Cross Reference: /netbsd-current/sys/kern/sys

History log of /netbsd-current/sys/kern/sys_futex.c
Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
# 1.20	11-Apr-2024	riastradh	sys_futex.c: Fix illustration of futex(2). In this illustration, we need to _set_ bit 1 to claim ownership, not _clear_ bit 1 to claim ownership. No functional change intended -- comment only.
Revision tags: thorpej-ifq-base thorpej-altq-separation-base
# 1.19	24-Feb-2023	riastradh	kern: Eliminate most __HAVE_ATOMIC_AS_MEMBAR conditionals. I'm leaving in the conditional around the legacy membar_enters (store-before-load, store-before-store) in kern_mutex.c and in kern_lock.c because they may still matter: store-before-load barriers tend to be the most expensive kind, so eliding them is probably worthwhile on x86. (It also may not matter; I just don't care to do measurements right now, and it's a single valid and potentially justifiable use case in the whole tree.) However, membar_release/acquire can be mere instruction barriers on all TSO platforms including x86, so there's no need to go out of our way with a bad API to conditionalize them. If the procedure call overhead is measurable we just could change them to be macros on x86 that expand into __insn_barrier. Discussed on tech-kern: https://mail-index.netbsd.org/tech-kern/2023/02/23/msg028729.html
Revision tags: netbsd-10-0-RELEASE netbsd-10-0-RC6 netbsd-10-0-RC5 netbsd-10-0-RC4 netbsd-10-0-RC3 netbsd-10-0-RC2 netbsd-10-0-RC1 netbsd-10-base bouyer-sunxi-drm-base
# 1.18	21-Apr-2022	riastradh	futex(9): Convert membar_enter/exit to membar_acquire/release. No functional change -- this is just in an illustrative comment!
# 1.17	09-Apr-2022	riastradh	sys: Use membar_release/acquire around reference drop. This just goes through my recent reference count membar audit and changes membar_exit to membar_release and membar_enter to membar_acquire -- this should make everything cheaper on most CPUs without hurting correctness, because membar_acquire is generally cheaper than membar_enter.
# 1.16	12-Mar-2022	riastradh	sys: Membar audit around reference count releases. If two threads are using an object that is freed when the reference count goes to zero, we need to ensure that all memory operations related to the object happen before freeing the object. Using an atomic_dec_uint_nv(&refcnt) == 0 ensures that only one thread takes responsibility for freeing, but it's not enough to ensure that the other thread's memory operations happen before the freeing. Consider: Thread A Thread B obj->foo = 42; obj->baz = 73; mumble(&obj->bar); grumble(&obj->quux); /* membar_exit(); / / membar_exit(); / atomic_dec -- not last atomic_dec -- last / membar_enter(); / KASSERT(invariant(obj->foo, obj->bar)); free_stuff(obj); The memory barriers ensure that obj->foo = 42; mumble(&obj->bar); in thread A happens before KASSERT(invariant(obj->foo, obj->bar)); free_stuff(obj); in thread B. Without them, this ordering is not guaranteed. So in general it is necessary to do membar_exit(); if (atomic_dec_uint_nv(&obj->refcnt) != 0) return; membar_enter(); to release a reference, for the `last one out hit the lights' style of reference counting. (This is in contrast to the style where one thread blocks new references and then waits under a lock for existing ones to drain with a condvar -- no membar needed thanks to mutex(9).) I searched for atomic_dec to find all these. Obviously we ought to have a better abstraction for this because there's so much copypasta. This is a stop-gap measure to fix actual bugs until we have that. It would be nice if an abstraction could gracefully handle the different styles of reference counting in use -- some years ago I drafted an API for this, but making it cover everything got a little out of hand (particularly with struct vnode::v_usecount) and I ended up setting it aside to work on psref/localcount instead for better scalability. I got bored of adding #ifdef __HAVE_ATOMIC_AS_MEMBAR everywhere, so I only put it on things that look performance-critical on 5sec review. We should really adopt membar_enter_preatomic/membar_exit_postatomic or something (except they are applicable only to atomic r/m/w, not to atomic_load/store_, making the naming annoying) and get rid of all the ifdefs.
# 1.15	01-Nov-2021	chs	fix a typo in compare_futex_key().
# 1.14	21-Oct-2021	andvar	fix various typos, mainly in comments, but also in man pages and log messages.
# 1.13	28-Sep-2021	thorpej	futex_release_all_lwp(): No need to pass the "tid" argument separately; that is a vestige of an older version of the code. Also, move a KASSERT() that both futex_release_all_lwp() call sites had inside of futex_release_all_lwp() itself.
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.12	21-Jul-2021	skrll	branches: 1.12.4; need <sys/param.h> for COHERENCY_UNIT Minor KNF along the way.
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.11	05-May-2020	riastradh	branches: 1.11.2; 1.11.6; Revert "Use cv_timedwaitclock_sig in futex." Turned out to break things; we'll do this another way.
# 1.10	05-May-2020	riastradh	Revert "Make sure futex waits never return ERESTART." Part of redoing the timedwaitclock changes, which were buggy and committed a little too fast.
# 1.9	03-May-2020	riastradh	Make sure futex waits never return ERESTART. If the user had passed in a relative timeout, this would have the effect of waiting for the full relative time repeatedly, without regard for how much time had elapsed during the wait before a signal. In principle this may not be necessary for absolute timeouts or indefinite timeouts, but it's not clear there's an advantage; we do the same for various other syscalls like nanosleep. Perhaps in the future we can arrange to keep the state of how much time had elapsed when we restart like Linux does, but that's a much more ambitious change.
# 1.8	03-May-2020	riastradh	Use cv_timedwaitclock_sig in futex. Possible fix for hangs observed with Java under Linux emulation.
# 1.7	28-Apr-2020	riastradh	Make FUTEX_WAIT_BITSET(bitset=0) fail with EINVAL to match Linux.
# 1.6	28-Apr-2020	riastradh	Fix waiting on a zero bitset. The logic in futex_wait assumes there are two paths out: 1. Error (signal or timeout), in which case we take ourselves off the queue. 2. Wakeup, in which case the waker takes us off the queue. But if the user does FUTEX_WAIT_BITSET(bitset=0), as in the futex_wait_pointless_bitset test, then we will never even go to sleep, so there will be nobody to wake us as in (2), but it's not an error as in (1) either. As a result, we're left on the queue. Instead, don't bother with any of the wait machinery in that case. This does not actually match Linux semantics -- Linux returns EINVAL if bitset is zero. But let's make sure this passes the releng test rig as the tests are written now, and then fix both the logic and the tests -- this is a candidate fix for: lib/libc/sys/t_futex_ops (277/847): 20 test cases futex_basic_wait_wake_private: [6.645189s] Passed. futex_basic_wait_wake_shared: [6.572692s] Passed. futex_cmp_requeue: [4.624082s] Passed. futex_requeue: [4.427191s] Passed. futex_wait_pointless_bitset: [0.202865s] Passed. futex_wait_timeout_deadline: [ 9074.4164779] panic: TAILQ_INSERT_TAIL 0xffff000056a1ad48 /tmp/bracket/build/2020.04.28.03.00.23-evbarm-aarch64/src/sys/kern/sys_futex.c:826 [ 9074.4340691] cpu0: Begin traceback... [ 9074.4340691] trace fp ffffc0004ceffb40 [ 9074.4340691] fp ffffc0004ceffb60 vpanic() at ffffc000004aac58 netbsd:vpanic+0x160 [ 9074.4441432] fp ffffc0004ceffbd0 panic() at ffffc000004aad4c netbsd:panic+0x44 [ 9074.4441432] fp ffffc0004ceffc60 futex_wait_enqueue() at ffffc000004b7710 netbsd:futex_wait_enqueue+0x138 [ 9074.4555795] fp ffffc0004ceffc80 futex_func_wait.part.5() at ffffc000004b82f4 netbsd:futex_func_wait.part.5+0x17c [ 9074.4660518] fp ffffc0004ceffd50 do_futex() at ffffc000004b8cd8 netbsd:do_futex+0x1d0 [ 9074.4660518] fp ffffc0004ceffdf0 sys___futex() at ffffc000004b9078 netbsd:sys___futex+0x50
# 1.5	28-Apr-2020	riastradh	Rename futex_get -> futex_lookup_create. Remove futex_put. Just use futex_rele instead of futex_put. There may once have been a method to the madness this alias in an early draft but there is no longer. No functional change; all names are private to sys_futex.c.
# 1.4	27-Apr-2020	riastradh	Fix races in aborted futex waits. - Re-check the wake condition in futex_wait in the event of error. => Otherwise, if futex_wait times out in cv_timedwait_sig but futex_wake wakes it while cv_timedwait_sig is still trying to reacquire fw_lock, the wake would be incorrectly accounted. - Fold futex_wait_abort into futex_wait so it happens atomically. => Otherwise, if futex_wait times out and release fw_lock, then, before futex_wait_abort reacquires the lock and removes it from the queue, the waiter could be woken by futex_wake. But once we enter futex_wait_abort, the decision to abort is final, so the wake would incorrectly accounted. - In futex_wait_abort, mark each waiter aborting while we do the lock dance, and skip over aborting waiters in futex_wake and futex_requeue. => Otherwise, futex_wake might move it to a new futex while futex_wait_abort has released all the locks -- but futex_wait_abort still has the old futex, so TAILQ_REMOVE will cross the streams and bad things will happen. - In futex_wait_abort, release the futex we moved the waiter off. => Otherwise, we would leak the futex reference acquired by futex_func_wait, in the event of aborting. (For normal wakeups, futex_wake releases the reference on our behalf.) - Consistently use futex_wait_dequeue rather than TAILQ_REMOVE so that all changes to fw_futex and the waiter queue are isolated to futex_wait_enqueue/dequeue and happen together. Patch developed with and tested by thorpej@.
# 1.3	27-Apr-2020	thorpej	We would have bigger problems if PAGE_SIZE were < sizeof(int). Remove a CTASSERT() that can't be evaluated at compile-time on all platforms.
# 1.2	26-Apr-2020	mlelstv	fix DIAGNOSTIC build
# 1.1	26-Apr-2020	thorpej	Add a NetBSD native futex implementation, mostly written by riastradh@. Map the COMPAT_LINUX futex calls to the native ones.
# 1.19	24-Feb-2023	riastradh	kern: Eliminate most __HAVE_ATOMIC_AS_MEMBAR conditionals. I'm leaving in the conditional around the legacy membar_enters (store-before-load, store-before-store) in kern_mutex.c and in kern_lock.c because they may still matter: store-before-load barriers tend to be the most expensive kind, so eliding them is probably worthwhile on x86. (It also may not matter; I just don't care to do measurements right now, and it's a single valid and potentially justifiable use case in the whole tree.) However, membar_release/acquire can be mere instruction barriers on all TSO platforms including x86, so there's no need to go out of our way with a bad API to conditionalize them. If the procedure call overhead is measurable we just could change them to be macros on x86 that expand into __insn_barrier. Discussed on tech-kern: https://mail-index.netbsd.org/tech-kern/2023/02/23/msg028729.html
Revision tags: netbsd-10-base bouyer-sunxi-drm-base
# 1.18	21-Apr-2022	riastradh	futex(9): Convert membar_enter/exit to membar_acquire/release. No functional change -- this is just in an illustrative comment!
# 1.17	09-Apr-2022	riastradh	sys: Use membar_release/acquire around reference drop. This just goes through my recent reference count membar audit and changes membar_exit to membar_release and membar_enter to membar_acquire -- this should make everything cheaper on most CPUs without hurting correctness, because membar_acquire is generally cheaper than membar_enter.
# 1.16	12-Mar-2022	riastradh	sys: Membar audit around reference count releases. If two threads are using an object that is freed when the reference count goes to zero, we need to ensure that all memory operations related to the object happen before freeing the object. Using an atomic_dec_uint_nv(&refcnt) == 0 ensures that only one thread takes responsibility for freeing, but it's not enough to ensure that the other thread's memory operations happen before the freeing. Consider: Thread A Thread B obj->foo = 42; obj->baz = 73; mumble(&obj->bar); grumble(&obj->quux); /* membar_exit(); / / membar_exit(); / atomic_dec -- not last atomic_dec -- last / membar_enter(); / KASSERT(invariant(obj->foo, obj->bar)); free_stuff(obj); The memory barriers ensure that obj->foo = 42; mumble(&obj->bar); in thread A happens before KASSERT(invariant(obj->foo, obj->bar)); free_stuff(obj); in thread B. Without them, this ordering is not guaranteed. So in general it is necessary to do membar_exit(); if (atomic_dec_uint_nv(&obj->refcnt) != 0) return; membar_enter(); to release a reference, for the `last one out hit the lights' style of reference counting. (This is in contrast to the style where one thread blocks new references and then waits under a lock for existing ones to drain with a condvar -- no membar needed thanks to mutex(9).) I searched for atomic_dec to find all these. Obviously we ought to have a better abstraction for this because there's so much copypasta. This is a stop-gap measure to fix actual bugs until we have that. It would be nice if an abstraction could gracefully handle the different styles of reference counting in use -- some years ago I drafted an API for this, but making it cover everything got a little out of hand (particularly with struct vnode::v_usecount) and I ended up setting it aside to work on psref/localcount instead for better scalability. I got bored of adding #ifdef __HAVE_ATOMIC_AS_MEMBAR everywhere, so I only put it on things that look performance-critical on 5sec review. We should really adopt membar_enter_preatomic/membar_exit_postatomic or something (except they are applicable only to atomic r/m/w, not to atomic_load/store_, making the naming annoying) and get rid of all the ifdefs.
# 1.15	01-Nov-2021	chs	fix a typo in compare_futex_key().
# 1.14	21-Oct-2021	andvar	fix various typos, mainly in comments, but also in man pages and log messages.
# 1.13	28-Sep-2021	thorpej	futex_release_all_lwp(): No need to pass the "tid" argument separately; that is a vestige of an older version of the code. Also, move a KASSERT() that both futex_release_all_lwp() call sites had inside of futex_release_all_lwp() itself.
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.12	21-Jul-2021	skrll	branches: 1.12.4; need <sys/param.h> for COHERENCY_UNIT Minor KNF along the way.
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.11	05-May-2020	riastradh	branches: 1.11.2; 1.11.6; Revert "Use cv_timedwaitclock_sig in futex." Turned out to break things; we'll do this another way.
# 1.10	05-May-2020	riastradh	Revert "Make sure futex waits never return ERESTART." Part of redoing the timedwaitclock changes, which were buggy and committed a little too fast.
# 1.9	03-May-2020	riastradh	Make sure futex waits never return ERESTART. If the user had passed in a relative timeout, this would have the effect of waiting for the full relative time repeatedly, without regard for how much time had elapsed during the wait before a signal. In principle this may not be necessary for absolute timeouts or indefinite timeouts, but it's not clear there's an advantage; we do the same for various other syscalls like nanosleep. Perhaps in the future we can arrange to keep the state of how much time had elapsed when we restart like Linux does, but that's a much more ambitious change.
# 1.8	03-May-2020	riastradh	Use cv_timedwaitclock_sig in futex. Possible fix for hangs observed with Java under Linux emulation.
# 1.7	28-Apr-2020	riastradh	Make FUTEX_WAIT_BITSET(bitset=0) fail with EINVAL to match Linux.
# 1.6	28-Apr-2020	riastradh	Fix waiting on a zero bitset. The logic in futex_wait assumes there are two paths out: 1. Error (signal or timeout), in which case we take ourselves off the queue. 2. Wakeup, in which case the waker takes us off the queue. But if the user does FUTEX_WAIT_BITSET(bitset=0), as in the futex_wait_pointless_bitset test, then we will never even go to sleep, so there will be nobody to wake us as in (2), but it's not an error as in (1) either. As a result, we're left on the queue. Instead, don't bother with any of the wait machinery in that case. This does not actually match Linux semantics -- Linux returns EINVAL if bitset is zero. But let's make sure this passes the releng test rig as the tests are written now, and then fix both the logic and the tests -- this is a candidate fix for: lib/libc/sys/t_futex_ops (277/847): 20 test cases futex_basic_wait_wake_private: [6.645189s] Passed. futex_basic_wait_wake_shared: [6.572692s] Passed. futex_cmp_requeue: [4.624082s] Passed. futex_requeue: [4.427191s] Passed. futex_wait_pointless_bitset: [0.202865s] Passed. futex_wait_timeout_deadline: [ 9074.4164779] panic: TAILQ_INSERT_TAIL 0xffff000056a1ad48 /tmp/bracket/build/2020.04.28.03.00.23-evbarm-aarch64/src/sys/kern/sys_futex.c:826 [ 9074.4340691] cpu0: Begin traceback... [ 9074.4340691] trace fp ffffc0004ceffb40 [ 9074.4340691] fp ffffc0004ceffb60 vpanic() at ffffc000004aac58 netbsd:vpanic+0x160 [ 9074.4441432] fp ffffc0004ceffbd0 panic() at ffffc000004aad4c netbsd:panic+0x44 [ 9074.4441432] fp ffffc0004ceffc60 futex_wait_enqueue() at ffffc000004b7710 netbsd:futex_wait_enqueue+0x138 [ 9074.4555795] fp ffffc0004ceffc80 futex_func_wait.part.5() at ffffc000004b82f4 netbsd:futex_func_wait.part.5+0x17c [ 9074.4660518] fp ffffc0004ceffd50 do_futex() at ffffc000004b8cd8 netbsd:do_futex+0x1d0 [ 9074.4660518] fp ffffc0004ceffdf0 sys___futex() at ffffc000004b9078 netbsd:sys___futex+0x50
# 1.5	28-Apr-2020	riastradh	Rename futex_get -> futex_lookup_create. Remove futex_put. Just use futex_rele instead of futex_put. There may once have been a method to the madness this alias in an early draft but there is no longer. No functional change; all names are private to sys_futex.c.
# 1.4	27-Apr-2020	riastradh	Fix races in aborted futex waits. - Re-check the wake condition in futex_wait in the event of error. => Otherwise, if futex_wait times out in cv_timedwait_sig but futex_wake wakes it while cv_timedwait_sig is still trying to reacquire fw_lock, the wake would be incorrectly accounted. - Fold futex_wait_abort into futex_wait so it happens atomically. => Otherwise, if futex_wait times out and release fw_lock, then, before futex_wait_abort reacquires the lock and removes it from the queue, the waiter could be woken by futex_wake. But once we enter futex_wait_abort, the decision to abort is final, so the wake would incorrectly accounted. - In futex_wait_abort, mark each waiter aborting while we do the lock dance, and skip over aborting waiters in futex_wake and futex_requeue. => Otherwise, futex_wake might move it to a new futex while futex_wait_abort has released all the locks -- but futex_wait_abort still has the old futex, so TAILQ_REMOVE will cross the streams and bad things will happen. - In futex_wait_abort, release the futex we moved the waiter off. => Otherwise, we would leak the futex reference acquired by futex_func_wait, in the event of aborting. (For normal wakeups, futex_wake releases the reference on our behalf.) - Consistently use futex_wait_dequeue rather than TAILQ_REMOVE so that all changes to fw_futex and the waiter queue are isolated to futex_wait_enqueue/dequeue and happen together. Patch developed with and tested by thorpej@.
# 1.3	27-Apr-2020	thorpej	We would have bigger problems if PAGE_SIZE were < sizeof(int). Remove a CTASSERT() that can't be evaluated at compile-time on all platforms.
# 1.2	26-Apr-2020	mlelstv	fix DIAGNOSTIC build
# 1.1	26-Apr-2020	thorpej	Add a NetBSD native futex implementation, mostly written by riastradh@. Map the COMPAT_LINUX futex calls to the native ones.
# 1.18	21-Apr-2022	riastradh	futex(9): Convert membar_enter/exit to membar_acquire/release. No functional change -- this is just in an illustrative comment!
# 1.17	09-Apr-2022	riastradh	sys: Use membar_release/acquire around reference drop. This just goes through my recent reference count membar audit and changes membar_exit to membar_release and membar_enter to membar_acquire -- this should make everything cheaper on most CPUs without hurting correctness, because membar_acquire is generally cheaper than membar_enter.
# 1.16	12-Mar-2022	riastradh	sys: Membar audit around reference count releases. If two threads are using an object that is freed when the reference count goes to zero, we need to ensure that all memory operations related to the object happen before freeing the object. Using an atomic_dec_uint_nv(&refcnt) == 0 ensures that only one thread takes responsibility for freeing, but it's not enough to ensure that the other thread's memory operations happen before the freeing. Consider: Thread A Thread B obj->foo = 42; obj->baz = 73; mumble(&obj->bar); grumble(&obj->quux); /* membar_exit(); / / membar_exit(); / atomic_dec -- not last atomic_dec -- last / membar_enter(); / KASSERT(invariant(obj->foo, obj->bar)); free_stuff(obj); The memory barriers ensure that obj->foo = 42; mumble(&obj->bar); in thread A happens before KASSERT(invariant(obj->foo, obj->bar)); free_stuff(obj); in thread B. Without them, this ordering is not guaranteed. So in general it is necessary to do membar_exit(); if (atomic_dec_uint_nv(&obj->refcnt) != 0) return; membar_enter(); to release a reference, for the `last one out hit the lights' style of reference counting. (This is in contrast to the style where one thread blocks new references and then waits under a lock for existing ones to drain with a condvar -- no membar needed thanks to mutex(9).) I searched for atomic_dec to find all these. Obviously we ought to have a better abstraction for this because there's so much copypasta. This is a stop-gap measure to fix actual bugs until we have that. It would be nice if an abstraction could gracefully handle the different styles of reference counting in use -- some years ago I drafted an API for this, but making it cover everything got a little out of hand (particularly with struct vnode::v_usecount) and I ended up setting it aside to work on psref/localcount instead for better scalability. I got bored of adding #ifdef __HAVE_ATOMIC_AS_MEMBAR everywhere, so I only put it on things that look performance-critical on 5sec review. We should really adopt membar_enter_preatomic/membar_exit_postatomic or something (except they are applicable only to atomic r/m/w, not to atomic_load/store_, making the naming annoying) and get rid of all the ifdefs.
# 1.15	01-Nov-2021	chs	fix a typo in compare_futex_key().
# 1.14	21-Oct-2021	andvar	fix various typos, mainly in comments, but also in man pages and log messages.
# 1.13	28-Sep-2021	thorpej	futex_release_all_lwp(): No need to pass the "tid" argument separately; that is a vestige of an older version of the code. Also, move a KASSERT() that both futex_release_all_lwp() call sites had inside of futex_release_all_lwp() itself.
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.12	21-Jul-2021	skrll	branches: 1.12.4; need <sys/param.h> for COHERENCY_UNIT Minor KNF along the way.
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.11	05-May-2020	riastradh	branches: 1.11.2; 1.11.6; Revert "Use cv_timedwaitclock_sig in futex." Turned out to break things; we'll do this another way.
# 1.10	05-May-2020	riastradh	Revert "Make sure futex waits never return ERESTART." Part of redoing the timedwaitclock changes, which were buggy and committed a little too fast.
# 1.9	03-May-2020	riastradh	Make sure futex waits never return ERESTART. If the user had passed in a relative timeout, this would have the effect of waiting for the full relative time repeatedly, without regard for how much time had elapsed during the wait before a signal. In principle this may not be necessary for absolute timeouts or indefinite timeouts, but it's not clear there's an advantage; we do the same for various other syscalls like nanosleep. Perhaps in the future we can arrange to keep the state of how much time had elapsed when we restart like Linux does, but that's a much more ambitious change.
# 1.8	03-May-2020	riastradh	Use cv_timedwaitclock_sig in futex. Possible fix for hangs observed with Java under Linux emulation.
# 1.7	28-Apr-2020	riastradh	Make FUTEX_WAIT_BITSET(bitset=0) fail with EINVAL to match Linux.
# 1.6	28-Apr-2020	riastradh	Fix waiting on a zero bitset. The logic in futex_wait assumes there are two paths out: 1. Error (signal or timeout), in which case we take ourselves off the queue. 2. Wakeup, in which case the waker takes us off the queue. But if the user does FUTEX_WAIT_BITSET(bitset=0), as in the futex_wait_pointless_bitset test, then we will never even go to sleep, so there will be nobody to wake us as in (2), but it's not an error as in (1) either. As a result, we're left on the queue. Instead, don't bother with any of the wait machinery in that case. This does not actually match Linux semantics -- Linux returns EINVAL if bitset is zero. But let's make sure this passes the releng test rig as the tests are written now, and then fix both the logic and the tests -- this is a candidate fix for: lib/libc/sys/t_futex_ops (277/847): 20 test cases futex_basic_wait_wake_private: [6.645189s] Passed. futex_basic_wait_wake_shared: [6.572692s] Passed. futex_cmp_requeue: [4.624082s] Passed. futex_requeue: [4.427191s] Passed. futex_wait_pointless_bitset: [0.202865s] Passed. futex_wait_timeout_deadline: [ 9074.4164779] panic: TAILQ_INSERT_TAIL 0xffff000056a1ad48 /tmp/bracket/build/2020.04.28.03.00.23-evbarm-aarch64/src/sys/kern/sys_futex.c:826 [ 9074.4340691] cpu0: Begin traceback... [ 9074.4340691] trace fp ffffc0004ceffb40 [ 9074.4340691] fp ffffc0004ceffb60 vpanic() at ffffc000004aac58 netbsd:vpanic+0x160 [ 9074.4441432] fp ffffc0004ceffbd0 panic() at ffffc000004aad4c netbsd:panic+0x44 [ 9074.4441432] fp ffffc0004ceffc60 futex_wait_enqueue() at ffffc000004b7710 netbsd:futex_wait_enqueue+0x138 [ 9074.4555795] fp ffffc0004ceffc80 futex_func_wait.part.5() at ffffc000004b82f4 netbsd:futex_func_wait.part.5+0x17c [ 9074.4660518] fp ffffc0004ceffd50 do_futex() at ffffc000004b8cd8 netbsd:do_futex+0x1d0 [ 9074.4660518] fp ffffc0004ceffdf0 sys___futex() at ffffc000004b9078 netbsd:sys___futex+0x50
# 1.5	28-Apr-2020	riastradh	Rename futex_get -> futex_lookup_create. Remove futex_put. Just use futex_rele instead of futex_put. There may once have been a method to the madness this alias in an early draft but there is no longer. No functional change; all names are private to sys_futex.c.
# 1.4	27-Apr-2020	riastradh	Fix races in aborted futex waits. - Re-check the wake condition in futex_wait in the event of error. => Otherwise, if futex_wait times out in cv_timedwait_sig but futex_wake wakes it while cv_timedwait_sig is still trying to reacquire fw_lock, the wake would be incorrectly accounted. - Fold futex_wait_abort into futex_wait so it happens atomically. => Otherwise, if futex_wait times out and release fw_lock, then, before futex_wait_abort reacquires the lock and removes it from the queue, the waiter could be woken by futex_wake. But once we enter futex_wait_abort, the decision to abort is final, so the wake would incorrectly accounted. - In futex_wait_abort, mark each waiter aborting while we do the lock dance, and skip over aborting waiters in futex_wake and futex_requeue. => Otherwise, futex_wake might move it to a new futex while futex_wait_abort has released all the locks -- but futex_wait_abort still has the old futex, so TAILQ_REMOVE will cross the streams and bad things will happen. - In futex_wait_abort, release the futex we moved the waiter off. => Otherwise, we would leak the futex reference acquired by futex_func_wait, in the event of aborting. (For normal wakeups, futex_wake releases the reference on our behalf.) - Consistently use futex_wait_dequeue rather than TAILQ_REMOVE so that all changes to fw_futex and the waiter queue are isolated to futex_wait_enqueue/dequeue and happen together. Patch developed with and tested by thorpej@.
# 1.3	27-Apr-2020	thorpej	We would have bigger problems if PAGE_SIZE were < sizeof(int). Remove a CTASSERT() that can't be evaluated at compile-time on all platforms.
# 1.2	26-Apr-2020	mlelstv	fix DIAGNOSTIC build
# 1.1	26-Apr-2020	thorpej	Add a NetBSD native futex implementation, mostly written by riastradh@. Map the COMPAT_LINUX futex calls to the native ones.
# 1.17	09-Apr-2022	riastradh	sys: Use membar_release/acquire around reference drop. This just goes through my recent reference count membar audit and changes membar_exit to membar_release and membar_enter to membar_acquire -- this should make everything cheaper on most CPUs without hurting correctness, because membar_acquire is generally cheaper than membar_enter.
# 1.16	12-Mar-2022	riastradh	sys: Membar audit around reference count releases. If two threads are using an object that is freed when the reference count goes to zero, we need to ensure that all memory operations related to the object happen before freeing the object. Using an atomic_dec_uint_nv(&refcnt) == 0 ensures that only one thread takes responsibility for freeing, but it's not enough to ensure that the other thread's memory operations happen before the freeing. Consider: Thread A Thread B obj->foo = 42; obj->baz = 73; mumble(&obj->bar); grumble(&obj->quux); /* membar_exit(); / / membar_exit(); / atomic_dec -- not last atomic_dec -- last / membar_enter(); / KASSERT(invariant(obj->foo, obj->bar)); free_stuff(obj); The memory barriers ensure that obj->foo = 42; mumble(&obj->bar); in thread A happens before KASSERT(invariant(obj->foo, obj->bar)); free_stuff(obj); in thread B. Without them, this ordering is not guaranteed. So in general it is necessary to do membar_exit(); if (atomic_dec_uint_nv(&obj->refcnt) != 0) return; membar_enter(); to release a reference, for the `last one out hit the lights' style of reference counting. (This is in contrast to the style where one thread blocks new references and then waits under a lock for existing ones to drain with a condvar -- no membar needed thanks to mutex(9).) I searched for atomic_dec to find all these. Obviously we ought to have a better abstraction for this because there's so much copypasta. This is a stop-gap measure to fix actual bugs until we have that. It would be nice if an abstraction could gracefully handle the different styles of reference counting in use -- some years ago I drafted an API for this, but making it cover everything got a little out of hand (particularly with struct vnode::v_usecount) and I ended up setting it aside to work on psref/localcount instead for better scalability. I got bored of adding #ifdef __HAVE_ATOMIC_AS_MEMBAR everywhere, so I only put it on things that look performance-critical on 5sec review. We should really adopt membar_enter_preatomic/membar_exit_postatomic or something (except they are applicable only to atomic r/m/w, not to atomic_load/store_, making the naming annoying) and get rid of all the ifdefs.
# 1.15	01-Nov-2021	chs	fix a typo in compare_futex_key().
# 1.14	21-Oct-2021	andvar	fix various typos, mainly in comments, but also in man pages and log messages.
# 1.13	28-Sep-2021	thorpej	futex_release_all_lwp(): No need to pass the "tid" argument separately; that is a vestige of an older version of the code. Also, move a KASSERT() that both futex_release_all_lwp() call sites had inside of futex_release_all_lwp() itself.
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.12	21-Jul-2021	skrll	branches: 1.12.4; need <sys/param.h> for COHERENCY_UNIT Minor KNF along the way.
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.11	05-May-2020	riastradh	branches: 1.11.2; 1.11.6; Revert "Use cv_timedwaitclock_sig in futex." Turned out to break things; we'll do this another way.
# 1.10	05-May-2020	riastradh	Revert "Make sure futex waits never return ERESTART." Part of redoing the timedwaitclock changes, which were buggy and committed a little too fast.
# 1.9	03-May-2020	riastradh	Make sure futex waits never return ERESTART. If the user had passed in a relative timeout, this would have the effect of waiting for the full relative time repeatedly, without regard for how much time had elapsed during the wait before a signal. In principle this may not be necessary for absolute timeouts or indefinite timeouts, but it's not clear there's an advantage; we do the same for various other syscalls like nanosleep. Perhaps in the future we can arrange to keep the state of how much time had elapsed when we restart like Linux does, but that's a much more ambitious change.
# 1.8	03-May-2020	riastradh	Use cv_timedwaitclock_sig in futex. Possible fix for hangs observed with Java under Linux emulation.
# 1.7	28-Apr-2020	riastradh	Make FUTEX_WAIT_BITSET(bitset=0) fail with EINVAL to match Linux.
# 1.6	28-Apr-2020	riastradh	Fix waiting on a zero bitset. The logic in futex_wait assumes there are two paths out: 1. Error (signal or timeout), in which case we take ourselves off the queue. 2. Wakeup, in which case the waker takes us off the queue. But if the user does FUTEX_WAIT_BITSET(bitset=0), as in the futex_wait_pointless_bitset test, then we will never even go to sleep, so there will be nobody to wake us as in (2), but it's not an error as in (1) either. As a result, we're left on the queue. Instead, don't bother with any of the wait machinery in that case. This does not actually match Linux semantics -- Linux returns EINVAL if bitset is zero. But let's make sure this passes the releng test rig as the tests are written now, and then fix both the logic and the tests -- this is a candidate fix for: lib/libc/sys/t_futex_ops (277/847): 20 test cases futex_basic_wait_wake_private: [6.645189s] Passed. futex_basic_wait_wake_shared: [6.572692s] Passed. futex_cmp_requeue: [4.624082s] Passed. futex_requeue: [4.427191s] Passed. futex_wait_pointless_bitset: [0.202865s] Passed. futex_wait_timeout_deadline: [ 9074.4164779] panic: TAILQ_INSERT_TAIL 0xffff000056a1ad48 /tmp/bracket/build/2020.04.28.03.00.23-evbarm-aarch64/src/sys/kern/sys_futex.c:826 [ 9074.4340691] cpu0: Begin traceback... [ 9074.4340691] trace fp ffffc0004ceffb40 [ 9074.4340691] fp ffffc0004ceffb60 vpanic() at ffffc000004aac58 netbsd:vpanic+0x160 [ 9074.4441432] fp ffffc0004ceffbd0 panic() at ffffc000004aad4c netbsd:panic+0x44 [ 9074.4441432] fp ffffc0004ceffc60 futex_wait_enqueue() at ffffc000004b7710 netbsd:futex_wait_enqueue+0x138 [ 9074.4555795] fp ffffc0004ceffc80 futex_func_wait.part.5() at ffffc000004b82f4 netbsd:futex_func_wait.part.5+0x17c [ 9074.4660518] fp ffffc0004ceffd50 do_futex() at ffffc000004b8cd8 netbsd:do_futex+0x1d0 [ 9074.4660518] fp ffffc0004ceffdf0 sys___futex() at ffffc000004b9078 netbsd:sys___futex+0x50
# 1.5	28-Apr-2020	riastradh	Rename futex_get -> futex_lookup_create. Remove futex_put. Just use futex_rele instead of futex_put. There may once have been a method to the madness this alias in an early draft but there is no longer. No functional change; all names are private to sys_futex.c.
# 1.4	27-Apr-2020	riastradh	Fix races in aborted futex waits. - Re-check the wake condition in futex_wait in the event of error. => Otherwise, if futex_wait times out in cv_timedwait_sig but futex_wake wakes it while cv_timedwait_sig is still trying to reacquire fw_lock, the wake would be incorrectly accounted. - Fold futex_wait_abort into futex_wait so it happens atomically. => Otherwise, if futex_wait times out and release fw_lock, then, before futex_wait_abort reacquires the lock and removes it from the queue, the waiter could be woken by futex_wake. But once we enter futex_wait_abort, the decision to abort is final, so the wake would incorrectly accounted. - In futex_wait_abort, mark each waiter aborting while we do the lock dance, and skip over aborting waiters in futex_wake and futex_requeue. => Otherwise, futex_wake might move it to a new futex while futex_wait_abort has released all the locks -- but futex_wait_abort still has the old futex, so TAILQ_REMOVE will cross the streams and bad things will happen. - In futex_wait_abort, release the futex we moved the waiter off. => Otherwise, we would leak the futex reference acquired by futex_func_wait, in the event of aborting. (For normal wakeups, futex_wake releases the reference on our behalf.) - Consistently use futex_wait_dequeue rather than TAILQ_REMOVE so that all changes to fw_futex and the waiter queue are isolated to futex_wait_enqueue/dequeue and happen together. Patch developed with and tested by thorpej@.
# 1.3	27-Apr-2020	thorpej	We would have bigger problems if PAGE_SIZE were < sizeof(int). Remove a CTASSERT() that can't be evaluated at compile-time on all platforms.
# 1.2	26-Apr-2020	mlelstv	fix DIAGNOSTIC build
# 1.1	26-Apr-2020	thorpej	Add a NetBSD native futex implementation, mostly written by riastradh@. Map the COMPAT_LINUX futex calls to the native ones.
# 1.16	12-Mar-2022	riastradh	sys: Membar audit around reference count releases. If two threads are using an object that is freed when the reference count goes to zero, we need to ensure that all memory operations related to the object happen before freeing the object. Using an atomic_dec_uint_nv(&refcnt) == 0 ensures that only one thread takes responsibility for freeing, but it's not enough to ensure that the other thread's memory operations happen before the freeing. Consider: Thread A Thread B obj->foo = 42; obj->baz = 73; mumble(&obj->bar); grumble(&obj->quux); /* membar_exit(); / / membar_exit(); / atomic_dec -- not last atomic_dec -- last / membar_enter(); / KASSERT(invariant(obj->foo, obj->bar)); free_stuff(obj); The memory barriers ensure that obj->foo = 42; mumble(&obj->bar); in thread A happens before KASSERT(invariant(obj->foo, obj->bar)); free_stuff(obj); in thread B. Without them, this ordering is not guaranteed. So in general it is necessary to do membar_exit(); if (atomic_dec_uint_nv(&obj->refcnt) != 0) return; membar_enter(); to release a reference, for the `last one out hit the lights' style of reference counting. (This is in contrast to the style where one thread blocks new references and then waits under a lock for existing ones to drain with a condvar -- no membar needed thanks to mutex(9).) I searched for atomic_dec to find all these. Obviously we ought to have a better abstraction for this because there's so much copypasta. This is a stop-gap measure to fix actual bugs until we have that. It would be nice if an abstraction could gracefully handle the different styles of reference counting in use -- some years ago I drafted an API for this, but making it cover everything got a little out of hand (particularly with struct vnode::v_usecount) and I ended up setting it aside to work on psref/localcount instead for better scalability. I got bored of adding #ifdef __HAVE_ATOMIC_AS_MEMBAR everywhere, so I only put it on things that look performance-critical on 5sec review. We should really adopt membar_enter_preatomic/membar_exit_postatomic or something (except they are applicable only to atomic r/m/w, not to atomic_load/store_, making the naming annoying) and get rid of all the ifdefs.
# 1.15	01-Nov-2021	chs	fix a typo in compare_futex_key().
# 1.14	21-Oct-2021	andvar	fix various typos, mainly in comments, but also in man pages and log messages.
# 1.13	28-Sep-2021	thorpej	futex_release_all_lwp(): No need to pass the "tid" argument separately; that is a vestige of an older version of the code. Also, move a KASSERT() that both futex_release_all_lwp() call sites had inside of futex_release_all_lwp() itself.
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.12	21-Jul-2021	skrll	branches: 1.12.4; need <sys/param.h> for COHERENCY_UNIT Minor KNF along the way.
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.11	05-May-2020	riastradh	branches: 1.11.2; 1.11.6; Revert "Use cv_timedwaitclock_sig in futex." Turned out to break things; we'll do this another way.
# 1.10	05-May-2020	riastradh	Revert "Make sure futex waits never return ERESTART." Part of redoing the timedwaitclock changes, which were buggy and committed a little too fast.
# 1.9	03-May-2020	riastradh	Make sure futex waits never return ERESTART. If the user had passed in a relative timeout, this would have the effect of waiting for the full relative time repeatedly, without regard for how much time had elapsed during the wait before a signal. In principle this may not be necessary for absolute timeouts or indefinite timeouts, but it's not clear there's an advantage; we do the same for various other syscalls like nanosleep. Perhaps in the future we can arrange to keep the state of how much time had elapsed when we restart like Linux does, but that's a much more ambitious change.
# 1.8	03-May-2020	riastradh	Use cv_timedwaitclock_sig in futex. Possible fix for hangs observed with Java under Linux emulation.
# 1.7	28-Apr-2020	riastradh	Make FUTEX_WAIT_BITSET(bitset=0) fail with EINVAL to match Linux.
# 1.6	28-Apr-2020	riastradh	Fix waiting on a zero bitset. The logic in futex_wait assumes there are two paths out: 1. Error (signal or timeout), in which case we take ourselves off the queue. 2. Wakeup, in which case the waker takes us off the queue. But if the user does FUTEX_WAIT_BITSET(bitset=0), as in the futex_wait_pointless_bitset test, then we will never even go to sleep, so there will be nobody to wake us as in (2), but it's not an error as in (1) either. As a result, we're left on the queue. Instead, don't bother with any of the wait machinery in that case. This does not actually match Linux semantics -- Linux returns EINVAL if bitset is zero. But let's make sure this passes the releng test rig as the tests are written now, and then fix both the logic and the tests -- this is a candidate fix for: lib/libc/sys/t_futex_ops (277/847): 20 test cases futex_basic_wait_wake_private: [6.645189s] Passed. futex_basic_wait_wake_shared: [6.572692s] Passed. futex_cmp_requeue: [4.624082s] Passed. futex_requeue: [4.427191s] Passed. futex_wait_pointless_bitset: [0.202865s] Passed. futex_wait_timeout_deadline: [ 9074.4164779] panic: TAILQ_INSERT_TAIL 0xffff000056a1ad48 /tmp/bracket/build/2020.04.28.03.00.23-evbarm-aarch64/src/sys/kern/sys_futex.c:826 [ 9074.4340691] cpu0: Begin traceback... [ 9074.4340691] trace fp ffffc0004ceffb40 [ 9074.4340691] fp ffffc0004ceffb60 vpanic() at ffffc000004aac58 netbsd:vpanic+0x160 [ 9074.4441432] fp ffffc0004ceffbd0 panic() at ffffc000004aad4c netbsd:panic+0x44 [ 9074.4441432] fp ffffc0004ceffc60 futex_wait_enqueue() at ffffc000004b7710 netbsd:futex_wait_enqueue+0x138 [ 9074.4555795] fp ffffc0004ceffc80 futex_func_wait.part.5() at ffffc000004b82f4 netbsd:futex_func_wait.part.5+0x17c [ 9074.4660518] fp ffffc0004ceffd50 do_futex() at ffffc000004b8cd8 netbsd:do_futex+0x1d0 [ 9074.4660518] fp ffffc0004ceffdf0 sys___futex() at ffffc000004b9078 netbsd:sys___futex+0x50
# 1.5	28-Apr-2020	riastradh	Rename futex_get -> futex_lookup_create. Remove futex_put. Just use futex_rele instead of futex_put. There may once have been a method to the madness this alias in an early draft but there is no longer. No functional change; all names are private to sys_futex.c.
# 1.4	27-Apr-2020	riastradh	Fix races in aborted futex waits. - Re-check the wake condition in futex_wait in the event of error. => Otherwise, if futex_wait times out in cv_timedwait_sig but futex_wake wakes it while cv_timedwait_sig is still trying to reacquire fw_lock, the wake would be incorrectly accounted. - Fold futex_wait_abort into futex_wait so it happens atomically. => Otherwise, if futex_wait times out and release fw_lock, then, before futex_wait_abort reacquires the lock and removes it from the queue, the waiter could be woken by futex_wake. But once we enter futex_wait_abort, the decision to abort is final, so the wake would incorrectly accounted. - In futex_wait_abort, mark each waiter aborting while we do the lock dance, and skip over aborting waiters in futex_wake and futex_requeue. => Otherwise, futex_wake might move it to a new futex while futex_wait_abort has released all the locks -- but futex_wait_abort still has the old futex, so TAILQ_REMOVE will cross the streams and bad things will happen. - In futex_wait_abort, release the futex we moved the waiter off. => Otherwise, we would leak the futex reference acquired by futex_func_wait, in the event of aborting. (For normal wakeups, futex_wake releases the reference on our behalf.) - Consistently use futex_wait_dequeue rather than TAILQ_REMOVE so that all changes to fw_futex and the waiter queue are isolated to futex_wait_enqueue/dequeue and happen together. Patch developed with and tested by thorpej@.
# 1.3	27-Apr-2020	thorpej	We would have bigger problems if PAGE_SIZE were < sizeof(int). Remove a CTASSERT() that can't be evaluated at compile-time on all platforms.
# 1.2	26-Apr-2020	mlelstv	fix DIAGNOSTIC build
# 1.1	26-Apr-2020	thorpej	Add a NetBSD native futex implementation, mostly written by riastradh@. Map the COMPAT_LINUX futex calls to the native ones.
# 1.15	01-Nov-2021	chs	fix a typo in compare_futex_key().
# 1.14	21-Oct-2021	andvar	fix various typos, mainly in comments, but also in man pages and log messages.
# 1.13	28-Sep-2021	thorpej	futex_release_all_lwp(): No need to pass the "tid" argument separately; that is a vestige of an older version of the code. Also, move a KASSERT() that both futex_release_all_lwp() call sites had inside of futex_release_all_lwp() itself.
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.12	21-Jul-2021	skrll	branches: 1.12.4; need <sys/param.h> for COHERENCY_UNIT Minor KNF along the way.
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.11	05-May-2020	riastradh	branches: 1.11.2; 1.11.6; Revert "Use cv_timedwaitclock_sig in futex." Turned out to break things; we'll do this another way.
# 1.10	05-May-2020	riastradh	Revert "Make sure futex waits never return ERESTART." Part of redoing the timedwaitclock changes, which were buggy and committed a little too fast.
# 1.9	03-May-2020	riastradh	Make sure futex waits never return ERESTART. If the user had passed in a relative timeout, this would have the effect of waiting for the full relative time repeatedly, without regard for how much time had elapsed during the wait before a signal. In principle this may not be necessary for absolute timeouts or indefinite timeouts, but it's not clear there's an advantage; we do the same for various other syscalls like nanosleep. Perhaps in the future we can arrange to keep the state of how much time had elapsed when we restart like Linux does, but that's a much more ambitious change.
# 1.8	03-May-2020	riastradh	Use cv_timedwaitclock_sig in futex. Possible fix for hangs observed with Java under Linux emulation.
# 1.7	28-Apr-2020	riastradh	Make FUTEX_WAIT_BITSET(bitset=0) fail with EINVAL to match Linux.
# 1.6	28-Apr-2020	riastradh	Fix waiting on a zero bitset. The logic in futex_wait assumes there are two paths out: 1. Error (signal or timeout), in which case we take ourselves off the queue. 2. Wakeup, in which case the waker takes us off the queue. But if the user does FUTEX_WAIT_BITSET(bitset=0), as in the futex_wait_pointless_bitset test, then we will never even go to sleep, so there will be nobody to wake us as in (2), but it's not an error as in (1) either. As a result, we're left on the queue. Instead, don't bother with any of the wait machinery in that case. This does not actually match Linux semantics -- Linux returns EINVAL if bitset is zero. But let's make sure this passes the releng test rig as the tests are written now, and then fix both the logic and the tests -- this is a candidate fix for: lib/libc/sys/t_futex_ops (277/847): 20 test cases futex_basic_wait_wake_private: [6.645189s] Passed. futex_basic_wait_wake_shared: [6.572692s] Passed. futex_cmp_requeue: [4.624082s] Passed. futex_requeue: [4.427191s] Passed. futex_wait_pointless_bitset: [0.202865s] Passed. futex_wait_timeout_deadline: [ 9074.4164779] panic: TAILQ_INSERT_TAIL 0xffff000056a1ad48 /tmp/bracket/build/2020.04.28.03.00.23-evbarm-aarch64/src/sys/kern/sys_futex.c:826 [ 9074.4340691] cpu0: Begin traceback... [ 9074.4340691] trace fp ffffc0004ceffb40 [ 9074.4340691] fp ffffc0004ceffb60 vpanic() at ffffc000004aac58 netbsd:vpanic+0x160 [ 9074.4441432] fp ffffc0004ceffbd0 panic() at ffffc000004aad4c netbsd:panic+0x44 [ 9074.4441432] fp ffffc0004ceffc60 futex_wait_enqueue() at ffffc000004b7710 netbsd:futex_wait_enqueue+0x138 [ 9074.4555795] fp ffffc0004ceffc80 futex_func_wait.part.5() at ffffc000004b82f4 netbsd:futex_func_wait.part.5+0x17c [ 9074.4660518] fp ffffc0004ceffd50 do_futex() at ffffc000004b8cd8 netbsd:do_futex+0x1d0 [ 9074.4660518] fp ffffc0004ceffdf0 sys___futex() at ffffc000004b9078 netbsd:sys___futex+0x50
# 1.5	28-Apr-2020	riastradh	Rename futex_get -> futex_lookup_create. Remove futex_put. Just use futex_rele instead of futex_put. There may once have been a method to the madness this alias in an early draft but there is no longer. No functional change; all names are private to sys_futex.c.
# 1.4	27-Apr-2020	riastradh	Fix races in aborted futex waits. - Re-check the wake condition in futex_wait in the event of error. => Otherwise, if futex_wait times out in cv_timedwait_sig but futex_wake wakes it while cv_timedwait_sig is still trying to reacquire fw_lock, the wake would be incorrectly accounted. - Fold futex_wait_abort into futex_wait so it happens atomically. => Otherwise, if futex_wait times out and release fw_lock, then, before futex_wait_abort reacquires the lock and removes it from the queue, the waiter could be woken by futex_wake. But once we enter futex_wait_abort, the decision to abort is final, so the wake would incorrectly accounted. - In futex_wait_abort, mark each waiter aborting while we do the lock dance, and skip over aborting waiters in futex_wake and futex_requeue. => Otherwise, futex_wake might move it to a new futex while futex_wait_abort has released all the locks -- but futex_wait_abort still has the old futex, so TAILQ_REMOVE will cross the streams and bad things will happen. - In futex_wait_abort, release the futex we moved the waiter off. => Otherwise, we would leak the futex reference acquired by futex_func_wait, in the event of aborting. (For normal wakeups, futex_wake releases the reference on our behalf.) - Consistently use futex_wait_dequeue rather than TAILQ_REMOVE so that all changes to fw_futex and the waiter queue are isolated to futex_wait_enqueue/dequeue and happen together. Patch developed with and tested by thorpej@.
# 1.3	27-Apr-2020	thorpej	We would have bigger problems if PAGE_SIZE were < sizeof(int). Remove a CTASSERT() that can't be evaluated at compile-time on all platforms.
# 1.2	26-Apr-2020	mlelstv	fix DIAGNOSTIC build
# 1.1	26-Apr-2020	thorpej	Add a NetBSD native futex implementation, mostly written by riastradh@. Map the COMPAT_LINUX futex calls to the native ones.
# 1.14	21-Oct-2021	andvar	fix various typos, mainly in comments, but also in man pages and log messages.
# 1.13	28-Sep-2021	thorpej	futex_release_all_lwp(): No need to pass the "tid" argument separately; that is a vestige of an older version of the code. Also, move a KASSERT() that both futex_release_all_lwp() call sites had inside of futex_release_all_lwp() itself.
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.12	21-Jul-2021	skrll	branches: 1.12.4; need <sys/param.h> for COHERENCY_UNIT Minor KNF along the way.
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.11	05-May-2020	riastradh	branches: 1.11.2; 1.11.6; Revert "Use cv_timedwaitclock_sig in futex." Turned out to break things; we'll do this another way.
# 1.10	05-May-2020	riastradh	Revert "Make sure futex waits never return ERESTART." Part of redoing the timedwaitclock changes, which were buggy and committed a little too fast.
# 1.9	03-May-2020	riastradh	Make sure futex waits never return ERESTART. If the user had passed in a relative timeout, this would have the effect of waiting for the full relative time repeatedly, without regard for how much time had elapsed during the wait before a signal. In principle this may not be necessary for absolute timeouts or indefinite timeouts, but it's not clear there's an advantage; we do the same for various other syscalls like nanosleep. Perhaps in the future we can arrange to keep the state of how much time had elapsed when we restart like Linux does, but that's a much more ambitious change.
# 1.8	03-May-2020	riastradh	Use cv_timedwaitclock_sig in futex. Possible fix for hangs observed with Java under Linux emulation.
# 1.7	28-Apr-2020	riastradh	Make FUTEX_WAIT_BITSET(bitset=0) fail with EINVAL to match Linux.
# 1.6	28-Apr-2020	riastradh	Fix waiting on a zero bitset. The logic in futex_wait assumes there are two paths out: 1. Error (signal or timeout), in which case we take ourselves off the queue. 2. Wakeup, in which case the waker takes us off the queue. But if the user does FUTEX_WAIT_BITSET(bitset=0), as in the futex_wait_pointless_bitset test, then we will never even go to sleep, so there will be nobody to wake us as in (2), but it's not an error as in (1) either. As a result, we're left on the queue. Instead, don't bother with any of the wait machinery in that case. This does not actually match Linux semantics -- Linux returns EINVAL if bitset is zero. But let's make sure this passes the releng test rig as the tests are written now, and then fix both the logic and the tests -- this is a candidate fix for: lib/libc/sys/t_futex_ops (277/847): 20 test cases futex_basic_wait_wake_private: [6.645189s] Passed. futex_basic_wait_wake_shared: [6.572692s] Passed. futex_cmp_requeue: [4.624082s] Passed. futex_requeue: [4.427191s] Passed. futex_wait_pointless_bitset: [0.202865s] Passed. futex_wait_timeout_deadline: [ 9074.4164779] panic: TAILQ_INSERT_TAIL 0xffff000056a1ad48 /tmp/bracket/build/2020.04.28.03.00.23-evbarm-aarch64/src/sys/kern/sys_futex.c:826 [ 9074.4340691] cpu0: Begin traceback... [ 9074.4340691] trace fp ffffc0004ceffb40 [ 9074.4340691] fp ffffc0004ceffb60 vpanic() at ffffc000004aac58 netbsd:vpanic+0x160 [ 9074.4441432] fp ffffc0004ceffbd0 panic() at ffffc000004aad4c netbsd:panic+0x44 [ 9074.4441432] fp ffffc0004ceffc60 futex_wait_enqueue() at ffffc000004b7710 netbsd:futex_wait_enqueue+0x138 [ 9074.4555795] fp ffffc0004ceffc80 futex_func_wait.part.5() at ffffc000004b82f4 netbsd:futex_func_wait.part.5+0x17c [ 9074.4660518] fp ffffc0004ceffd50 do_futex() at ffffc000004b8cd8 netbsd:do_futex+0x1d0 [ 9074.4660518] fp ffffc0004ceffdf0 sys___futex() at ffffc000004b9078 netbsd:sys___futex+0x50
# 1.5	28-Apr-2020	riastradh	Rename futex_get -> futex_lookup_create. Remove futex_put. Just use futex_rele instead of futex_put. There may once have been a method to the madness this alias in an early draft but there is no longer. No functional change; all names are private to sys_futex.c.
# 1.4	27-Apr-2020	riastradh	Fix races in aborted futex waits. - Re-check the wake condition in futex_wait in the event of error. => Otherwise, if futex_wait times out in cv_timedwait_sig but futex_wake wakes it while cv_timedwait_sig is still trying to reacquire fw_lock, the wake would be incorrectly accounted. - Fold futex_wait_abort into futex_wait so it happens atomically. => Otherwise, if futex_wait times out and release fw_lock, then, before futex_wait_abort reacquires the lock and removes it from the queue, the waiter could be woken by futex_wake. But once we enter futex_wait_abort, the decision to abort is final, so the wake would incorrectly accounted. - In futex_wait_abort, mark each waiter aborting while we do the lock dance, and skip over aborting waiters in futex_wake and futex_requeue. => Otherwise, futex_wake might move it to a new futex while futex_wait_abort has released all the locks -- but futex_wait_abort still has the old futex, so TAILQ_REMOVE will cross the streams and bad things will happen. - In futex_wait_abort, release the futex we moved the waiter off. => Otherwise, we would leak the futex reference acquired by futex_func_wait, in the event of aborting. (For normal wakeups, futex_wake releases the reference on our behalf.) - Consistently use futex_wait_dequeue rather than TAILQ_REMOVE so that all changes to fw_futex and the waiter queue are isolated to futex_wait_enqueue/dequeue and happen together. Patch developed with and tested by thorpej@.
# 1.3	27-Apr-2020	thorpej	We would have bigger problems if PAGE_SIZE were < sizeof(int). Remove a CTASSERT() that can't be evaluated at compile-time on all platforms.
# 1.2	26-Apr-2020	mlelstv	fix DIAGNOSTIC build
# 1.1	26-Apr-2020	thorpej	Add a NetBSD native futex implementation, mostly written by riastradh@. Map the COMPAT_LINUX futex calls to the native ones.
# 1.13	28-Sep-2021	thorpej	futex_release_all_lwp(): No need to pass the "tid" argument separately; that is a vestige of an older version of the code. Also, move a KASSERT() that both futex_release_all_lwp() call sites had inside of futex_release_all_lwp() itself.
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.12	21-Jul-2021	skrll	branches: 1.12.4; need <sys/param.h> for COHERENCY_UNIT Minor KNF along the way.
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.11	05-May-2020	riastradh	branches: 1.11.2; 1.11.6; Revert "Use cv_timedwaitclock_sig in futex." Turned out to break things; we'll do this another way.
# 1.10	05-May-2020	riastradh	Revert "Make sure futex waits never return ERESTART." Part of redoing the timedwaitclock changes, which were buggy and committed a little too fast.
# 1.9	03-May-2020	riastradh	Make sure futex waits never return ERESTART. If the user had passed in a relative timeout, this would have the effect of waiting for the full relative time repeatedly, without regard for how much time had elapsed during the wait before a signal. In principle this may not be necessary for absolute timeouts or indefinite timeouts, but it's not clear there's an advantage; we do the same for various other syscalls like nanosleep. Perhaps in the future we can arrange to keep the state of how much time had elapsed when we restart like Linux does, but that's a much more ambitious change.
# 1.8	03-May-2020	riastradh	Use cv_timedwaitclock_sig in futex. Possible fix for hangs observed with Java under Linux emulation.
# 1.7	28-Apr-2020	riastradh	Make FUTEX_WAIT_BITSET(bitset=0) fail with EINVAL to match Linux.
# 1.6	28-Apr-2020	riastradh	Fix waiting on a zero bitset. The logic in futex_wait assumes there are two paths out: 1. Error (signal or timeout), in which case we take ourselves off the queue. 2. Wakeup, in which case the waker takes us off the queue. But if the user does FUTEX_WAIT_BITSET(bitset=0), as in the futex_wait_pointless_bitset test, then we will never even go to sleep, so there will be nobody to wake us as in (2), but it's not an error as in (1) either. As a result, we're left on the queue. Instead, don't bother with any of the wait machinery in that case. This does not actually match Linux semantics -- Linux returns EINVAL if bitset is zero. But let's make sure this passes the releng test rig as the tests are written now, and then fix both the logic and the tests -- this is a candidate fix for: lib/libc/sys/t_futex_ops (277/847): 20 test cases futex_basic_wait_wake_private: [6.645189s] Passed. futex_basic_wait_wake_shared: [6.572692s] Passed. futex_cmp_requeue: [4.624082s] Passed. futex_requeue: [4.427191s] Passed. futex_wait_pointless_bitset: [0.202865s] Passed. futex_wait_timeout_deadline: [ 9074.4164779] panic: TAILQ_INSERT_TAIL 0xffff000056a1ad48 /tmp/bracket/build/2020.04.28.03.00.23-evbarm-aarch64/src/sys/kern/sys_futex.c:826 [ 9074.4340691] cpu0: Begin traceback... [ 9074.4340691] trace fp ffffc0004ceffb40 [ 9074.4340691] fp ffffc0004ceffb60 vpanic() at ffffc000004aac58 netbsd:vpanic+0x160 [ 9074.4441432] fp ffffc0004ceffbd0 panic() at ffffc000004aad4c netbsd:panic+0x44 [ 9074.4441432] fp ffffc0004ceffc60 futex_wait_enqueue() at ffffc000004b7710 netbsd:futex_wait_enqueue+0x138 [ 9074.4555795] fp ffffc0004ceffc80 futex_func_wait.part.5() at ffffc000004b82f4 netbsd:futex_func_wait.part.5+0x17c [ 9074.4660518] fp ffffc0004ceffd50 do_futex() at ffffc000004b8cd8 netbsd:do_futex+0x1d0 [ 9074.4660518] fp ffffc0004ceffdf0 sys___futex() at ffffc000004b9078 netbsd:sys___futex+0x50
# 1.5	28-Apr-2020	riastradh	Rename futex_get -> futex_lookup_create. Remove futex_put. Just use futex_rele instead of futex_put. There may once have been a method to the madness this alias in an early draft but there is no longer. No functional change; all names are private to sys_futex.c.
# 1.4	27-Apr-2020	riastradh	Fix races in aborted futex waits. - Re-check the wake condition in futex_wait in the event of error. => Otherwise, if futex_wait times out in cv_timedwait_sig but futex_wake wakes it while cv_timedwait_sig is still trying to reacquire fw_lock, the wake would be incorrectly accounted. - Fold futex_wait_abort into futex_wait so it happens atomically. => Otherwise, if futex_wait times out and release fw_lock, then, before futex_wait_abort reacquires the lock and removes it from the queue, the waiter could be woken by futex_wake. But once we enter futex_wait_abort, the decision to abort is final, so the wake would incorrectly accounted. - In futex_wait_abort, mark each waiter aborting while we do the lock dance, and skip over aborting waiters in futex_wake and futex_requeue. => Otherwise, futex_wake might move it to a new futex while futex_wait_abort has released all the locks -- but futex_wait_abort still has the old futex, so TAILQ_REMOVE will cross the streams and bad things will happen. - In futex_wait_abort, release the futex we moved the waiter off. => Otherwise, we would leak the futex reference acquired by futex_func_wait, in the event of aborting. (For normal wakeups, futex_wake releases the reference on our behalf.) - Consistently use futex_wait_dequeue rather than TAILQ_REMOVE so that all changes to fw_futex and the waiter queue are isolated to futex_wait_enqueue/dequeue and happen together. Patch developed with and tested by thorpej@.
# 1.3	27-Apr-2020	thorpej	We would have bigger problems if PAGE_SIZE were < sizeof(int). Remove a CTASSERT() that can't be evaluated at compile-time on all platforms.
# 1.2	26-Apr-2020	mlelstv	fix DIAGNOSTIC build
# 1.1	26-Apr-2020	thorpej	Add a NetBSD native futex implementation, mostly written by riastradh@. Map the COMPAT_LINUX futex calls to the native ones.
# 1.12	21-Jul-2021	skrll	need <sys/param.h> for COHERENCY_UNIT Minor KNF along the way.
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.11	05-May-2020	riastradh	branches: 1.11.2; Revert "Use cv_timedwaitclock_sig in futex." Turned out to break things; we'll do this another way.
# 1.10	05-May-2020	riastradh	Revert "Make sure futex waits never return ERESTART." Part of redoing the timedwaitclock changes, which were buggy and committed a little too fast.
# 1.9	03-May-2020	riastradh	Make sure futex waits never return ERESTART. If the user had passed in a relative timeout, this would have the effect of waiting for the full relative time repeatedly, without regard for how much time had elapsed during the wait before a signal. In principle this may not be necessary for absolute timeouts or indefinite timeouts, but it's not clear there's an advantage; we do the same for various other syscalls like nanosleep. Perhaps in the future we can arrange to keep the state of how much time had elapsed when we restart like Linux does, but that's a much more ambitious change.
# 1.8	03-May-2020	riastradh	Use cv_timedwaitclock_sig in futex. Possible fix for hangs observed with Java under Linux emulation.
# 1.7	28-Apr-2020	riastradh	Make FUTEX_WAIT_BITSET(bitset=0) fail with EINVAL to match Linux.
# 1.6	28-Apr-2020	riastradh	Fix waiting on a zero bitset. The logic in futex_wait assumes there are two paths out: 1. Error (signal or timeout), in which case we take ourselves off the queue. 2. Wakeup, in which case the waker takes us off the queue. But if the user does FUTEX_WAIT_BITSET(bitset=0), as in the futex_wait_pointless_bitset test, then we will never even go to sleep, so there will be nobody to wake us as in (2), but it's not an error as in (1) either. As a result, we're left on the queue. Instead, don't bother with any of the wait machinery in that case. This does not actually match Linux semantics -- Linux returns EINVAL if bitset is zero. But let's make sure this passes the releng test rig as the tests are written now, and then fix both the logic and the tests -- this is a candidate fix for: lib/libc/sys/t_futex_ops (277/847): 20 test cases futex_basic_wait_wake_private: [6.645189s] Passed. futex_basic_wait_wake_shared: [6.572692s] Passed. futex_cmp_requeue: [4.624082s] Passed. futex_requeue: [4.427191s] Passed. futex_wait_pointless_bitset: [0.202865s] Passed. futex_wait_timeout_deadline: [ 9074.4164779] panic: TAILQ_INSERT_TAIL 0xffff000056a1ad48 /tmp/bracket/build/2020.04.28.03.00.23-evbarm-aarch64/src/sys/kern/sys_futex.c:826 [ 9074.4340691] cpu0: Begin traceback... [ 9074.4340691] trace fp ffffc0004ceffb40 [ 9074.4340691] fp ffffc0004ceffb60 vpanic() at ffffc000004aac58 netbsd:vpanic+0x160 [ 9074.4441432] fp ffffc0004ceffbd0 panic() at ffffc000004aad4c netbsd:panic+0x44 [ 9074.4441432] fp ffffc0004ceffc60 futex_wait_enqueue() at ffffc000004b7710 netbsd:futex_wait_enqueue+0x138 [ 9074.4555795] fp ffffc0004ceffc80 futex_func_wait.part.5() at ffffc000004b82f4 netbsd:futex_func_wait.part.5+0x17c [ 9074.4660518] fp ffffc0004ceffd50 do_futex() at ffffc000004b8cd8 netbsd:do_futex+0x1d0 [ 9074.4660518] fp ffffc0004ceffdf0 sys___futex() at ffffc000004b9078 netbsd:sys___futex+0x50
# 1.5	28-Apr-2020	riastradh	Rename futex_get -> futex_lookup_create. Remove futex_put. Just use futex_rele instead of futex_put. There may once have been a method to the madness this alias in an early draft but there is no longer. No functional change; all names are private to sys_futex.c.
# 1.4	27-Apr-2020	riastradh	Fix races in aborted futex waits. - Re-check the wake condition in futex_wait in the event of error. => Otherwise, if futex_wait times out in cv_timedwait_sig but futex_wake wakes it while cv_timedwait_sig is still trying to reacquire fw_lock, the wake would be incorrectly accounted. - Fold futex_wait_abort into futex_wait so it happens atomically. => Otherwise, if futex_wait times out and release fw_lock, then, before futex_wait_abort reacquires the lock and removes it from the queue, the waiter could be woken by futex_wake. But once we enter futex_wait_abort, the decision to abort is final, so the wake would incorrectly accounted. - In futex_wait_abort, mark each waiter aborting while we do the lock dance, and skip over aborting waiters in futex_wake and futex_requeue. => Otherwise, futex_wake might move it to a new futex while futex_wait_abort has released all the locks -- but futex_wait_abort still has the old futex, so TAILQ_REMOVE will cross the streams and bad things will happen. - In futex_wait_abort, release the futex we moved the waiter off. => Otherwise, we would leak the futex reference acquired by futex_func_wait, in the event of aborting. (For normal wakeups, futex_wake releases the reference on our behalf.) - Consistently use futex_wait_dequeue rather than TAILQ_REMOVE so that all changes to fw_futex and the waiter queue are isolated to futex_wait_enqueue/dequeue and happen together. Patch developed with and tested by thorpej@.
# 1.3	27-Apr-2020	thorpej	We would have bigger problems if PAGE_SIZE were < sizeof(int). Remove a CTASSERT() that can't be evaluated at compile-time on all platforms.
# 1.2	26-Apr-2020	mlelstv	fix DIAGNOSTIC build
# 1.1	26-Apr-2020	thorpej	Add a NetBSD native futex implementation, mostly written by riastradh@. Map the COMPAT_LINUX futex calls to the native ones.
# 1.11	05-May-2020	riastradh	Revert "Use cv_timedwaitclock_sig in futex." Turned out to break things; we'll do this another way.
# 1.10	05-May-2020	riastradh	Revert "Make sure futex waits never return ERESTART." Part of redoing the timedwaitclock changes, which were buggy and committed a little too fast.
# 1.9	03-May-2020	riastradh	Make sure futex waits never return ERESTART. If the user had passed in a relative timeout, this would have the effect of waiting for the full relative time repeatedly, without regard for how much time had elapsed during the wait before a signal. In principle this may not be necessary for absolute timeouts or indefinite timeouts, but it's not clear there's an advantage; we do the same for various other syscalls like nanosleep. Perhaps in the future we can arrange to keep the state of how much time had elapsed when we restart like Linux does, but that's a much more ambitious change.
# 1.8	03-May-2020	riastradh	Use cv_timedwaitclock_sig in futex. Possible fix for hangs observed with Java under Linux emulation.
# 1.7	28-Apr-2020	riastradh	Make FUTEX_WAIT_BITSET(bitset=0) fail with EINVAL to match Linux.
# 1.6	28-Apr-2020	riastradh	Fix waiting on a zero bitset. The logic in futex_wait assumes there are two paths out: 1. Error (signal or timeout), in which case we take ourselves off the queue. 2. Wakeup, in which case the waker takes us off the queue. But if the user does FUTEX_WAIT_BITSET(bitset=0), as in the futex_wait_pointless_bitset test, then we will never even go to sleep, so there will be nobody to wake us as in (2), but it's not an error as in (1) either. As a result, we're left on the queue. Instead, don't bother with any of the wait machinery in that case. This does not actually match Linux semantics -- Linux returns EINVAL if bitset is zero. But let's make sure this passes the releng test rig as the tests are written now, and then fix both the logic and the tests -- this is a candidate fix for: lib/libc/sys/t_futex_ops (277/847): 20 test cases futex_basic_wait_wake_private: [6.645189s] Passed. futex_basic_wait_wake_shared: [6.572692s] Passed. futex_cmp_requeue: [4.624082s] Passed. futex_requeue: [4.427191s] Passed. futex_wait_pointless_bitset: [0.202865s] Passed. futex_wait_timeout_deadline: [ 9074.4164779] panic: TAILQ_INSERT_TAIL 0xffff000056a1ad48 /tmp/bracket/build/2020.04.28.03.00.23-evbarm-aarch64/src/sys/kern/sys_futex.c:826 [ 9074.4340691] cpu0: Begin traceback... [ 9074.4340691] trace fp ffffc0004ceffb40 [ 9074.4340691] fp ffffc0004ceffb60 vpanic() at ffffc000004aac58 netbsd:vpanic+0x160 [ 9074.4441432] fp ffffc0004ceffbd0 panic() at ffffc000004aad4c netbsd:panic+0x44 [ 9074.4441432] fp ffffc0004ceffc60 futex_wait_enqueue() at ffffc000004b7710 netbsd:futex_wait_enqueue+0x138 [ 9074.4555795] fp ffffc0004ceffc80 futex_func_wait.part.5() at ffffc000004b82f4 netbsd:futex_func_wait.part.5+0x17c [ 9074.4660518] fp ffffc0004ceffd50 do_futex() at ffffc000004b8cd8 netbsd:do_futex+0x1d0 [ 9074.4660518] fp ffffc0004ceffdf0 sys___futex() at ffffc000004b9078 netbsd:sys___futex+0x50
# 1.5	28-Apr-2020	riastradh	Rename futex_get -> futex_lookup_create. Remove futex_put. Just use futex_rele instead of futex_put. There may once have been a method to the madness this alias in an early draft but there is no longer. No functional change; all names are private to sys_futex.c.
# 1.4	27-Apr-2020	riastradh	Fix races in aborted futex waits. - Re-check the wake condition in futex_wait in the event of error. => Otherwise, if futex_wait times out in cv_timedwait_sig but futex_wake wakes it while cv_timedwait_sig is still trying to reacquire fw_lock, the wake would be incorrectly accounted. - Fold futex_wait_abort into futex_wait so it happens atomically. => Otherwise, if futex_wait times out and release fw_lock, then, before futex_wait_abort reacquires the lock and removes it from the queue, the waiter could be woken by futex_wake. But once we enter futex_wait_abort, the decision to abort is final, so the wake would incorrectly accounted. - In futex_wait_abort, mark each waiter aborting while we do the lock dance, and skip over aborting waiters in futex_wake and futex_requeue. => Otherwise, futex_wake might move it to a new futex while futex_wait_abort has released all the locks -- but futex_wait_abort still has the old futex, so TAILQ_REMOVE will cross the streams and bad things will happen. - In futex_wait_abort, release the futex we moved the waiter off. => Otherwise, we would leak the futex reference acquired by futex_func_wait, in the event of aborting. (For normal wakeups, futex_wake releases the reference on our behalf.) - Consistently use futex_wait_dequeue rather than TAILQ_REMOVE so that all changes to fw_futex and the waiter queue are isolated to futex_wait_enqueue/dequeue and happen together. Patch developed with and tested by thorpej@.
# 1.3	27-Apr-2020	thorpej	We would have bigger problems if PAGE_SIZE were < sizeof(int). Remove a CTASSERT() that can't be evaluated at compile-time on all platforms.
# 1.2	26-Apr-2020	mlelstv	fix DIAGNOSTIC build
# 1.1	26-Apr-2020	thorpej	Add a NetBSD native futex implementation, mostly written by riastradh@. Map the COMPAT_LINUX futex calls to the native ones.
# 1.9	03-May-2020	riastradh	Make sure futex waits never return ERESTART. If the user had passed in a relative timeout, this would have the effect of waiting for the full relative time repeatedly, without regard for how much time had elapsed during the wait before a signal. In principle this may not be necessary for absolute timeouts or indefinite timeouts, but it's not clear there's an advantage; we do the same for various other syscalls like nanosleep. Perhaps in the future we can arrange to keep the state of how much time had elapsed when we restart like Linux does, but that's a much more ambitious change.
# 1.8	03-May-2020	riastradh	Use cv_timedwaitclock_sig in futex. Possible fix for hangs observed with Java under Linux emulation.
# 1.7	28-Apr-2020	riastradh	Make FUTEX_WAIT_BITSET(bitset=0) fail with EINVAL to match Linux.
# 1.6	28-Apr-2020	riastradh	Fix waiting on a zero bitset. The logic in futex_wait assumes there are two paths out: 1. Error (signal or timeout), in which case we take ourselves off the queue. 2. Wakeup, in which case the waker takes us off the queue. But if the user does FUTEX_WAIT_BITSET(bitset=0), as in the futex_wait_pointless_bitset test, then we will never even go to sleep, so there will be nobody to wake us as in (2), but it's not an error as in (1) either. As a result, we're left on the queue. Instead, don't bother with any of the wait machinery in that case. This does not actually match Linux semantics -- Linux returns EINVAL if bitset is zero. But let's make sure this passes the releng test rig as the tests are written now, and then fix both the logic and the tests -- this is a candidate fix for: lib/libc/sys/t_futex_ops (277/847): 20 test cases futex_basic_wait_wake_private: [6.645189s] Passed. futex_basic_wait_wake_shared: [6.572692s] Passed. futex_cmp_requeue: [4.624082s] Passed. futex_requeue: [4.427191s] Passed. futex_wait_pointless_bitset: [0.202865s] Passed. futex_wait_timeout_deadline: [ 9074.4164779] panic: TAILQ_INSERT_TAIL 0xffff000056a1ad48 /tmp/bracket/build/2020.04.28.03.00.23-evbarm-aarch64/src/sys/kern/sys_futex.c:826 [ 9074.4340691] cpu0: Begin traceback... [ 9074.4340691] trace fp ffffc0004ceffb40 [ 9074.4340691] fp ffffc0004ceffb60 vpanic() at ffffc000004aac58 netbsd:vpanic+0x160 [ 9074.4441432] fp ffffc0004ceffbd0 panic() at ffffc000004aad4c netbsd:panic+0x44 [ 9074.4441432] fp ffffc0004ceffc60 futex_wait_enqueue() at ffffc000004b7710 netbsd:futex_wait_enqueue+0x138 [ 9074.4555795] fp ffffc0004ceffc80 futex_func_wait.part.5() at ffffc000004b82f4 netbsd:futex_func_wait.part.5+0x17c [ 9074.4660518] fp ffffc0004ceffd50 do_futex() at ffffc000004b8cd8 netbsd:do_futex+0x1d0 [ 9074.4660518] fp ffffc0004ceffdf0 sys___futex() at ffffc000004b9078 netbsd:sys___futex+0x50
# 1.5	28-Apr-2020	riastradh	Rename futex_get -> futex_lookup_create. Remove futex_put. Just use futex_rele instead of futex_put. There may once have been a method to the madness this alias in an early draft but there is no longer. No functional change; all names are private to sys_futex.c.
# 1.4	27-Apr-2020	riastradh	Fix races in aborted futex waits. - Re-check the wake condition in futex_wait in the event of error. => Otherwise, if futex_wait times out in cv_timedwait_sig but futex_wake wakes it while cv_timedwait_sig is still trying to reacquire fw_lock, the wake would be incorrectly accounted. - Fold futex_wait_abort into futex_wait so it happens atomically. => Otherwise, if futex_wait times out and release fw_lock, then, before futex_wait_abort reacquires the lock and removes it from the queue, the waiter could be woken by futex_wake. But once we enter futex_wait_abort, the decision to abort is final, so the wake would incorrectly accounted. - In futex_wait_abort, mark each waiter aborting while we do the lock dance, and skip over aborting waiters in futex_wake and futex_requeue. => Otherwise, futex_wake might move it to a new futex while futex_wait_abort has released all the locks -- but futex_wait_abort still has the old futex, so TAILQ_REMOVE will cross the streams and bad things will happen. - In futex_wait_abort, release the futex we moved the waiter off. => Otherwise, we would leak the futex reference acquired by futex_func_wait, in the event of aborting. (For normal wakeups, futex_wake releases the reference on our behalf.) - Consistently use futex_wait_dequeue rather than TAILQ_REMOVE so that all changes to fw_futex and the waiter queue are isolated to futex_wait_enqueue/dequeue and happen together. Patch developed with and tested by thorpej@.
# 1.3	27-Apr-2020	thorpej	We would have bigger problems if PAGE_SIZE were < sizeof(int). Remove a CTASSERT() that can't be evaluated at compile-time on all platforms.
# 1.2	26-Apr-2020	mlelstv	fix DIAGNOSTIC build
# 1.1	26-Apr-2020	thorpej	Add a NetBSD native futex implementation, mostly written by riastradh@. Map the COMPAT_LINUX futex calls to the native ones.
# 1.7	28-Apr-2020	riastradh	Make FUTEX_WAIT_BITSET(bitset=0) fail with EINVAL to match Linux.
# 1.6	28-Apr-2020	riastradh	Fix waiting on a zero bitset. The logic in futex_wait assumes there are two paths out: 1. Error (signal or timeout), in which case we take ourselves off the queue. 2. Wakeup, in which case the waker takes us off the queue. But if the user does FUTEX_WAIT_BITSET(bitset=0), as in the futex_wait_pointless_bitset test, then we will never even go to sleep, so there will be nobody to wake us as in (2), but it's not an error as in (1) either. As a result, we're left on the queue. Instead, don't bother with any of the wait machinery in that case. This does not actually match Linux semantics -- Linux returns EINVAL if bitset is zero. But let's make sure this passes the releng test rig as the tests are written now, and then fix both the logic and the tests -- this is a candidate fix for: lib/libc/sys/t_futex_ops (277/847): 20 test cases futex_basic_wait_wake_private: [6.645189s] Passed. futex_basic_wait_wake_shared: [6.572692s] Passed. futex_cmp_requeue: [4.624082s] Passed. futex_requeue: [4.427191s] Passed. futex_wait_pointless_bitset: [0.202865s] Passed. futex_wait_timeout_deadline: [ 9074.4164779] panic: TAILQ_INSERT_TAIL 0xffff000056a1ad48 /tmp/bracket/build/2020.04.28.03.00.23-evbarm-aarch64/src/sys/kern/sys_futex.c:826 [ 9074.4340691] cpu0: Begin traceback... [ 9074.4340691] trace fp ffffc0004ceffb40 [ 9074.4340691] fp ffffc0004ceffb60 vpanic() at ffffc000004aac58 netbsd:vpanic+0x160 [ 9074.4441432] fp ffffc0004ceffbd0 panic() at ffffc000004aad4c netbsd:panic+0x44 [ 9074.4441432] fp ffffc0004ceffc60 futex_wait_enqueue() at ffffc000004b7710 netbsd:futex_wait_enqueue+0x138 [ 9074.4555795] fp ffffc0004ceffc80 futex_func_wait.part.5() at ffffc000004b82f4 netbsd:futex_func_wait.part.5+0x17c [ 9074.4660518] fp ffffc0004ceffd50 do_futex() at ffffc000004b8cd8 netbsd:do_futex+0x1d0 [ 9074.4660518] fp ffffc0004ceffdf0 sys___futex() at ffffc000004b9078 netbsd:sys___futex+0x50
# 1.5	28-Apr-2020	riastradh	Rename futex_get -> futex_lookup_create. Remove futex_put. Just use futex_rele instead of futex_put. There may once have been a method to the madness this alias in an early draft but there is no longer. No functional change; all names are private to sys_futex.c.
# 1.4	27-Apr-2020	riastradh	Fix races in aborted futex waits. - Re-check the wake condition in futex_wait in the event of error. => Otherwise, if futex_wait times out in cv_timedwait_sig but futex_wake wakes it while cv_timedwait_sig is still trying to reacquire fw_lock, the wake would be incorrectly accounted. - Fold futex_wait_abort into futex_wait so it happens atomically. => Otherwise, if futex_wait times out and release fw_lock, then, before futex_wait_abort reacquires the lock and removes it from the queue, the waiter could be woken by futex_wake. But once we enter futex_wait_abort, the decision to abort is final, so the wake would incorrectly accounted. - In futex_wait_abort, mark each waiter aborting while we do the lock dance, and skip over aborting waiters in futex_wake and futex_requeue. => Otherwise, futex_wake might move it to a new futex while futex_wait_abort has released all the locks -- but futex_wait_abort still has the old futex, so TAILQ_REMOVE will cross the streams and bad things will happen. - In futex_wait_abort, release the futex we moved the waiter off. => Otherwise, we would leak the futex reference acquired by futex_func_wait, in the event of aborting. (For normal wakeups, futex_wake releases the reference on our behalf.) - Consistently use futex_wait_dequeue rather than TAILQ_REMOVE so that all changes to fw_futex and the waiter queue are isolated to futex_wait_enqueue/dequeue and happen together. Patch developed with and tested by thorpej@.
# 1.3	27-Apr-2020	thorpej	We would have bigger problems if PAGE_SIZE were < sizeof(int). Remove a CTASSERT() that can't be evaluated at compile-time on all platforms.
# 1.2	26-Apr-2020	mlelstv	fix DIAGNOSTIC build
# 1.1	26-Apr-2020	thorpej	Add a NetBSD native futex implementation, mostly written by riastradh@. Map the COMPAT_LINUX futex calls to the native ones.
# 1.5	28-Apr-2020	riastradh	Rename futex_get -> futex_lookup_create. Remove futex_put. Just use futex_rele instead of futex_put. There may once have been a method to the madness this alias in an early draft but there is no longer. No functional change; all names are private to sys_futex.c.
# 1.4	27-Apr-2020	riastradh	Fix races in aborted futex waits. - Re-check the wake condition in futex_wait in the event of error. => Otherwise, if futex_wait times out in cv_timedwait_sig but futex_wake wakes it while cv_timedwait_sig is still trying to reacquire fw_lock, the wake would be incorrectly accounted. - Fold futex_wait_abort into futex_wait so it happens atomically. => Otherwise, if futex_wait times out and release fw_lock, then, before futex_wait_abort reacquires the lock and removes it from the queue, the waiter could be woken by futex_wake. But once we enter futex_wait_abort, the decision to abort is final, so the wake would incorrectly accounted. - In futex_wait_abort, mark each waiter aborting while we do the lock dance, and skip over aborting waiters in futex_wake and futex_requeue. => Otherwise, futex_wake might move it to a new futex while futex_wait_abort has released all the locks -- but futex_wait_abort still has the old futex, so TAILQ_REMOVE will cross the streams and bad things will happen. - In futex_wait_abort, release the futex we moved the waiter off. => Otherwise, we would leak the futex reference acquired by futex_func_wait, in the event of aborting. (For normal wakeups, futex_wake releases the reference on our behalf.) - Consistently use futex_wait_dequeue rather than TAILQ_REMOVE so that all changes to fw_futex and the waiter queue are isolated to futex_wait_enqueue/dequeue and happen together. Patch developed with and tested by thorpej@.
# 1.3	27-Apr-2020	thorpej	We would have bigger problems if PAGE_SIZE were < sizeof(int). Remove a CTASSERT() that can't be evaluated at compile-time on all platforms.
# 1.2	26-Apr-2020	mlelstv	fix DIAGNOSTIC build
# 1.1	26-Apr-2020	thorpej	Add a NetBSD native futex implementation, mostly written by riastradh@. Map the COMPAT_LINUX futex calls to the native ones.