Cross Reference: /openbsd-current/sys/kern/kern

History log of /openbsd-current/sys/kern/kern_srp.c
Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
# 1.13	06-Dec-2020	cheloha	srp_finalize(9): tsleep(9) -> tsleep_nsec(9) srp_finalize(9) spins until the refcount hits zero. Blocking for at least 1ms each iteration instead of blocking for at most 1 tick is sufficient. Discussed with mpi@. ok claudio@ jmatthew@
Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.12	08-Sep-2017	deraadt	If you use sys/param.h, you don't need sys/types.h
Revision tags: OPENBSD_6_1_BASE
# 1.11	15-Sep-2016	dlg	all pools have their ipl set via pool_setipl, so fold it into pool_init. the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl. most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand. the manpage and subr_pool.c bits i did myself. ok tedu@ jmatthew@ @ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
Revision tags: OPENBSD_6_0_BASE
# 1.10	01-Jun-2016	dlg	add support for using SRPs without the garbage collection machinery. the gc machinery may sleep during srp_update, which makes it hard to use from an interrupt context. srp_swap simply swaps the references in an srp and relies ont he caller to schedule work in a process context where it may sleep with srp_finalise until the reference is no longer in use. our network stack currently modifies routing tables in an interrupt context, so this is built to be used to support rtable updates in our current stack while supporting concurrent lookups. ok jmatthew@ mpi@
# 1.9	18-May-2016	dlg	rename srp_finalize to srp_gc_finalize
# 1.8	18-May-2016	dlg	rework the srp api so it takes an srp_ref struct that the caller provides. the srp_ref struct is used to track the location of the callers hazard pointer so later calls to srp_follow and srp_enter already know what to clear. this in turn means most of the caveats around using srps go away. specifically, you can now: - switch cpus while holding an srp ref - ie, you can sleep while holding an srp ref - you can take and release srp refs in any order the original intent was to simplify use of the api when dealing with complicated data structures. the caller now no longer has to track the location of the srp a value was fetched from, the srp_ref effectively does that for you. srp lists have been refactored to use srp_refs instead of srpl_iter structs. this is in preparation of using srps inside the ART code. ART is a complicated data structure, and lookups require overlapping holds of srp references. ok mpi@ jmatthew@
Revision tags: OPENBSD_5_9_BASE
# 1.7	23-Nov-2015	mpi	Do not include <sys/atomic.h> inside <sys/refcnt.h>. Prevent lazy developers, like David and I, to use atomic operations without including <sys/atomic.h>. ok dlg@
# 1.6	11-Sep-2015	dlg	unbreak build on UP kernels. found by deraadt@
# 1.5	11-Sep-2015	dlg	make srp use refcnts so it can use refcnt_finalize instead of sleep_setup/sleep_finish.
# 1.4	11-Sep-2015	dlg	remove some bits of srp.h i had pasted in here by accident
# 1.3	09-Sep-2015	dlg	implement a singly linked list built with SRPs. this allows us to build lists of things that can be followed by multiple cpus. ok mpi@ claudio@
# 1.2	01-Sep-2015	dlg	mattieu baptiste reported a problem with bpf+srps where the per cpu hazard pointers were becoming corrupt and therefore panics. the problem turned out to be that bridge_input calls if_input on behalf of a hardware interface which then calls bpf_mtap at splsoftnet, while the actual hardware nic calls if_input and bpf_mtap at splnet. the hardware interrupts ran in the middle of the bpf calls bridge runs at softnet. this means the same srps are being entered and left on the same cpu at different ipls, which led to races because of the order of operations on the per cpu hazard pointers. after a lot of experimentation, jmatthew@ figured out how to deal with this problem without introducing per cpu critical sections (ie, splhigh) calls in srp_enter and srp_leave, and without introducing atomic operations. the solution is to iterate forward through the array of hazard pointers in srp_enter, and backward in srp_leave to clear. if you guarantee that you leave srps in the reverse order to entering them, then you can use the same set of SRPs at different IPLs on the same CPU. the ordering requirement is a problem if we want to build linked data structures out of srps because you need to hold a ref to the current element containing the next srp to use it, before giving up the current ref. we're adding srp_follow() to support taking the next ref and giving up the current one while preserving the structure of the hazard pointer list. srp_follow() does this by reusing the hazard pointer for the current reference for the next ref. both mattieu baptiste and jmatthew@ have been hitting this pretty hard with a tweaked version of srp+bpf that uses srp_follow instead of interleaved srp_enter/srp_leave sequences. neither can reproduce the panics anymore. thanks to mattieu for the report and tests ok jmatthew@
Revision tags: OPENBSD_5_8_BASE
# 1.1	02-Jul-2015	dlg	introduce srp, which according to the manpage i wrote is short for "shared reference pointers". srp allows concurrent access to a data structure by multiple cpus while avoiding interlocking cpu opcodes. it manages its own reference counts and the garbage collection of those data structure to avoid use after frees. internally srp is a twisted version of hazard pointers, which are a relative of RCU. jmatthew wrote the bulk of a hazard pointer implementation and changed bpf to use it to allow mpsafe access to bpfilters. however, at s2k15 we were trying to apply it to other data structures but the memory overhead of every hazard pointer would have blown out significantly in several uses cases. a bulk of our time at s2k15 was spent reworking hazard pointers into srp. this diff adds the srp api and adds the necessary metadata to struct cpuinfo on our MP architectures. srp on uniprocessor platforms has alternate code that is optimised because it knows there'll be no concurrent access to data by multiple cpus. srp is made available to the system via param.h, so it should be available everywhere in the kernel. the docs likely need improvement cos im too close to the implementation. ok mpi@
Revision tags: OPENBSD_6_2_BASE
# 1.12	08-Sep-2017	deraadt	If you use sys/param.h, you don't need sys/types.h
Revision tags: OPENBSD_6_1_BASE
# 1.11	15-Sep-2016	dlg	all pools have their ipl set via pool_setipl, so fold it into pool_init. the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl. most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand. the manpage and subr_pool.c bits i did myself. ok tedu@ jmatthew@ @ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
Revision tags: OPENBSD_6_0_BASE
# 1.10	01-Jun-2016	dlg	add support for using SRPs without the garbage collection machinery. the gc machinery may sleep during srp_update, which makes it hard to use from an interrupt context. srp_swap simply swaps the references in an srp and relies ont he caller to schedule work in a process context where it may sleep with srp_finalise until the reference is no longer in use. our network stack currently modifies routing tables in an interrupt context, so this is built to be used to support rtable updates in our current stack while supporting concurrent lookups. ok jmatthew@ mpi@
# 1.9	18-May-2016	dlg	rename srp_finalize to srp_gc_finalize
# 1.8	18-May-2016	dlg	rework the srp api so it takes an srp_ref struct that the caller provides. the srp_ref struct is used to track the location of the callers hazard pointer so later calls to srp_follow and srp_enter already know what to clear. this in turn means most of the caveats around using srps go away. specifically, you can now: - switch cpus while holding an srp ref - ie, you can sleep while holding an srp ref - you can take and release srp refs in any order the original intent was to simplify use of the api when dealing with complicated data structures. the caller now no longer has to track the location of the srp a value was fetched from, the srp_ref effectively does that for you. srp lists have been refactored to use srp_refs instead of srpl_iter structs. this is in preparation of using srps inside the ART code. ART is a complicated data structure, and lookups require overlapping holds of srp references. ok mpi@ jmatthew@
Revision tags: OPENBSD_5_9_BASE
# 1.7	23-Nov-2015	mpi	Do not include <sys/atomic.h> inside <sys/refcnt.h>. Prevent lazy developers, like David and I, to use atomic operations without including <sys/atomic.h>. ok dlg@
# 1.6	11-Sep-2015	dlg	unbreak build on UP kernels. found by deraadt@
# 1.5	11-Sep-2015	dlg	make srp use refcnts so it can use refcnt_finalize instead of sleep_setup/sleep_finish.
# 1.4	11-Sep-2015	dlg	remove some bits of srp.h i had pasted in here by accident
# 1.3	09-Sep-2015	dlg	implement a singly linked list built with SRPs. this allows us to build lists of things that can be followed by multiple cpus. ok mpi@ claudio@
# 1.2	01-Sep-2015	dlg	mattieu baptiste reported a problem with bpf+srps where the per cpu hazard pointers were becoming corrupt and therefore panics. the problem turned out to be that bridge_input calls if_input on behalf of a hardware interface which then calls bpf_mtap at splsoftnet, while the actual hardware nic calls if_input and bpf_mtap at splnet. the hardware interrupts ran in the middle of the bpf calls bridge runs at softnet. this means the same srps are being entered and left on the same cpu at different ipls, which led to races because of the order of operations on the per cpu hazard pointers. after a lot of experimentation, jmatthew@ figured out how to deal with this problem without introducing per cpu critical sections (ie, splhigh) calls in srp_enter and srp_leave, and without introducing atomic operations. the solution is to iterate forward through the array of hazard pointers in srp_enter, and backward in srp_leave to clear. if you guarantee that you leave srps in the reverse order to entering them, then you can use the same set of SRPs at different IPLs on the same CPU. the ordering requirement is a problem if we want to build linked data structures out of srps because you need to hold a ref to the current element containing the next srp to use it, before giving up the current ref. we're adding srp_follow() to support taking the next ref and giving up the current one while preserving the structure of the hazard pointer list. srp_follow() does this by reusing the hazard pointer for the current reference for the next ref. both mattieu baptiste and jmatthew@ have been hitting this pretty hard with a tweaked version of srp+bpf that uses srp_follow instead of interleaved srp_enter/srp_leave sequences. neither can reproduce the panics anymore. thanks to mattieu for the report and tests ok jmatthew@
Revision tags: OPENBSD_5_8_BASE
# 1.1	02-Jul-2015	dlg	introduce srp, which according to the manpage i wrote is short for "shared reference pointers". srp allows concurrent access to a data structure by multiple cpus while avoiding interlocking cpu opcodes. it manages its own reference counts and the garbage collection of those data structure to avoid use after frees. internally srp is a twisted version of hazard pointers, which are a relative of RCU. jmatthew wrote the bulk of a hazard pointer implementation and changed bpf to use it to allow mpsafe access to bpfilters. however, at s2k15 we were trying to apply it to other data structures but the memory overhead of every hazard pointer would have blown out significantly in several uses cases. a bulk of our time at s2k15 was spent reworking hazard pointers into srp. this diff adds the srp api and adds the necessary metadata to struct cpuinfo on our MP architectures. srp on uniprocessor platforms has alternate code that is optimised because it knows there'll be no concurrent access to data by multiple cpus. srp is made available to the system via param.h, so it should be available everywhere in the kernel. the docs likely need improvement cos im too close to the implementation. ok mpi@