Cross Reference: /freebsd-11-stable/sys/kern/subr

History log of /freebsd-11-stable/sys/kern/subr_taskqueue.c
Revision	Date	Author	Comments (<<< Hide modified files) (Show modified files >>>)
# 354406	06-Nov-2019	mav	MFC r354241: Some more taskqueue optimizations. - Optimize enqueue for two task priority values by adding new tq_hint field, pointing to the last task inserted into the middle of the list. In case of more then two priority values it should halve average search. - Move tq_active insert/remove out of the taskqueue_run_locked loop. Instead of dirtying few shared cache lines per task introduce different mechanism to drain active tasks, based on task sequence number counter, that uses only cache lines already present in cache. Since the new mechanism does not need ordering, switch tq_active from TAILQ to LIST. - Move static and dynamic struct taskqueue fields into different cache lines. Move lock into its own cache line, so that heavy lock spinning by multiple waiting threads would not affect the running thread. - While there, correct some TQ_SLEEP() wait messages. This change fixes certain ZFS write workloads, causing huge congestion on taskqueue lock. Those workloads combine some large block writes to saturate the pool and trigger allocation throttling, which uses higher priority tasks to requeue the delayed I/Os, with many small blocks to generate deep queue of small tasks for taskqueue to sort. Sponsored by: iXsystems, Inc.
# 354405	06-Nov-2019	mav	MFC r349220: Add wakeup_any(), cheaper wakeup_one() for taskqueue(9). wakeup_one() and underlying sleepq_signal() spend additional time trying to be fair, waking thread with highest priority, sleeping longest time. But in case of taskqueue there are many absolutely identical threads, and any fairness between them is quite pointless. It makes even worse, since round-robin wakeups not only make previous CPU affinity in scheduler quite useless, but also hide from user chance to see CPU bottlenecks, when sequential workload with one request at a time looks evenly distributed between multiple threads. This change adds new SLEEPQ_UNFAIR flag to sleepq_signal(), making it wakeup thread that went to sleep last, but no longer in context switch (to avoid immediate spinning on the thread lock). On top of that new wakeup_any() function is added, equivalent to wakeup_one(), but setting the flag. On top of that taskqueue(9) is switchied to wakeup_any() to wakeup its threads. As result, on 72-core Xeon v4 machine sequential ZFS write to 12 ZVOLs with 16KB block size spend 34% less time in wakeup_any() and descendants then it was spending in wakeup_one(), and total write throughput increased by ~10% with the same as before CPU usage.
# 341154	28-Nov-2018	markj	MFC r340730, r340731: Add taskqueue_quiesce(9) and use it to implement taskq_wait(). PR: 227784
# 328392	25-Jan-2018	pkelsey	MFC of r305169: _taskqueue_start_threads() now fails if it doesn't actually start any threads. Reviewed by: jhb Differential Revision: https://reviews.freebsd.org/D7701
# 323447	11-Sep-2017	ian	MFC r320901-r320902, r320996-r320997, r321002, r321048, r321400, r321743, r321745 r320901: Protect access to the AT realtime clock with its own mutex. The mutex protecting access to the registered realtime clock should not be overloaded to protect access to the atrtc hardware, which might not even be the registered rtc. More importantly, the resettodr mutex needs to be eliminated to remove locking/sleeping restrictions on clock drivers, and that can't happen if MD code for amd64 depends on it. This change moves the protection into what's really being protected: access to the atrtc date and time registers. This change also adds protection when the clock is accessed from xentimer_settime(), which bypasses the resettodr locking. Differential Revision: https://reviews.freebsd.org/D11483 r320902: Support multiple realtime clocks, and remove locking/sleeping restrictions on clock drivers. This tracks multiple concurrent realtime clock drivers in a list sorted by clock resolution. When system time changes (and periodically) the clock_settime() methods of all registered clocks are invoked. To initialize system time, each driver is tried in turn from best to worst resolution, until one succesfully returns a valid time. The code no longer holds a mutex while calling the clock_settime() and clock_gettime() methods of the registered clocks. This allows clock drivers to do whatever kind of locking or sleeping is necessary (this is especially important for i2c clock chips since i2c drivers often need to sleep). A new clock_register_flags() function allows the clock driver to pass flags. The flags currently defined help support drivers that use their own techniques to avoid roundoff errors (prevents the 4/5 rounding done by the subr_rtc code). A driver which may need to wait for resources (such as bus ownership) may pass a flag to indicate that it will obtain system time for itself after waiting for resources; this is merely an optimization to avoid the common code retrieving a timespec that will never get used. Relnotes: yes Differential Revision: https://reviews.freebsd.org/D11484 r320996: Allow setting debug.clocktime as a tunable. Print 64-bit time_t correctly on 32-bit systems. r320997: Minor optimization: instead of converting between days and years using loops that start in 1970, assume most conversions are going to be for recent dates and use a precomputed number of days through the end of 2016. r321002: Revert r320997. There are reports of it getting the wrong results, so clearly my testing was insuffficent, and it's best to just revert it until I get it straightened out. r321048: Minor optimization: instead of converting between days and years using loops that start in 1970, assume most conversions are going to be for recent dates and use a precomputed number of days through the end of 2016. This is a do-over of r320997, hopefully this time with 100% more workiness. The first attempt had an off-by-one error, but instead of just adding another mysterious +1 adjustment, this rearranges the relationship between recent_base_year and recent_base_days so that the latter is the number of days that occurred before the start of the associated year (instead of the count thru the end of that year). This makes the recent_base stuff work more like the original loop logic that didn't need any +1 adjustments. r321400: Add common code to support realtime clocks that store year without century. Most realtime clocks store the year as 2 BCD digits. Some add a century bit to extend the range another hundred years. Every clock driver has its own code to determine the century and pass a full year value to clock_ct_to_ts(). Now clock drivers can just convert BCD to bin and store the result in the clocktime struct and let the common code figure out the century. Clocks with a century bit can just add 100 to year if the century bit is on. r321743: Add taskqueue_enqueue_timeout_sbt(), because sometimes you want more control over the scheduling precision than 'ticks' can offer, and because sometimes you're already working with sbintime_t units and it's dumb to convert them to ticks just so they can get converted back to sbintime_t under the hood. r321745: Add clock_schedule(), a feature that allows realtime clock drivers to request that their clock_settime() methods be called at a given offset from top-of-second. This adds a timeout_task to the rtc_instance so that each clock can be separately added to taskqueue_thread with the scheduling it prefers, instead of looping through all the clocks at once with a single task on taskqueue_thread. If a driver doesn't call clock_schedule() the default is the old behavior: clock_settime() is queued immediately.
# 315267	14-Mar-2017	hselasky	MFC r314553: Implement taskqueue_poll_is_busy() for use by the LinuxKPI. Refer to comment above function for a detailed description. Discussed with: kib @ Sponsored by: Mellanox Technologies
# 306946	10-Oct-2016	hselasky	MFC r306441 and r306634: While draining a timeout task prevent the taskqueue_enqueue_timeout() function from restarting the timer. Commonly taskqueue_enqueue_timeout() is called from within the task function itself without any checks for teardown. Then it can happen the timer stays active after the return of taskqueue_drain_timeout(), because the timeout and task is drained separately. This patch factors out the teardown flag into the timeout task itself, allowing existing code to stay as-is instead of applying a teardown flag to each and every of the timeout task consumers. Add assert to taskqueue_drain_timeout() which prevents parallel execution on the same timeout task. Update manual page documenting the return value of taskqueue_enqueue_timeout(). Differential Revision: https://reviews.freebsd.org/D8012 Reviewed by: kib, trasz
# 304704	23-Aug-2016	shurd	MFC r304021: Update iflib to support more NIC designs - Move group task queue into kern/subr_gtaskqueue.c - Change intr_enable to return an int so it can be detected if it's not implemented - Allow different TX/RX queues per set to be different sizes - Don't split up TX mbufs before transmit - Allow a completion queue for TX as well as RX - Pass the RX budget to isc_rxd_available() to allow an earlier return and avoid multiple calls Approved by: sbruno
# 302408	07-Jul-2016	gjb	Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here. Additional commits post-branch will follow. Approved by: re (implicit) Sponsored by: The FreeBSD Foundation /freebsd-11-stable/MAINTAINERS /freebsd-11-stable/cddl /freebsd-11-stable/cddl/contrib/opensolaris /freebsd-11-stable/cddl/contrib/opensolaris/cmd/dtrace/test/tst/common/print /freebsd-11-stable/cddl/contrib/opensolaris/cmd/zfs /freebsd-11-stable/cddl/contrib/opensolaris/lib/libzfs /freebsd-11-stable/contrib/amd /freebsd-11-stable/contrib/apr /freebsd-11-stable/contrib/apr-util /freebsd-11-stable/contrib/atf /freebsd-11-stable/contrib/binutils /freebsd-11-stable/contrib/bmake /freebsd-11-stable/contrib/byacc /freebsd-11-stable/contrib/bzip2 /freebsd-11-stable/contrib/com_err /freebsd-11-stable/contrib/compiler-rt /freebsd-11-stable/contrib/dialog /freebsd-11-stable/contrib/dma /freebsd-11-stable/contrib/dtc /freebsd-11-stable/contrib/ee /freebsd-11-stable/contrib/elftoolchain /freebsd-11-stable/contrib/elftoolchain/ar /freebsd-11-stable/contrib/elftoolchain/brandelf /freebsd-11-stable/contrib/elftoolchain/elfdump /freebsd-11-stable/contrib/expat /freebsd-11-stable/contrib/file /freebsd-11-stable/contrib/gcc /freebsd-11-stable/contrib/gcclibs/libgomp /freebsd-11-stable/contrib/gdb /freebsd-11-stable/contrib/gdtoa /freebsd-11-stable/contrib/groff /freebsd-11-stable/contrib/ipfilter /freebsd-11-stable/contrib/ldns /freebsd-11-stable/contrib/ldns-host /freebsd-11-stable/contrib/less /freebsd-11-stable/contrib/libarchive /freebsd-11-stable/contrib/libarchive/cpio /freebsd-11-stable/contrib/libarchive/libarchive /freebsd-11-stable/contrib/libarchive/libarchive_fe /freebsd-11-stable/contrib/libarchive/tar /freebsd-11-stable/contrib/libc++ /freebsd-11-stable/contrib/libc-vis /freebsd-11-stable/contrib/libcxxrt /freebsd-11-stable/contrib/libexecinfo /freebsd-11-stable/contrib/libpcap /freebsd-11-stable/contrib/libstdc++ /freebsd-11-stable/contrib/libucl /freebsd-11-stable/contrib/libxo /freebsd-11-stable/contrib/llvm /freebsd-11-stable/contrib/llvm/projects/libunwind /freebsd-11-stable/contrib/llvm/tools/clang /freebsd-11-stable/contrib/llvm/tools/lldb /freebsd-11-stable/contrib/llvm/tools/llvm-dwarfdump /freebsd-11-stable/contrib/llvm/tools/llvm-lto /freebsd-11-stable/contrib/mdocml /freebsd-11-stable/contrib/mtree /freebsd-11-stable/contrib/ncurses /freebsd-11-stable/contrib/netcat /freebsd-11-stable/contrib/ntp /freebsd-11-stable/contrib/nvi /freebsd-11-stable/contrib/one-true-awk /freebsd-11-stable/contrib/openbsm /freebsd-11-stable/contrib/openpam /freebsd-11-stable/contrib/openresolv /freebsd-11-stable/contrib/pf /freebsd-11-stable/contrib/sendmail /freebsd-11-stable/contrib/serf /freebsd-11-stable/contrib/sqlite3 /freebsd-11-stable/contrib/subversion /freebsd-11-stable/contrib/tcpdump /freebsd-11-stable/contrib/tcsh /freebsd-11-stable/contrib/tnftp /freebsd-11-stable/contrib/top /freebsd-11-stable/contrib/top/install-sh /freebsd-11-stable/contrib/tzcode/stdtime /freebsd-11-stable/contrib/tzcode/zic /freebsd-11-stable/contrib/tzdata /freebsd-11-stable/contrib/unbound /freebsd-11-stable/contrib/vis /freebsd-11-stable/contrib/wpa /freebsd-11-stable/contrib/xz /freebsd-11-stable/crypto/heimdal /freebsd-11-stable/crypto/openssh /freebsd-11-stable/crypto/openssl /freebsd-11-stable/gnu/lib /freebsd-11-stable/gnu/usr.bin/binutils /freebsd-11-stable/gnu/usr.bin/cc/cc_tools /freebsd-11-stable/gnu/usr.bin/gdb /freebsd-11-stable/lib/libc/locale/ascii.c /freebsd-11-stable/sys/cddl/contrib/opensolaris /freebsd-11-stable/sys/contrib/dev/acpica /freebsd-11-stable/sys/contrib/ipfilter /freebsd-11-stable/sys/contrib/libfdt /freebsd-11-stable/sys/contrib/octeon-sdk /freebsd-11-stable/sys/contrib/x86emu /freebsd-11-stable/sys/contrib/xz-embedded /freebsd-11-stable/usr.sbin/bhyve/atkbdc.h /freebsd-11-stable/usr.sbin/bhyve/bhyvegc.c /freebsd-11-stable/usr.sbin/bhyve/bhyvegc.h /freebsd-11-stable/usr.sbin/bhyve/console.c /freebsd-11-stable/usr.sbin/bhyve/console.h /freebsd-11-stable/usr.sbin/bhyve/pci_fbuf.c /freebsd-11-stable/usr.sbin/bhyve/pci_xhci.c /freebsd-11-stable/usr.sbin/bhyve/pci_xhci.h /freebsd-11-stable/usr.sbin/bhyve/ps2kbd.c /freebsd-11-stable/usr.sbin/bhyve/ps2kbd.h /freebsd-11-stable/usr.sbin/bhyve/ps2mouse.c /freebsd-11-stable/usr.sbin/bhyve/ps2mouse.h /freebsd-11-stable/usr.sbin/bhyve/rfb.c /freebsd-11-stable/usr.sbin/bhyve/rfb.h /freebsd-11-stable/usr.sbin/bhyve/sockstream.c /freebsd-11-stable/usr.sbin/bhyve/sockstream.h /freebsd-11-stable/usr.sbin/bhyve/usb_emul.c /freebsd-11-stable/usr.sbin/bhyve/usb_emul.h /freebsd-11-stable/usr.sbin/bhyve/usb_mouse.c /freebsd-11-stable/usr.sbin/bhyve/vga.c /freebsd-11-stable/usr.sbin/bhyve/vga.h
# 302372	06-Jul-2016	nwhitehorn	Replace a number of conflations of mp_ncpus and mp_maxid with either mp_maxid or CPU_FOREACH() as appropriate. This fixes a number of places in the kernel that assumed CPU IDs are dense in [0, mp_ncpus) and would try, for example, to run tasks on CPUs that did not exist or to allocate too few buffers on systems with sparse CPU IDs in which there are holes in the range and mp_maxid > mp_ncpus. Such circumstances generally occur on systems with SMT, but on which SMT is disabled. This patch restores system operation at least on POWER8 systems configured in this way. There are a number of other places in the kernel with potential problems in these situations, but where sparse CPU IDs are not currently known to occur, mostly in the ARM machine-dependent code. These will be fixed in a follow-up commit after the stable/11 branch. PR: kern/210106 Reviewed by: jhb Approved by: re (glebius)
# 301208	02-Jun-2016	mjg	taskqueue: plug a leak in _taskqueue_create While here make some style fixes and postpone the sprintf so that it is only done when the function can no longer fail. CID: 1356041
# 300372	21-May-2016	avg	fix loss of taskqueue wakeups (introduced in r300113) Submitted by: kmacy Tested by: dchagin
# 300219	19-May-2016	scottl	Adjust the creation of tq_name so it can be freed correctly Reviewed by: jhb, allanjude Differential Revision: D6454
# 300113	18-May-2016	scottl	Import the 'iflib' API library for network drivers. From the author: "iflib is a library to eliminate the need for frequently duplicated device independent logic propagated (poorly) across many network drivers." Participation is purely optional. The IFLIB kernel config option is provided for drivers that want to transition between legacy and iflib modes of operation. ixl and ixgbe driver conversions will be committed shortly. We hope to see participation from the Broadcom and maybe Chelsio drivers in the near future. Submitted by: mmacy@nextbsd.org Reviewed by: gallatin Differential Revision: D5211
# 296272	01-Mar-2016	jhb	Remove taskqueue_enqueue_fast(). taskqueue_enqueue() was changed to support both fast and non-fast taskqueues 10 years ago in r154167. It has been a compat shim ever since. It's time for the compat shim to go. Submitted by: Howard Su <howard0su@gmail.com> Reviewed by: sephe Differential Revision: https://reviews.freebsd.org/D5131
# 290805	13-Nov-2015	rrs	This fixes several places where callout_stops return is examined. The new return codes of -1 were mistakenly being considered "true". Callout_stop now returns -1 to indicate the callout had either already completed or was not running and 0 to indicate it could not be stopped. Also update the manual page to make it more consistent no non-zero in the callout_stop or callout_reset descriptions. MFC after: 1 Month with associated callout change.
# 283551	25-May-2015	delphij	MFuser/delphij/zfs-arc-rebase@r281754: In r256613, taskqueue_enqueue_locked() have been modified to release the task queue lock before returning. In r276665, taskqueue_drain_all() will call taskqueue_enqueue_locked() to insert the barrier task into the queue, but did not reacquire the lock after it but later code expects the lock still being held (e.g. TQ_SLEEP()). The barrier task is special and if we release then reacquire the lock, there would be a small race window where a high priority task could sneak into the queue. Looking more closely, the race seems to be tolerable but is undesirable from semantics standpoint. To solve this, in taskqueue_drain_tq_queue(), instead of directly calling taskqueue_enqueue_locked(), insert the barrier task directly without releasing the lock.
# 279300	25-Feb-2015	adrian	Remove taskqueue_start_threads_pinned(); there's noa generic cpuset version of this. Sponsored by: Norse Corp, Inc.
# 278879	17-Feb-2015	adrian	Implement taskqueue_start_threads_cpuset(). This is a more generic version of taskqueue_start_threads_pinned() which only supports a single cpuid. This originally came from John Baldwin <jhb@> who implemented it as part of a push towards NUMA awareness in drivers. I started implementing something similar for RSS and NUMA, then found he already did it. I'd like to axe taskqueue_start_threads_pinned() so it doesn't become part of a longer-term API. (Read: hps@ wants to MFC things, and if I don't do this soon, he'll MFC what's here. :-) I have a follow-up commit which converts the intel drivers over to using the cpuset version of this function, so we can eventually nuke the the pinned version. Tested: * igb, ixgbe Obtained from: jhbbsd
# 276665	04-Jan-2015	gibbs	Prevent live-lock and access of destroyed data in taskqueue_drain_all(). Phabric: https://reviews.freebsd.org/D1247 Reviewed by: jhb, avg Sponsored by: Spectra Logic Corporation sys/kern_subr_taskqueue.c: Modify taskqueue_drain_all() processing to use a temporary "barrier task", rather than rely on a user task that may be destroyed during taskqueue_drain_all()'s execution. The barrier task is queued behind all previously queued tasks and then has its priority elevated so that future tasks cannot pass it in the queue. Use a similar barrier scheme to drain threads processing current tasks. This requires taskqueue_run_locked() to insert and remove the taskqueue_busy object for the running thread for every task processed. share/man/man9/taskqueue.9: Remove warning about live-lock issues with taskqueue_drain_all() and indicate that it does not wait for tasks queued after it begins processing.
# 275345	30-Nov-2014	gibbs	Remove trailing whitespace.
# 269666	07-Aug-2014	ae	Temporary revert r269661, it looks like the patch isn't complete.
# 269661	07-Aug-2014	ae	Use cpuset_setithread() to apply cpu mask to taskq threads. Sponsored by: Yandex LLC
# 266939	01-Jun-2014	adrian	Pin the right thread. This _was_ right, a last minute suggestion and not enough testing makes Adrian a bad boy. Tested: * igb(4) with RSS patches, by hand verifying each igb(4) taskqueue tid from procstat -ka using cpuset -g -t <tid>.
# 266629	24-May-2014	adrian	Add a new taskqueue setup method that takes a cpuid to pin the taskqueue worker thread(s) to. For now it isn't a taskqueue/taskthread error to fail to pin to the given cpuid. Thanks to rpaulo@, kib@ and jhb@ for feedback. Tested: * igb(4), with local RSS patches to pin taskqueues. TODO: * ask the doc team for help in documenting the new API call. * add a taskqueue_start_threads_cpuset() method which takes a cpuset_t - but this may require a bunch of surgery to bring cpuset_t into scope.
# 258713	28-Nov-2013	avg	add taskqueue_drain_all This API has semantics similar to that of taskqueue_drain but acts on all tasks that might be queued or running on a taskqueue. A caller must ensure that no new tasks are being enqueued otherwise this call would be totally meaningless. For example, if the tasks are enqueued by an interrupt filter then its interrupt must be disabled. MFC after: 10 days
# 258354	19-Nov-2013	avg	taskqueue_cancel: garbage collect a write-only variable MFC after: 3 days
# 256862	21-Oct-2013	mav	Add comments that taskqueue_enqueue_locked() returns without the lock.
# 256730	18-Oct-2013	glebius	Revert r256587. Requested by: zec
# 256613	16-Oct-2013	mav	MFprojects/camlock r254763: Move tq_enqueue() call out of the queue lock for known handlers (actually I have found no others in the base system). This reduces queue lock hold time and congestion spinning under active multithreaded enqueuing.
# 256612	16-Oct-2013	mav	MFprojects/camlock r254685: Remove TQ_FLAGS_PENDING flag, softly duplicating queue emptiness status.
# 256587	16-Oct-2013	glebius	For VIMAGE kernels store vnet in the struct task, and set vnet context during task processing. Reported & tested by: mm
# 254787	24-Aug-2013	mav	MFprojects/camlock r254460: Remove locking from taskqueue_member(). The list of threads is static during the taskqueue life cycle, so there is no need to protect it, taking quite congested lock several more times for each ZFS I/O.
# 248649	23-Mar-2013	will	Extend taskqueue(9) to enable per-taskqueue callbacks. The scope of these callbacks is primarily to support actions that affect the taskqueue's thread environments. They are entirely optional, and consequently are introduced as a new API: taskqueue_set_callback(). This interface allows the caller to specify that a taskqueue requires a callback and optional context pointer for a given callback type. The callback types included in this commit can be used to register a constructor and destructor for thread-local storage using osd(9). This allows a particular taskqueue to define that its threads require a specific type of TLS, without the need for a specially-orchestrated task-based mechanism for startup and shutdown in order to accomplish it. Two callback types are supported at this point: - TASKQUEUE_CALLBACK_TYPE_INIT, called by every thread when it starts, prior to processing any tasks. - TASKQUEUE_CALLBACK_TYPE_SHUTDOWN, called by every thread when it exits, after it has processed its last task but before the taskqueue is reclaimed. While I'm here: - Add two new macros, TQ_ASSERT_LOCKED and TQ_ASSERT_UNLOCKED, and use them in appropriate locations. - Fix taskqueue.9 to mention taskqueue_start_threads(), which is a required interface for all consumers of taskqueue(9). Reviewed by: kib (all), eadler (taskqueue.9), brd (taskqueue.9) Approved by: ken (mentor) Sponsored by: Spectra Logic MFC after: 1 month
# 243341	20-Nov-2012	kib	Add a special meaning to the negative ticks argument for taskqueue_enqueue_timeout(). Do not rearm the callout if it is already armed and the ticks is negative. Otherwise rearm it to fire in abs(ticks) ticks in the future. The intended use is to call taskqueue_enqueue_timeout() for the given timeout_task with the same negative ticks argument. As result, the task is scheduled to execute not further than abs(ticks) ticks in future, and the consequent enqueues are coalesced until the already scheduled task is finished. Reviewed by: rwatson Tested by: Markus Gebert <markus.gebert@hostpoint.ch> MFC after: 2 weeks
# 239779	28-Aug-2012	jhb	Shorten the name of the fast SWI taskqueue to "fast taskq" so that it fits. Reported by: lev MFC after: 1 week
# 225570	15-Sep-2011	adrian	Ensure that ta_pending doesn't overflow u_short by capping its value at USHRT_MAX. If it overflows before the taskqueue can run, the task will be re-added to the taskqueue and cause a loop in the task list. Reported by: Arnaud Lacombe <lacombar@gmail.com> Submitted by: Ryan Stone <rysto32@gmail.com> Reviewed by: jhb Approved by: re (kib) MFC after: 1 day
# 221059	26-Apr-2011	kib	Implement the delayed task execution extension to the taskqueue mechanism. The caller may specify a timeout in ticks after which the task will be scheduled. Sponsored by: The FreeBSD Foundation Reviewed by: jeff, jhb MFC after: 1 month
# 215750	23-Nov-2010	avg	taskqueue: drop unused tq_name field tq_name was used write-only and besides it was just a pointer, so it could point to some garbage in a temporary buffer that's gone. This change shouldn't change KPI/KBI as struct taskqueue is private to subr_taskqueue.c. If we find a need for tq_name it can be resurrected at any moment. taskqueue_create() interface is preserved for this purpose. Suggested by: jhb MFC after: 10 days
# 215021	08-Nov-2010	jmallett	Use macros rather than inline functions to lock and unlock mutexes, so that line number information is preserved in witness. Reviewed by: jhb
# 215011	08-Nov-2010	mdf	Add a taskqueue_cancel(9) to cancel a pending task without waiting for it to run as taskqueue_drain(9) does. Requested by: hselasky Original code: jeff Reviewed by: jhb MFC after: 2 weeks
# 213813	13-Oct-2010	mdf	Use a safer mechanism for determining if a task is currently running, that does not rely on the lifetime of pointers being the same. This also restores the task KBI. Suggested by: jhb MFC after: 1 month
# 213739	12-Oct-2010	mdf	Re-expose and briefly document taskqueue_run(9). The function is used in at least one 3rd party driver. Requested by: jhb
# 211928	28-Aug-2010	pjd	Run all tasks from a proper context, with proper priority, etc. Reviewed by: jhb MFC after: 1 month
# 211284	13-Aug-2010	pjd	Simplify taskqueue_drain() by using proved macros.
# 210380	22-Jul-2010	mdf	Remove unused variable that snuck in during development. Approved by: zml (mentor)
# 210377	22-Jul-2010	mdf	Fix taskqueue_drain(9) to not have false negatives. For threaded taskqueues, more than one task can be running simultaneously. Also make taskqueue_run(9) static to the file, since there are no consumers in the base kernel and the function signature needs to change with this fix. Remove mention of taskqueue_run(9) and taskqueue_run_fast(9) from the taskqueue(9) man page. Reviewed by: jhb Approved by: zml (mentor)
# 209062	11-Jun-2010	avg	fix a few cases where a string is passed via format argument instead of via %s Most of the cases looked harmless, but this is done for the sake of correctness. In one case it even allowed to drop an intermediate buffer. Found by: clang MFC after: 2 week
# 208715	01-Jun-2010	zml	Revert taskqueue(9) related commits until mdf@ is approved and can resolve issues. This reverts commits r207439, r208623, r208624
# 208624	28-May-2010	zml	Avoid a wakeup(9) if we can be sure no one is waiting on the task. Submitted by: Matthew Fleming <matthew.fleming@isilon.com> Reviewed by: zml, jhb
# 208623	28-May-2010	zml	Revert r207439 and solve the problem differently. The task handler ta_func may free the task structure, so no references to its members are valid after the handler has been called. Using a per-queue member and having waits longer than strictly necessary was suggested by jhb. Submitted by: Matthew Fleming <matthew.fleming@isilon.com> Reviewed by: zml, jhb
# 207439	30-Apr-2010	zml	Handle taskqueue_drain(9) correctly on a threaded taskqueue: taskqueue_drain(9) will not correctly detect whether a task is currently running. The check is against a field in the taskqueue struct, but for a threaded queue with more than one thread, multiple threads can simultaneously be running a task, thus stomping over the tq_running field. Submitted by: Matthew Fleming <matthew.fleming@isilon.com> Reviewed by: jhb Approved by: dfr (mentor)
# 198411	23-Oct-2009	jhb	- Fix several off-by-one errors when using MAXCOMLEN. The p_comm[] and td_name[] arrays are actually MAXCOMLEN + 1 in size and a few places that created shadow copies of these arrays were just using MAXCOMLEN. - Prefer using sizeof() of an array type to explicit constants for the array length in a few places. - Ensure that all of p_comm[] and td_name[] is always zero'd during execve() to guard against any possible information leaks. Previously trailing garbage in p_comm[] could be leaked to userland in ktrace record headers via td_name[]. Reviewed by: bde
# 196358	18-Aug-2009	pjd	Remove unused taskqueue_find() function. Reviewed by: dfr Approved by: re (kib)
# 196295	17-Aug-2009	pjd	Remove OpenSolaris taskq port (it performs very poorly in our kernel) and replace it with wrappers around our taskqueue(9). To make it possible implement taskqueue_member() function which returns 1 if the given thread was created by the given taskqueue. Approved by: re (kib)
# 196293	17-Aug-2009	pjd	Because taskqueue_run() can drop tq_mutex, we need to check if the TQ_FLAGS_ACTIVE flag wasn't removed in the meantime, which means we missed a wakeup. Approved by: re (kib)
# 188592	13-Feb-2009	thompsa	Remove semicolon left in the last commit Spotted by: csjp
# 188548	12-Feb-2009	thompsa	Check the exit flag at the start of the taskqueue loop rather than the end. It is possible to tear down the taskqueue before the thread has run and the taskqueue loop would sleep forever. Reviewed by: sam MFC after: 1 week
# 188058	03-Feb-2009	imp	Use NULL in preference to 0 for pointers.
# 180588	18-Jul-2008	kmacy	revert local change
# 180583	18-Jul-2008	kmacy	import vendor fixes to cxgb
# 178123	11-Apr-2008	jhb	Use kthread_exit() to terminate a taskqueue thread rather than kproc_exit() now that the taskqueue threads are kthreads rather than kprocs. Reported by: kris
# 178015	08-Apr-2008	sam	change taskqueue_start_threads to create threads instead of proc's Reviewed by: jhb
# 177621	25-Mar-2008	scottl	Implement taskqueue_block() and taskqueue_unblock(). These functions allow the owner of a queue to block and unblock execution of the tasks in the queue while allowing tasks to continue to be added queue. Combining this with taskqueue_drain() allows a queue to be safely disabled. The unblock function may run (or schedule to run) the queue when it is called, just as calling taskqueue_enqueue() would. Reviewed by: jhb, sam
# 172836	20-Oct-2007	julian	Rename the kthread_xxx (e.g. kthread_create()) calls to kproc_xxx as they actually make whole processes. Thos makes way for us to add REAL kthread_create() and friends that actually make theads. it turns out that most of these calls actually end up being moved back to the thread version when it's added. but we need to make this cosmetic change first. I'd LOVE to do this rename in 7.0 so that we can eventually MFC the new kthread_xxx() calls.
# 170307	04-Jun-2007	jeff	Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
# 166188	23-Jan-2007	jeff	- Remove setrunqueue and replace it with direct calls to sched_add(). setrunqueue() was mostly empty. The few asserts and thread state setting were moved to the individual schedulers. sched_add() was chosen to displace it for naming consistency reasons. - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be different on all three schedulers where it was only called in one place each. - Remove the long ifdef'd out remrunqueue code. - Remove the now redundant ts_state. Inspect the thread state directly. - Don't set TSF_* flags from kern_switch.c, we were only doing this to support a feature in one scheduler. - Change sched_choose() to return a thread rather than a td_sched. Also, rely on the schedulers to return the idlethread. This simplifies the logic in choosethread(). Aside from the run queue links kern_switch.c mostly does not care about the contents of td_sched. Discussed with: julian - Move the idle thread loop into the per scheduler area. ULE wants to do something different from the other schedulers. Suggested by: jhb Tested on: x86/amd64 sched_{4BSD, ULE, CORE}.
# 158904	24-May-2006	sam	When starting up threads in taskqueue_start_threads create them stopped before adjusting their priority and setting them on the run q so they cannot race for resources (pointed out by njl). While here add a console printf on thread create fails; otherwise noone may notice (e.g. return value is always 0 and caller has no way to verify). Reviewed by: jhb, scottl MFC after: 2 weeks
# 157815	17-Apr-2006	jhb	Change msleep() and tsleep() to not alter the calling thread's priority if the specified priority is zero. This avoids a race where the calling thread could read a snapshot of it's current priority, then a different thread could change the first thread's priority, then the original thread would call sched_prio() inside msleep() undoing the change made by the second thread. I used a priority of zero as no thread that calls msleep() or tsleep() should be specifying a priority of zero anyway. The various places that passed 'curthread->td_priority' or some variant as the priority now pass 0.
# 157314	30-Mar-2006	sam	fixup error handling in taskqueue_start_threads: check for kthread_create failing, print a message when we fail for some reason as most callers do not check the return value (e.g. 'cuz they're called from SYSINIT) Reviewed by: scottl MFC after: 1 week
# 154333	13-Jan-2006	scottl	Add the following to the taskqueue api: taskqueue_start_threads(struct taskqueue *, int count, int pri, const char name, ...); This allows the creation of 1 or more threads that will service a single taskqueue. Also rework the taskqueue_create() API to remove the API change that was introduced a while back. Creating a taskqueue doesn't rely on the presence of a process structure, and the proc mechanics are much better encapsulated in taskqueue_start_threads(). Also clean up the taskqueue_terminate() and taskqueue_free() functions to safely drain pending tasks and remove all associated threads. The TASKQUEUE_DEFINE and TASKQUEUE_DEFINE_THREAD macros have been changed to use the new API, but drivers compiled against the old definitions will still work. Thus, recompiling drivers is not a strict requirement.
# 154205	10-Jan-2006	scottl	The interlock in taskqueue_terminate() is completely wrong for taskqueues that use spinlocks. Remove it for now.
# 154167	10-Jan-2006	scottl	Add functions and macros and refactor code to make it easier to manage fast taskqueues. The following have been added: TASKQUEUE_FAST_DEFINE() - create a global task queue. an arbitrary execution context. TASKQUEUE_FAST_DEFINE_THREAD() - create a global taskqueue that uses a dedicated kthread. taskqueue_create_fast() - create a local/private taskqueue. These are all complimentary of the standard taskqueue functions. They are primarily useful for fast interrupt handlers that can only use spinlock for synchronization. I personally think that the taskqueue API is starting to get too narrow and hairy, but fixing it will require a major redesign on the API. Such a redesign would be good but would break compatibility with FreeBSD 6.x, so it really isn't desirable at this time. Submitted by: sam
# 153676	23-Dec-2005	scottl	Create the taskqueue_fast handler with INTR_MPSAFE so that it doesn't run with Giant. MFC After: 3 days
# 151656	25-Oct-2005	jhb	Use shorter names for the Giant and fast taskqueues so that their names actually fit.
# 151624	24-Oct-2005	jhb	Revert previous change to this file. I accidentally committed while fixing spelling in a comment.
# 151623	24-Oct-2005	jhb	Spell hierarchy correctly in comments. Submitted by: Wojciech A. Koszek dunstan at freebsd dot czest dot pl
# 145729	30-Apr-2005	sam	o enable shutdown of taskqueue threads; the thread servicing the queue checks a new entry in the taskqueue struct each time it wakes up to see if it should terminate o adjust TASKQUEUE_DEFINE_THREAD & co. to record the thread/proc identity for the shutdown rendezvous o replace wakeup after adding a task to a queue with wakeup_one; this helps queues where multiple threads are used to service tasks (e.g. acpi) o remove NULL check of tq_enqueue method; it should never be NULL Reviewed by: dfr, njl
# 145473	24-Apr-2005	sam	o eliminate modification of task structures after their run to avoid modify-after-free races when the task structure is malloc'd o shrink task structure by removing ta_flags (no longer needed with avoid fix) and combining ta_pending and ta_priority Reviewed by: dwhite, dfr MFC after: 4 days
# 136131	05-Oct-2004	imp	Add taskqueue_drain. This waits for the specified task to finish, if running, or returns. The calling program is responsible for making sure that nothing new is enqueued. # man page coming soon.
# 133305	08-Aug-2004	jmg	rearange some code that handles the thread taskqueue so that it is more generic. Introduce a new define TASKQUEUE_DEFINE_THREAD that takes a single arg, which is the name of the queue. Document these changes.
# 131246	28-Jun-2004	jhb	- Execute all of the tasks on the taskqueue during taskqueue_free() after the queue has been removed from the global taskqueue_queues list. This removes the need for the draining queue hack. - Allow taskqueue_run() to be called with the taskqueue mutex held. It can still be called without the lock for API compatiblity. In that case it will acquire the lock internally. - Don't lock the individual queue mutex in taskqueue_find() until after the strcmp as the global queues mutex is sufficient for the strcmp. - Simplify taskqueue_thread_loop() now that it can hold the lock across taskqueue_run(). Submitted by: bde (mostly)
# 126027	19-Feb-2004	jhb	Tidy up the thread taskqueue implementation and close a lost wakeup race. Instead of creating a mutex that we msleep on but don't actually lock when doing the corresponding wakeup(), in the kthread, lock the mutex associated with our taskqueue and msleep while the queue is empty. Assert that the queue is locked when the callback function is called to wake the kthread.
# 123614	17-Dec-2003	jhb	Various style fixes. Submitted by: bde (mostly, if not all)
# 122436	10-Nov-2003	alfred	Fix a bug where the taskqueue kproc was being parented by init because RFNOWAIT was being passed to kproc_create. The result was that shutdown took quite a bit longer because this errant "child" would not respond to termination signals from init at system shutdown. RFNOWAIT dissassociates itself from the caller by attaching to init as a parent proc. We could have had the taskqueue proc listen for SIGKILL, but being able to SIGKILL a potentially critical system process doesn't seem like a good idea.
# 119812	06-Sep-2003	sam	correct fast swi taskqueue spinlock name to be different from the sleep lock Submitted by: Tor Egge <Tor.Egge@cvsup.no.freebsd.org>
# 119789	05-Sep-2003	sam	"fast swi" taskqueue support. This is a taskqueue that uses spinlocks making it useful for dispatching swi tasks from fast interrupt handlers. Sponsered by: FreeBSD Foundation
# 119708	03-Sep-2003	ken	Move dynamic sysctl(8) variable creation for the cd(4) and da(4) drivers out of cdregister() and daregister(), which are run from interrupt context. The sysctl code does blocking mallocs (M_WAITOK), which causes problems if malloc(9) actually needs to sleep. The eventual fix for this issue will involve moving the CAM probe process inside a kernel thread. For now, though, I have fixed the issue by moving dynamic sysctl variable creation for these two drivers to a task queue running in a kernel thread. The existing task queues (taskqueue_swi and taskqueue_swi_giant) run in software interrupt handlers, which wouldn't fix the problem at hand. So I have created a new task queue, taskqueue_thread, that runs inside a kernel thread. (It also runs outside of Giant -- clients must explicitly acquire and release Giant in their taskqueue functions.) scsi_cd.c: Remove sysctl variable creation code from cdregister(), and move it to a new function, cdsysctlinit(). Queue cdsysctlinit() to the taskqueue_thread taskqueue once we have fully registered the cd(4) driver instance. scsi_da.c: Remove sysctl variable creation code from daregister(), and move it to move it to a new function, dasysctlinit(). Queue dasysctlinit() to the taskqueue_thread taskqueue once we have fully registered the da(4) instance. taskqueue.h: Declare the new taskqueue_thread taskqueue, update some comments. subr_taskqueue.c: Create the new kernel thread taskqueue. This taskqueue runs outside of Giant, so any functions queued to it would need to explicitly acquire/release Giant if they need it. cd.4: Update the cd(4) man page to talk about the minimum command size sysctl/loader tunable. Also note that the changer variables are available as loader tunables as well. da.4: Update the da(4) man page to cover the retry_count, default_timeout and minimum_cmd_size sysctl variables/loader tunables. Remove references to /dev/r???, they aren't used any longer. cd.9: Update the cd(9) man page to describe the CD_Q_10_BYTE_ONLY quirk. taskqueue.9: Update the taskqueue(9) man page to describe the new thread task queue, and the taskqueue_swi_giant queue. MFC after: 3 days
# 116182	10-Jun-2003	obrien	Use __FBSDID().
# 111528	26-Feb-2003	scottl	Introduce a new taskqueue that runs completely free of Giant, and in turns runs its tasks free of Giant too. It is intended that as drivers become locked down, they will move out of the old, Giant-bound taskqueue and into this new one. The old taskqueue has been renamed to taskqueue_swi_giant, and the new one keeps the name taskqueue_swi.
# 101154	01-Aug-2002	jhb	Forced commit to note that the previous log was incorrect. The previous commit added an assertion that a taskqueue being free'd wasn't being drained at the same time.
# 101153	01-Aug-2002	jhb	If we fail to write to a vnode during a ktrace write, then we drop all other references to that vnode as a trace vnode in other processes as well as in any pending requests on the todo list. Thus, it is possible for a ktrace request structure to have a NULL ktr_vp when it is destroyed in ktr_freerequest(). We shouldn't call vrele() on the vnode in that case. Reported by: bde
# 93818	04-Apr-2002	jhb	Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used. Tested on: i386, alpha, sparc64
# 88900	05-Jan-2002	jhb	Change the preemption code for software interrupt thread schedules and mutex releases to not require flags for the cases when preemption is not allowed: The purpose of the MTX_NOSWITCH and SWI_NOSWITCH flags is to prevent switching to a higher priority thread on mutex releease and swi schedule, respectively when that switch is not safe. Now that the critical section API maintains a per-thread nesting count, the kernel can easily check whether or not it should switch without relying on flags from the programmer. This fixes a few bugs in that all current callers of swi_sched() used SWI_NOSWITCH, when in fact, only the ones called from fast interrupt handlers and the swi_sched of softclock needed this flag. Note that to ensure that swi_sched()'s in clock and fast interrupt handlers do not switch, these handlers have to be explicitly wrapped in critical_enter/exit pairs. Presently, just wrapping the handlers is sufficient, but in the future with the fully preemptive kernel, the interrupt must be EOI'd before critical_exit() is called. (critical_exit() can switch due to a deferred preemption in a fully preemptive kernel.) I've tested the changes to the interrupt code on i386 and alpha. I have not tested ia64, but the interrupt code is almost identical to the alpha code, so I expect it will work fine. PowerPC and ARM do not yet have interrupt code in the tree so they shouldn't be broken. Sparc64 is broken, but that's been ok'd by jake and tmm who will be fixing the interrupt code for sparc64 shortly. Reviewed by: peter Tested on: i386, alpha
# 85560	26-Oct-2001	jhb	- Change the taskqueue locking to protect the necessary parts of a task while it is on a queue with the queue lock and remove the per-task locks. - Remove TASK_DESTROY now that it is no longer needed. - Go back to inlining TASK_INIT now that it is short again. Inspired by: dfr
# 85521	26-Oct-2001	jhb	Add locking to taskqueues. There is one mutex per task, one mutex per queue, and a mutex to protect the global list of taskqueues. The only visible change is that a TASK_DESTROY() macro has been added to mirror the TASK_INIT() macro to destroy a task before it is free'd. Submitted by: Andrew Reiter <awr@watson.org>
# 76666	16-May-2001	alfred	remove include of ipl.h because it no longer exists
# 72238	09-Feb-2001	jhb	- Catch up to the new swi API changes: - Use swi_* function names. - Use void * to hold cookies to handlers instead of struct intrhand *. - In sio.c, use 'driver_name' instead of "sio" as the name of the driver lock to minimize diffs with cy(4).
# 69774	08-Dec-2000	phk	Staticize some malloc M_ instances.
# 67551	25-Oct-2000	jhb	- Overhaul the software interrupt code to use interrupt threads for each type of software interrupt. Roughly, what used to be a bit in spending now maps to a swi thread. Each thread can have multiple handlers, just like a hardware interrupt thread. - Instead of using a bitmask of pending interrupts, we schedule the specific software interrupt thread to run, so spending, NSWI, and the shandlers array are no longer needed. We can now have an arbitrary number of software interrupt threads. When you register a software interrupt thread via sinthand_add(), you get back a struct intrhand that you pass to sched_swi() when you wish to schedule your swi thread to run. - Convert the name of 'struct intrec' to 'struct intrhand' as it is a bit more intuitive. Also, prefix all the members of struct intrhand with 'ih_'. - Make swi_net() a MI function since there is now no point in it being MD. Submitted by: cp
# 66698	05-Oct-2000	jhb	- Heavyweight interrupt threads on the alpha for device I/O interrupts. - Make softinterrupts (SWI's) almost completely MI, and divorce them completely from the x86 hardware interrupt code. - The ihandlers array is now gone. Instead, there is a MI shandlers array that just contains SWI handlers. - Most of the former machine/ipl.h files have moved to a new sys/ipl.h. - Stub out all the spl*() functions on all architectures. Submitted by: dfr
# 65822	13-Sep-2000	jhb	- Remove the inthand2_t type and use the equivalent driver_intr_t type from newbus for referencing device interrupt handlers. - Move the 'struct intrec' type which describes interrupt sources into sys/interrupt.h instead of making it just be a x86 structure. - Don't create 'ithd' and 'intrec' typedefs, instead, just use 'struct ithd' and 'struct intrec' - Move the code to translate new-bus interrupt flags into an interrupt thread priority out of the x86 nexus code and into a MI ithread_priority() function in sys/kern/kern_intr.c. - Remove now-uneeded x86-specific headers from sys/dev/ata/ata-all.c and sys/pci/pci_compat.c.
# 64199	03-Aug-2000	hsu	Modify to use fixed STAILQ_LAST(). Reviewed by: dfr
# 61033	28-May-2000	dfr	Add taskqueue system for easy-to-use SWIs among other things. Reviewed by: arch