#
e4b7bbd6 |
|
13-Jan-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
lio_listio(2): add LIO_FOFFSET flag to ignore aiocb aio_offset and use the current file offset instead. Requested by: Vinícius dos Santos Oliveira <vini.ipsmaker@gmail.com> Reviewed by: jhb Discussed with: asomers Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D43448
|
#
61cc4830 |
|
18-Jan-2024 |
Alfredo Mazzinghi <am2419@cl.cam.ac.uk> |
Abstract UIO allocation and deallocation. Introduce the allocuio() and freeuio() functions to allocate and deallocate struct uio. This hides the actual allocator interface, so it is easier to modify the sub-allocation layout of struct uio and the corresponding iovec array. Obtained from: CheriBSD Reviewed by: kib, markj MFC after: 2 weeks Sponsored by: CHaOS, EPSRC grant EP/V000292/1 Differential Revision: https://reviews.freebsd.org/D43711
|
#
b068bb09 |
|
07-Jan-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
Add vnode_pager_clean_{a,}sync(9) Bump __FreeBSD_version for ZFS use. Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D43356
|
#
fdafd315 |
|
24-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Automated cleanup of cdefs and other formatting Apply the following automated changes to try to eliminate no-longer-needed sys/cdefs.h includes as well as now-empty blank lines in a row. Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/ Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/ Remove /\n+#if.*\n#endif.*\n+/ Remove /^#if.*\n#endif.*\n/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/ Sponsored by: Netflix
|
#
685dc743 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
4d846d26 |
|
10-May-2023 |
Warner Losh <imp@FreeBSD.org> |
spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix
|
#
98844e99 |
|
15-Feb-2023 |
John Baldwin <jhb@FreeBSD.org> |
aio: Fix more synchronization issues in aio_biowakeup. - Use atomic_store to set job->error. atomic_set does an or operation, not assignment. - Use refcount_* to manage job->nbio. This ensures proper memory barriers are present so that the last bio won't see a possibly stale value of job->error. - Don't re-read job->error after reading it via atomic_load. Reported by: markj (1) Reviewed by: mjg, markj Differential Revision: https://reviews.freebsd.org/D38611
|
#
cca6d616 |
|
15-Feb-2023 |
John Baldwin <jhb@FreeBSD.org> |
aio_biowakeup: Various style fixes.
|
#
40734fc5 |
|
15-Feb-2023 |
Keith Reynolds <keith.reynolds@hpe.com> |
aio: Fix a test and set race in aio_biowakeup. Use atomic_fetchadd in place of separate atomic_subtract / atomic_load. Reviewed by: markj Sponsored by: HPE TidalScale Differential Revision: https://reviews.freebsd.org/D38559
|
#
a75d1ddd |
|
17-Sep-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: introduce V_PCATCH to stop abusing PCATCH
|
#
9553bc89 |
|
19-Jun-2022 |
Mark Johnston <markj@FreeBSD.org> |
aio: Improve UMA usage - Remove the AIO proc zone. This zone gets one allocation per AIO daemon process, which isn't enough to warrant a dedicated zone. Plus, unlike other AIO structures, aiops are small (32 bytes with LP64), so UMA doesn't provide better space efficiency than malloc(9). Change one of the malloc types in vfs_aio.c to make it more general. - Don't set the NOFREE flag on the other AIO zones. This flag means that memory allocated to the AIO subsystem is never freed back to the VM, so it's always preferable to avoid using it when possible. NOFREE was set without explanation when AIO was converted to use UMA 20 years ago, but it does not appear to be required; all of the structures allocated from UMA (per-process kaioinfo, kaiocb, and aioliojob) keep track of references and get freed only when none exist. Plus, these structures will contain dangling pointer after they're freed (e.g., the "cred", "fd_file" and "uiop" fields of struct kaiocb), so use-after-frees are dangerous even when the structures themselves are type-stable. Reviewed by: asomers MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D35493
|
#
31d1b816 |
|
28-May-2022 |
Dmitry Chagin <dchagin@FreeBSD.org> |
sysent: Get rid of bogus sys/sysent.h include. Where appropriate hide sysent.h under proper condition. MFC after: 2 weeks
|
#
e9c7ec22 |
|
14-Nov-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
aio: whack "set but not used" warnings
|
#
45c2c7c4 |
|
23-Sep-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
aio_aqueue(): avoid ucred leak on failure path PR: 258698 Submitted by: sigsys@gmail.com MFC after: 1 week
|
#
2933a7ca |
|
19-Sep-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
aio_fsync_vnode: handle ERELOOKUP after VOP_FSYNC() Reported by: tmunro Reviewed by: jhb, tmunro Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D32023
|
#
922bee44 |
|
19-Sep-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
aio_fsync_vnode: use for(;;) loop instead of label Reviewed by: jhb, tmunro Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D32023
|
#
2884918c |
|
10-Sep-2021 |
Mark Johnston <markj@FreeBSD.org> |
aio: Fix up the opcode in aiocb32_copyin() With lio_listio(2), the opcode is specified by userspace rather than being hard-coded by the system call (e.g., aio_readv() -> LIO_READV). kern_lio_listio() calls aio_aqueue() with an opcode of LIO_NOP, which gets fixed up when the aiocb is copied in. When copying in a job request for vectored I/O, we need to dynamically allocate a uio to wrap an iovec. So aiocb_copyin() needs to get the opcode from the aiocb and then decide whether an allocation is required. We failed to do this in the COMPAT_FREEBSD32 case. Fix it. Reported by: syzbot+27eab6f2c2162f2885ee@syzkaller.appspotmail.com Reviewed by: kib, asomers Fixes: f30a1ae8d529 ("lio_listio(2): Allow LIO_READV and LIO_WRITEV.") Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31914
|
#
f30a1ae8 |
|
22-Aug-2021 |
Thomas Munro <tmunro@FreeBSD.org> |
lio_listio(2): Allow LIO_READV and LIO_WRITEV. Allow multiple vector IOs to be started with one system call. aio_readv() and aio_writev() already used these opcodes under the covers. This commit makes them available to user space. Being non-standard extensions, they're only visible if __BSD_VISIBLE is defined, like the functions. Reviewed by: asomers, kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D31627
|
#
2e5f6152 |
|
15-Jul-2021 |
Mark Johnston <markj@FreeBSD.org> |
lio_listio: Don't post a completion notification if none was requested One is allowed to use LIO_NOWAIT without specifying a sigevent. In this case, lj->lioj_signal is left uninitialized, but several code paths examine liov_signal.sigev_notify to figure out which notification to post. Unconditionally initialize that field to SIGEV_NONE. Add a dumb test case which triggers the bug. Reported by: KMSAN+syzkaller Reviewed by: asomers MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31197
|
#
8d9ed174 |
|
17-Mar-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
open(2): Implement O_PATH Reviewed by: markj Tested by: pho Discussed with: walker.aj325_gmail.com, wulf Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29323
|
#
2247f489 |
|
02-Jan-2021 |
Alan Somers <asomers@FreeBSD.org> |
aio: micro-optimize the lio_opcode assignments This allows slightly more efficient opcode testing in-kernel. It is transparent to userland, except to applications that sneakily submit aio fsync or aio mlock operations via lio_listio, which has never been documented, requires the use of deliberately undefined constants (LIO_SYNC and LIO_MLOCK), and is arguably a bug. Reviewed by: jhb Differential Revision: https://reviews.freebsd.org/D27942
|
#
ff1a3078 |
|
09-Jan-2021 |
Alan Somers <asomers@FreeBSD.org> |
lio_listio: validate aio_lio_opcode Previously, we would accept any kind of LIO_* opcode, including ones that were intended for in-kernel use only like LIO_SYNC (which is not defined in userland). The situation became more serious with 022ca2fc7fe08d51f33a1d23a9be49e6d132914e. After that revision, setting aio_lio_opcode to LIO_WRITEV or LIO_READV would trigger an assertion. Note that POSIX does not specify what should happen if aio_lio_opcode is invalid. MFC-with: 022ca2fc7fe08d51f33a1d23a9be49e6d132914e Reviewed by: jhb, tmunro, 0mp Differential Revision: <https://reviews.freebsd.org/D28078
|
#
801ac943 |
|
07-Jan-2021 |
Thomas Munro <tmunro@FreeBSD.org> |
aio_fsync(2): Support O_DSYNC. aio_fsync(O_DSYNC, ...) is the asynchronous version of fdatasync(2). Reviewed by: kib, asomers, jhb Differential Review: https://reviews.freebsd.org/D25071
|
#
022ca2fc |
|
02-Jan-2021 |
Alan Somers <asomers@FreeBSD.org> |
Add aio_writev and aio_readv POSIX AIO is great, but it lacks vectored I/O functions. This commit fixes that shortcoming by adding aio_writev and aio_readv. They aren't part of the standard, but they're an obvious extension. They work just like their synchronous equivalents pwritev and preadv. It isn't yet possible to use vectored aiocbs with lio_listio, but that could be added in the future. Reviewed by: jhb, kib, bcr Relnotes: yes Differential Revision: https://reviews.freebsd.org/D27743
|
#
01206038 |
|
21-Dec-2020 |
Alan Somers <asomers@FreeBSD.org> |
AIO: remove the kaiocb->bio linkage Vectored aio will require each aiocb to be associated with multiple bios, so we can't store a link to the latter from the former. But we don't really need to. aio_biowakeup already knows the bio it's using, and the other fields can be stored within the bio and/or buf itself. Also, remove the unused kaiocb.backend2 field. Reviewed By: kib Differential Revision: https://reviews.freebsd.org/D27682
|
#
6814c2da |
|
01-Dec-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
lio_listio(2): send signal even if number of jobs is zero. Right now, if lio registered zero jobs, syscall frees lio job structure, cleaning up queued ksi. As result, the realtime signal is dequeued and never delivered. Fix it by allowing sendsig() to copy ksi when job count is zero. PR: 220398 Reported and reviewed by: asomers Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D27421
|
#
29331656 |
|
01-Dec-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
vfs_aio.c: style. Mostly re-wrap conditions to split after binary ops. Reviewed by: asomers Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D27421
|
#
5c5005ec |
|
01-Dec-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
vfs_aio.c: correct comment. Reviewed by: asomers Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D27421
|
#
a9d4fe97 |
|
29-Nov-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
bio aio: Destroy ephemeral mapping before unwiring page. Apparently some architectures, like ppc in its hashed page tables variants, account mappings by pmap_qenter() in the response from pmap_is_page_mapped(). While there, eliminate useless userp variable. Noted and reviewed by: alc (previous version) Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D27409
|
#
cd853791 |
|
27-Nov-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Make MAXPHYS tunable. Bump MAXPHYS to 1M. Replace MAXPHYS by runtime variable maxphys. It is initialized from MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys. Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer cache buffers exactly to atop(maxbcachebuf) (currently it is sized to atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1. The +1 for pbufs allow several pbuf consumers, among them vmapbuf(), to use unaligned buffers still sized to maxphys, esp. when such buffers come from userspace (*). Overall, we save significant amount of otherwise wasted memory in b_pages[] for buffer cache buffers, while bumping MAXPHYS to desired high value. Eliminate all direct uses of the MAXPHYS constant in kernel and driver sources, except a place which initialize maxphys. Some random (and arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted straight. Some drivers, which use MAXPHYS to size embeded structures, get private MAXPHYS-like constant; their convertion is out of scope for this work. Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs, dev/siis, where either submitted by, or based on changes by mav. Suggested by: mav (*) Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D27225
|
#
f7db0c95 |
|
04-Nov-2020 |
Mark Johnston <markj@FreeBSD.org> |
vmspace: Convert to refcount(9) This is mostly mechanical except for vmspace_exit(). There, use the new refcount_release_if_last() to avoid switching to vmspace0 unless other processes are sharing the vmspace. In that case, upon switching to vmspace0 we can unconditionally release the reference. Remove the volatile qualifier from vm_refcnt now that accesses are protected using refcount(9) KPIs. Reviewed by: alc, kib, mmel MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27057
|
#
8128c65b |
|
30-Sep-2020 |
John Baldwin <jhb@FreeBSD.org> |
Avoid a dubious assignment to bio_data in aio_qbio(). A user pointer is not a suitable value for bio_data and the next block of code always overwrites bio_data anyway. Just use cb->aio_buf directly in the call to vm_fault_quick_hold_pages(). Reviewed by: kib Obtained from: CheriBSD MFC after: 1 month Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D26595
|
#
7ad2a82d |
|
18-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: drop the error parameter from vn_isdisk, introduce vn_isdisk_error Most consumers pass NULL.
|
#
7029da5c |
|
26-Feb-2020 |
Pawel Biernacki <kaktus@FreeBSD.org> |
Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718
|
#
b249ce48 |
|
03-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: drop the mostly unused flags argument from VOP_UNLOCK Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427
|
#
849aef49 |
|
21-Nov-2019 |
Andrew Turner <andrew@FreeBSD.org> |
Port the NetBSD KCSAN runtime to FreeBSD. Update the NetBSD Kernel Concurrency Sanitizer (KCSAN) runtime to work in the FreeBSD kernel. It is a useful tool for finding data races between threads executing on different CPUs. This can be enabled by enabling KCSAN in the kernel config, or by using the GENERIC-KCSAN amd64 kernel. It works on amd64 and arm64, however the later needs a compiler change to allow -fsanitize=thread that KCSAN uses. Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D22315
|
#
756a5412 |
|
14-Jan-2019 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Allocate pager bufs from UMA instead of 80-ish mutex protected linked list. o In vm_pager_bufferinit() create pbuf_zone and start accounting on how many pbufs are we going to have set. In various subsystems that are going to utilize pbufs create private zones via call to pbuf_zsecond_create(). The latter calls uma_zsecond_create(), and sets a limit on created zone. After startup preallocate pbufs according to requirements of all pbuf zones. Subsystems that used to have a private limit with old allocator now have private pbuf zones: md(4), fusefs, NFS client, smbfs, VFS cluster, FFS, swap, vnode pager. The following subsystems use shared pbuf zone: cam(4), nvme(4), physio(9), aio(4). They should have their private limits, but changing that is out of scope of this commit. o Fetch tunable value of kern.nswbuf from init_param2() and while here move NSWBUF_MIN to opt_param.h and eliminate opt_swap.h, that was holding only this option. Default values aren't touched by this commit, but they probably should be reviewed wrt to modern hardware. This change removes a tight bottleneck from sendfile(2) operation, that uses pbufs in vnode pager. Other pagers also would benefit from faster allocation. Together with: gallatin Tested by: pho
|
#
72bce9ff |
|
26-Nov-2018 |
Alan Somers <asomers@FreeBSD.org> |
vfs_aio.c: rename "physio" symbols to "bio". aio has two paths: an asynchronous "physio" path and a synchronous path. Confusingly, physio(9) isn't actually used by the "physio" path, and never has been. In fact, it may even be called by the synchronous path! Rename the "physio" path to the "bio" path to reflect what it actually does: directly compose BIOs and send them to character devices. MFC after: 2 weeks
|
#
792843c3 |
|
24-Nov-2018 |
Mark Johnston <markj@FreeBSD.org> |
Pass malloc flags directly through kevent(2) subroutines. Some kevent functions have a boolean "waitok" parameter for use when calling malloc(9). Replace them with the corresponding malloc() flags: the desired behaviour is known at compile-time, so this eliminates a couple of conditional branches, and makes the code easier to read. No functional change intended. Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D18318
|
#
36c4960e |
|
24-Nov-2018 |
Mark Johnston <markj@FreeBSD.org> |
Plug some kernel memory disclosures via kevent(2). The kernel may register for events on behalf of a userspace process, in which case it must be careful to zero the kevent struct that will be copied out to userspace. Reviewed by: kib MFC after: 3 days Security: kernel stack memory disclosure Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D18317
|
#
cbd92ce6 |
|
09-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
Eliminate the overhead of gratuitous repeated reinitialization of cap_rights - Add macros to allow preinitialization of cap_rights_t. - Convert most commonly used code paths to use preinitialized cap_rights_t. A 3.6% speedup in fstat was measured with this change. Reported by: mjg Reviewed by: oshogbo Approved by: sbruno MFC after: 1 month
|
#
52c09831 |
|
16-Apr-2018 |
Alan Somers <asomers@FreeBSD.org> |
lio_listio: return EAGAIN instead of EIO when out of resources This behavior is already documented by the man page, and suggested by POSIX. Reviewed by: jhb MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D15099
|
#
6469bdcd |
|
06-Apr-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Move most of the contents of opt_compat.h to opt_global.h. opt_compat.h is mentioned in nearly 180 files. In-progress network driver compabibility improvements may add over 100 more so this is closer to "just about everywhere" than "only some files" per the guidance in sys/conf/options. Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of sys/compat/linux/*.c. A fake _COMPAT_LINUX option ensure opt_compat.h is created on all architectures. Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the set of compiled files. Reviewed by: kib, cem, jhb, jtl Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14941
|
#
86bbef43 |
|
10-Jan-2018 |
John Baldwin <jhb@FreeBSD.org> |
Don't store shadow copies of per-process AIO limits. Previously the AIO subsystem would save a snapshot of the currently configured per-process limits the first time a process used AIO. The process would continue to use the snapshotted limits ignoring any changes to the global limits during the rest of its lifetime. This change removes the snapshotted values and changes the AIO code to always check the global values which can be toggled at runtime. This means an administrator can now change the effective limits of existing processes. This is more consistent with how other limits configured via sysctl work in FreeBSD. Reviewed by: asomers, kib MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D13819
|
#
f54c5606 |
|
09-Jan-2018 |
John Baldwin <jhb@FreeBSD.org> |
Allow the fast-path for disk AIO requests to fail requests. - If aio_qphysio() returns a non-zero error code, fail the request rather than queueing it to the AIO kproc pool to be retried via the slow path. Currently this means that if vm_fault_quick_hold_pages() reports an error, EFAULT is returned from the fast-path rather than retrying the request in the slow path where it will still fail with EFAULT. - If aio_qphysio() wishes to use the fast path for a device that doesn't support unmapped I/O but there are already the maximum number of such requests in flight, fail with EAGAIN as we do for other AIO resource limits rather than queueing the request to the AIO kproc pool. - Move the opcode check for aio_qphysio() out of the caller and into aio_qphysio() to simplify some logic and remove two goto's while here. It also uses a whitelist (only supported for LIO_READ / LIO_WRITE) rather than a blacklist (skipped for LIO_SYNC). PR: 217261 Submitted by: jkim (an earlier version) MFC after: 2 weeks Sponsored by: Chelsio Communications
|
#
7e409184 |
|
09-Jan-2018 |
John Baldwin <jhb@FreeBSD.org> |
Simplify some logic by merging an if test with a subsequent switch. Specifically, in aio_queue_file() the code was doing this: if (opcode == LIO_SYNC) { ... } switch (opcode) { ... case LIO_SYNC: ... } This moves the body of the if statement into the LIO_SYNC case of the switch statement. MFC after: 2 weeks Sponsored by: Chelsio Communications
|
#
8091e52b |
|
09-Jan-2018 |
John Baldwin <jhb@FreeBSD.org> |
Add a counter to track in-flight AIO requests using unmapped I/O. MFC after: 2 weeks Sponsored by: Chelsio Communications
|
#
151ba793 |
|
24-Dec-2017 |
Alexander Kabaev <kan@FreeBSD.org> |
Do pass removing some write-only variables from the kernel. This reduces noise when kernel is compiled by newer GCC versions, such as one used by external toolchain ports. Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial) Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c) Differential Revision: https://reviews.freebsd.org/D10385
|
#
8a36da99 |
|
27-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys/kern: adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.
|
#
df485bdb |
|
26-Oct-2017 |
Alan Somers <asomers@FreeBSD.org> |
Fix aio_suspend in 32-bit emulation An off-by-one error has been present since the system call was first present in 185878. It additionally became a memory corruption bug after change 324941. The failure is actually revealed by our existing AIO tests. However, apparently nobody's been running those in 32-bit emulation mode. Reported by: Coverity, cem CID: 1382114 MFC after: 18 days X-MFC-With: 324941 Sponsored by: Spectra Logic Corp
|
#
913b9329 |
|
23-Oct-2017 |
Alan Somers <asomers@FreeBSD.org> |
Remove artificial restriction on lio_listio's operation count In r322258 I made p1003_1b.aio_listio_max a tunable. However, further investigation shows that there was never any good reason for that limit to exist in the first place. It's used in two completely different ways: * To size a UMA zone, which globally limits the number of concurrent aio_suspend calls. * To artifically limit the number of operations in a single lio_listio call. There doesn't seem to be any memory allocation associated with this limit. This change does two things: * Properly names aio_suspend's UMA zone, and sizes it based on a new constant. * Eliminates the artifical restriction on lio_listio. Instead, lio_listio calls will now be limited by the more generous max_aio_queue_per_proc. The old p1003_1b.aio_listio_max is now an alias for vfs.aio.max_aio_queue_per_proc, so sysconf(3) will still work with _SC_AIO_LISTIO_MAX. Reported by: bde Reviewed by: jhb MFC after: 3 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D12120
|
#
c45796d5 |
|
08-Aug-2017 |
Alan Somers <asomers@FreeBSD.org> |
Make p1003_1b.aio_listio_max a tunable p1003_1b.aio_listio_max is now a tunable. Its value is reflected in the sysctl of the same name, and the sysconf(3) variable _SC_AIO_LISTIO_MAX. Its value will be bounded from below by the compile-time constant AIO_LISTIO_MAX and from above by the compile-time constant MAX_AIO_QUEUE_PER_PROC and the tunable vfs.aio.max_aio_queue. Reviewed by: jhb, kib MFC after: 3 weeks Relnotes: yes Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D11601
|
#
711dba24 |
|
19-Jun-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Allow negative aio_offset only for the read and write LIO ops on device nodes. Otherwise, the current check of aio_offset == -1LL makes it possible to pass negative file offsets down to the filesystems. This trips assertions and is even unsafe for e.g. FFS which keeps metadata at negative offsets. Reported and tested by: pho Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D11266
|
#
2b34e843 |
|
16-Jun-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Add abstime kqueue(2) timers and expand struct kevent members. This change implements NOTE_ABSTIME flag for EVFILT_TIMER, which specifies that the data field contains absolute time to fire the event. To make this useful, data member of the struct kevent must be extended to 64bit. Using the opportunity, I also added ext members. This changes struct kevent almost to Apple struct kevent64, except I did not changed type of ident and udata, the later would cause serious API incompatibilities. The type of ident was kept uintptr_t since EVFILT_AIO returns a pointer in this field, and e.g. CHERI is sensitive to the type (discussed with brooks, jhb). Unlike Apple kevent64, symbol versioning allows us to claim ABI compatibility and still name the new syscall kevent(2). Compat shims are provided for both host native and compat32. Requested by: bapt Reviewed by: bapt, brooks, ngie (previous version) Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D11025
|
#
496ab053 |
|
13-Feb-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Rework r313352. Rename kern_vm_* functions to kern_*. Move the prototypes to syscallsubr.h. Also change Mach VM types to uintptr_t/size_t as needed, to avoid headers pollution. Requested by: alc, jhb Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D9535
|
#
e2a18110 |
|
17-Aug-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove duplicated code. aio_aqueue() calls aio_init_aioinfo() as the first action. There is no need to duplicate the code in kern_aio_fsync(). Also fix indent for aio_aqueue() definition. Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D7523
|
#
005ce8e4 |
|
29-Jul-2016 |
John Baldwin <jhb@FreeBSD.org> |
Fix locking issues with aio_fsync(). - Use correct lock in aio_cancel_sync when dequeueing job. - Add _locked variants of aio_set/clear_cancel_function and use those to avoid lock recursion when adding and removing fsync jobs to the per-process sync queue. - While here, add a basic test for aio_fsync(). PR: 211390 Reported by: Randy Westlund <rwestlun@gmail.com> MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D7339
|
#
b9a53e16 |
|
27-Jul-2016 |
John Baldwin <jhb@FreeBSD.org> |
Adjust tests in fsync job scheduling loop to reduce indentation.
|
#
9c20dc99 |
|
21-Jul-2016 |
John Baldwin <jhb@FreeBSD.org> |
Add more documentation regarding unsafe AIO requests. The asynchronous I/O changes made previously result in different behavior out of the box. Previously all AIO requests failed with ENOSYS / SIGSYS unless aio.ko was explicitly loaded. Now, some AIO requests complete and others ("unsafe" requests) fail with EOPNOTSUPP. Reword the introductory paragraph in aio(4) to add a general description of AIO before describing the vfs.aio.enable_unsafe sysctl. Remove the ENOSYS error description from aio_fsync(2), aio_read(2), and aio_write(2) and replace it with a description of EOPNOTSUPP. Remove the ENOSYS error description from aio_mlock(2). Log a message to the system log the first time a process requests an "unsafe" AIO request that fails with EOPNOTSUPP. This is modeled on the log message used for processes using the legacy pty devices. Reviewed by: kib (earlier version) MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D7151
|
#
9fe297bb |
|
21-Jul-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Declare aio requests on files from local filesystems safe. Two notes: - I allow AIO on reclaimed vnodes, since it is deterministically terminated fast. - devfs mounts are marked as MNT_LOCAL, but device vnodes have type VCHR, so the slow device io is not allowed. Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D7273
|
#
b1012d80 |
|
21-Jun-2016 |
John Baldwin <jhb@FreeBSD.org> |
Account for AIO socket operations in thread/process resource usage. File and disk-backed I/O requests store counts of read/written disk blocks in each AIO job so that they can be charged to the thread that completes an AIO request via aio_return() or aio_waitcomplete(). This change extends AIO jobs to store counts of received/sent messages and updates socket backends to set these counts accordingly. Note that the socket backends are careful to only charge a single messages for each AIO request even though a single request on a blocking socket might invoke sosend or soreceive multiple times. This is to mimic the resource accounting of synchronous read/write. Adjust the UNIX socketpair AIO test to verify that the message resource usage counts update accordingly for aio_read and aio_write. Approved by: re (hrs) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D6911
|
#
fe0bdd1d |
|
15-Jun-2016 |
John Baldwin <jhb@FreeBSD.org> |
Move backend-specific fields of kaiocb into a union. This reduces the size of kaiocb slightly. I've also added some generic fields that other backends can use in place of the BIO-specific fields. Change the socket and Chelsio DDP backends to use 'backend3' instead of abusing _aiocb_private.status directly. This confines the use of _aiocb_private to the AIO internals in vfs_aio.c. Reviewed by: kib (earlier version) Approved by: re (gjb) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D6547
|
#
f0ec1740 |
|
20-May-2016 |
John Baldwin <jhb@FreeBSD.org> |
Consistently set status to -1 when completing an AIO request with an error. Sponsored by: Chelsio Communications
|
#
4d805eac |
|
31-Mar-2016 |
John Baldwin <jhb@FreeBSD.org> |
Tidy up the unmapped I/O code in qphysio. - Move some blocks around to reduce the number of 'if (unmap)' checks. - Use 'pbuf == NULL' instead of 'unmap'. - Use nitems. - Pull an assignment out of an if expression. Reviewed by: kib Sponsored by: Chelsio Communications
|
#
bb430bc7 |
|
21-Mar-2016 |
John Baldwin <jhb@FreeBSD.org> |
Fully handle size_t lengths in AIO requests. First, update the return types of aio_return() and aio_waitcomplete() to ssize_t. POSIX requires aio_return() to return a ssize_t so that it can represent all return values from read() and write(). aio_waitcomplete() should use ssize_t for the same reason. aio_return() has used ssize_t in <aio.h> since r31620 but the manpage and system call entry were not updated. aio_waitcomplete() has always returned int. Note that this does not require new system call stubs as this is effectively only an API change in how the compiler interprets the return value. Second, allow aio_nbytes values up to IOSIZE_MAX instead of just INT_MAX. aio_read/write should now honor the same length limits as normal read/write. Third, use longs instead of ints in the aio_return() and aio_waitcomplete() system call functions so that the 64-bit size_t in the in-kernel aiocb isn't truncated to 32-bits before being copied out to userland or being returned. Finally, a simple test has been added to verify the bounds checking on the maximum read size from a file.
|
#
5166fdde |
|
18-Mar-2016 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
aio_qphysio(): Avoid uninitialized pointer read on error. For the !unmap case it may happen that pbuf gets called unreferenced when vm_fault_quick_hold_pages() fails. Initialize it so it doesn't cause trouble. CID: 1352776 Reviewed by: jhb MFC after: 1 week
|
#
399e8c17 |
|
09-Mar-2016 |
John Baldwin <jhb@FreeBSD.org> |
Simplify AIO initialization now that it is standard. - Mark AIO system calls as STD and remove the helpers to dynamically register them. - Use COMPAT6 for the old system calls with the older sigevent instead of an 'o' prefix. - Simplify the POSIX configuration to note that AIO is always available. - Handle AIO in the default VOP_PATHCONF instead of special casing it in the pathconf() system call. fpathconf() is still hackish. - Remove freebsd32_aio_cancel() as it just called the native one directly. Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D5589
|
#
f3215338 |
|
01-Mar-2016 |
John Baldwin <jhb@FreeBSD.org> |
Refactor the AIO subsystem to permit file-type-specific handling and improve cancellation robustness. Introduce a new file operation, fo_aio_queue, which is responsible for queueing and completing an asynchronous I/O request for a given file. The AIO subystem now exports library of routines to manipulate AIO requests as well as the ability to run a handler function in the "default" pool of AIO daemons to service a request. A default implementation for file types which do not include an fo_aio_queue method queues requests to the "default" pool invoking the fo_read or fo_write methods as before. The AIO subsystem permits file types to install a private "cancel" routine when a request is queued to permit safe dequeueing and cleanup of cancelled requests. Sockets now use their own pool of AIO daemons and service per-socket requests in FIFO order. Socket requests will not block indefinitely permitting timely cancellation of all requests. Due to the now-tight coupling of the AIO subsystem with file types, the AIO subsystem is now a standard part of all kernels. The VFS_AIO kernel option and aio.ko module are gone. Many file types may block indefinitely in their fo_read or fo_write callbacks resulting in a hung AIO daemon. This can result in hung user processes (when processes attempt to cancel all outstanding requests during exit) or a hung system. To protect against this, AIO requests are only permitted for known "safe" files by default. AIO requests for all file types can be enabled by setting the new vfs.aio.enable_usafe sysctl to a non-zero value. The AIO tests have been updated to skip operations on unsafe file types if the sysctl is zero. Currently, AIO requests on sockets and raw disks are considered safe and are enabled by default. aio_mlock() is also enabled by default. Reviewed by: cem, jilles Discussed with: kib (earlier version) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D5289
|
#
5652770d |
|
05-Feb-2016 |
John Baldwin <jhb@FreeBSD.org> |
Rename aiocblist to kaiocb and use consistent variable names. Typically <foo>list is used for a structure that holds a list head in FreeBSD, not for members of a list. As such, rename 'struct aiocblist' to 'struct kaiocb' (the kernel version of 'struct aiocb'). While here, use more consistent variable names for AIO control blocks: - Use 'job' instead of 'aiocbe', 'cb', 'cbe', or 'iocb' for kernel job objects. - Use 'jobn' instead of 'cbn' for use with TAILQ_FOREACH_SAFE(). - Use 'sjob' and 'sjobn' instead of 'scb' and 'scbn' for fsync jobs. - Use 'ujob' instead of 'aiocbp', 'job', 'uaiocb', or 'uuaiocb' to hold a user pointer to a 'struct aiocb'. - Use 'ujobp' instead of 'aiocbp' for a user pointer to a 'struct aiocb *'. Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D5125
|
#
0dd6c035 |
|
26-Jan-2016 |
John Baldwin <jhb@FreeBSD.org> |
Various style fixes. - Wrap long lines. - Fix indentation. - Remove excessive parens. - Whitespace fixes in struct definitions. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D5025
|
#
39314b7d |
|
20-Jan-2016 |
John Baldwin <jhb@FreeBSD.org> |
AIO daemons have always been kernel processes to facilitate switching to user VM spaces while servicing jobs. Update various comments and data structures that refer to AIO daemons as threads to refer to processes instead. Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D4999
|
#
4429f0e2 |
|
20-Jan-2016 |
John Baldwin <jhb@FreeBSD.org> |
Remove unused variables for socket AIO. In r55943, a per-process queue of pending socket AIO requests (requests waiting for the socket to become ready) was added so that they could be cancelled during process rundown. In r154765, the rundown code was changed to handle jobs in this state (JOBST_JOBQSOCK) directly removing the need for the extra queue. However, the per-process queue head and global lock were never removed. Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D4997
|
#
8a4dc40f |
|
19-Jan-2016 |
John Baldwin <jhb@FreeBSD.org> |
Various cleanups to the main function for AIO kernel processes: - Pull the vmspace logic out into helper functions and reduce duplication. Operations on the vmspace are all isolated to vm_map.c, but it now exports a new 'vmspace_switch_aio' for use by AIO kernel processes. - When an AIO kernel process wants to exit, break out of the main loop and perform cleanup after the loop end. This reduces a lot of indentation and allows cleanup to more closely mirror setup actions before the loop starts. - Convert a DIAGNOSTIC to KASSERT(). - Replace mycp with more typical 'p'. Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D4990
|
#
f2e7f06a |
|
19-Jan-2016 |
John Baldwin <jhb@FreeBSD.org> |
Don't create a dedicated session for each AIO kernel process. This code dates back to the initial AIO support and the commit log does not explain why it is needed. However, I cannot find anything in the AIO code or the various file methods (fo_read/fo_write) that would change behavior due to using a private session instead of proc0's session. Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D4988
|
#
6c8fd022 |
|
14-Jan-2016 |
John Baldwin <jhb@FreeBSD.org> |
Remove aiod_timeout. It hasn't been used since the AIO code was made MPSAFE 10 years ago. Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D4946
|
#
c85650ca |
|
14-Jan-2016 |
John Baldwin <jhb@FreeBSD.org> |
Rename aiod_bio taskqueue to aiod_kick. This taskqueue is not used to handle bio requests. It is only used to run aio_kick_nowait() to spin up new aio daemon processes. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D4904
|
#
38d68e2d |
|
25-Oct-2015 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
The aio_waitcomplete(2) syscall should not sleep when the given timeout is 0. Without this change it was sleeping for one tick. Maybe not a big deal, but it makes share/dtrace/blocking script to report that. Reviewed by: jhb Differential Revision: https://reviews.freebsd.org/D3814 Sponsored by: Wheel Systems, http://wheelsystems.com
|
#
9889bbac |
|
06-Jul-2015 |
Konstantin Belousov <kib@FreeBSD.org> |
Mutex memory is not zeroed, add MTX_NEW. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
f131759f |
|
05-Jul-2015 |
Mateusz Guzik <mjg@FreeBSD.org> |
fd: make 'rights' a manadatory argument to fget* functions
|
#
f743d981 |
|
22-Apr-2015 |
Alexander Motin <mav@FreeBSD.org> |
Make AIO to not allocate pbufs for unmapped I/O like r281825. While there, make few more performance optimizations. On 40-core system doing many 512-byte AIO reads from array of raw SSDs this change removes lock congestions inside pbuf allocator and devfs, and bottleneck on single AIO completion taskqueue thread. It improves peak AIO performance from ~600K to ~1.3M IOPS. MFC after: 2 weeks
|
#
e015b1ab |
|
26-Oct-2014 |
Mateusz Guzik <mjg@FreeBSD.org> |
Avoid dynamic syscall overhead for statically compiled modules. The kernel tracks syscall users so that modules can safely unregister them. But if the module is not unloadable or was compiled into the kernel, there is no need to do this. Achieve this by adding SY_THR_STATIC_KLD macro which expands to SY_THR_STATIC during kernel build and 0 otherwise. Reviewed by: kib (previous version) MFC after: 2 weeks
|
#
4a144410 |
|
16-Mar-2014 |
Robert Watson <rwatson@FreeBSD.org> |
Update kernel inclusions of capability.h to use capsicum.h instead; some further refinement is required as some device drivers intended to be portable over FreeBSD versions rely on __FreeBSD_version to decide whether to include capability.h. MFC after: 3 weeks
|
#
96a62209 |
|
05-Sep-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
The fget() function now takes pointer to cap_rights_t, so change 0 to NULL.
|
#
7008be5b |
|
04-Sep-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Change the cap_rights_t type from uint64_t to a structure that we can extend in the future in a backward compatible (API and ABI) way. The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough. The structure definition looks like this: struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; }; The initial CAP_RIGHTS_VERSION is 0. The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements. The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future. To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg. #define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL) We still support aliases that combine few rights, but the rights have to belong to the same array element, eg: #define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL) #define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP) There is new API to manage the new cap_rights_t structure: cap_rights_t *cap_rights_init(cap_rights_t *rights, ...); void cap_rights_set(cap_rights_t *rights, ...); void cap_rights_clear(cap_rights_t *rights, ...); bool cap_rights_is_set(const cap_rights_t *rights, ...); bool cap_rights_is_valid(const cap_rights_t *rights); void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src); void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src); bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little); Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg: cap_rights_t rights; cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT); There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg: #define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...); Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1: cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL); Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition. This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x. Sponsored by: The FreeBSD Foundation
|
#
ce625ec7 |
|
15-Aug-2013 |
Kenneth D. Merry <ken@FreeBSD.org> |
Change the way that unmapped I/O capability is advertised. The previous method was to set the D_UNMAPPED_IO flag in the cdevsw for the driver. The problem with this is that in many cases (e.g. sa(4)) there may be some instances of the driver that can handle unmapped I/O and some that can't. The isp(4) driver can handle unmapped I/O, but the esp(4) driver currently cannot. The cdevsw is shared among all driver instances. So instead of setting a flag on the cdevsw, set a flag on the cdev. This allows drivers to indicate support for unmapped I/O on a per-instance basis. sys/conf.h: Remove the D_UNMAPPED_IO cdevsw flag and replace it with an SI_UNMAPPED cdev flag. kern_physio.c: Look at the cdev SI_UNMAPPED flag to determine whether or not a particular driver can handle unmapped I/O. geom_dev.c: Set the SI_UNMAPPED flag for all GEOM cdevs. Since GEOM will create a temporary mapping when needed, setting SI_UNMAPPED unconditionally will work. Remove the D_UNMAPPED_IO flag. nvme_ns.c: Set the SI_UNMAPPED flag on cdevs created here if NVME_UNMAPPED_BIO_SUPPORT is enabled. vfs_aio.c: In aio_qphysio(), check the SI_UNMAPPED flag on a cdev instead of the D_UNMAPPED_IO flag on the cdevsw. sys/param.h: Bump __FreeBSD_version to 1000045 for the switch from setting the D_UNMAPPED_IO flag in the cdevsw to setting SI_UNMAPPED in the cdev. Reviewed by: kib, jimharris MFC after: 1 week Sponsored by: Spectra Logic
|
#
977c7043 |
|
02-Aug-2013 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Remove extra zeroing after M_ZERO allocation.
|
#
97319989 |
|
21-Jul-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Move the convert_sigevent32() utility function into freebsd32_misc.c for consumption outside the vfs_aio.c. For SIGEV_THREAD_ID and SIGEV_SIGNAL notification delivery methods, also copy in the sigev_value, since librt event pumping loop compares note generation number with the value passed through sigev_value. Tested by: Petr Salinger <Petr.Salinger@seznam.cz> Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
6160e12c |
|
08-Jun-2013 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Add new system call - aio_mlock(). The name speaks for itself. It allows to perform the mlock(2) operation, which can consume a lot of time, under control of aio(4). Reviewed by: kib, jilles Sponsored by: Nginx, Inc.
|
#
f95c13db |
|
08-Jun-2013 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Separate LIO_SYNC processing into a separate function aio_process_sync(), and rename aio_process() into aio_process_rw(). Reviewed by: kib Sponsored by: Nginx, Inc.
|
#
f3215a60 |
|
27-Mar-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix a race with the vnode reclamation in the aio_qphysio(). Obtain the thread reference on the vp->v_rdev and use the returned struct cdev *dev instead of using vp->v_rdev. Call dev_strategy_csw() instead of dev_strategy(), since we now own the reference. Since the csw was already calculated, test d_flags to avoid mapping the buffer if the driver supports unmapped requests [*]. Suggested by: kan [*] Reviewed by: kan (previous version) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
#
e81ff91e |
|
19-Mar-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not remap usermode pages into KVA for physio. Sponsored by: The FreeBSD Foundation Tested by: pho
|
#
89f6b863 |
|
08-Mar-2013 |
Attilio Rao <attilio@FreeBSD.org> |
Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes. The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs. The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example). Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho
|
#
2609222a |
|
01-Mar-2013 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
Merge Capsicum overhaul: - Capability is no longer separate descriptor type. Now every descriptor has set of its own capability rights. - The cap_new(2) system call is left, but it is no longer documented and should not be used in new code. - The new syscall cap_rights_limit(2) should be used instead of cap_new(2), which limits capability rights of the given descriptor without creating a new one. - The cap_getrights(2) syscall is renamed to cap_rights_get(2). - If CAP_IOCTL capability right is present we can further reduce allowed ioctls list with the new cap_ioctls_limit(2) syscall. List of allowed ioctls can be retrived with cap_ioctls_get(2) syscall. - If CAP_FCNTL capability right is present we can further reduce fcntls that can be used with the new cap_fcntls_limit(2) syscall and retrive them with cap_fcntls_get(2). - To support ioctl and fcntl white-listing the filedesc structure was heavly modified. - The audit subsystem, kdump and procstat tools were updated to recognize new syscalls. - Capability rights were revised and eventhough I tried hard to provide backward API and ABI compatibility there are some incompatible changes that are described in detail below: CAP_CREATE old behaviour: - Allow for openat(2)+O_CREAT. - Allow for linkat(2). - Allow for symlinkat(2). CAP_CREATE new behaviour: - Allow for openat(2)+O_CREAT. Added CAP_LINKAT: - Allow for linkat(2). ABI: Reuses CAP_RMDIR bit. - Allow to be target for renameat(2). Added CAP_SYMLINKAT: - Allow for symlinkat(2). Removed CAP_DELETE. Old behaviour: - Allow for unlinkat(2) when removing non-directory object. - Allow to be source for renameat(2). Removed CAP_RMDIR. Old behaviour: - Allow for unlinkat(2) when removing directory. Added CAP_RENAMEAT: - Required for source directory for the renameat(2) syscall. Added CAP_UNLINKAT (effectively it replaces CAP_DELETE and CAP_RMDIR): - Allow for unlinkat(2) on any object. - Required if target of renameat(2) exists and will be removed by this call. Removed CAP_MAPEXEC. CAP_MMAP old behaviour: - Allow for mmap(2) with any combination of PROT_NONE, PROT_READ and PROT_WRITE. CAP_MMAP new behaviour: - Allow for mmap(2)+PROT_NONE. Added CAP_MMAP_R: - Allow for mmap(PROT_READ). Added CAP_MMAP_W: - Allow for mmap(PROT_WRITE). Added CAP_MMAP_X: - Allow for mmap(PROT_EXEC). Added CAP_MMAP_RW: - Allow for mmap(PROT_READ | PROT_WRITE). Added CAP_MMAP_RX: - Allow for mmap(PROT_READ | PROT_EXEC). Added CAP_MMAP_WX: - Allow for mmap(PROT_WRITE | PROT_EXEC). Added CAP_MMAP_RWX: - Allow for mmap(PROT_READ | PROT_WRITE | PROT_EXEC). Renamed CAP_MKDIR to CAP_MKDIRAT. Renamed CAP_MKFIFO to CAP_MKFIFOAT. Renamed CAP_MKNODE to CAP_MKNODEAT. CAP_READ old behaviour: - Allow pread(2). - Disallow read(2), readv(2) (if there is no CAP_SEEK). CAP_READ new behaviour: - Allow read(2), readv(2). - Disallow pread(2) (CAP_SEEK was also required). CAP_WRITE old behaviour: - Allow pwrite(2). - Disallow write(2), writev(2) (if there is no CAP_SEEK). CAP_WRITE new behaviour: - Allow write(2), writev(2). - Disallow pwrite(2) (CAP_SEEK was also required). Added convinient defines: #define CAP_PREAD (CAP_SEEK | CAP_READ) #define CAP_PWRITE (CAP_SEEK | CAP_WRITE) #define CAP_MMAP_R (CAP_MMAP | CAP_SEEK | CAP_READ) #define CAP_MMAP_W (CAP_MMAP | CAP_SEEK | CAP_WRITE) #define CAP_MMAP_X (CAP_MMAP | CAP_SEEK | 0x0000000000000008ULL) #define CAP_MMAP_RW (CAP_MMAP_R | CAP_MMAP_W) #define CAP_MMAP_RX (CAP_MMAP_R | CAP_MMAP_X) #define CAP_MMAP_WX (CAP_MMAP_W | CAP_MMAP_X) #define CAP_MMAP_RWX (CAP_MMAP_R | CAP_MMAP_W | CAP_MMAP_X) #define CAP_RECV CAP_READ #define CAP_SEND CAP_WRITE #define CAP_SOCK_CLIENT \ (CAP_CONNECT | CAP_GETPEERNAME | CAP_GETSOCKNAME | CAP_GETSOCKOPT | \ CAP_PEELOFF | CAP_RECV | CAP_SEND | CAP_SETSOCKOPT | CAP_SHUTDOWN) #define CAP_SOCK_SERVER \ (CAP_ACCEPT | CAP_BIND | CAP_GETPEERNAME | CAP_GETSOCKNAME | \ CAP_GETSOCKOPT | CAP_LISTEN | CAP_PEELOFF | CAP_RECV | CAP_SEND | \ CAP_SETSOCKOPT | CAP_SHUTDOWN) Added defines for backward API compatibility: #define CAP_MAPEXEC CAP_MMAP_X #define CAP_DELETE CAP_UNLINKAT #define CAP_MKDIR CAP_MKDIRAT #define CAP_RMDIR CAP_UNLINKAT #define CAP_MKFIFO CAP_MKFIFOAT #define CAP_MKNOD CAP_MKNODAT #define CAP_SOCK_ALL (CAP_SOCK_CLIENT | CAP_SOCK_SERVER) Sponsored by: The FreeBSD Foundation Reviewed by: Christoph Mallon <christoph.mallon@gmx.de> Many aspects discussed with: rwatson, benl, jonathan ABI compatibility discussed with: kib
|
#
5050aa86 |
|
22-Oct-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove the support for using non-mpsafe filesystem modules. In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho
|
#
d56e058a |
|
04-Feb-2012 |
David Xu <davidxu@FreeBSD.org> |
Add 32-bit compat code for AIO kevent flags introduced in revision 230857.
|
#
fde80935 |
|
31-Jan-2012 |
David Xu <davidxu@FreeBSD.org> |
If multiple threads call kevent() to get AIO events on same kqueue fd, it is possible that a single AIO event will be reported to multiple threads, it is not threading friendly, and the existing API can not control this behavior. Allocate a kevent flags field sigev_notify_kevent_flags for AIO event notification in sigevent, and allow user to pass EV_CLEAR, EV_DISPATCH or EV_ONESHOT to AIO kernel code, user can control whether the event should be cleared once it is retrieved by a thread. This change should be comptaible with existing application, because the field should have already been zero-filled, and no additional action will be taken by kernel. PR: kern/156567
|
#
8e9fc278 |
|
30-Jan-2012 |
Doug Ambrisko <ambrisko@FreeBSD.org> |
When detaching an AIO or LIO requests grab the lock and tell knlist_remove that we have the lock now. This cleans up a locking panic ASSERT when knlist_empty is called without a lock when INVARIANTS etc. are turned. Reviewed by: kib jhb MFC after: 1 week
|
#
94fce847 |
|
27-Jan-2012 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Fix size check, that prevents getting negative after casting to a signed type Reviewed by: bde
|
#
434ea137 |
|
26-Jan-2012 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Although aio_nbytes is size_t, later is is signed to casted types: to ssize_t in filesystem code and to int in buf code, thus supplying a negative argument leads to kernel panic later. To fix that check user supplied argument in the beginning of syscall. Submitted by: Maxim Dounin <mdounin mdounin.ru>, maxim@
|
#
8451d0dd |
|
16-Sep-2011 |
Kip Macy <kmacy@FreeBSD.org> |
In order to maximize the re-usability of kernel code in user space this patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz)
|
#
a9d2f8d8 |
|
10-Aug-2011 |
Robert Watson <rwatson@FreeBSD.org> |
Second-to-last commit implementing Capsicum capabilities in the FreeBSD kernel for FreeBSD 9.0: Add a new capability mask argument to fget(9) and friends, allowing system call code to declare what capabilities are required when an integer file descriptor is converted into an in-kernel struct file *. With options CAPABILITIES compiled into the kernel, this enforces capability protection; without, this change is effectively a no-op. Some cases require special handling, such as mmap(2), which must preserve information about the maximum rights at the time of mapping in the memory map so that they can later be enforced in mprotect(2) -- this is done by narrowing the rights in the existing max_protection field used for similar purposes with file permissions. In namei(9), we assert that the code is not reached from within capability mode, as we're not yet ready to enforce namespace capabilities there. This will follow in a later commit. Update two capability names: CAP_EVENT and CAP_KEVENT become CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they represent. Approved by: re (bz) Submitted by: jonathan Sponsored by: Google Inc
|
#
a7d5f7eb |
|
19-Oct-2010 |
Jamie Gritton <jamie@FreeBSD.org> |
A new jail(8) with a configuration file, to replace the work currently done by /etc/rc.d/jail.
|
#
cf7d9a8c |
|
08-Oct-2010 |
David Xu <davidxu@FreeBSD.org> |
Create a global thread hash table to speed up thread lookup, use rwlock to protect the table. In old code, thread lookup is done with process lock held, to find a thread, kernel has to iterate through process and thread list, this is quite inefficient. With this change, test shows in extreme case performance is dramatically improved. Earlier patch was reviewed by: jhb, julian
|
#
36131b48 |
|
07-Apr-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
MFC r205326: Convert aio syscall registration to SYSCALL_INIT_HELPER.
|
#
4ccf64eb |
|
06-Apr-2010 |
Nathan Whitehorn <nwhitehorn@FreeBSD.org> |
MFC r205014,205015: Provide groundwork for 32-bit binary compatibility on non-x86 platforms, for upcoming 64-bit PowerPC and MIPS support. This renames the COMPAT_IA32 option to COMPAT_FREEBSD32, removes some IA32-specific code from MI parts of the kernel and enhances the freebsd32 compatibility code to support big-endian platforms. This MFC is required for MFCs of later changes to the freebsd32 compatibility from HEAD. Requested by: kib
|
#
723d37c0 |
|
19-Mar-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
Convert aio syscall registration to SYSCALL_INIT_HELPER. Reviewed by: jhb MFC after: 2 weeks
|
#
841c0c7e |
|
11-Mar-2010 |
Nathan Whitehorn <nwhitehorn@FreeBSD.org> |
Provide groundwork for 32-bit binary compatibility on non-x86 platforms, for upcoming 64-bit PowerPC and MIPS support. This renames the COMPAT_IA32 option to COMPAT_FREEBSD32, removes some IA32-specific code from MI parts of the kernel and enhances the freebsd32 compatibility code to support big-endian platforms. Reviewed by: kib, jhb
|
#
e76d823b |
|
12-Sep-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Use C99 initialization for struct filterops. Obtained from: Mac OS X Sponsored by: Apple Inc. MFC after: 3 weeks
|
#
d8b0556c |
|
10-Jun-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
Adapt vfs kqfilter to the shared vnode lock used by zfs write vop. Use vnode interlock to protect the knote fields [1]. The locking assumes that shared vnode lock is held, thus we get exclusive access to knote either by exclusive vnode lock protection, or by shared vnode lock + vnode interlock. Do not use kl_locked() method to assert either lock ownership or the fact that curthread does not own the lock. For shared locks, ownership is not recorded, e.g. VOP_ISLOCKED can return LK_SHARED for the shared lock not owned by curthread, causing false positives in kqueue subsystem assertions about knlist lock. Remove kl_locked method from knlist lock vector, and add two separate assertion methods kl_assert_locked and kl_assert_unlocked, that are supposed to use proper asserts. Change knlist_init accordingly. Add convenience function knlist_init_mtx to reduce number of arguments for typical knlist initialization. Submitted by: jhb [1] Noted by: jhb [2] Reviewed by: jhb Tested by: rnoland
|
#
74fb0ba7 |
|
01-Jun-2009 |
John Baldwin <jhb@FreeBSD.org> |
Rework socket upcalls to close some races with setup/teardown of upcalls. - Each socket upcall is now invoked with the appropriate socket buffer locked. It is not permissible to call soisconnected() with this lock held; however, so socket upcalls now return an integer value. The two possible values are SU_OK and SU_ISCONNECTED. If an upcall returns SU_ISCONNECTED, then the soisconnected() will be invoked on the socket after the socket buffer lock is dropped. - A new API is provided for setting and clearing socket upcalls. The API consists of soupcall_set() and soupcall_clear(). - To simplify locking, each socket buffer now has a separate upcall. - When a socket upcall returns SU_ISCONNECTED, the upcall is cleared from the receive socket buffer automatically. Note that a SO_SND upcall should never return SU_ISCONNECTED. - All this means that accept filters should now return SU_ISCONNECTED instead of calling soisconnected() directly. They also no longer need to explicitly clear the upcall on the new socket. - The HTTP accept filter still uses soupcall_set() to manage its internal state machine, but other accept filters no longer have any explicit knowlege of socket upcall internals aside from their return value. - The various RPC client upcalls currently drop the socket buffer lock while invoking soreceive() as a temporary band-aid. The plan for the future is to add a new flag to allow soreceive() to be called with the socket buffer locked. - The AIO callback for socket I/O is now also invoked with the socket buffer locked. Previously sowakeup() would drop the socket buffer lock only to call aio_swake() which immediately re-acquired the socket buffer lock for the duration of the function call. Discussed with: rwatson, rmacklem
|
#
e588eeb1 |
|
23-Jan-2009 |
John Baldwin <jhb@FreeBSD.org> |
Use the correct type for the timeout parameter to the 32-bit compat version aio_waitcomplete(). Reminded by: bz Submitted by: jamie MFC after: 3 days
|
#
3858a1f4 |
|
10-Dec-2008 |
John Baldwin <jhb@FreeBSD.org> |
- Add 32-bit compat system calls for VFS_AIO. The system calls live in the aio code and are registered via the recently added SYSCALL32_*() helpers. - Since the aio code likes to invoke fuword and suword a lot down in the "bowels" of system calls, add a structure holding a set of operations for things like storing errors, copying in the aiocb structure, storing status, etc. The 32-bit system calls use a separate operations vector to handle fuword32 vs fuword, etc. Also, the oldsigevent handling is now done by having seperate operation vectors with different aiocb copyin routines. - Split out kern_foo() functions for the various AIO system calls so the 32-bit front ends can manage things like copying in and converting timespec structures, etc. - For both the native and 32-bit aio_suspend() and lio_listio() calls, just use copyin() to read the array of aiocb pointers instead of using a for loop that iterated over fuword/fuword32. The error handling in the old case was incomplete (lio_listio() just ignored any aiocb's that it got an EFAULT trying to read rather than reporting an error), and possibly slower. MFC after: 1 month
|
#
d7f03759 |
|
19-Oct-2008 |
Ulf Lilleengen <lulf@FreeBSD.org> |
- Import the HEAD csup code which is the basis for the cvsmode work.
|
#
22035f47 |
|
21-Jun-2008 |
Oleksandr Tymoshenko <gonzo@FreeBSD.org> |
Use minimum of max_aio_procs and target_aio_procs when spawning new aiod since there should be no more then max_aio_procs processes.
|
#
e603be7a |
|
01-Feb-2008 |
Robert Watson <rwatson@FreeBSD.org> |
Use FEATURE() macro to advertise aio availability.
|
#
a8afa221 |
|
24-Jan-2008 |
Jean-Sébastien Pédron <dumbbell@FreeBSD.org> |
When asked to use kqueue, AIO stores its internal state in the `kn_sdata' member of the newly registered knote. The problem is that this member is overwritten by a call to kevent(2) with the EV_ADD flag, targetted at the same kevent/knote. For instance, a userland application may set the pointer to NULL, leading to a panic. A testcase was provided by the submitter. PR: kern/118911 Submitted by: MOROHOSHI Akihiko <moro@remus.dti.ne.jp> MFC after: 1 day
|
#
22db15c0 |
|
13-Jan-2008 |
Attilio Rao <attilio@FreeBSD.org> |
VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
|
#
cb05b60a |
|
09-Jan-2008 |
Attilio Rao <attilio@FreeBSD.org> |
vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
|
#
3745c395 |
|
20-Oct-2007 |
Julian Elischer <julian@FreeBSD.org> |
Rename the kthread_xxx (e.g. kthread_create()) calls to kproc_xxx as they actually make whole processes. Thos makes way for us to add REAL kthread_create() and friends that actually make theads. it turns out that most of these calls actually end up being moved back to the thread version when it's added. but we need to make this cosmetic change first. I'd LOVE to do this rename in 7.0 so that we can eventually MFC the new kthread_xxx() calls.
|
#
5114048b |
|
20-Aug-2007 |
Konstantin Belousov <kib@FreeBSD.org> |
Destroy the kaio_mtx on the freeing the struct kaioinfo in the aio_proc_rundown. Do not allow for zero-length read to be passed to the fo_read file method by aio. Reported and tested by: Peter Holm Approved by: re (kensmith)
|
#
a659386c |
|
09-Jun-2007 |
Matt Jacob <mjacob@FreeBSD.org> |
Remove unused variable.
|
#
1c4bcd05 |
|
31-May-2007 |
Jeff Roberson <jeff@FreeBSD.org> |
- Move rusage from being per-process in struct pstats to per-thread in td_ru. This removes the requirement for per-process synchronization in statclock() and mi_switch(). This was previously supported by sched_lock which is going away. All modifications to rusage are now done in the context of the owning thread. reads proceed without locks. - Aggregate exiting threads rusage in thread_exit() such that the exiting thread's rusage is not lost. - Provide a new routine, rufetch() to fetch an aggregate of all rusage structures from all threads in a process. This routine must be used in any place requiring a rusage from a process prior to it's exit. The exited process's rusage is still available via p_ru. - Aggregate tick statistics only on demand via rufetch() or when a thread exits. Tick statistics are kept in the thread and protected by sched_lock until it exits. Initial patch by: attilio Reviewed by: attilio, bde (some objections), arch (mostly silent)
|
#
873fbcd7 |
|
05-Mar-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Further system call comment cleanup: - Remove also "MP SAFE" after prior "MPSAFE" pass. (suggested by bde) - Remove extra blank lines in some cases. - Add extra blank lines in some cases. - Remove no-op comments consisting solely of the function name, the word "syscall", or the system call name. - Add punctuation. - Re-wrap some comments.
|
#
6aeb05d7 |
|
11-Nov-2006 |
Tom Rhodes <trhodes@FreeBSD.org> |
Merge posix4/* into normal kernel hierarchy. Reviewed by: glanced at by jhb Approved by: silence on -arch@ and -standards@
|
#
6a1162d4 |
|
15-Oct-2006 |
Alexander Leidinger <netchild@FreeBSD.org> |
MFP4 (with some minor changes): Implement the linux_io_* syscalls (AIO). They are only enabled if the native AIO code is available (either compiled in to the kernel or as a module) at the time the functions are used. If the AIO stuff is not available there will be a ENOSYS. From the submitter: ---snip--- DESIGN NOTES: 1. Linux permits a process to own multiple AIO queues (distinguished by "context"), but FreeBSD creates only one single AIO queue per process. My code maintains a request queue (STAILQ of queue(3)) per "context", and throws all AIO requests of all contexts owned by a process into the single FreeBSD per-process AIO queue. When the process calls io_destroy(2), io_getevents(2), io_submit(2) and io_cancel(2), my code can pick out requests owned by the specified context from the single FreeBSD per-process AIO queue according to the per-context request queues maintained by my code. 2. The request queue maintained by my code stores contrast information between Linux IO control blocks (struct linux_iocb) and FreeBSD IO control blocks (struct aiocb). FreeBSD IO control block actually exists in userland memory space, required by FreeBSD native aio_XXXXXX(2). 3. It is quite troubling that the function io_getevents() of libaio-0.3.105 needs to use Linux-specific "struct aio_ring", which is a partial mirror of context in user space. I would rather take the address of context in kernel as the context ID, but the io_getevents() of libaio forces me to take the address of the "ring" in user space as the context ID. To my surprise, one comment line in the file "io_getevents.c" of libaio-0.3.105 reads: Ben will hate me for this REFERENCE: 1. Linux kernel source code: http://www.kernel.org/pub/linux/kernel/v2.6/ (include/linux/aio_abi.h, fs/aio.c) 2. Linux manual pages: http://www.kernel.org/pub/linux/docs/manpages/ (io_setup(2), io_destroy(2), io_getevents(2), io_submit(2), io_cancel(2)) 3. Linux Scalability Effort: http://lse.sourceforge.net/io/aio.html The design notes: http://lse.sourceforge.net/io/aionotes.txt 4. The package libaio, both source and binary: http://rpmfind.net/linux/rpm2html/search.php?query=libaio Simple transparent interface to Linux AIO system calls. 5. Libaio-oracle: http://oss.oracle.com/projects/libaio-oracle/ POSIX AIO implementation based on Linux AIO system calls (depending on libaio). ---snip--- Submitted by: Li, Xiao <intron@intron.ac>
|
#
4db71d27 |
|
23-Sep-2006 |
John-Mark Gurney <jmg@FreeBSD.org> |
hide kqueue_register from public view, and replace it w/ kqfd_register... this eliminates a possible race in aio registering a kevent..
|
#
f6d004d5 |
|
06-Sep-2006 |
Mark Peek <mp@FreeBSD.org> |
Remove call to fdfree() for the AIO daemons to prevent kernel panics with linprocfs. This call is not needed since file descriptor sharing was removed in v1.125. Reviewed by: alc, davidxu, ambrisko MFC after: 3 days
|
#
993182e5 |
|
14-Aug-2006 |
Alexander Leidinger <netchild@FreeBSD.org> |
- Change process_exec function handlers prototype to include struct image_params arg. - Change struct image_params to include struct sysentvec pointer and initialize it. - Change all consumers of process_exit/process_exec eventhandlers to new prototypes (includes splitting up into distinct exec/exit functions). - Add eventhandler to userret. Sponsored by: Google SoC 2006 Submitted by: rdivacky Parts suggested by: jhb (on hackers@)
|
#
51e37c7f |
|
02-Jun-2006 |
Doug Ambrisko <ambrisko@FreeBSD.org> |
Make lio ident more consistant with aio ident.
|
#
759cccca |
|
08-May-2006 |
David Xu <davidxu@FreeBSD.org> |
Use a dedicated mutex to protect aio queues, the movation is to reduce lock contention with other parts.
|
#
dbbccfe9 |
|
23-Mar-2006 |
David Xu <davidxu@FreeBSD.org> |
1. Move code for scanning pending I/O from aio_fsync to aio_aqueue, it has less overhead. 2. Avoid scheduling task if maximum number of I/O threads is reached.
|
#
99eee864 |
|
23-Mar-2006 |
David Xu <davidxu@FreeBSD.org> |
Implement aio_fsync() syscall.
|
#
27b8220d |
|
25-Feb-2006 |
David Xu <davidxu@FreeBSD.org> |
1. Remove aio entry from lists earlier in aio_free_entry, so other threads can not see it if we unlock the proc lock (this can happen in knlist_delete). Don't do wakeup, it is not necessary. 2. Decrease kaio_buffer_count in biohelper rather than doing it in aio_bio_done_notify. 3. In aio_bio_done_notify, don't send notification if KAIO_RUNDOWN was set, because the process is already in single thread mode. 4. Use assignment to initialize aiothreadflags. 5. AIOCBLIST_RUNDOWN is not useful, axe the code using it. 6. use LIO_NOP instead of zero.
|
#
ad8de0f2 |
|
21-Feb-2006 |
David Xu <davidxu@FreeBSD.org> |
If block size is zero, use normal file operations to do I/O, this eliminates a divided-by-zero fault. Recommended by: phk
|
#
6d53aa62 |
|
27-Jan-2006 |
David Xu <davidxu@FreeBSD.org> |
Just like dofilewrite(), call bwillwrite before fo_write.
|
#
03d66b36 |
|
26-Jan-2006 |
David Xu <davidxu@FreeBSD.org> |
return final error code in aio_return rather than a hardcoded 0.
|
#
55a122bf |
|
26-Jan-2006 |
David Xu <davidxu@FreeBSD.org> |
in aio_aqueue, store same return code into job->_aiocb_private.error. in aio_return, unlock proc lock before suword.
|
#
1aa4c324 |
|
24-Jan-2006 |
David Xu <davidxu@FreeBSD.org> |
Add locking annotation and comments about socket, pipe, fifo problem. Temporarily fix a locking problem for socket I/O.
|
#
e6bdc05f |
|
23-Jan-2006 |
David Xu <davidxu@FreeBSD.org> |
Er, rescure a deleted comment line.
|
#
bd793be3 |
|
23-Jan-2006 |
David Xu <davidxu@FreeBSD.org> |
More cleanup for aio code: 1) unregsiter kqueue filter for EVFILT_LIO. 2) free uma_zones. 3) call setsid directly to enter another session rather than implementing by itself. Submitted by: jhb
|
#
7f34b521 |
|
23-Jan-2006 |
David Xu <davidxu@FreeBSD.org> |
Add bracket.
|
#
68d71118 |
|
23-Jan-2006 |
David Xu <davidxu@FreeBSD.org> |
Verify all supported notification types.
|
#
a9bf5e37 |
|
22-Jan-2006 |
David Xu <davidxu@FreeBSD.org> |
1) Merge _aio_aqueue and aio_aqueue, check quota in aio_aqueue, so that lio_listio won't exceed the quota. 2) Remove lio_ref_count, it is no longer used.
|
#
8c0d9af5 |
|
22-Jan-2006 |
David Xu <davidxu@FreeBSD.org> |
Fix a bogus panic.
|
#
9b84335c |
|
22-Jan-2006 |
David Xu <davidxu@FreeBSD.org> |
Decrease kaio_active_count first, because user process may go away after we notified it.
|
#
1ce91824 |
|
21-Jan-2006 |
David Xu <davidxu@FreeBSD.org> |
Make aio code MP safe.
|
#
8213baf0 |
|
14-Jan-2006 |
Christian S.J. Peron <csjp@FreeBSD.org> |
Initialize ki to p->p_aioinfo after we know it's going to be referencing a valid kaioinfo structure. This avoids a potential NULL pointer dereference. Found with: Coverity Prevent(tm) MFC after: 2 weeks
|
#
af56abaa |
|
06-Jan-2006 |
John Baldwin <jhb@FreeBSD.org> |
Return error from fget_write() rather than hardcoding EBADF now that fget_write() DTRT. Requested by: bde
|
#
323fe565 |
|
08-Nov-2005 |
David Xu <davidxu@FreeBSD.org> |
In aio_waitcomplete, do not return EAGAIN if no other threads have started aio, instead, initialize aio management structure if it hasn't been done, the reason to adjust this behavior is to make it a bit friendly for threaded program, consider two threads, one submits aio_write, and another just calls aio_waitcomplete to wait any I/O to be completed and recycle the aio requests, before submitter doing any I/O, the recycler wants to wait in kernel. This also fixes inconsistency with other aio syscalls.
|
#
2a522eb9 |
|
08-Nov-2005 |
John Baldwin <jhb@FreeBSD.org> |
Various and sundry cleanups: - Use curthread for calls to knlist_delete() and add a big comment explaining why as well as appropriate assertions. - Use TAILQ_FOREACH and TAILQ_FOREACH_SAFE instead of handrolling them. - Use fget() family of functions to lookup file objects instead of grovelling around in file descriptor tables. - Destroy the aio_freeproc mutex if we are unloaded. Tested on: i386
|
#
8f0371f1 |
|
04-Nov-2005 |
David Xu <davidxu@FreeBSD.org> |
Fix name compatible problem with POSIX standard. the sigval_ptr and sigval_int really should be sival_ptr and sival_int. Also sigev_notify_function accepts a union sigval value but not a pointer.
|
#
4c0fb2cf |
|
02-Nov-2005 |
David Xu <davidxu@FreeBSD.org> |
Support sending realtime signal information via signal queue, realtime signal memory is pre-allocated, so kernel can always notify user code.
|
#
68a17869 |
|
01-Nov-2005 |
John Baldwin <jhb@FreeBSD.org> |
Push down Giant into fdfree() and remove it from two of the callers. Other callers such as some rfork() cases weren't locking Giant anyway. Reviewed by: csjp MFC after: 1 week
|
#
0972628a |
|
29-Oct-2005 |
David Xu <davidxu@FreeBSD.org> |
Fix sigevent's POSIX incompatible problem by adding member fields sigev_notify_function and sigev_notify_attributes. AIO syscalls use sigevent, so they have to be adjusted. Reviewed by: alc
|
#
db43cd04 |
|
12-Oct-2005 |
Doug Ambrisko <ambrisko@FreeBSD.org> |
Fix tinderbox box by removing incomplete/bad spl usage. Proper giant free locking is required in for aio. Pointed out by: imp
|
#
69cd28da |
|
12-Oct-2005 |
Doug Ambrisko <ambrisko@FreeBSD.org> |
Add in kqueue support to LIO event notification and fix how it handled notifications when LIO operations completed. These were the problems with LIO event complete notification: - Move all LIO/AIO event notification into one general function so we don't have bugs in different data paths. This unification got rid of several notification bugs one of which if kqueue was used a SIGILL could get sent to the process. - Change the LIO event accounting to count all AIO request that could have been split across the fast path and daemon mode. The prior accounting only kept track of AIO op's in that mode and not the entire list of operations. This could cause a bogus LIO event complete notification to occur when all of the fast path AIO op's completed and not the AIO op's that ended up queued for the daemon. Suggestions from: alc
|
#
ec9c9e73 |
|
20-Jul-2005 |
Alan Cox <alc@FreeBSD.org> |
Eliminate inconsistency in the setting of the B_DONE flag. Specifically, make the b_iodone callback responsible for setting it if it is needed. Previously, it was set unconditionally by bufdone() without holding whichever lock is shared by the b_iodone callback and the corresponding top-half function. Consequently, in a race, the top-half function could conclude that operation was done before the b_iodone callback finished. See, for example, aio_physwakeup() and aio_fphysio(). Note: I don't believe that the other, more widely-used b_iodone callbacks are affected. Discussed with: jeff Reviewed by: phk MFC after: 2 weeks
|
#
571dcd15 |
|
01-Jul-2005 |
Suleiman Souhlal <ssouhlal@FreeBSD.org> |
Fix the recent panics/LORs/hangs created by my kqueue commit by: - Introducing the possibility of using locks different than mutexes for the knlist locking. In order to do this, we add three arguments to knlist_init() to specify the functions to use to lock, unlock and check if the lock is owned. If these arguments are NULL, we assume mtx_lock, mtx_unlock and mtx_owned, respectively. - Using the vnode lock for the knlist locking, when doing kqueue operations on a vnode. This way, we don't have to lock the vnode while holding a mutex, in filt_vfsread. Reviewed by: jmg Approved by: re (scottl), scottl (mentor override) Pointyhat to: ssouhlal Will be happy: everyone
|
#
b490cc72 |
|
06-Jun-2005 |
Alan Cox <alc@FreeBSD.org> |
In lio_listio(2) change jobref from an int to a long so that lio_listio(LIO_WAIT, ...) works correctly on 64-bit architectures. Reviewed by: tegge
|
#
67b95a95 |
|
04-Jun-2005 |
Alan Cox <alc@FreeBSD.org> |
Eliminate an unused field from struct aio_liojob.
|
#
bbe7bbdf |
|
04-Jun-2005 |
Alan Cox <alc@FreeBSD.org> |
Eliminate the original method of requesting notification of aio_read(2) and aio_write(2) completion through kevent(2). This method does not work on 64-bit architectures. It was deprecated in FreeBSD 4.4. See revisions 1.87 and 1.70.2.7. Change aio_physwakeup() to call psignal(9) directly rather than indirectly through a timeout(9). Discussed with: bde Correct a bug introduced in revision 1.65 that could result in premature delivery of a signal if an lio_listio(2) consisted of a mixture of direct/raw and queued I/O operations. Observed by: tegge Eliminate a field from struct kaioinfo that is now unused. Reviewed by: tegge
|
#
3769f562 |
|
02-Jun-2005 |
Alan Cox <alc@FreeBSD.org> |
Synchronize access to the per process aiocb lists in many of the functions.
|
#
e293dc86 |
|
02-Jun-2005 |
Alan Cox <alc@FreeBSD.org> |
In aio_waitcomplete() correct two cases of using an aiocb after freeing it.
|
#
3148c2c9 |
|
30-May-2005 |
Alan Cox <alc@FreeBSD.org> |
Synchronize access to aio_freeproc with a mutex. Eliminate related spl calls. Reduce the scope of Giant in aio_daemon().
|
#
3999ebe3 |
|
30-May-2005 |
Alan Cox <alc@FreeBSD.org> |
Use the proc mtx to prevent simultaneous changes to p_aioinfo.
|
#
82851350 |
|
30-May-2005 |
Alan Cox <alc@FreeBSD.org> |
Eliminate unnecessary calls to wakeup(); no one sleeps on &aio_freeproc. Eliminate an unused flag, AIOP_SCHED; it's cleared but never set.
|
#
95eca142 |
|
29-May-2005 |
Alan Cox <alc@FreeBSD.org> |
Eliminate aio_activeproc; it's unused.
|
#
8484b5e6 |
|
29-May-2005 |
Alan Cox <alc@FreeBSD.org> |
Eliminate aio_bufjobs; it's unused.
|
#
a230c79b |
|
30-Apr-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Acquire Giant in AIO's iodone routine. VFS will no longer do it for us soon. Sponsored by: Isilon Systems, Inc.
|
#
c4c44d29 |
|
17-Mar-2005 |
John-Mark Gurney <jmg@FreeBSD.org> |
fix aio+kq... I've been running ambrisko's test program for much longer w/o problems than I was before... This simply brings back the knote_delete as knlist_delete which will also drop the knote's, instead of just clearing the list and seeing _ONESHOT... Fix a race where if a note was _INFLUX and _DETACHED, it could end up being modified... whoopse.. MFC after: 1 week Prodded by: ambrisko and dwhite
|
#
5ece08f5 |
|
09-Feb-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Make a SYSCTL_NODE static
|
#
9454b2d8 |
|
06-Jan-2005 |
Warner Losh <imp@FreeBSD.org> |
/* -> /*- for copyright notices, minor format tweaks as necessary
|
#
c5690651 |
|
04-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove buf->b_dev field.
|
#
6afb3b1c |
|
29-Oct-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Give dev_strategy() an explict cdev argument in preparation for removing buf->b-dev. Put a bio between the buf passed to dev_strategy() and the device driver strategy routine in order to not clobber fields in the buf. Assert copyright on vfs_bio.c and update copyright message to canonical text. There is no legal difference between John Dysons two-clause abbreviated BSD license and the canonical text.
|
#
5d9d81e7 |
|
26-Oct-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Put the I/O block size in bufobj->bo_bsize. We keep si_bsize_phys around for now as that is the simplest way to pull the number out of disk device drivers in devfs_open(). The correct solution would be to do an ioctl(DIOCGSECTORSIZE), but the point is probably mooth when filesystems sit on GEOM, so don't bother for now.
|
#
576c004f |
|
30-Sep-2004 |
Alfred Perlstein <alfred@FreeBSD.org> |
cover soreadable and sowriteable with the corresponding socketbuffer locks.
|
#
1a52a73d |
|
23-Sep-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Eliminate DEV_STRATEGY() macro: call dev_strategy() directly. Make dev_strategy() handle errors and departing devices properly.
|
#
b6ac5828 |
|
02-Sep-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Tag AIO as requiring Giant over the network stack using NET_NEEDS_GIANT(). RELENG_5 candidate.
|
#
ad3b9257 |
|
15-Aug-2004 |
John-Mark Gurney <jmg@FreeBSD.org> |
Add locking to the kqueue subsystem. This also makes the kqueue subsystem a more complete subsystem, and removes the knowlege of how things are implemented from the drivers. Include locking around filter ops, so a module like aio will know when not to be unloaded if there are outstanding knotes using it's filter ops. Currently, it uses the MTX_DUPOK even though it is not always safe to aquire duplicate locks. Witness currently doesn't support the ability to discover if a dup lock is ok (in some cases). Reviewed by: green, rwatson (both earlier versions)
|
#
ac77164d |
|
13-Aug-2004 |
John-Mark Gurney <jmg@FreeBSD.org> |
clean up whitespace...
|
#
1a276a3f |
|
26-Jul-2004 |
Alan Cox <alc@FreeBSD.org> |
- Use atomic ops for updating the vmspace's refcnt and exitingcnt. - Push down Giant into shmexit(). (Giant is acquired only if the vmspace contains shm segments.) - Eliminate the acquisition of Giant from proc_rwmem(). - Reduce the scope of Giant in exit1(), uncovering the destruction of the address space.
|
#
9535efc0 |
|
17-Jun-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Merge additional socket buffer locking from rwatson_netperf: - Lock down low hanging fruit use of sb_flags with socket buffer lock. - Lock down low hanging fruit use of so_state with socket lock. - Lock down low hanging fruit use of so_options. - Lock down low-hanging fruit use of sb_lowwat and sb_hiwat with socket buffer lock. - Annotate situations in which we unlock the socket lock and then grab the receive socket buffer lock, which are currently actually the same lock. Depending on how we want to play our cards, we may want to coallesce these lock uses to reduce overhead. - Convert a if()->panic() into a KASSERT relating to so_state in soaccept(). - Remove a number of splnet()/splx() references. More complex merging of socket and socket buffer locking to follow.
|
#
77409fe1 |
|
30-May-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add missing #include <sys/module.h>
|
#
a5bdcb2a |
|
13-Mar-2004 |
Peter Wemm <peter@FreeBSD.org> |
Make the process_exit eventhandler run without Giant. Add Giant hooks in the two consumers that need it.. processes using AIO and netncp. Update docs. Say that process_exec is called with Giant, but not to depend on it. All our consumers can handle it without Giant.
|
#
00cbe31b |
|
15-Nov-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Send B_PHYS out to pasture, it no longer serves any function.
|
#
0eb3b7bb |
|
24-Oct-2003 |
John-Mark Gurney <jmg@FreeBSD.org> |
don't allow reading from files that haven't been open'd for reading.
|
#
a44ca4f0 |
|
21-Oct-2003 |
Hidetoshi Shimokawa <simokawa@FreeBSD.org> |
We need to initialize bp->b_offset and bp->b_iooffset becuase bp->b_blkno is ignored now.
|
#
8edbaf85 |
|
10-Sep-2003 |
Hidetoshi Shimokawa <simokawa@FreeBSD.org> |
Fix asynchronous physio breakage introduced in rev 1.163. We cannnot use bp->b_caller2 because DEV_STRATEGY will overwrite it.
|
#
3b6d9652 |
|
22-Jun-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Add a f_vnode field to struct file. Several of the subtypes have an associated vnode which is used for stuff like the f*() functions. By giving the vnode a speparate field, a number of checks for the specific subtype can be replaced simply with a check for f_vnode != NULL, and we can later free f_data up to subtype specific use. At this point in time, f_data still points to the vnode, so any code I might have overlooked will still work.
|
#
e725c18c |
|
16-Jun-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Get rid of the b_spc specialty field in struct buf by using an already available caller private field.
|
#
677b542e |
|
10-Jun-2003 |
David E. O'Brien <obrien@FreeBSD.org> |
Use __FBSDID().
|
#
104a9b7e |
|
29-Apr-2003 |
Alexander Kabaev <kan@FreeBSD.org> |
Deprecate machine/limits.h in favor of new sys/limits.h. Change all in-tree consumers to include <sys/limits.h> Discussed on: standards@ Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>
|
#
cd4ed3b5 |
|
17-Apr-2003 |
John Baldwin <jhb@FreeBSD.org> |
- kthread's don't have p_textvp set to anything, so replace code that dealt with that possibility with a KASSERT(). - No need to set P_SYSTEM, kthread_create() does that for us.
|
#
ef38cda1 |
|
05-Apr-2003 |
Alan Cox <alc@FreeBSD.org> |
Don't reinitialize fields that are already initialized by getpbuf().
|
#
06363906 |
|
03-Apr-2003 |
Alan Cox <alc@FreeBSD.org> |
o Remove useracc() calls from aio_qphysio(); they are redundant given the checks performed by vmapbuf(). Reviewed by: tegge
|
#
75b8b3b2 |
|
24-Mar-2003 |
John Baldwin <jhb@FreeBSD.org> |
Replace the at_fork, at_exec, and at_exit functions with the slightly more flexible process_fork, process_exec, and process_exit eventhandlers. This reduces code duplication and also means that I don't have to go duplicate the eventhandler locking three more times for each of at_fork, at_exec, and at_exit. Reviewed by: phk, jake, almost complete silence on arch@
|
#
a163d034 |
|
18-Feb-2003 |
Warner Losh <imp@FreeBSD.org> |
Back out M_* changes, per decision of the TRB. Approved by: trb
|
#
44956c98 |
|
21-Jan-2003 |
Alfred Perlstein <alfred@FreeBSD.org> |
Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
|
#
2d5c7e45 |
|
20-Jan-2003 |
Matthew Dillon <dillon@FreeBSD.org> |
Close the remaining user address mapping races for physical I/O, CAM, and AIO. Still TODO: streamline useracc() checks. Reviewed by: alc, tegge MFC after: 7 days
|
#
ac41f2ef |
|
13-Jan-2003 |
Alfred Perlstein <alfred@FreeBSD.org> |
style(9) fixes, mostly add parens around return arguments.
|
#
48e3128b |
|
12-Jan-2003 |
Matthew Dillon <dillon@FreeBSD.org> |
Bow to the whining masses and change a union back into void *. Retain removal of unnecessary casts and throw in some minor cleanups to see if anyone complains, just for the hell of it.
|
#
ae3b195f |
|
12-Jan-2003 |
Tim J. Robbins <tjr@FreeBSD.org> |
Allowing nent < 0 in aio_suspend() and lio_listio() is just asking for trouble. Return EINVAL instead.
|
#
44a2c818 |
|
12-Jan-2003 |
Tim J. Robbins <tjr@FreeBSD.org> |
Remove "XXX undocumented" comment from lio_listio().
|
#
cd72f218 |
|
11-Jan-2003 |
Matthew Dillon <dillon@FreeBSD.org> |
Change struct file f_data to un_data, a union of the correct struct pointer types, and remove a huge number of casts from code using it. Change struct xfile xf_data to xun_data (ABI is still compatible). If we need to add a #define for f_data and xf_data we can, but I don't think it will be necessary. There are no operational changes in this commit.
|
#
e2a3ea1c |
|
02-Jan-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove unused second argument from DEV_STRATEGY().
|
#
5590e7fd |
|
27-Dec-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Lock filedesc while performing a range check on the file descriptor. Reviewed by: alc
|
#
f51c1e89 |
|
16-Nov-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Rework the sysconf(3) interaction with aio: sysconf.c: Use 'break' rather than 'goto yesno' in sysconf.c so that we report a '0' return value from the kernel sysctl. vfs_aio.c: Make aio reset its configuration parameters to -1 after unloading instead of 0. posix4_mib.c: Initialize the aio configuration parameters to -1 to indicate that it is not loaded. Add a facility (p31b_iscfg()) to determine if a posix4 facility has been initialized to avoid having to re-order the SYSINITs. Use p31b_iscfg() to determine if aio has had a chance to run yet which is likely if it is compiled into the kernel and avoid spamming its values. Introduce a macro P31B_VALID() instead of doing the same comparison over and over. posix4.h: Prototype p31b_iscfg().
|
#
86d52125 |
|
15-Nov-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Export the values for _SC_AIO_MAX and _SC_AIO_PRIO_DELTA_MAX via the p1003b sysctl interface.
|
#
c844abc9 |
|
15-Nov-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Call 'p31b_setcfg(CTL_P1003_1B_AIO_LISTIO_MAX, AIO_LISTIO_MAX)' when AIO is initialized so that sysconf() gives correct results. Reported by: Craig Rodrigues <rodrigc@attbi.com>
|
#
f8f750c5 |
|
07-Nov-2002 |
Robert Watson <rwatson@FreeBSD.org> |
Do a bit more work in the aio code to simulate the credential environment of the original AIO request: save and restore the active thread credential as well as using the file credential, since MAC (and some other bits of the system) rely on the thread credential instead of/as well as the file credential. In brief: cache td->td_ucred when the AIO operation is queued, temporarily set and restore the kernel thread credential, and release the credential when done. Similar to ktrace credential management. Reviewed by: alc Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
c7047e52 |
|
27-Oct-2002 |
Garrett Wollman <wollman@FreeBSD.org> |
Change the way support for asynchronous I/O is indicated to applications to conform to 1003.1-2001. Make it possible for applications to actually tell whether or not asynchronous I/O is supported. Since FreeBSD's aio implementation works on all descriptor types, don't call down into file or vnode ops when [f]pathconf() is asked about _PC_ASYNC_IO; this avoids the need for every file and vnode op to know about it.
|
#
6d345e2a |
|
18-Oct-2002 |
John Baldwin <jhb@FreeBSD.org> |
fdfree() clears p_fd for us, no need to do it again.
|
#
4d752b01 |
|
13-Oct-2002 |
Alan Cox <alc@FreeBSD.org> |
Eliminate the unnecessary clearing of flag bits that are already clear in lio_listio(2).
|
#
316ec49a |
|
02-Oct-2002 |
Scott Long <scottl@FreeBSD.org> |
Some kernel threads try to do significant work, and the default KSTACK_PAGES doesn't give them enough stack to do much before blowing away the pcb. This adds MI and MD code to allow the allocation of an alternate kstack who's size can be speficied when calling kthread_create. Passing the value 0 prevents the alternate kstack from being created. Note that the ia64 MD code is missing for now, and PowerPC was only partially written due to the pmap.c being incomplete there. Though this patch does not modify anything to make use of the alternate kstack, acpi and usb are good candidates. Reviewed by: jake, peter, jhb
|
#
4a6a94d8 |
|
22-Aug-2002 |
Archie Cobbs <archie@FreeBSD.org> |
Replace (ab)uses of "NULL" where "0" is really meant.
|
#
0a179f80 |
|
22-Aug-2002 |
Alan Cox <alc@FreeBSD.org> |
o Remove the AIOCBLIST_ASYNCFREE flag and related code. It's never set. Submitted by: Romer Gil <rgil@cs.rice.edu>
|
#
ad49abc0 |
|
11-Aug-2002 |
Alan Cox <alc@FreeBSD.org> |
o Make a correction to the last change: In aio_cancel(2) return AIO_ALLDONE instead of EINVAL if p->p_aioinfo is NULL.
|
#
b6c1f1ef |
|
10-Aug-2002 |
Alan Cox <alc@FreeBSD.org> |
o In aio_cancel(2), make sure that p->p_aioinfo isn't NULL before dereferencing it. Submitted by: saureen <sshah@apple.com>
|
#
b46f1c55 |
|
06-Aug-2002 |
Alan Cox <alc@FreeBSD.org> |
Set the ident field of the struct kevent that is registered by _aio_aqueue() to the address of the user's aiocb rather than the kernel's aiocb. (In other words, prior to this change, the ident field returned by kevent(2) on completion of an AIO was effectively garbage.) Submitted by: Romer Gil <rgil@cs.rice.edu>
|
#
20fb589d |
|
05-Aug-2002 |
Alan Cox <alc@FreeBSD.org> |
o The introduction of kevent() broke lio_listio(): _aio_aqueue() thought that LIO_READ and LIO_WRITE were requests for kevent()-based notification of completion. Modify _aio_aqueue() to recognize LIO_READ and LIO_WRITE. Notes: (1) The patch provided by the PR perpetuates a second bug in this code, a direct access to user-space memory. This change fixes that bug as well. (2) This change is to code that implements a deprecated interface. It should probably be removed after an MFC. PR: kern/39556
|
#
4cc20ab1 |
|
31-May-2002 |
Seigo Tanimura <tanimura@FreeBSD.org> |
Back out my lats commit of locking down a socket, it conflicts with hsu's work. Requested by: hsu
|
#
a739e09c |
|
25-May-2002 |
Alan Cox <alc@FreeBSD.org> |
o Remove some unnecessary casting from and add some necessary casting to aio_suspend() and lio_listio(). Submitted by: bde
|
#
34e3110c |
|
24-May-2002 |
Peter Wemm <peter@FreeBSD.org> |
Fix warnings. Also, removed an unused variable that I found that was just initialized and never used afterwards.
|
#
243917fe |
|
19-May-2002 |
Seigo Tanimura <tanimura@FreeBSD.org> |
Lock down a socket, milestone 1. o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a socket buffer. The mutex in the receive buffer also protects the data in struct socket. o Determine the lock strategy for each members in struct socket. o Lock down the following members: - so_count - so_options - so_linger - so_state o Remove *_locked() socket APIs. Make the following socket APIs touching the members above now require a locked socket: - sodisconnect() - soisconnected() - soisconnecting() - soisdisconnected() - soisdisconnecting() - sofree() - soref() - sorele() - sorwakeup() - sotryfree() - sowakeup() - sowwakeup() Reviewed by: alfred
|
#
6041fa0a |
|
03-May-2002 |
Seigo Tanimura <tanimura@FreeBSD.org> |
As malloc(9) and free(9) are now Giant-free, remove the Giant lock across malloc(9) and free(9) of a pgrp or a session.
|
#
1c2451c2 |
|
19-Apr-2002 |
Seigo Tanimura <tanimura@FreeBSD.org> |
Push down Giant for setpgid(), setsid() and aio_daemon(). Giant protects only malloc(9) and free(9).
|
#
ba626c1d |
|
16-Apr-2002 |
John Baldwin <jhb@FreeBSD.org> |
Lock proctree_lock instead of pgrpsess_lock.
|
#
00e73160 |
|
13-Apr-2002 |
Alan Cox <alc@FreeBSD.org> |
o Use aiocblist::fd_file in the AIO threads rather than recomputing the file * from the calling process's descriptor table. o Eliminate sharing of the calling process's descriptor table with the AIO threads.
|
#
c0bf5caa |
|
07-Apr-2002 |
Alan Cox <alc@FreeBSD.org> |
Restructure aio_return() to eliminate duplicated code and facilitate Giant push down.
|
#
ae124fc4 |
|
07-Apr-2002 |
Alan Cox <alc@FreeBSD.org> |
Reduce the duplication of code for error handling in _aio_aqueue().
|
#
63a4964e |
|
06-Apr-2002 |
Alan Cox <alc@FreeBSD.org> |
Change jobref and *ijoblist from int to long in order to avoid a catastrophe after the 2^32nd AIO operation on 64-bit architectures.
|
#
9b16adc1 |
|
03-Apr-2002 |
Alan Cox <alc@FreeBSD.org> |
o aio_process needn't fhold()/fdrop() the fp now that _aio_aqueue() and aio_free_entry() do this. o Remove two unnecessary/unused variables from aio_process() and one field from aiocblist.
|
#
a5c0b1c0 |
|
31-Mar-2002 |
Alan Cox <alc@FreeBSD.org> |
Keep the reference to the file acquired in _aio_aqueue() until the operation completes. The reference is released in aio_free_entry(). Submitted by: tegge
|
#
ee99e978 |
|
25-Mar-2002 |
Bruce Evans <bde@FreeBSD.org> |
Added used include of <sys/sx.h>. Don't depend on namespace pollution in <sys/file.h> or <sys/socketvar.h>.
|
#
c897b813 |
|
19-Mar-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
Remove references to vm_zone.h and switch over to the new uma API. Also, remove maxsockets. If you look carefully you'll notice that the old zone allocator never honored this anyway.
|
#
eb8e6d52 |
|
05-Mar-2002 |
Eivind Eklund <eivind@FreeBSD.org> |
Document all functions, global and static variables, and sysctls. Includes some minor whitespace changes, and re-ordering to be able to document properly (e.g, grouping of variables and the SYSCTL macro calls for them, where the documentation has been added.) Reviewed by: phk (but all errors are mine)
|
#
f591779b |
|
23-Feb-2002 |
Seigo Tanimura <tanimura@FreeBSD.org> |
Lock struct pgrp, session and sigio. New locks are: - pgrpsess_lock which locks the whole pgrps and sessions, - pg_mtx which protects the pgrp members, and - s_mtx which protects the session members. Please refer to sys/proc.h for the coverage of these locks. Changes on the pgrp/session interface: - pgfind() needs the pgrpsess_lock held. - The caller of enterpgrp() is responsible to allocate a new pgrp and session. - Call enterthispgrp() in order to enter an existing pgrp. - pgsignal() requires a pgrp lock held. Reviewed by: jhb, alfred Tested on: cvsup.jp.FreeBSD.org (which is a quad-CPU machine running -current)
|
#
9fbd7ccf |
|
12-Feb-2002 |
Alan Cox <alc@FreeBSD.org> |
o Clearing p/td_retval[0] after aio_newproc() is unnecessary. (We stopped calling rfork() to create aio threads in revision 1.46.) o Don't recompute the FILE * when it's already stored in the kernel's AIOCB.
|
#
079b7bad |
|
07-Feb-2002 |
Julian Elischer <julian@FreeBSD.org> |
Pre-KSE/M3 commit. this is a low-functionality change that changes the kernel to access the main thread of a process via the linked list of threads rather than assuming that it is embedded in the process. It IS still embeded there but remove all teh code that assumes that in preparation for the next commit which will actually move it out. Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,
|
#
c3869e4b |
|
20-Jan-2002 |
Alan Cox <alc@FreeBSD.org> |
o Remove the unused vestiges of JOBST_JOBQPROC and the per-thread jobtorun queue. o Use TAILQ_EMPTY() instead of TAILQ_FIRST(...) == NULL.
|
#
12f63f17 |
|
19-Jan-2002 |
Alan Cox <alc@FreeBSD.org> |
o Revision 1.99 ("KSE Milestone 2") left the aio daemons sleeping on a process object but changed the corresponding wakeup()s to the thread object. The result was that non-raw aio ops waited for an aio daemon to timeout before action was taken. Now, we sleep on the thread object. PR: kern/34016
|
#
825ce531 |
|
17-Jan-2002 |
Alan Cox <alc@FreeBSD.org> |
o Eliminate an unused parameter from aio_fphysio().
|
#
c6c191b2 |
|
14-Jan-2002 |
Alan Cox <alc@FreeBSD.org> |
o Correct the initialization of aiolio_zone: Each entry was 16 times larger than necessary. o Move a rarely-used goto label inside a critical section so that we don't perform an splnet() for which there is no corresponding splx(). o Remove unnecessary splnet()/splx() around accesses to kaioinfo::kaio_jobdone in aio_return(). o Use TAILQ_FOREACH for simple cases of iteration over kaioinfo::kaio_jobdone.
|
#
7d17bbd0 |
|
08-Jan-2002 |
Alan Cox <alc@FreeBSD.org> |
o Correct a 32/64-bit error in the initialization of aiol_zone, specifically, sizeof(int) is not the size of a pointer.
|
#
48dac059 |
|
06-Jan-2002 |
Alan Cox <alc@FreeBSD.org> |
o Add missing synchronization (splnet()/splx()) in aio_free_entry(). o Move the definition of struct aiocblist from sys/aio.h to kern/vfs_aio.c. o Make aio_swake_cb() static.
|
#
23f13943 |
|
02-Jan-2002 |
Alan Cox <alc@FreeBSD.org> |
o Properly check the file descriptor passed to aio_cancel(2). (Previously, no out-of-bounds check was performed on the file descriptor.) o Eliminate some excessive white space from aio_cancel(2).
|
#
eae43d0e |
|
31-Dec-2001 |
Alan Cox <alc@FreeBSD.org> |
o Some style(9)-motivated changes to white space.
|
#
5ca50a4b |
|
30-Dec-2001 |
Alan Cox <alc@FreeBSD.org> |
o Correct an off-by-one error in aio_suspend(2). PR: 18350
|
#
516d2564 |
|
30-Dec-2001 |
Alan Cox <alc@FreeBSD.org> |
o Use "td->td_proc" instead of "curproc" where possible. o Eliminate the unnecessary initialization of several static variables to zero.
|
#
21d56e9c |
|
29-Dec-2001 |
Alfred Perlstein <alfred@FreeBSD.org> |
Make AIO a loadable module. Remove the explicit call to aio_proc_rundown() from exit1(), instead AIO will use at_exit(9). Add functions at_exec(9), rm_at_exec(9) which function nearly the same as at_exec(9) and rm_at_exec(9), these functions are called on behalf of modules at the time of execve(2) after the image activator has run. Use a modified version of tegge's suggestion via at_exec(9) to close an exploitable race in AIO. Fix SYSCALL_MODULE_HELPER such that it's archetecuterally neutral, the problem was that one had to pass it a paramater indicating the number of arguments which were actually the number of "int". Fix it by using an inline version of the AS macro against the syscall arguments. (AS should be available globally but we'll get to that later.) Add a primative system for dynamically adding kqueue ops, it's really not as sophisticated as it should be, but I'll discuss with jlemon when he's around.
|
#
604035c5 |
|
09-Dec-2001 |
Alan Cox <alc@FreeBSD.org> |
o Eliminate compilation warnings on 64-bit architectures.
|
#
91369fc7 |
|
09-Dec-2001 |
Alan Cox <alc@FreeBSD.org> |
o Eliminate unnecessary synchronization from filt_aiodetach(). o The manual page for kevent says that EVFILT_AIO returns under the same conditions as aio_error(). With that in mind, set the data field of the returned struct kevent to the value that would be returned by aio_error(). o Fix two compilation warnings.
|
#
43150722 |
|
05-Oct-2001 |
John Baldwin <jhb@FreeBSD.org> |
The aio kthreads start off with a root credential just like all other kthreads, so don't malloc a ucred just so we can create a duplicate of the one we already have.
|
#
b40ce416 |
|
12-Sep-2001 |
Julian Elischer <julian@FreeBSD.org> |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
|
#
2f3cf918 |
|
18-Apr-2001 |
Alfred Perlstein <alfred@FreeBSD.org> |
Check validity of signal callback requested via aio routines. Also move the insertion of the request to after the request is validated, there's still looks like there may be some problems if an invalid address is passed to the aio routines, basically a possible leak or having a not completely initialized structure on the queue may still be possible. A new sig macro was made _SIG_VALID to check the validity of a signal, it would be advisable to use it from now on (in kern/kern_sig.c) rather than rolling your own. PR: kern/17152
|
#
13644654 |
|
10-Mar-2001 |
Alan Cox <alc@FreeBSD.org> |
When aio_read/write() is used on a raw device, physical buffers are used for up to "vfs.aio.max_buf_aio" of the requests. If a request size is MAXPHYS, but the request base isn't page aligned, vmapbuf() will map the end of the user space buffer into the start of the kva allocated for the next physical buffer. Don't use a physical buffer in this case. (This change addresses problem report 25617.) When an aio_read/write() on a raw device has completed, timeout() is used to schedule a signal to the process. Thus, the reporting is delayed up to 10 ms (assuming hz is 100). The process might have terminated in the meantime, causing a trap 12 when attempting to deliver the signal. Thus, the timeout must be cancelled when removing the job. aio jobs in state JOBST_JOBQGLOBAL should be removed from the kaio_jobqueue list during process rundown. During process rundown, some aio jobs might move from one list to a different list that has already been "emptied", causing the rundown to be incomplete. Retry the rundown. A call to BUF_KERNPROC() is needed after obtaining a physical buffer to disassociate the lock from the running process since it can return to userland without releasing that lock. PR: 25617 Submitted by: tegge
|
#
c9a970a7 |
|
08-Mar-2001 |
Alan Cox <alc@FreeBSD.org> |
Use the kthread API to create and destroy AIO daemons. Submitted by: jhb
|
#
19eb87d2 |
|
06-Mar-2001 |
John Baldwin <jhb@FreeBSD.org> |
Grab the process lock while calling psignal and before calling psignal.
|
#
9c8a2647 |
|
06-Mar-2001 |
Alan Cox <alc@FreeBSD.org> |
Add a missing splx() to aio_fphysio(). (This change is a no-op in -5.0, but potentially significant in -4.x.) Eliminate a pointless parameter to aio_fphysio(). Remove unnecessary casts from aio_fphysio() and aio_physwakeup().
|
#
88ed460e |
|
04-Mar-2001 |
Alan Cox <alc@FreeBSD.org> |
Eliminate the aio_freejobs list. Its purpose was to store free aiocb's allocated by zalloc(). In other words, zfree() was never called. Now, we call zfree(). Why eliminate this micro- optimization? At some later point, when we multithread the AIO system, we would need a mutex to synchronize access to aio_freejobs, making its use nearly indistinguishable in cost from zalloc() and zfree(). Remove unnecessary fhold() and fdrop() calls from aio_qphysio(), undo'ing a part of revision 1.86. The reference count on the file structure is already incremented by _aio_aqueue() before it calls aio_qphysio(). (Update the comments to document this fact.) Remove unnecessary casts from _aio_aqueue(), aio_read(), aio_write() and aio_waitcomplete(). Remove an unnecessary "return;" from aio_process(). Add "static" in various places.
|
#
fb579e9a |
|
03-Mar-2001 |
Alan Cox <alc@FreeBSD.org> |
Remove the field privatemodes from struct __aiocb_private and the related code from aio_read() and aio_write(). This field was intended, but never used, to allow a mythical user-level library to make an aio_read() or aio_write() behave like an ordinary read() or write(), i.e., a blocking I/O operation.
|
#
9ed346ba |
|
08-Feb-2001 |
Bosko Milekic <bmilekic@FreeBSD.org> |
Change and clean the mutex lock interface. mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)
|
#
f09deb69 |
|
06-Feb-2001 |
Jeroen Ruigrok van der Werven <asmodai@FreeBSD.org> |
Fix typo: wierd -> weird. There is no such thing as wierd in the english language.
|
#
37d40066 |
|
04-Feb-2001 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Another round of the <sys/queue.h> FOREACH transmogriffer. Created with: sed(1) Reviewed by: md5(1)
|
#
86360fee |
|
01-Dec-2000 |
Jake Burkholder <jake@FreeBSD.org> |
Remove thr_sleep and thr_wakeup. Remove fields p_nthread and p_wakeup from struct proc, which are now unused (p_nthread already was). Remove process flag P_KTHREADP which was untested and only set in vfs_aio.c (it should use kthread_create). Move the yield system call to kern_synch.c as kern_threads.c has been removed completely. moral support from: alfred, jhb
|
#
c6fa9f78 |
|
21-Nov-2000 |
Alan Cox <alc@FreeBSD.org> |
Provide a new interface for the user of aio_read() and aio_write() to request a kevent upon completion of the I/O. Specifically, introduce a new type of sigevent notification, SIGEV_EVENT. If sigev_notify is SIGEV_EVENT, then sigev_notify_kqueue names the kqueue that should receive the event and sigev_value contains the "void *" is copied into the kevent's udata field. In contrast to the existing interface, this one: 1) works on the Alpha 2) avoids the extra copyin() call for the kevent because all of the information needed is in the sigevent and 3) could be applied to request a single kevent upon completion of an entire lio_listio(). Reviewed by: jlemon
|
#
279d7226 |
|
18-Nov-2000 |
Matthew Dillon <dillon@FreeBSD.org> |
This patchset fixes a large number of file descriptor race conditions. Pre-rfork code assumed inherent locking of a process's file descriptor array. However, with the advent of rfork() the file descriptor table could be shared between processes. This patch closes over a dozen serious race conditions related to one thread manipulating the table (e.g. closing or dup()ing a descriptor) while another is blocked in an open(), close(), fcntl(), read(), write(), etc... PR: kern/11629 Discussed with: Alexander Viro <viro@math.psu.edu>
|
#
39b2b25f |
|
29-Oct-2000 |
Alan Cox <alc@FreeBSD.org> |
_aio_aqueue(): Change kevent registration to use its own struct file pointer. Otherwise, aio_read() and aio_write() on sockets are broken if a kevent is registered. (The code after kevent registration for handling sockets assumes that the struct file pointer "fp" still refers to the socket, not the kqueue.)
|
#
35e0e5b3 |
|
20-Oct-2000 |
John Baldwin <jhb@FreeBSD.org> |
Catch up to moving headers: - machine/ipl.h -> sys/ipl.h - machine/mutex.h -> sys/mutex.h
|
#
b92bb032 |
|
26-Sep-2000 |
Alan Cox <alc@FreeBSD.org> |
aio_qphysio: Eliminate one instance of an out-of-range check that is performed twice. Eliminate initialization that is already performed by _aio_aqueue. aio_physwakeup: Eliminate redundant synchronization that is already performed by bufdone.
|
#
621dbe43 |
|
16-Sep-2000 |
Bruce Evans <bde@FreeBSD.org> |
Added used include of <sys/mutex.h> (don't depend on pollution in <sys/signalvar.h>).
|
#
a93a7807 |
|
10-Sep-2000 |
John Baldwin <jhb@FreeBSD.org> |
aio processes need to have the Giant mutex before doing work. Submitted by: tegge
|
#
f535380c |
|
05-Sep-2000 |
Don Lewis <truckman@FreeBSD.org> |
Remove uidinfo hash table lookup and maintenance out of chgproccnt() and chgsbsize(), which are called rather frequently and may be called from an interrupt context in the case of chgsbsize(). Instead, do the hash table lookup and maintenance when credentials are changed, which is a lot less frequent. Add pointers to the uidinfo structures to the ucred and pcred structures for fast access. Pass a pointer to the credential to chgproccnt() and chgsbsize() instead of passing the uid. Add a reference count to the uidinfo structure and use it to decide when to free the structure rather than freeing the structure when the resource consumption drops to zero. Move the resource tracking code from kern_proc.c to kern_resource.c. Move some duplicate code sequences in kern_prot.c to separate helper functions. Change KASSERTs in this code to unconditional tests and calls to panic().
|
#
b70158ba |
|
04-Sep-2000 |
Alan Cox <alc@FreeBSD.org> |
Make filt_aio() check the jobstate for JOBST_JOBBFINISHED (in addition to JOBST_JOBFINISHED) in case the aio_read() or aio_write() was performed via the high-performance physio method, i.e., aio_qphysio().
|
#
5dec52ba |
|
28-Jul-2000 |
Peter Wemm <peter@FreeBSD.org> |
Fix the #ifdef VFS_AIO to not compile a whole bunch of unused stuff in the !VFS_AIO case. Lots of things have hooks into here (kqueue, exit(), sockets, etc), I elected to keep the external interfaces the same rather than spread more #ifdefs around the kernel.
|
#
e3975643 |
|
25-May-2000 |
Jake Burkholder <jake@FreeBSD.org> |
Back out the previous change to the queue(3) interface. It was not discussed and should probably not happen. Requested by: msmith and others
|
#
740a1973 |
|
23-May-2000 |
Jake Burkholder <jake@FreeBSD.org> |
Change the way that the queue(3) structures are declared; don't assume that the type argument to *_HEAD and *_ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd
|
#
9626b608 |
|
05-May-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Separate the struct bio related stuff out of <sys/buf.h> into <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter
|
#
cb679c38 |
|
16-Apr-2000 |
Jonathan Lemon <jlemon@FreeBSD.org> |
Introduce kqueue() and kevent(), a kernel event notification facility.
|
#
c244d2de |
|
02-Apr-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Move B_ERROR flag to b_ioflags and call it BIO_ERROR. (Much of this done by script) Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED. Move b_pblkno and b_iodone_chain to struct bio while we transition, they will be obsoleted once bio structs chain/stack. Add bio_queue field for struct bio aware disksort. Address a lot of stylistic issues brought up by bde.
|
#
b99c307a |
|
20-Mar-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Rename the existing BUF_STRATEGY() to DEV_STRATEGY() substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo) substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo) This patch is machine generated except for the ccd.c and buf.h parts.
|
#
21144e3b |
|
20-Mar-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set. B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes. Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL. Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading. This change is a step in the direction towards a stackable BIO capability. A lot of this patch were machine generated (Thanks to style(9) compliance!) Vinum users: Greg has not had time to test this yet, be careful.
|
#
dd85920a |
|
23-Feb-2000 |
Jason Evans <jasone@FreeBSD.org> |
Add the VFS_AIO config option and leave it off by default. Unless the VFS_AIO option is specified, all aio-related syscalls return ENOSYS. The aio code is very fragile right now, and is unsuitable for default inclusion in a production shell box. Approved by: jkh
|
#
b7592c7b |
|
20-Jan-2000 |
Jason Evans <jasone@FreeBSD.org> |
Back out the previous spl change, since it opens a race window. Reviewed by: alfred, dillon, peter
|
#
60ffb019 |
|
19-Jan-2000 |
Jason Evans <jasone@FreeBSD.org> |
Don't tsleep() while at splbio(). Correctly return EINPROGRESS from aio_error() even when an aio request is still in the socket queue. Submitted by: Adrian Chadd <adrian@bofh.co.uk>
|
#
f582ac06 |
|
17-Jan-2000 |
Brian Feldman <green@FreeBSD.org> |
Fix vn_isdisk() usage to make AIO work on non-disk-files again, rather than just return ENOTBLK. PR: 16163 Submitted by: Adrian Chadd <adrian@FreeBSD.org>
|
#
bfbbc4aa |
|
13-Jan-2000 |
Jason Evans <jasone@FreeBSD.org> |
Add aio_waitcomplete(). Make aio work correctly for socket descriptors. Make gratuitous style(9) fixes (me, not the submitter) to make the aio code more readable. PR: kern/12053 Submitted by: Chris Sedore <cmsedore@maxwell.syr.edu>
|
#
ba4ad1fc |
|
09-Jan-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Give vn_isdisk() a second argument where it can return a suitable errno. Suggested by: bde
|
#
38224dcd |
|
22-Nov-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Convert various pieces of code to use vn_isdisk() rather than checking for vp->v_type == VBLK. In ccd: we don't need to call VOP_GETATTR to find the type of a vnode. Reviewed by: sos
|
#
008626c3 |
|
07-Nov-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Simplify and de-bogotify check for raw disk.
|
#
02c58685 |
|
30-Oct-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Change useracc() and kernacc() to use VM_PROT_{READ|WRITE|EXECUTE} for the "rw" argument, rather than hijacking B_{READ|WRITE}. Fix two bugs (physio & cam) resulting by the confusion caused by this. Submitted by: Tor.Egge@fast.no Reviewed by: alc, ken (partly)
|
#
d1f088da |
|
11-Oct-1999 |
Peter Wemm <peter@FreeBSD.org> |
Trim unused options (or #ifdef for undoc options). Submitted by: phk
|
#
13ccadd4 |
|
19-Sep-1999 |
Brian Feldman <green@FreeBSD.org> |
This is what was "fdfix2.patch," a fix for fd sharing. It's pretty far-reaching in fd-land, so you'll want to consult the code for changes. The biggest change is that now, you don't use fp->f_ops->fo_foo(fp, bar) but instead fo_foo(fp, bar), which increments and decrements the fp refcount upon entry and exit. Two new calls, fhold() and fdrop(), are provided. Each does what it seems like it should, and if fdrop() brings the refcount to zero, the fd is freed as well. Thanks to peter ("to hell with it, it looks ok to me.") for his review. Thanks to msmith for keeping me from putting locks everywhere :) Reviewed by: peter
|
#
c3aac50f |
|
27-Aug-1999 |
Peter Wemm <peter@FreeBSD.org> |
$Id$ -> $FreeBSD$
|
#
49ff4deb |
|
14-Aug-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Spring cleaning around strategy and disklabels/slices: Introduce BUF_STRATEGY(struct buf *, int flag) macro, and use it throughout. please see comment in sys/conf.h about the flag argument. Remove strategy argument from all the diskslice/label/bad144 implementations, it should be found from the dev_t. Remove bogus and unused strategy1 routines. Remove open/close arguments from dssize(). Pick them up from dev_t. Remove unused and unfinished setgeom support from diskslice/label/bad144 code.
|
#
4d4f9323 |
|
13-Aug-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
s/v_specinfo/v_rdev/
|
#
0ef1c826 |
|
08-Aug-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>, a few lines into <sys/vnode.h>. Add a few fields to struct specinfo, paving the way for the fun part.
|
#
9c8b8baa |
|
01-Jul-1999 |
Peter Wemm <peter@FreeBSD.org> |
Slight reorganization of kernel thread/process creation. Instead of using SYSINIT_KT() etc (which is a static, compile-time procedure), use a NetBSD-style kthread_create() interface. kproc_start is still available as a SYSINIT() hook. This allowed simplification of chunks of the sysinit code in the process. This kthread_create() is our old kproc_start internals, with the SYSINIT_KT fork hooks grafted in and tweaked to work the same as the NetBSD one. One thing I'd like to do shortly is get rid of nfsiod as a user initiated process. It makes sense for the nfs client code to create them on the fly as needed up to a user settable limit. This means that nfsiod doesn't need to be in /sbin and is always "available". This is a fair bit easier to do outside of the SYSINIT_KT() framework.
|
#
df8abd0b |
|
30-Jun-1999 |
Peter Wemm <peter@FreeBSD.org> |
Slight tweak to fork1() calling conventions. Add a third argument so the caller can easily find the child proc struct. fork(), rfork() etc syscalls set p->p_retval[] themselves. Simplify the SYSINIT_KT() code and other kernel thread creators to not need to use pfind() to find the child based on the pid. While here, partly tidy up some of the fork1() code for RF_SIGSHARE etc.
|
#
67812eac |
|
25-Jun-1999 |
Kirk McKusick <mckusick@FreeBSD.org> |
Convert buffer locking from using the B_BUSY and B_WANTED flags to using lockmgr locks. This commit should be functionally equivalent to the old semantics. That is, all buffer locking is done with LK_EXCLUSIVE requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will be done in future commits.
|
#
6fcd8a7c |
|
01-Jun-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Introduce the makebdev() function, it does the same as the makedev() function for now, but that will change.
|
#
0a346dab |
|
09-May-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
major(something) can never become NODEV.
|
#
4be2eb8c |
|
08-May-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
I got tired of seeing all the cdevsw[major(foo)] all over the place. Made a new (inline) function devsw(dev_t dev) and substituted it. Changed to the BDEV variant to this format as well: bdevsw(dev_t dev) DEVFS will eventually benefit from this change too.
|
#
b0eeea20 |
|
06-May-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
remove b_proc from struct buf, it's (now) unused. Reviewed by: dillon, bde
|
#
d5558c00 |
|
06-May-1999 |
Peter Wemm <peter@FreeBSD.org> |
Fix up a few easy 'assignment used as truth value' and 'suggest parens around && within ||' type warnings. I'm pretty sure I have not masked any problems here, I've committed real problem fixes seperately.
|
#
5206bca1 |
|
27-Apr-1999 |
Luoqi Chen <luoqi@FreeBSD.org> |
Enable vmspace sharing on SMP. Major changes are, - %fs register is added to trapframe and saved/restored upon kernel entry/exit. - Per-cpu pages are no longer mapped at the same virtual address. - Each cpu now has a separate gdt selector table. A new segment selector is added to point to per-cpu pages, per-cpu global variables are now accessed through this new selector (%fs). The selectors in gdt table are rearranged for cache line optimization. - fask_vfork is now on as default for both UP and SMP. - Some aio code cleanup. Reviewed by: Alan Cox <alc@cs.rice.edu> John Dyson <dyson@iquest.net> Julian Elischer <julian@whistel.com> Bruce Evans <bde@zeta.org.au> David Greenman <dg@root.com>
|
#
8fe387ab |
|
04-Apr-1999 |
Dmitrij Tejblum <dt@FreeBSD.org> |
Add standard padding argument to pread and pwrite syscall. That should make them NetBSD compatible. Add parameter to fo_read and fo_write. (The only flag FOF_OFFSET mean that the offset is set in the struct uio). Factor out some common code from read/pread/write/pwrite syscalls.
|
#
a5c9bce7 |
|
25-Feb-1999 |
Bruce Evans <bde@FreeBSD.org> |
Added a used #include (don't depend on "vnode_if.h" including <sys/buf.h>).
|
#
b1028ad1 |
|
19-Feb-1999 |
Luoqi Chen <luoqi@FreeBSD.org> |
Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This is the preparation step for moving pmap storage out of vmspace proper. Reviewed by: Alan Cox <alc@cs.rice.edu> Matthew Dillion <dillon@apollo.backplane.com>
|
#
bc814931 |
|
29-Jan-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
More const fixes for -Wall, -Wcast-qual
|
#
9e26dd2a |
|
29-Jan-1999 |
Bruce Evans <bde@FreeBSD.org> |
Removed bogus casts to c_caddr_t. This is part of terminating c_caddr_t with extreme prejudice. Here the original casts to caddr_t were to support K&R compilers (or missing prototypes), but the relevant source files require an ANSI compiler.
|
#
697457a1 |
|
28-Jan-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
Fix warnings related to -Wall -Wcast-qual
|
#
8aef1712 |
|
27-Jan-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile
|
#
d254af07 |
|
27-Jan-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile
|
#
1c7c3c6a |
|
21-Jan-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
This is a rather large commit that encompasses the new swapper, changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues. Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
|
#
e3b3ba2d |
|
15-Dec-1998 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Wrap two macros into do { ... } while (0), and fix the way they're used in the kernel. Reviewed by: bde
|
#
18830dba |
|
26-Nov-1998 |
Tor Egge <tegge@FreeBSD.org> |
Don't forget to update the pmap associated with aio daemons when adding new page directory entries for a growing kernel virtual address space.
|
#
f5ef029e |
|
25-Oct-1998 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Nitpicking and dusting performed on a train. Removes trivial warnings about unused variables, labels and other lint.
|
#
2d2f8ae7 |
|
17-Aug-1998 |
Bruce Evans <bde@FreeBSD.org> |
Fixed nonsense overflow checking (checking that a long variable is less than INT_MAX after it has possibly overflowed). Removed an unused variable and its associated 2 style bugs. Removed unused includes.
|
#
30166fab |
|
15-Jul-1998 |
Bruce Evans <bde@FreeBSD.org> |
Cast between longs and pointers via intptr_t. There shouldn't be nearly so many casts here. Casting an pointer that was an integer back to an integer just to compare it with -1 is bad, and casting it back just to compare it with NULL is just wrong.
|
#
596f8506 |
|
05-Jul-1998 |
Julian Elischer <julian@FreeBSD.org> |
fix braino from yesterdays' megacommit Not sure of the result of it.. (may or may not effect anything) but it's fixed now. (found by: comparing what cvsup sent back to me with what I tested..)
|
#
f7ea2f55 |
|
04-Jul-1998 |
Julian Elischer <julian@FreeBSD.org> |
There is no such thing any more as "struct bdevsw". There is only cdevsw (which should be renamed in a later edit to deventry or something). cdevsw contains the union of what were in both bdevsw an cdevsw entries. The bdevsw[] table stiff exists and is a second pointer to the cdevsw entry of the device. it's major is in d_bmaj rather than d_maj. some cleanup still to happen (e.g. dsopen now gets two pointers to the same cdevsw struct instead of one to a bdevsw and one to a cdevsw). rawread()/rawwrite() went away as part of this though it's not strictly the same patch, just that it involves all the same lines in the drivers. cdroms no longer have write() entries (they did have rawwrite (?)). tapes no longer have support for bdev operations. Reviewed by: Eivind Eklund and Mike Smith Changes suggested by eivind.
|
#
8c12612c |
|
10-Jun-1998 |
Doug Rabson <dfr@FreeBSD.org> |
64bit fixes: don't cast pointers to int.
|
#
dc733423 |
|
17-Apr-1998 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108.
|
#
227ee8a1 |
|
30-Mar-1998 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Eradicate the variable "time" from the kernel, using various measures. "time" wasn't a atomic variable, so splfoo() protection were needed around any access to it, unless you just wanted the seconds part. Most uses of time.tv_sec now uses the new variable time_second instead. gettime() changed to getmicrotime(0. Remove a couple of unneeded splfoo() protections, the new getmicrotime() is atomic, (until Bruce sets a breakpoint in it). A couple of places needed random data, so use read_random() instead of mucking about with time which isn't random. Add a new nfs_curusec() function. Mark a couple of bogosities involving the now disappeard time variable. Update ffs_update() to avoid the weird "== &time" checks, by fixing the one remaining call that passwd &time as args. Change profiling in ncr.c to use ticks instead of time. Resolution is the same. Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call hzto() which subtracts time" sequences. Reviewed by: bde
|
#
8a6472b7 |
|
28-Mar-1998 |
Peter Dufault <dufault@FreeBSD.org> |
Finish _POSIX_PRIORITY_SCHEDULING. Needs P1003_1B and _KPOSIX_PRIORITY_SCHEDULING options to work. Changes: Change all "posix4" to "p1003_1b". Misnamed files are left as "posix4" until I'm told if I can simply delete them and add new ones; Add _POSIX_PRIORITY_SCHEDULING system calls for FreeBSD and Linux; Add man pages for _POSIX_PRIORITY_SCHEDULING system calls; Add options to LINT; Minor fixes to P1003_1B code during testing.
|
#
08637435 |
|
28-Mar-1998 |
Bruce Evans <bde@FreeBSD.org> |
Moved some #includes from <sys/param.h> nearer to where they are actually used.
|
#
57518a4e |
|
24-Feb-1998 |
Bruce Evans <bde@FreeBSD.org> |
Removed a stale comment and staler code.
|
#
303b270b |
|
08-Feb-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Staticize.
|
#
0b08f5f7 |
|
05-Feb-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Back out DIAGNOSTIC changes.
|
#
47cfdb16 |
|
04-Feb-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Turn DIAGNOSTIC into a new-style option.
|
#
64889941 |
|
09-Dec-1997 |
John Dyson <dyson@FreeBSD.org> |
Quiet some lint.
|
#
78922e41 |
|
07-Dec-1997 |
John Dyson <dyson@FreeBSD.org> |
Correct prototypes to match POSIX. Correct return code for aio_cancel. Submitted by: Alex Nash <nash@mcs.com>
|
#
e499ed6f |
|
01-Dec-1997 |
John Dyson <dyson@FreeBSD.org> |
Fix a problem when creating a new kernel thread. In some cases, aio_read or aio_write can return the pid of the new thread. This is due to the way that return values from system calls being passed by side-effect in the proc structure now. This commit fixes the problem with aio_read and aio_write.
|
#
11783b14 |
|
01-Dec-1997 |
John Dyson <dyson@FreeBSD.org> |
Fix error handling for VCHR type I/O. Also, fix another spl problem, and remove alot of overly verbose debugging statements. ioproclist { int aioprocflags; /* AIO proc flags */ TAILQ_ENTRY(aioproclist) list; /* List of processes */ struct proc *aioproc; /* The AIO thread */ TAILQ_HEAD (,aiocblist) jobtorun; /* suggested job to run */ }; /* * data-structure for lio signal management */ struct aio_liojob { int lioj_flags; int lioj_buffer_count; int lioj_buffer_finished_count; int lioj_queue_count; int lioj_queue_finished_count; struct sigevent lioj_signal; /* signal on all I/O done */ TAILQ_ENTRY (aio_liojob) lioj_list; struct kaioinfo *lioj_ki; }; #define LIOJ_SIGNAL 0x1 /* signal on all done (lio) */ #define LIOJ_SIGNAL_POSTED 0x2 /* signal has been posted */ /* * per process aio data structure */ struct kaioinfo { int kaio_flags; /* per process kaio flags */ int kaio_maxactive_count; /* maximum number of AIOs */ int kaio_active_count; /* number of currently used AIOs */ int kaio_qallowed_count; /* maxiumu size of AIO queue */ int kaio_queue_count; /* size of AIO queue */ int kaio_ballowed_count; /* maximum number of buffers */ int kaio_queue_finished_count; /* number of daemon jobs finished */ int kaio_buffer_count; /* number of physio buffers */ int kaio_buffer_finished_count; /* count of I/O done */ struct proc *kaio_p; /* process that uses this kaio block */ TAILQ_HEAD (,aio_liojob) kaio_liojoblist; /* list of lio jobs */ TAILQ_HEAD (,aiocblist) kaio_jobqueue; /* job queue for process */ TAILQ_HEAD (,aiocblist) kaio_jobdone; /* done queue for process */ TAILQ_HEAD (,aiocblist) kaio_bufqueue; /* buffer job queue for process */ TAILQ_HEAD (,aiocblist) kaio_bufdone; /* buffer done queue for process */ }; #define KAIO_RUNDOWN 0x1 /* process is being run down */ #define KAIO_WAKEUP 0x2 /* wakeup process when there is a significant event */ TAILQ_HEAD (,aioproclist) aio_freeproc, aio_activeproc; TAILQ_HEAD(,aiocblist) aio_jobs; /* Async job list */ TAILQ_HEAD(,aiocblist) aio_bufjobs; /* Phys I/O job list */ TAILQ_HEAD(,aiocblist) aio_freejobs; /* Pool of free jobs */ static void aio_init_aioinfo(struct proc *p) ; static void aio_onceonly(void *) ; static int aio_free_entry(struct aiocblist *aiocbe); static void aio_process(struct aiocblist *aiocbe); static int aio_newproc(void) ; static int aio_aqueue(struct proc *p, struct aiocb *job, int type) ; static void aio_physwakeup(struct buf *bp); static int aio_fphysio(struct proc *p, struct aiocblist *aiocbe, int type); static int aio_qphysio(struct proc *p, struct aiocblist *iocb); static void aio_daemon(void *uproc); SYSINIT(aio, SI_SUB_VFS, SI_ORDER_ANY, aio_onceonly, NULL); static vm_zone_t kaio_zone=0, aiop_zone=0, aiocb_zone=0, aiol_zone=0, aiolio_zone=0; /* * Single AIOD vmspace shared amongst all of them */ static struct vmspace *aiovmspace = NULL; /* * Startup initialization */ void aio_onceonly(void *na) { TAILQ_INIT(&aio_freeproc); TAILQ_INIT(&aio_activeproc); TAILQ_INIT(&aio_jobs); TAILQ_INIT(&aio_bufjobs); TAILQ_INIT(&aio_freejobs); kaio_zone = zinit("AIO", sizeof (struct kaioinfo), 0, 0, 1); aiop_zone = zinit("AIOP", sizeof (struct aioproclist), 0, 0, 1); aiocb_zone = zinit("AIOCB", sizeof (struct aiocblist), 0, 0, 1); aiol_zone = zinit("AIOL", AIO_LISTIO_MAX * sizeof (int), 0, 0, 1); aiolio_zone = zinit("AIOLIO", AIO_LISTIO_MAX * sizeof (struct aio_liojob), 0, 0, 1); aiod_timeout = AIOD_TIMEOUT_DEFAULT; aiod_lifetime = AIOD_LIFETIME_DEFAULT; jobrefid = 1; } /* * Init the per-process aioinfo structure. * The aioinfo limits are set per-process for user limit (resource) management. */ void aio_init_aioinfo(struct proc *p) { struct kaioinfo *ki; if (p->p_aioinfo == NULL) { ki = zalloc(kaio_zone); p->p_aioinfo = ki
|
#
f4f0ecef |
|
30-Nov-1997 |
John Dyson <dyson@FreeBSD.org> |
Correct a last minute code change. Would have been an infinite loop under certain error conditions. Submitted by: pst@shockwave.com
|
#
c5efdcbd |
|
30-Nov-1997 |
John Dyson <dyson@FreeBSD.org> |
Fix an spl nit.
|
#
84af4da6 |
|
29-Nov-1997 |
John Dyson <dyson@FreeBSD.org> |
Finish up the vast majority of the AIO/LIO functionality. Proper signal support was missing in the previous version of the AIO code. More tunables added, and very efficient support for VCHR files has been added. Kernel threads are not used for VCHR files, all work for such files is done for the requesting process directly. Some attempt has been made to charge the requesting process for resource utilization, but more work is needed. aio_fsync is still missing (but the original fsync system call can be used for now.) aio_cancel is essentially a noop, but that is okay per POSIX. More aio_cancel functionality can be added later, if it is found to be needed. The functions implemented include: aio_read, aio_write, lio_listio, aio_error, aio_return, aio_cancel, aio_suspend. The code has been implemented to support the POSIX spec 1003.1b (formerly known as POSIX 1003.4 spec) features of the above. The async I/O features are truly async, with the VCHR mode of operation being essentially the same as physio (for appropriate files) for maximum efficiency. This code also supports the signal capability, is highly tunable, allowing management of resource usage, and has been written to allow a per process usage quota. Both the O'Reilly POSIX.4 book and the actual POSIX 1003.1b document were the reference specs used. Any filedescriptor can be used with these new system calls. I know of no exceptions where these system calls will not work. (TTY's will also probably work.)
|
#
f4feb04e |
|
28-Nov-1997 |
John Dyson <dyson@FreeBSD.org> |
Disable the VCHR optimization for AIO until I have implemented it. Just in case anyone wants to play with the POSIX AIO/LIO stuff. (As it is, it should work with ANY vnode, on UP systems only, for now.)
|
#
fd3bf775 |
|
28-Nov-1997 |
John Dyson <dyson@FreeBSD.org> |
Fix and complete the AIO syscalls. There are some performance enhancements coming up soon, but the code is functional. Docs will be forthcoming.
|
#
fdebd4f0 |
|
18-Nov-1997 |
Bruce Evans <bde@FreeBSD.org> |
Get locking stuff by #including <sys/lock.h> instead of <sys/vnode.h>.
|
#
4a11ca4e |
|
07-Nov-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove a bunch of variables which were unused both in GENERIC and LINT. Found by: -Wunused
|
#
cb226aaa |
|
06-Nov-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Move the "retval" (3rd) parameter from all syscall functions and put it in struct proc instead. This fixes a boatload of compiler warning, and removes a lot of cruft from the sources. I have not removed the /*ARGSUSED*/, they will require some looking at. libkvm, ps and other userland struct proc frobbing programs will need recompiled.
|
#
a1c995b6 |
|
12-Oct-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Last major round (Unless Bruce thinks of somthing :-) of malloc changes. Distribute all but the most fundamental malloc types. This time I also remembered the trick to making things static: Put "static" in front of them. A couple of finer points by: bde
|
#
55166637 |
|
11-Oct-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Distribute and statizice a lot of the malloc M_* types. Substantial input from: bde
|
#
c4860686 |
|
10-Oct-1997 |
John Dyson <dyson@FreeBSD.org> |
Make the target for the number of AIO daemons work.
|
#
a624e84f |
|
08-Oct-1997 |
John Dyson <dyson@FreeBSD.org> |
Major cleanup and debugging of the new AIO/LIO code. The LIO code is now corrected. New tunables/instrumentation added. The code is now likely "good enough to use." I will add the userland support soon. The "high performance" mode for raw devices is still missing, and will be added next. POSIX system calls that now appear to work: aio_cancel, aio_error, aio_read, aio_return, aio_suspend, aio_write, lio_listio. Missing, but to be added soon: aio_fsync.
|
#
e4ba6a82 |
|
02-Sep-1997 |
Bruce Evans <bde@FreeBSD.org> |
Removed unused #includes.
|
#
5aaef07c |
|
16-Jul-1997 |
John Dyson <dyson@FreeBSD.org> |
Clean up some lint associated with the AIO code.
|
#
2244ea07 |
|
05-Jul-1997 |
John Dyson <dyson@FreeBSD.org> |
This is an upgrade so that the kernel supports the AIO calls from POSIX.4. Additionally, there is some initial code that supports LIO. This code supports AIO/LIO for all types of file descriptors, with few if any restrictions. There will be a followup very soon that will support significantly more efficient operation for VCHR type files (raw.) This code is also dependent on some kernel features that don't work under SMP yet. After I commit the changes to the kernel to support proper address space sharing on SMP, this code will also work under SMP.
|
#
ee877a35 |
|
15-Jun-1997 |
John Dyson <dyson@FreeBSD.org> |
Add initial AIO/LIO kernel thread support files. This is preliminary, and further features will be added.
|