#
b068bb09 |
|
07-Jan-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
Add vnode_pager_clean_{a,}sync(9) Bump __FreeBSD_version for ZFS use. Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D43356
|
#
ed1a88a3 |
|
09-Jan-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
vnode_pager_generic_putpages(): rename maxblksz local to max_offset Requested by: markj Reviewed by: alc, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D43358
|
#
bdb46c21 |
|
08-Jan-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
vnode_pager_generic_putpages(): correctly handle clean block at EOF The loop 'skip clean blocks' checking for the clean blocks in the dirty pages might end up setting the in_hole to true when exactly at EOF at the middle of the block, without advancing the prev_offset value. Then the next block is not dirty, and next_offset is clipped back to poffset + maxsize, equal to prev_offset, failing the assertion. Instead of asserting prev_offset < next_offset, we must skip the write. Reported by: asomers PR: 276191 Reviewed by: alc, markj Tested by: asomers Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D43358
|
#
29363fb4 |
|
23-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove ancient SCCS tags. Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script. Sponsored by: Netflix
|
#
685dc743 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
28f957b8 |
|
24-Mar-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
vnode_pager_input: return runningbufspace back Both vnode_pager_input_smlfs() and vnode_pager_generic_getpages() increment runningbufspace, but also both delegate io completion handling on the pbuf to either plain bdone() or filesystem-specific strategy routine. Accidentally, for e.g. UFS it is g_vfs_strategy()/g_vfs_done(). The later calls bufdone() which handles runningbufspace reclamation. For plain bdone() io done handler, nothing would return accounted b_runningbufspace back. Do it in the new helper vnode_pager_input_bdone(), as well as in vnode_pager_generic_getpages_done() explicitly. Note that potential multiple calls to runningbufwakeup() for the same pbuf or buf completion are safe. runningbufwakeup() clears accounting for the buffer, so second and later calls are nop. The problem was found due to tarfs using small vnode pager input but not g_vfs_strategy(). Reported by: des Reviewed by: markj, sjg Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D39263
|
#
f45feecf |
|
22-Sep-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: add vn_getsize getattr is very expensive and in important cases only gets called to get the size. This can be optimized with a dedicated routine which obtains that statistic. As a step towards that goal make size-only consumers use a dedicated routine. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D37885
|
#
b8ebd99a |
|
13-Apr-2022 |
John Baldwin <jhb@FreeBSD.org> |
vm: Use __diagused for variables only used in KASSERT().
|
#
de2e1529 |
|
04-Aug-2021 |
Ka Ho Ng <khng@FreeBSD.org> |
Add vnode_pager_purge_range(9) KPI This KPI is created in addition to the existing vnode_pager_setsize(9) KPI. The KPI is intended for file systems that are able to turn a range of file into sparse range, also known as hole-punching. Sponsored by: The FreeBSD Foundation Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D27194
|
#
00a3fe96 |
|
07-May-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
vm_object_kvme_type(): reimplement by embedding kvme_type into pagerops Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D30168
|
#
d474440a |
|
03-May-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Constify vm_pager-related virtual tables. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D30070
|
#
192112b7 |
|
30-Apr-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Add pgo_getvp method This eliminates the staircase of conditions in vm_map_entry_set_vnode_text(). Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D30070
|
#
c23c555b |
|
30-Apr-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
Add pgo_mightbedirty method Used to implement vm_object_mightbedirty() Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D30070
|
#
180bcaa4 |
|
30-Apr-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
vm_pager: add pgo_set_writeable_dirty method specialized for swap and vnode pagers, and used to implement vm_object_set_writeable_dirty(). Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D30070
|
#
a771bf74 |
|
17-Mar-2021 |
Bryan Drewery <bdrewery@FreeBSD.org> |
Remove unused obj variable missed in r354870. Sponsored by: Dell EMC
|
#
cd853791 |
|
27-Nov-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Make MAXPHYS tunable. Bump MAXPHYS to 1M. Replace MAXPHYS by runtime variable maxphys. It is initialized from MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys. Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer cache buffers exactly to atop(maxbcachebuf) (currently it is sized to atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1. The +1 for pbufs allow several pbuf consumers, among them vmapbuf(), to use unaligned buffers still sized to maxphys, esp. when such buffers come from userspace (*). Overall, we save significant amount of otherwise wasted memory in b_pages[] for buffer cache buffers, while bumping MAXPHYS to desired high value. Eliminate all direct uses of the MAXPHYS constant in kernel and driver sources, except a place which initialize maxphys. Some random (and arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted straight. Some drivers, which use MAXPHYS to size embeded structures, get private MAXPHYS-like constant; their convertion is out of scope for this work. Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs, dev/siis, where either submitted by, or based on changes by mav. Suggested by: mav (*) Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D27225
|
#
f9cc8410 |
|
18-Sep-2020 |
Eric van Gyzen <vangyzen@FreeBSD.org> |
vm_ooffset_t is now unsigned vm_ooffset_t is now unsigned. Remove some tests for negative values, or make other adjustments accordingly. Reported by: Coverity Reviewed by: kib markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D26214
|
#
c3aa3bf9 |
|
01-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vm: clean up empty lines in .c and .h files
|
#
7ad2a82d |
|
18-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: drop the error parameter from vn_isdisk, introduce vn_isdisk_error Most consumers pass NULL.
|
#
419e5698 |
|
16-Aug-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Atomically update vm_object vnp_size, where atomic is available. This will be used later, where it matters on 32bit arches. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D25968
|
#
efec381d |
|
04-Aug-2020 |
Mark Johnston <markj@FreeBSD.org> |
Remove most lingering references to the page lock in comments. Finish updating comments to reflect new locking protocols introduced over the past year. In particular, vm_page_lock is now effectively unused. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D25868
|
#
1bd12a3b |
|
17-Jul-2020 |
Chuck Silvers <chs@FreeBSD.org> |
Fix vnode_pager handling of read ahead/behind pages when a disk read fails. Rather than marking the read ahead/behind pages valid even though they were not initialized, free them using the new function vm_page_free_invalid(). Reviewed by: markj, kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D25430
|
#
c3dbadc1 |
|
17-Jul-2020 |
Chuck Silvers <chs@FreeBSD.org> |
Revert my change from r361855 in favor of a better fix. Reviewed by: markj, kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D25430
|
#
bd7d64f5 |
|
05-Jun-2020 |
Chuck Silvers <chs@FreeBSD.org> |
Don't mark pages as valid if reading the contents from disk fails. Instead, just skip marking pages valid if the read fails. Future attempts to access such pages will notice that they are not marked valid and try to read them from disk again. Reviewed by: kib, markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D25138
|
#
abfdf767 |
|
30-Mar-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
VOP_GETPAGES_ASYNC(): consistently call iodone() callback in case of error. Reviewed by: glebius, markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24038
|
#
cafbf0c6 |
|
19-Feb-2020 |
Warner Losh <imp@FreeBSD.org> |
Don't convert all lower-layer errors to EIO. Don't convert all lower layer errors to EIO. Instead, pass the actual error up the stack. This will allow the upper layers that look for ENXIO to react properly to that signal from the lower layers and, for UFS, unmount the filesystem. Reviewed by: kib@ Differential Revision: https://reviews.freebsd.org/D23755
|
#
65252dc9 |
|
19-Feb-2020 |
Warner Losh <imp@FreeBSD.org> |
Don't spam the console with an additional, and useless, error message. There's no need to spam the console with this error message. If there's an I/O error, the disk/cam driver will report it at the lower levels. If that's an actual problem, the upper layers will report that. Reviewed by: kib@ Differential Revision: https://reviews.freebsd.org/D23756
|
#
f1fa1ba3 |
|
03-Feb-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
Fix up various vnode-related asserts which did not dump the used vnode
|
#
d6e13f3b |
|
19-Jan-2020 |
Jeff Roberson <jeff@FreeBSD.org> |
Don't hold the object lock while calling getpages. The vnode pager does not want the object lock held. Moving this out allows further object lock scope reduction in callers. While here add some missing paging in progress calls and an assert. The object handle is now protected explicitly with pip. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D23033
|
#
9c83ff2d |
|
19-Jan-2020 |
Jeff Roberson <jeff@FreeBSD.org> |
It has not been possible to recursively terminate a vnode object for some time now. Eliminate the dead code that supports it. Approved by: kib, markj Differential Revision: https://reviews.freebsd.org/D22908
|
#
a314aba8 |
|
11-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vm: add missing CLTFLAG_MPSAFE annotations This covers all vm/* files.
|
#
b249ce48 |
|
03-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: drop the mostly unused flags argument from VOP_UNLOCK Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427
|
#
abd80ddb |
|
08-Dec-2019 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: introduce v_irflag and make v_type smaller The current vnode layout is not smp-friendly by having frequently read data avoidably sharing cachelines with very frequently modified fields. In particular v_iflag inspected for VI_DOOMED can be found in the same line with v_usecount. Instead make it available in the same cacheline as the v_op, v_data and v_type which all get read all the time. v_type is avoidably 4 bytes while the necessary data will easily fit in 1. Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new flag field with a new value: VIRF_DOOMED. Reviewed by: kib, jeff Differential Revision: https://reviews.freebsd.org/D22715
|
#
a67d5408 |
|
26-Nov-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
Use atomics in more cases for object references. We now can completely omit the object lock if we are above a certain threshold. Hold only a single vnode reference when the vnode object has any ref > 0. This allows us to only lock the object and vnode on 0-1 and 1-0 transitions. Differential Revision: https://reviews.freebsd.org/D22452
|
#
7f935055 |
|
19-Nov-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
Remove unnecessary object locking from the vnode pager. Recent changes to busy/valid/dirty locking make these acquires redundant. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D22186
|
#
51df5321 |
|
29-Oct-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
Use atomics and a shared object lock to protect the object reference count. Certain consumers still need to guarantee a stable reference so we can not switch entirely to atomics yet. Exclusive lock holders can still modify and examine the refcount without using the ref api. Reviewed by: kib Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21598
|
#
2f81c92e |
|
23-Oct-2019 |
Mark Johnston <markj@FreeBSD.org> |
Check for bogus_page in vnode_pager_generic_getpages_done(). We now assert that a page is busy when updating its validity-tracking state, but bogus_page is not busied during a getpages operation. Reported by: syzkaller Reviewed by: alc, kib Discussed with: jeff MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22124
|
#
5b87ecc6 |
|
22-Oct-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Assert that vnode_pager_setsize() is called with the vnode exclusively locked except for filesystems that set the MNTK_VMSETSIZE_BUG, Set the flag for ZFS. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D21883
|
#
208b81bb |
|
22-Oct-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Add VV_VMSIZEVNLOCK flag. The flag specifies that vm_fault() handler should check the vnode' vm_object size under the vnode lock. It is converted into the object' OBJ_SIZEVNLOCK flag in vnode_pager_alloc(). Tested by: pho Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D21883
|
#
0012f373 |
|
14-Oct-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
(4/6) Protect page valid with the busy lock. Atomics are used for page busy and valid state when the shared busy is held. The details of the locking protocol and valid and dirty synchronization are in the updated vm_page.h comments. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21594
|
#
fe7bcbaf |
|
03-Sep-2019 |
Kyle Evans <kevans@FreeBSD.org> |
vm pager: writemapping accounting for OBJT_SWAP Currently writemapping accounting is only done for vnode_pager which does some accounting on the underlying vnode. Extend this to allow accounting to be possible for any of the pager types. New pageops are added to update/release writecount that need to be implemented for any pager wishing to do said accounting, and we implement these methods now for both vnode_pager (unchanged) and swap_pager. The primary motivation for this is to allow other systems with OBJT_SWAP objects to check if their objects have any write mappings and reject operations with EBUSY if so. posixshm will be the first to do so in order to reject adding write seals to the shmfd if any writable mappings exist. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D21456
|
#
6470c8d3 |
|
29-Aug-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Rework v_object lifecycle for vnodes. Current implementation of vnode_create_vobject() and vnode_destroy_vobject() is written so that it prepared to handle the vm object destruction for live vnode. Practically, no filesystems use this, except for some remnants that were present in UFS till today. One of the consequences of that model is that each filesystem must call vnode_destroy_vobject() in VOP_RECLAIM() or earlier, as result all of them get rid of the v_object in reclaim. Move the call to vnode_destroy_vobject() to vgonel() before VOP_RECLAIM(). This makes v_object stable: either the object is NULL, or it is valid vm object till the vnode reclamation. Remove code from vnode_create_vobject() to handle races with the parallel destruction. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D21412
|
#
783a68aa |
|
25-Aug-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Move OBJT_VNODE specific code from vm_object_terminate() to vnode_destroy_vobject(). Reviewed by: alc, jeff (previous version), markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D21357
|
#
4153054a |
|
19-Aug-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
Permit vm_pager_has_page() to run with a shared lock. Introduce VM_OBJECT_DROP/VM_OBJECT_PICKUP to handle functions that are called with uncertain lock state. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21310
|
#
6cb46f39 |
|
17-Jul-2019 |
Alan Somers <asomers@FreeBSD.org> |
Revert r346608 That change was intended to be cosmetic, but it inadvertenly caused vnode_pager_setsize to discard cached indirect blocks and extended attributes on UFS during truncation. The reason is because those blocks have negative LBNs, which get sign-cast to positive VM indexes. Reported by: kib Sponsored by: The FreeBSD Foundation
|
#
0cab71bc |
|
06-Jul-2019 |
Doug Moore <dougm@FreeBSD.org> |
Fix style(9) violations involving division by PAGE_SIZE. Reviewed by: alc Approved by: markj (mentor) Differential Revision: https://reviews.freebsd.org/D20847
|
#
daec9284 |
|
21-May-2019 |
Conrad Meyer <cem@FreeBSD.org> |
Include ktr.h in more compilation units Similar to r348026, exhaustive search for uses of CTRn() and cross reference ktr.h includes. Where it was obvious that an OS compat header of some kind included ktr.h indirectly, .c files were left alone. Some of these files clearly got ktr.h via header pollution in some scenarios, or tinderbox would not be passing prior to this revision, but go ahead and explicitly include it in files using it anyway. Like r348026, these CUs did not show up in tinderbox as missing the include. Reported by: peterj (arm64/mp_machdep.c) X-MFC-With: r347984 Sponsored by: Dell EMC Isilon
|
#
78022527 |
|
05-May-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Switch to use shared vnode locks for text files during image activation. kern_execve() locks text vnode exclusive to be able to set and clear VV_TEXT flag. VV_TEXT is mutually exclusive with the v_writecount > 0 condition. The change removes VV_TEXT, replacing it with the condition v_writecount <= -1, and puts v_writecount under the vnode interlock. Each text reference decrements v_writecount. To clear the text reference when the segment is unmapped, it is recorded in the vm_map_entry backed by the text file as MAP_ENTRY_VN_TEXT flag, and v_writecount is incremented on the map entry removal The operations like VOP_ADD_WRITECOUNT() and VOP_SET_TEXT() check that v_writecount does not contradict the desired change. vn_writecheck() is now racy and its use was eliminated everywhere except access. Atomic check for writeability and increment of v_writecount is performed by the VOP. vn_truncate() now increments v_writecount around VOP_SETATTR() call, lack of which is arguably a bug on its own. nullfs bypasses v_writecount to the lower vnode always, so nullfs vnode has its own v_writecount correct, and lower vnode gets all references, since object->handle is always lower vnode. On the text vnode' vm object dealloc, the v_writecount value is reset to zero, and deadfs vop_unset_text short-circuit the operation. Reclamation of lowervp always reclaims all nullfs vnodes referencing lowervp first, so no stray references are left. Reviewed by: markj, trasz Tested by: mjg, pho Sponsored by: The FreeBSD Foundation MFC after: 1 month Differential revision: https://reviews.freebsd.org/D19923
|
#
3746a90a |
|
23-Apr-2019 |
Alan Somers <asomers@FreeBSD.org> |
Slightly simplify vnode_pager_setsize No functional change intended. Sponsored by: The FreeBSD Foundation
|
#
40a51684 |
|
25-Feb-2019 |
Jason A. Harmening <jah@FreeBSD.org> |
Fix incorrect assertion in vnode_pager_generic_getpages() Reviewed by: kib, glebius MFC after: 1 week
|
#
66fb0b1a |
|
15-Feb-2019 |
Gleb Smirnoff <glebius@FreeBSD.org> |
For 32-bit machines rollback the default number of vnode pager pbufs back to the lever before r343030. For 64-bit machines reduce it slightly, too. Together with r343030 I bumped the limit up to the value we use at Netflix to serve 100 Gbit/s of sendfile traffic, and it probably isn't a good default. Provide a loader tunable to change vnode pager pbufs count. Document it.
|
#
756a5412 |
|
14-Jan-2019 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Allocate pager bufs from UMA instead of 80-ish mutex protected linked list. o In vm_pager_bufferinit() create pbuf_zone and start accounting on how many pbufs are we going to have set. In various subsystems that are going to utilize pbufs create private zones via call to pbuf_zsecond_create(). The latter calls uma_zsecond_create(), and sets a limit on created zone. After startup preallocate pbufs according to requirements of all pbuf zones. Subsystems that used to have a private limit with old allocator now have private pbuf zones: md(4), fusefs, NFS client, smbfs, VFS cluster, FFS, swap, vnode pager. The following subsystems use shared pbuf zone: cam(4), nvme(4), physio(9), aio(4). They should have their private limits, but changing that is out of scope of this commit. o Fetch tunable value of kern.nswbuf from init_param2() and while here move NSWBUF_MIN to opt_param.h and eliminate opt_swap.h, that was holding only this option. Default values aren't touched by this commit, but they probably should be reviewed wrt to modern hardware. This change removes a tight bottleneck from sendfile(2) operation, that uses pbufs in vnode pager. Other pagers also would benefit from faster allocation. Together with: gallatin Tested by: pho
|
#
e5818a53 |
|
28-Mar-2018 |
Jeff Roberson <jeff@FreeBSD.org> |
Implement several enhancements to NUMA policies. Add a new "interleave" allocation policy which stripes pages across domains with a stride or width keeping contiguity within a multi-page region. Move the kernel to the dedicated numbered cpuset #2 making it possible to assign kernel threads and memory policy separately from user. This also eliminates the need for the complicated interrupt binding code. Add a sysctl API for viewing and manipulating domainsets. Refactor some of the cpuset_t manipulation code using the generic bitset type so that it can be used for both. This probably belongs in a dedicated subr file. Attempt to improve the include situation. Reviewed by: kib Discussed with: jhb (cpuset parts) Tested by: pho (before review feedback) Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D14839
|
#
e2068d0b |
|
06-Feb-2018 |
Jeff Roberson <jeff@FreeBSD.org> |
Use per-domain locks for vm page queue free. Move paging control from global to per-domain state. Protect reservations with the free lock from the domain that they belong to. Refactor to make vm domains more of a first class object. Reviewed by: markj, kib, gallatin Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D14000
|
#
938cdc42 |
|
02-Feb-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
On pageout, in vnode generic pager, for partially dirty page, only clear dirty bits for completely invalid blocks. Otherwise we might not write out the last chunk that is shorter than 512 bytes, if the file end is not aligned on disk block boundary. This become important after the r324794. PR: 225586 Reported by: tris_vern@hotmail.com Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 days
|
#
df57947f |
|
18-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
spdx: initial adoption of licensing ID tags. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point. Initially, only tag files that use BSD 4-Clause "Original" license. RelNotes: yes Differential Revision: https://reviews.freebsd.org/D13133
|
#
b3d4ab66 |
|
20-Oct-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Take the vm object lock in read mode in vnode_generic_putpages(). Only upgrade it to write mode if we need to clear dirty bits of the partially valid page after EOF. Suggested and reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 3 weeks
|
#
05877a85 |
|
20-Oct-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not overwrite clean blocks on pageout. If filesystem block size is less than the page size, it is possible that the page-out run contains partially clean pages. E.g., the chunk of the page might be bdwrite()-ed, or some thread performed bwrite() on a buffer which references a chunk of the paged out page. As result, the assertion added in r319975, which checked that all pages in the run are dirty, does not hold on such filesystems. One solution is to remove the assert, but it is undesirable, because we do overwrite the valid on-disk content. I cannot provide a scenario where such write would corrupt the file data, but I do not like it on principle. Another, in my opinion proper, solution is to only write parts of the pages still marked dirty. The patch implements this, it skips clean blocks and only writes the dirty block runs. Note that due to clustering, write one page might clean other pages in the run, so the next write range must be calculated only after the current range is written out. More, due to a possible invalidation, and the fact that the object lock is dropped and reacquired before the checks, it is possible that the whole page-out pages run appears to consist of only clean pages. For this reason, it is impossible to assert that there is some work for the pageout method to do (i.e. assert that there is at least one dirty page in the run). But such clearing can only occur due to invalidation, and not due to a parallel write, because we own the vnode lock exclusive. Reported by: fsu In collaboration with: pho Reviewed by: alc, markj Sponsored by: The FreeBSD Foundation MFC after: 3 weeks Differential revision: https://reviews.freebsd.org/D12668
|
#
555b7bb4 |
|
26-Jul-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Mark pages after EOF as clean after pageout. Suppose that a file on NFS has partially filled last page, and this page is dirty. NFS VOP_PAGEOUT() method only marks the the page clean up to the block of the last written byte, leaving other blocks dirty. Also any page which erronously exists in the vnode vm_object past EOF is also left marked as dirty. With the introduction of the buf-cache coherent pager, each pass of syncer over the object with such page results in creation of B_DELWRI buffer due to VOP_WRITE() call. This buffer is noted on next syncer pass, which results e.g. a visible manifestation of shutdown never finishing vnode sync. Note that before buf-cache coherency commit, a dirty page might left never synced to server if a partial writes occur. Fix this by clearing dirty bits after EOF. Only blocks of the partial page which are completely after EOF are marked clean, to avoid possible user data loss. Reported by: mav Reviewed by: alc, markj Tested by: mav, pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D11697
|
#
e6c44f65 |
|
15-Jun-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Some minor improvements to vnode_pager_generic_putpages(). - Add asserts that the pages to write are dirty. The last page, if partially written, is only required to be dirty, while completely written pages should have all dirty bit set. - Use uintmax_t to print vm_page pindexes. - Use NULL instead of casted zero. - Remove if () test which duplicated the loop ending condition. - Miscellaneous style fixes. Reviewed by: alc, markj (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
83c9dea1 |
|
17-Apr-2017 |
Gleb Smirnoff <glebius@FreeBSD.org> |
- Remove 'struct vmmeter' from 'struct pcpu', leaving only global vmmeter in place. To do per-cpu stats, convert all fields that previously were maintained in the vmmeters that sit in pcpus to counter(9). - Since some vmmeter stats may be touched at very early stages of boot, before we have set up UMA and we can do counter_u64_alloc(), provide an early counter mechanism: o Leave one spare uint64_t in struct pcpu, named pc_early_dummy_counter. o Point counter(9) fields of vmmeter to pcpu[0].pc_early_dummy_counter, so that at early stages of boot, before counters are allocated we already point to a counter that can be safely written to. o For sparc64 that required a whole dummy pcpu[MAXCPU] array. Further related changes: - Don't include vmmeter.h into pcpu.h. - vm.stats.vm.v_swappgsout and vm.stats.vm.v_swappgsin changed to 64-bit, to match kernel representation. - struct vmmeter hidden under _KERNEL, and only vmstat(1) is an exclusion. This is based on benno@'s 4-year old patch: https://lists.freebsd.org/pipermail/freebsd-arch/2013-July/014471.html Reviewed by: kib, gallatin, marius, lidl Differential Revision: https://reviews.freebsd.org/D10156
|
#
65b9599a |
|
05-Apr-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Extract calculation of ioflags from the vm_pager_putpages flags into a helper. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week X-Differential revision: https://reviews.freebsd.org/D10241
|
#
3dbb0ca6 |
|
05-Apr-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Some style fixes for vnode_pager_generic_putpages(), in the local declaration block. Reviewed by: markj (as part of the larger patch) Tested by: pho (as part of the larger patch) Sponsored by: The FreeBSD Foundation MFC after: 1 week X-Differential revision: https://reviews.freebsd.org/D10241
|
#
4f56243a |
|
12-Jan-2017 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Fix the contiguity once more.
|
#
1e0c121f |
|
04-Jan-2017 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Fix assertion that checks that pages are consecutive to properly handle bogus_page insertion(s).
|
#
6ff51a36 |
|
31-Dec-2016 |
Mateusz Guzik <mjg@FreeBSD.org> |
Use vrefact in vnode_pager_alloc.
|
#
99e6e193 |
|
23-Nov-2016 |
Mark Johnston <markj@FreeBSD.org> |
Release laundered vnode pages to the head of the inactive queue. The swap pager enqueues laundered pages near the head of the inactive queue to avoid another trip through LRU before reclamation. This change adds support for this behaviour to the vnode pager and makes use of it in UFS and ext2fs. Some ioflag handling is consolidated into a common subroutine so that this support can be easily extended to other filesystems which make use of the buffer cache. No changes are needed for ZFS since its putpages routine always undirties the pages before returning, and the laundry thread requeues the pages appropriately in this case. Reviewed by: alc, kib Differential Revision: https://reviews.freebsd.org/D8589
|
#
bba39b9a |
|
22-Nov-2016 |
Alan Cox <alc@FreeBSD.org> |
Remove PG_CACHED-related fields from struct vmmeter, because they are no longer used. More precisely, they are always zero because the code that decremented and incremented them no longer exists. Bump __FreeBSD_version to mark this change. Reviewed by: kib, markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D8583
|
#
e48b82bd |
|
17-Nov-2016 |
Gleb Smirnoff <glebius@FreeBSD.org> |
- If caller specifies readbehind and readahead that together with count doesn't fit into a buf, then trim readbehind and readahead evenly. If rbehind was limited by the previous BMAP, then roundup its trim to block size. - Add KASSERT to check that b_blkno has proper offset from original blkno returned by BMAP. [1] - Add KASSERT to check that pages in buf are consecutive. Reviewed by: kib Submitted by: kib [1]
|
#
7667839a |
|
15-Nov-2016 |
Alan Cox <alc@FreeBSD.org> |
Remove most of the code for implementing PG_CACHED pages. (This change does not remove user-space visible fields from vm_cnt or all of the references to cached pages from comments. Those changes will come later.) Reviewed by: kib, markj Tested by: pho Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D8497
|
#
dcc0ff5a |
|
19-Oct-2016 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Fix incorrect assertion that could miss overflows. Reviewed by: kib
|
#
90880a1b |
|
05-Jul-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Clarify the vnode_destroy_vobject() logic handling for already terminated objects. Assert that there is no new waiters for the already terminated objects. Old waiters should have been notified by the termination calling vnode_pager_dealloc() (old/new are with regard of the lock acquisition interval). Only clear the vp->v_object for the case of already terminated object, since other branches call vnode_pager_dealloc(), which should clear the pointer. Assert this. Tested by: pho Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Approved by: re (gjb)
|
#
2a339d9e |
|
17-May-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Add implementation of robust mutexes, hopefully close enough to the intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013. A robust mutex is guaranteed to be cleared by the system upon either thread or process owner termination while the mutex is held. The next mutex locker is then notified about inconsistent mutex state and can execute (or abandon) corrective actions. The patch mostly consists of small changes here and there, adding neccessary checks for the inconsistent and abandoned conditions into existing paths. Additionally, the thread exit handler was extended to iterate over the userspace-maintained list of owned robust mutexes, unlocking and marking as terminated each of them. The list of owned robust mutexes cannot be maintained atomically synchronous with the mutex lock state (it is possible in kernel, but is too expensive). Instead, for the duration of lock or unlock operation, the current mutex is remembered in a special slot that is also checked by the kernel at thread termination. Kernel must be aware about the per-thread location of the heads of robust mutex lists and the current active mutex slot. When a thread touches a robust mutex for the first time, a new umtx op syscall is issued which informs about location of lists heads. The umtx sleep queues for PP and PI mutexes are split between non-robust and robust. Somewhat unrelated changes in the patch: 1. Style. 2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared pi mutexes. 3. Removal of the userspace struct pthread_mutex m_owner field. 4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls the lifetime of the shared mutex associated with a vnode' page. Reviewed by: jilles (previous version, supposedly the objection was fixed) Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects) Tested by: pho Sponsored by: The FreeBSD Foundation
|
#
763df3ec |
|
02-May-2016 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys/vm: minor spelling fixes in comments. No functional change.
|
#
ff64a90e |
|
27-Dec-2015 |
Konstantin Belousov <kib@FreeBSD.org> |
Add missed relpbuf() for a smallfs page-in. Reported by: Shawn Webb Tested by: pho Sponsored by: The FreeBSD Foundation
|
#
b0cd2017 |
|
16-Dec-2015 |
Gleb Smirnoff <glebius@FreeBSD.org> |
A change to KPI of vm_pager_get_pages() and underlying VOP_GETPAGES(). o With new KPI consumers can request contiguous ranges of pages, and unlike before, all pages will be kept busied on return, like it was done before with the 'reqpage' only. Now the reqpage goes away. With new interface it is easier to implement code protected from race conditions. Such arrayed requests for now should be preceeded by a call to vm_pager_haspage() to make sure that request is possible. This could be improved later, making vm_pager_haspage() obsolete. Strenghtening the promises on the business of the array of pages allows us to remove such hacks as swp_pager_free_nrpage() and vm_pager_free_nonreq(). o New KPI accepts two integer pointers that may optionally point at values for read ahead and read behind, that a pager may do, if it can. These pages are completely owned by pager, and not controlled by the caller. This shifts the UFS-specific readahead logic from vm_fault.c, which should be file system agnostic, into vnode_pager.c. It also removes one VOP_BMAP() request per hard fault. Discussed with: kib, alc, jeff, scottl Sponsored by: Nginx, Inc. Sponsored by: Netflix
|
#
9af50b01 |
|
22-Nov-2015 |
Konstantin Belousov <kib@FreeBSD.org> |
Record proper commit message for r291157. The r289895 revision did not accounted for the block containing the requested page, when calculating the run of pages. Include the pages before/after the requested page, that fit into the reqblock, into the calculation. Noted by: glebius Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
4586820a |
|
22-Nov-2015 |
Konstantin Belousov <kib@FreeBSD.org> |
Noted by: glebius Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
09c837b8 |
|
20-Nov-2015 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Remove remnants of the old NFS from vnode pager. Reviewed by: kib Sponsored by: Netflix
|
#
eac91e32 |
|
24-Oct-2015 |
Konstantin Belousov <kib@FreeBSD.org> |
Reduce the amount of calls to VOP_BMAP() made from the local vnode pager. It is enough to execute VOP_BMAP() once to obtain both the disk block address for the requested page, and the before/after limits for the contiguous run. The clipping of the vm_page_t array passed to the vnode_pager_generic_getpages() and the disk address for the first page in the clipped array can be deduced from the call results. While there, remove some noise (like if (1) {...}) and adjust nearby code. Reviewed by: alc Discussed with: glebius Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 weeks
|
#
fade8dd7 |
|
23-Jul-2015 |
Jeff Roberson <jeff@FreeBSD.org> |
Refactor unmapped buffer address handling. - Use pointer assignment rather than a combination of pointers and flags to switch buffers between unmapped and mapped. This eliminates multiple flags and generally simplifies the logic. - Eliminate b_saveaddr since it is only used with pager bufs which have their b_data re-initialized on each allocation. - Gather up some convenience routines in the buffer cache for manipulating buf space and buf malloc space. - Add an inline, buf_mapped(), to standardize checks around unmapped buffers. In collaboration with: mlaier Reviewed by: kib Tested by: pho (many small revisions ago) Sponsored by: EMC / Isilon Storage Division
|
#
9cddade7 |
|
10-May-2015 |
Konstantin Belousov <kib@FreeBSD.org> |
Satisfy vm_object uma zone destructor requirements after r282660 when vnode object creation raced. Reported by: pho Reviewed by: alc Sponsored by: The FreeBSD Foundation
|
#
d2596d17 |
|
06-May-2015 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Fix the KASSERT and improve wording in r282426. Submitted by: alc
|
#
84d31376 |
|
04-May-2015 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Fix arithmetical bug in vnode_pager_haspage(). The check against object size should be done not with the number of pages in the first block, but with the overall number of pages. While here, add KASSERT that makes sure that BMAP doesn't return completely irrelevant blocks. Reviewed by: kib Tested by: pho Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
f6d6b5e2 |
|
30-Mar-2015 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Catch up on r271387 and remove unused parameter from VOP_GETPAGES_ASYNC().
|
#
3d653db0 |
|
21-Mar-2015 |
Alan Cox <alc@FreeBSD.org> |
Introduce vm_object_color() and use it in mmap(2) to set the color of named objects to zero before the virtual address is selected. Previously, the color setting was delayed until after the virtual address was selected. In rtld, this delay effectively prevented the mapping of a shared library's code section using superpages. Now, for example, we see the first 1 MB of libc's code on armv6 mapped by a superpage after we've gotten through the initial cold misses that bring the first 1 MB of code into memory. (With the page clustering that we perform on read faults, this happens quickly.) Differential Revision: https://reviews.freebsd.org/D2013 Reviewed by: jhb, kib Tested by: Svatopluk Kraus (armv6) MFC after: 6 weeks
|
#
4d6481a4 |
|
17-Mar-2015 |
Gleb Smirnoff <glebius@FreeBSD.org> |
o Enhance vm_pager_free_nonreq() function: - Allow to call the function with vm object lock held. - Allow to specify reqpage that doesn't match any page in the region, meaning freeing all pages. o Utilize the new function in couple more places in vnode pager. Reviewed by: alc, kib Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
41c895a8 |
|
16-Mar-2015 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Provide a comment explaining r279688. Suggested by: alc
|
#
2c0cb026 |
|
10-Mar-2015 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Fix function name in comment.
|
#
73e9030e |
|
06-Mar-2015 |
Gleb Smirnoff <glebius@FreeBSD.org> |
- In vnode_pager_generic_getpages() use different free counters for synchronous and asynchronous requests. The latter can saturate the I/O and we do not want them to affect regular paging. - Allocate the pbuf at the very beginning of the function, so that if we are low on certain kind of pbufs don't even proceed to BMAP, but sleep. Reviewed by: kib Sponsored by: Nginx, Inc. Sponsored by: Netflix
|
#
1bb5ad63 |
|
24-Nov-2014 |
Gleb Smirnoff <glebius@FreeBSD.org> |
We already have "int i" in this scope. Submitted by: alc
|
#
90effb23 |
|
22-Nov-2014 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Merge from projects/sendfile: o Provide a new VOP_GETPAGES_ASYNC(), which works like VOP_GETPAGES(), but doesn't sleep. It returns immediately, and will execute the I/O done handler function that must be supplied as argument. o Provide VOP_GETPAGES_ASYNC() for the FFS, which uses vnode_pager. o Extend pagertab to support pgo_getpages_async method, and implement this method for vnode_pager. Reviewed by: kib Tested by: pho Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
79f0deb9 |
|
19-Nov-2014 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Use __func__ in KASSERTs, since the code is about to be moved to other place. Sponsored by: Nginx, Inc.
|
#
2a5eef69 |
|
19-Nov-2014 |
Gleb Smirnoff <glebius@FreeBSD.org> |
In vnode_pager_generic_getpages() vp->v_mount is dereferenced in the beginning, thus can't be NULL. Sponsored by: Nginx, Inc.
|
#
e122dfc1 |
|
18-Nov-2014 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Collapse three contiguous comment blocks into one. Remove historical note about wrong assumptions 20 years ago. Use proper casing. Sponsored by: Nginx, Inc.
|
#
a7fecb4d |
|
15-Sep-2014 |
Alan Cox <alc@FreeBSD.org> |
Three improvements to vnode_pager_generic_getpages(): Eliminate an exclusive object lock acquisition and release on the expected execution path. Do page zeroing before the object lock is acquired rather than during the time that the object lock is held. Use vm_pager_free_nonreq() to eliminate duplicated code. Reviewed by: kib MFC after: 6 weeks Sponsored by: EMC / Isilon Storage Division
|
#
d15b55c5 |
|
14-Sep-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
Provide the unique implementation for the VOP_GETPAGES() method used by ffs and ext2fs. Remove duplicated call to vm_page_zero_invalid(), done by VOP and by vm_pager_getpages(). Use vm_pager_free_nonreq(). Reviewed by: alc (previous version) Sponsored by: The FreeBSD Foundation MFC after: 6 weeks (after r271596)
|
#
396b3e34 |
|
14-Sep-2014 |
Alan Cox <alc@FreeBSD.org> |
Avoid an exclusive acquisition of the object lock on the expected execution path through the NFS clients' getpages functions. Introduce vm_pager_free_nonreq(). This function can be used to eliminate code that is duplicated in many getpages functions. Also, in contrast to the code that currently appears in those getpages functions, vm_pager_free_nonreq() avoids acquiring an exclusive object lock in one case. Reviewed by: kib MFC after: 6 weeks Sponsored by: EMC / Isilon Storage Division
|
#
33cad9e9 |
|
14-Sep-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix mis-spelling of bits and types names in the vnode_pager_putpages(). The changes should not modify the generated code. The pager->pgo_putpages() method takes int flags as its fourth argument, while vnode_pager_putpages() used boolean_t (which is typedef'ed to int). The flags are from VM_PAGER_* namespace, while vnode_pager_putpages() passed TRUE and OBJPC_SYNC to VOP_PUTPAGES(), which both are numerically equal to VM_PAGER_PUT_SYNC. Noted and reviewed by: alc (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
27ad26d8 |
|
09-Sep-2014 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Remove unused arguments for VOP_GETPAGES(), VOP_PUTPAGES().
|
#
44f1c916 |
|
22-Mar-2014 |
Bryan Drewery <bdrewery@FreeBSD.org> |
Rename global cnt to vm_cnt to avoid shadowing. To reduce the diff struct pcu.cnt field was not renamed, so PCPU_OP(cnt.field) is still used. pc_cnt and pcpu are also used in kvm(3) and vmstat(8). The goal was to not affect externally used KPI. Bump __FreeBSD_version_ in case some out-of-tree module/code relies on the the global cnt variable. Exp-run revealed no ports using it directly. No objection from: arch@ Sponsored by: EMC / Isilon Storage Division
|
#
7ebba1f8 |
|
20-Jan-2014 |
Gleb Smirnoff <glebius@FreeBSD.org> |
ANSIfy declarations. Ok'ed by: alc
|
#
c7aebda8 |
|
09-Aug-2013 |
Attilio Rao <attilio@FreeBSD.org> |
The soft and hard busy mechanism rely on the vm object lock to work. Unify the 2 concept into a real, minimal, sxlock where the shared acquisition represent the soft busy and the exclusive acquisition represent the hard busy. The old VPO_WANTED mechanism becames the hard-path for this new lock and it becomes per-page rather than per-object. The vm_object lock becames an interlock for this functionality: it can be held in both read or write mode. However, if the vm_object lock is held in read mode while acquiring or releasing the busy state, the thread owner cannot make any assumption on the busy state unless it is also busying it. Also: - Add a new flag to directly shared busy pages while vm_page_alloc and vm_page_grab are being executed. This will be very helpful once these functions happen under a read object lock. - Move the swapping sleep into its own per-object flag The KPI is heavilly changed this is why the version is bumped. It is very likely that some VM ports users will need to change their own code. Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff, kib Tested by: gavin, bapt (older version) Tested by: pho, scottl
|
#
c93dcf22 |
|
23-Jul-2013 |
Jeff Roberson <jeff@FreeBSD.org> |
- Correct a stale comment. We don't have vclean() anymore. The work is done by vgonel() and destroy_vobject() should only be called once from VOP_INACTIVE(). Sponsored by: EMC / Isilon Storage Division
|
#
9b8851fa |
|
28-Apr-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Assert that the object type for the vnode' non-NULL v_object, passed to vnode_pager_setsize(), is either OBJT_VNODE, or, if vnode was already reclaimed, OBJT_DEAD. Note that the later is only possible due to some filesystems, in particular, nfsiods from nfs clients, call vnode_pager_setsize() with unlocked vnode. More, if the object is terminated, do not perform the resizing operation. Reviewed by: alc Tested by: pho, bf MFC after: 1 week
|
#
6ded8427 |
|
28-Apr-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Convert panic() into KASSERT(). Reviewed by: alc MFC after: 1 week
|
#
6991ee13 |
|
20-Mar-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix the logic inversion in the r248512. Noted by: mckay
|
#
6ce697dc |
|
19-Mar-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Pass unmapped buffers for page in requests if the filesystem indicated support for the unmapped i/o. Sponsored by: The FreeBSD Foundation Tested by: pho
|
#
70e198dd |
|
14-Mar-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Some style fixes. Sponsored by: The FreeBSD Foundation
|
#
89f6b863 |
|
08-Mar-2013 |
Attilio Rao <attilio@FreeBSD.org> |
Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes. The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs. The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example). Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho
|
#
64a3476f |
|
26-Feb-2013 |
Attilio Rao <attilio@FreeBSD.org> |
Remove white spaces. Sponsored by: EMC / Isilon storage division
|
#
0dde287b |
|
26-Feb-2013 |
Attilio Rao <attilio@FreeBSD.org> |
Wrap the sleeps synchronized by the vm_object lock into the specific macro VM_OBJECT_SLEEP(). This hides some implementation details like the usage of the msleep() primitive and the necessity to access to the lock address directly. For this reason VM_OBJECT_MTX() macro is now retired. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho
|
#
140dedb8 |
|
02-Nov-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
The r241025 fixed the case when a binary, executed from nullfs mount, was still possible to open for write from the lower filesystem. There is a symmetric situation where the binary could already has file descriptors opened for write, but it can be executed from the nullfs overlay. Handle the issue by passing one v_writecount reference to the lower vnode if nullfs vnode has non-zero v_writecount. Note that only one write reference can be donated, since nullfs only keeps one use reference on the lower vnode. Always use the lower vnode v_writecount for the checks. Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT to manipulate the v_writecount value, which manages a single bypass reference to the lower vnode. Caling the VOPs instead of directly accessing v_writecount provide the fix described in the previous paragraph. Tested by: pho MFC after: 3 weeks
|
#
5050aa86 |
|
22-Oct-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove the support for using non-mpsafe filesystem modules. In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho
|
#
877d24ac |
|
28-Sep-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix the mis-handling of the VV_TEXT on the nullfs vnodes. If you have a binary on a filesystem which is also mounted over by nullfs, you could execute the binary from the lower filesystem, or from the nullfs mount. When executed from lower filesystem, the lower vnode gets VV_TEXT flag set, and the file cannot be modified while the binary is active. But, if executed as the nullfs alias, only the nullfs vnode gets VV_TEXT set, and you still can open the lower vnode for write. Add a set of VOPs for the VV_TEXT query, set and clear operations, which are correctly bypassed to lower vnode. Tested by: pho (previous version) MFC after: 2 weeks
|
#
b6c00483 |
|
14-Aug-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not leave invalid pages in the object after the short read for a network file systems (not only NFS proper). Short reads cause pages other then the requested one, which were not filled by read response, to stay invalid. Change the vm_page_readahead_finish() interface to not take the error code, but instead to make a decision to free or to (de)activate the page only by its validity. As result, not requested invalid pages are freed even if the read RPC indicated success. Noted and reviewed by: alc MFC after: 1 week
|
#
1c771f92 |
|
05-Aug-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
After the PHYS_TO_VM_PAGE() function was de-inlined, the main reason to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages. Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it. Suggested and reviewed by: alc MFC after: 2 weeks
|
#
0055cbd3 |
|
04-Aug-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Reduce code duplication and exposure of direct access to struct vm_page oflags by providing helper function vm_page_readahead_finish(), which handles completed reads for pages with indexes other then the requested one, for VOP_GETPAGES(). Reviewed by: alc MFC after: 1 week
|
#
db9ba578 |
|
16-Jun-2012 |
Attilio Rao <attilio@FreeBSD.org> |
Do a more targeted check on the page cache and avoid to check the cache pointer directly in vnode_pager_setsize() by using newly introduced vm_page_is_cached() function. Reviewed by: alc MFC after: 2 weeks X-MFC: r234039,234064
|
#
6031c68d |
|
16-Jun-2012 |
Alan Cox <alc@FreeBSD.org> |
The page flag PGA_WRITEABLE is set and cleared exclusively by the pmap layer, but it is read directly by the MI VM layer. This change introduces pmap_page_is_write_mapped() in order to completely encapsulate all direct access to PGA_WRITEABLE in the pmap layer. Aesthetics aside, I am making this change because amd64 will likely begin using an alternative method to track write mappings, and having pmap_page_is_write_mapped() in place allows me to make such a change without further modification to the MI VM layer. As an added bonus, tidy up some nearby comments concerning page flags. Reviewed by: kib MFC after: 6 weeks
|
#
1faacf5d |
|
28-Mar-2012 |
Kirk McKusick <mckusick@FreeBSD.org> |
Keep track of the mount point associated with a special device to enable the collection of counts of synchronous and asynchronous reads and writes for its associated filesystem. The counts are displayed using `mount -v'. Ensure that buffers used for paging indicate the vnode from which they are operating so that counts of paging I/O operations from the filesystem are collected. This checkin only adds the setting of the mount point for the UFS/FFS filesystem, but it would be trivial to add the setting and clearing of the mount point at filesystem mount/unmount time for other filesystems too. Reviewed by: kib
|
#
b47f6241 |
|
08-Mar-2012 |
John Baldwin <jhb@FreeBSD.org> |
Add KTR_VFS traces to track modifications to a vnode's writecount.
|
#
84110e7e |
|
23-Feb-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Account the writeable shared mappings backed by file in the vnode v_writecount. Keep the amount of the virtual address space used by the mappings in the new vm_object un_pager.vnp.writemappings counter. The vnode v_writecount is incremented when writemappings gets non-zero value, and decremented when writemappings is returned to zero. Writeable shared vnode-backed mappings are accounted for in vm_mmap(), and vm_map_insert() is instructed to set MAP_ENTRY_VN_WRITECNT flag on the created map entry. During deferred map entry deallocation, vm_map_process_deferred() checks for MAP_ENTRY_VN_WRITECOUNT and decrements writemappings for the vm object. Now, the writeable mount cannot be demoted to read-only while writeable shared mappings of the vnodes from the mount point exist. Also, execve(2) fails for such files with ETXTBUSY, as it should be. Noted by: tegge Reviewed by: tegge (long time ago, early version), alc Tested by: pho MFC after: 3 weeks
|
#
dc874f98 |
|
30-Nov-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
Rename vm_page_set_valid() to vm_page_set_valid_range(). The vm_page_set_valid() is the most reasonable name for the m->valid accessor. Reviewed by: attilio, alc
|
#
561cc9fc |
|
05-Nov-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
Provide typedefs for the type of bit mask for the page bits. Use the defined types instead of int when manipulating masks. Supposedly, it could fix support for 32KB page size in the machine-independend VM layer. Reviewed by: alc MFC after: 2 weeks
|
#
98601346 |
|
14-Oct-2011 |
John Baldwin <jhb@FreeBSD.org> |
Fix a typo in a comment.
|
#
3407fefe |
|
06-Sep-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomic flags field. Updates to the atomic flags are performed using the atomic ops on the containing word, do not require any vm lock to be held, and are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9) functions are provided to modify afalgs. Document the changes to flags field to only require the page lock. Introduce vm_page_reference(9) function to provide a stable KPI and KBI for filesystems like tmpfs and zfs which need to mark a page as referenced. Reviewed by: alc, attilio Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64) Approved by: re (bz)
|
#
6bbee8e2 |
|
29-Jun-2011 |
Alan Cox <alc@FreeBSD.org> |
Add a new option, OBJPR_NOTMAPPED, to vm_object_page_remove(). Passing this option to vm_object_page_remove() asserts that the specified range of pages is not mapped, or more precisely that none of these pages have any managed mappings. Thus, vm_object_page_remove() need not call pmap_remove_all() on the pages. This change not only saves time by eliminating pointless calls to pmap_remove_all(), but it also eliminates an inconsistency in the use of pmap_remove_all() versus related functions, like pmap_remove_write(). It eliminates harmless but pointless calls to pmap_remove_all() that were being performed on PG_UNMANAGED pages. Update all of the existing assertions on pmap_remove_all() to reflect this change. Reviewed by: kib
|
#
9d17da3b |
|
11-Jun-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix a bug in r222586. Lock the page owner object around the modification of the m->dirty. Reported and tested by: nwhitehorn Reviewed by: alc
|
#
031ec8c1 |
|
01-Jun-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
In the VOP_PUTPAGES() implementations, change the default error from VM_PAGER_AGAIN to VM_PAGER_ERROR for the uwritten pages. Return VM_PAGER_AGAIN for the partially written page. Always forward at least one page in the loop of vm_object_page_clean(). VM_PAGER_ERROR causes the page reactivation and does not clear the page dirty state, so the write is not lost. The change fixes an infinite loop in vm_object_page_clean() when the filesystem returns permanent errors for some page writes. Reported and tested by: gavin Reviewed by: alc, rmacklem MFC after: 1 week
|
#
a7d5f7eb |
|
19-Oct-2010 |
Jamie Gritton <jamie@FreeBSD.org> |
A new jail(8) with a configuration file, to replace the work currently done by /etc/rc.d/jail.
|
#
c8fa8709 |
|
02-Jun-2010 |
Alan Cox <alc@FreeBSD.org> |
Minimize the use of the page queues lock for synchronizing access to the page's dirty field. With the exception of one case, access to this field is now synchronized by the object lock.
|
#
c46b90e9 |
|
26-May-2010 |
Alan Cox <alc@FreeBSD.org> |
Push down page queues lock acquisition in pmap_enter_object() and pmap_is_referenced(). Eliminate the corresponding page queues lock acquisitions from vm_map_pmap_enter() and mincore(), respectively. In mincore(), this allows some additional cases to complete without ever acquiring the page queues lock. Assert that the page is managed in pmap_is_referenced(). On powerpc/aim, push down the page queues lock acquisition from moea*_is_modified() and moea*_is_referenced() into moea*_query_bit(). Again, this will allow some additional cases to complete without ever acquiring the page queues lock. Reorder a few statements in vm_page_dontneed() so that a race can't lead to an old reference persisting. This scenario is described in detail by a comment. Correct a spelling error in vm_page_dontneed(). Assert that the object is locked in vm_page_clear_dirty(), and restrict the page queues lock assertion to just those cases in which the page is currently writeable. Add object locking to vnode_pager_generic_putpages(). This was the one and only place where vm_page_clear_dirty() was being called without the object being locked. Eliminate an unnecessary vm_page_lock() around vnode_pager_setsize()'s call to vm_page_clear_dirty(). Change vnode_pager_generic_putpages() to the modern-style of function definition. Also, change the name of one of the parameters to follow virtual memory system naming conventions. Reviewed by: kib
|
#
03679e23 |
|
07-May-2010 |
Alan Cox <alc@FreeBSD.org> |
Push down the page queues lock into vm_page_activate().
|
#
eb00b276 |
|
06-May-2010 |
Alan Cox <alc@FreeBSD.org> |
Eliminate page queues locking around most calls to vm_page_free().
|
#
2965a453 |
|
29-Apr-2010 |
Kip Macy <kmacy@FreeBSD.org> |
On Alan's advice, rather than do a wholesale conversion on a single architecture from page queue lock to a hashed array of page locks (based on a patch by Jeff Roberson), I've implemented page lock support in the MI code and have only moved vm_page's hold_count out from under page queue mutex to page lock. This changes pmap_extract_and_hold on all pmaps. Supported by: Bitgravity Inc. Discussed with: alc, jeffr, and kib
|
#
f407eeb3 |
|
25-Feb-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
MFC r204205: Remove write-only variable.
|
#
d7de6e2c |
|
22-Feb-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove write-only variable. MFC after: 3 days
|
#
02d65815 |
|
07-Feb-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
MFC r202529: vunref() the vnode in vm object deallocation code for OBJT_VNODE appropriate number of times to prevent possible vnode reference leak.
|
#
b9f180d1 |
|
17-Jan-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
When a vnode-backed vm object is referenced, it increments the vnode reference count, and decrements it on dereference. If referenced object is deallocated, object type is reset to OBJT_DEAD. Consequently, all vnode references that are owned by object references are never released. vunref() the vnode in vm object deallocation code for OBJT_VNODE appropriate number of times to prevent leak. Add an assertion to the vm_pageout() to make sure that we never get reference on the vnode but then do not execute code to release it. In collaboration with: pho Reviewed by: alc MFC after: 3 weeks
|
#
9f80ce04 |
|
25-Jun-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
Change the type of uio_resid member of struct uio from int to ssize_t. Note that this does not actually enable full-range i/o requests for 64 architectures, and is done now to update KBI only. Tested by: pho Reviewed by: jhb, bde (as part of the review of the bigger patch)
|
#
3364c323 |
|
23-Jun-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
Implement global and per-uid accounting of the anonymous memory. Add rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid. The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup. The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped. The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4). Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced. In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)
|
#
3c33df62 |
|
02-Jun-2009 |
Alan Cox <alc@FreeBSD.org> |
Correct a boundary case error in the management of a page's dirty bits by shm_dotruncate() and vnode_pager_setsize(). Specifically, if the length of a shared memory object or a file is truncated such that the length modulo the page size is between 1 and 511, then all of the page's dirty bits were cleared. Now, a dirty bit is cleared only if the corresponding block is truncated in its entirety.
|
#
42eb4108 |
|
14-May-2009 |
Alan Cox <alc@FreeBSD.org> |
Eliminate unnecessary clearing of the page's dirty mask from various getpages functions. Eliminate a stale comment.
|
#
12aa4fdc |
|
11-May-2009 |
Alan Cox <alc@FreeBSD.org> |
Eliminate gratuitous clearing of the page's dirty mask.
|
#
0d53a17b |
|
09-May-2009 |
Alan Cox <alc@FreeBSD.org> |
Fix a race involving vnode_pager_input_smlfs(). Specifically, in the case that vnode_pager_input_smlfs() zeroes the page, it should not mark the page as valid until after the page is zeroed. Otherwise, the page could be mapped for read access (e.g., by vm_map_pmap_enter()) before the page is zeroed. Reviewed by: tegge Eliminate gratuitous clearing of the page's dirty mask by vnode_pager_input_smlfs(). Instead, assert that the page is clean. Reviewed by: tegge Eliminate some blank lines. Eliminate pointless calls to pmap_clear_modify() and vm_page_undirty() from vnode_pager_input_old(). The page is not mapped. Therefore, it cannot have any page table entries that are modified. Eliminate an incorrect comment from vnode_pager_generic_getpages().
|
#
3a2cdcb0 |
|
04-May-2009 |
Alan Cox <alc@FreeBSD.org> |
Eliminate vnode_pager_input_smlfs()'s pointless call to pmap_clear_modify(). The page can't possibly have any modified page table entries because it isn't even mapped.
|
#
016a3c93 |
|
24-Apr-2009 |
Alan Cox <alc@FreeBSD.org> |
Eliminate unnecessary calls to pmap_clear_modify(). Specifically, calling pmap_clear_modify() on a page is pointless if that page is not mapped or it is only mapped for read access. Instead, assert that the page is not mapped or not mapped for write access as appropriate. Eliminate unnecessary clearing of a page's dirty mask. Instead, assert that the page's dirty mask is clear.
|
#
5bd65606 |
|
09-Mar-2009 |
John Baldwin <jhb@FreeBSD.org> |
Adjust some variables (mostly related to the buffer cache) that hold address space sizes to be longs instead of ints. Specifically, the follow values are now longs: runningbufspace, bufspace, maxbufspace, bufmallocspace, maxbufmallocspace, lobufspace, hibufspace, lorunningspace, hirunningspace, maxswzone, maxbcache, and maxpipekva. Previously, a relatively small number (~ 44000) of buffers set in kern.nbuf would result in integer overflows resulting either in hangs or bogus values of hidirtybuffers and lodirtybuffers. Now one has to overflow a long to see such problems. There was a check for a nbuf setting that would cause overflows in the auto-tuning of nbuf. I've changed it to always check and cap nbuf but warn if a user-supplied tunable would cause overflow. Note that this changes the ABI of several sysctls that are used by things like top(1), etc., so any MFC would probably require a some gross shims to allow for that. MFC after: 1 month
|
#
cb61d698 |
|
09-Feb-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
Comment out the assertion from r188321. It is not valid for nfs. Reported by: alc
|
#
7b54b1a9 |
|
08-Feb-2009 |
Alan Cox <alc@FreeBSD.org> |
Eliminate OBJ_NEEDGIANT. After r188331, OBJ_NEEDGIANT's only use is by a redundant assertion in vm_fault(). Reviewed by: kib
|
#
d2bf64c3 |
|
08-Feb-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not sleep for vnode lock while holding map lock in vm_fault. Try to acquire vnode lock for OBJT_VNODE object after map lock is dropped. Because we have the busy page(s) in the object, sleeping there would result in deadlock with vnode resize. Try to get lock without sleeping, and, if the attempt failed, drop the state, lock the vnode, and restart the fault handler from the start with already locked vnode. Because the vnode_pager_lock() function is inlined in vm_fault(), axe it. Based on suggestion by: alc Reviewed by: tegge, alc Tested by: pho
|
#
705f0a82 |
|
08-Feb-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
Assert that vnode is exclusively locked when its vm object is resized. Reviewed by: tegge
|
#
d7f03759 |
|
19-Oct-2008 |
Ulf Lilleengen <lulf@FreeBSD.org> |
- Import the HEAD csup code which is the basis for the cvsmode work.
|
#
0359a12e |
|
28-Aug-2008 |
Attilio Rao <attilio@FreeBSD.org> |
Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
|
#
3cca4b6f |
|
30-Jul-2008 |
John Baldwin <jhb@FreeBSD.org> |
A few more whitespace fixes.
|
#
24bbc85b |
|
30-Jul-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
The behaviour of the lockmgr going back at least to the 4.4BSD-Lite2 was to downgrade the exclusive lock to shared one when exclusive lock owner requested shared lock. New lockmgr panics instead. The vnode_pager_lock function requests shared lock on the vnode backing the OBJT_VNODE, and can be called when the current thread already holds an exlcusive lock on the vnode. For instance, it happens when handling page fault from the VOP_WRITE() uiomove that writes to the file, with the faulted in page fetched from the vm object backed by the same file. We then get the situation described above. Verify whether the vnode is already exclusively locked by the curthread and request recursed exclusive vnode lock instead of shared, if true. Reported by: gallatin Discussed with: attilio
|
#
11be8415 |
|
12-Jun-2008 |
Stephan Uphoff <ups@FreeBSD.org> |
Fix vm object creation locking to allow SHARED vnode locking for vnode_create_vobject. (Not currently used) Noticed by: kib@
|
#
2ac78f0e |
|
20-May-2008 |
Stephan Uphoff <ups@FreeBSD.org> |
Allow VM object creation in ufs_lookup. (If vfs.vmiodirenable is set) Directory IO without a VM object will store data in 'malloced' buffers severely limiting caching of the data. Without this change VM objects for directories are only created on an open() of the directory. TODO: Inline test if VM object already exists to avoid locking/function call overhead. Tested by: kris@ Reviewed by: jeff@ Reported by: David Filo
|
#
22db15c0 |
|
13-Jan-2008 |
Attilio Rao <attilio@FreeBSD.org> |
VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
|
#
cb05b60a |
|
09-Jan-2008 |
Attilio Rao <attilio@FreeBSD.org> |
vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
|
#
82cfdd5a |
|
22-Nov-2007 |
Alan Cox <alc@FreeBSD.org> |
Remove an unnecessary call to pmap_remove_all() and the associated "XXX" comments from vnode_pager_setsize(). This call was introduced in revision 1.140 to address a problem that no longer exists. Specifically, pmap_zero_page_area() has replaced a (possibly) problematic implementation of page zeroing that was based on vm_pager_map(), bzero(), and vm_pager_unmap().
|
#
0ab3c7a5 |
|
22-Oct-2007 |
Alan Cox <alc@FreeBSD.org> |
Correct an error of omission in the reimplementation of the page cache: vnode_pager_setsize() must handle the case where a file is truncated to a non-page-size-aligned boundary and there is a cached page underlying the new end of file. Reported by: kris, tegge Tested by: kris MFC after: 3 days
|
#
57fd3d55 |
|
26-Jul-2007 |
Pawel Jakub Dawidek <pjd@FreeBSD.org> |
When we do open, we should lock the vnode exclusively. This fixes few races: - fifo race, where two threads assign v_fifoinfo, - v_writecount modifications, - v_object modifications, - and probably more... Discussed with: kib, ups Approved by: re (rwatson)
|
#
b4b70819 |
|
04-Jun-2007 |
Attilio Rao <attilio@FreeBSD.org> |
Do proper "locking" for missing vmmeters part. Now, we assume no more sched_lock protection for some of them and use the distribuited loads method for vmmeter (distribuited through CPUs). Reviewed by: alc, bde Approved by: jeff (mentor)
|
#
2feb50bf |
|
31-May-2007 |
Attilio Rao <attilio@FreeBSD.org> |
Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)
|
#
222d0195 |
|
18-May-2007 |
Jeff Roberson <jeff@FreeBSD.org> |
- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>
|
#
7e2393ff |
|
14-Oct-2006 |
Alan Cox <alc@FreeBSD.org> |
Long ago, revision 1.22 of vm/vm_pager.h introduced a bug. Specifically, it introduced a check after the call to file system's get pages method that assumes that the get pages method does not change the array of pages that is passed to it. In the case of vnode_pager_generic_getpages(), this assumption has been incorrect. The contents of the array of pages may be shifted by vnode_pager_generic_getpages(). Likely, the problem has been hidden by vnode_pager_haspage() limiting the set of pages that are passed to vnode_pager_generic_getpages() such that a shift never occurs. The fix implemented herein is to adjust the pointer to the array of pages rather than shifting the pages within the array. MFC after: 3 weeks Fix suggested by: tegge
|
#
bff76343 |
|
14-Oct-2006 |
Alan Cox <alc@FreeBSD.org> |
Change vnode_pager_addr() such that on returning it distinguishes between an error returned by VOP_BMAP() and a hole in the file. Change the callers to vnode_pager_addr() such that they return VM_PAGER_ERROR when VOP_BMAP fails instead of a zero-filled page. Reviewed by: tegge MFC after: 3 weeks
|
#
1de11f1a |
|
10-Oct-2006 |
Alan Cox <alc@FreeBSD.org> |
Distinguish between two distinct kinds of errors from VOP_BMAP() in vnode_pager_generic_getpages(): (1) that VOP_BMAP() is unsupported by the underlying file system and (2) an error in performing the VOP_BMAP(). Previously, vnode_pager_generic_getpages() assumed that all errors were of the first type. If, in fact, the error was of the second type, the likely outcome was for the process to become permanently blocked on a busy page. MFC after: 3 weeks Reviewed by: tegge
|
#
f4f83da0 |
|
08-Oct-2006 |
Alan Cox <alc@FreeBSD.org> |
Change vnode_pager_generic_getpages() so that it does not panic if the given file is sparse. Instead, it zeroes the requested page. Reviewed by: tegge PR: kern/98116 MFC after: 3 days
|
#
5786be7c |
|
09-Aug-2006 |
Alan Cox <alc@FreeBSD.org> |
Introduce a field to struct vm_page for storing flags that are synchronized by the lock on the object containing the page. Transition PG_WANTED and PG_SWAPINPROG to use the new field, eliminating the need for holding the page queues lock when setting or clearing these flags. Rename PG_WANTED and PG_SWAPINPROG to VPO_WANTED and VPO_SWAPINPROG, respectively. Eliminate the assertion that the page queues lock is held in vm_page_io_finish(). Eliminate the acquisition and release of the page queues lock around calls to vm_page_io_finish() in kern_sendfile() and vfs_unbusy_pages().
|
#
3b582b4e |
|
02-Mar-2006 |
Tor Egge <tegge@FreeBSD.org> |
Eliminate a deadlock when creating snapshots. Blocking vn_start_write() must be called without any vnode locks held. Remove calls to vn_start_write() and vn_finished_write() in vnode_pager_putpages() and add these calls before the vnode lock is obtained to most of the callers that don't already have them.
|
#
b73f64c4 |
|
06-Feb-2006 |
Jeff Roberson <jeff@FreeBSD.org> |
- Fix silly VI locking that is used to check a single flag. The vnode lock also protects this flag so it is not necessary. - Don't rely on v_mount to detect whether or not we've been recycled, use the more appropriate VI_DOOMED instead. Sponsored by: Isilon Systems, Inc. MFC After: 1 week
|
#
731959b1 |
|
31-Jan-2006 |
Yaroslav Tykhiy <ytykhiy@gmail.com> |
Use off_t for file size passed to vnode_create_vobject(). The former type, size_t, was causing truncation to 32 bits on i386, which immediately led to undersizing of VM objects backed by files >4GB. In particular, sendfile(2) was broken for such files. PR: kern/92243 MFC after: 5 days
|
#
dd498bef |
|
01-Nov-2005 |
Paul Saab <ps@FreeBSD.org> |
Rate limit vnode_pager_putpages printfs to once a second.
|
#
857b66d5 |
|
13-Aug-2005 |
Alexander Kabaev <kan@FreeBSD.org> |
Do not use vm_pager_init() to initialize vnode_pbuf_freecnt variable. vm_pager_init() is run before required nswbuf variable has been set to correct value. This caused system to run with single pbuf available for vnode_pager. Handle both cluster_pbuf_freecnt and vnode_pbuf_freecnt variable in the same way. Reported by: ade Obtained from: alc MFC after: 2 days
|
#
4f12e0ac |
|
08-Aug-2005 |
Suleiman Souhlal <ssouhlal@FreeBSD.org> |
Use atomic operations on runningbufspace. PR: kern/84318 Submitted by: ade MFC after: 3 days
|
#
d2d9c9ac |
|
18-May-2005 |
Alan Cox <alc@FreeBSD.org> |
Remove a stale comment concerning spl* usage.
|
#
f3aad9a6 |
|
18-May-2005 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Correct 32 vs 64 bit signedness issues. Approved by: pjd (mentor) MFC after: 2 weeks
|
#
ed4fe4f4 |
|
03-May-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Add a new object flag "OBJ_NEEDSGIANT". We set this flag if the underlying vnode requires Giant. - In vm_fault only acquire Giant if the underlying object has NEEDSGIANT set. - In vm_object_shadow inherit the NEEDSGIANT flag from the backing object.
|
#
6e4b2820 |
|
03-Apr-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Don't NULL the vnode's v_object pointer until after the object is torn down. If we have dirty pages, the putpages routine will need to know what the vnode's object is so that it may write out dirty pages. Pointy hat: phk Found by: obrien
|
#
f247a524 |
|
30-Mar-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- LK_NOPAUSE is a nop now. Sponsored by: Isilon Systems, Inc.
|
#
7747c038 |
|
14-Mar-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Don't directly adjust v_usecount, use vref() instead. Sponsored by: Isilon Systems, Inc.
|
#
1d39df3f |
|
14-Mar-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Retire OLOCK and OWANT. All callers hold the vnode lock when creating a vnode object. There has been an assert to prove this for some time. Sponsored by: Isilon Systems, Inc.
|
#
493d78b3 |
|
12-Mar-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Don't acquire the vnode lock in destroy_vobject, assert that it has already been acquired by the caller. Sponsored by: Isilon Systems, Inc.
|
#
dfd4be14 |
|
19-Feb-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Try to unbreak the vnode locking around vop_reclaim() (based mostly on patch from kan@). Pull bufobj_invalbuf() out of vinvalbuf() and make g_vfs call it on close. This is not yet a generally safe function, but for this very specific use it is safe. This solves the problem with buffers not being flushed by unmount or after failed mount attempts.
|
#
7146d6cb |
|
28-Jan-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Move the contents of vop_stddestroyvobject() to the new vnode_pager function vnode_destroy_vobject(). Make the new function zero the vp->v_object pointer so we can tell if a call is missing.
|
#
d07a6d3f |
|
24-Jan-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Move the body of vop_stdcreatevobject() over to the vnode_pager under the name Sande^H^H^H^H^Hvnode_create_vobject(). Make the new function take a size argument which removes the need for a VOP_STAT() or a very pessimistic guess for disks. Call that new function from vop_stdcreatevobject(). Make vnode_pager_alloc() private now that its only user came home.
|
#
35764be3 |
|
24-Jan-2005 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Kill the VV_OBJBUF and test the v_object for NULL instead.
|
#
ae51ff11 |
|
24-Jan-2005 |
Jeff Roberson <jeff@FreeBSD.org> |
- Remove GIANT_REQUIRED where giant is no longer required. - Use VFS_LOCK_GIANT() rather than directly acquiring giant in places where giant is only held because vfs requires it. Sponsored By: Isilon Systems, Inc.
|
#
60727d8b |
|
06-Jan-2005 |
Warner Losh <imp@FreeBSD.org> |
/* -> /*- for license, minor formatting changes
|
#
475e8cc3 |
|
25-Dec-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
fix comment
|
#
2ad036b6 |
|
07-Dec-2004 |
Alan Cox <alc@FreeBSD.org> |
Almost nine years ago, when support for 1TB files was introduced in revision 1.55, the address parameter to vnode_pager_addr() was changed from an unsigned 32-bit quantity to a signed 64-bit quantity. However, an out-of-range check on the address was not updated. Consequently, memory-mapped I/O on files greater than 2GB could cause a kernel panic. Since the address is now a signed 64-bit quantity, the problem resolution is simply to remove a cast. Reviewed by: bde@ and tegge@ PR: 73010 MFC after: 1 week
|
#
d8fed1d0 |
|
05-Dec-2004 |
Alan Cox <alc@FreeBSD.org> |
Correct a sanity check in vnode_pager_generic_putpages(). The cast used to implement the sanity check should have been changed when we converted the implementation of vm_pindex_t from 32 to 64 bits. (Thus, RELENG_4 is not affected.) The consequence of this error would be a legimate write to an extremely large file being treated as an errant attempt to write meta- data. Discussed with: tegge@
|
#
9c83534d |
|
15-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Make VOP_BMAP return a struct bufobj for the underlying storage device instead of a vnode for it. The vnode_pager does not and should not have any interest in what the filesystem uses for backend. (vfs_cluster doesn't use the backing store argument.)
|
#
676f3ee2 |
|
15-Nov-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Explicitly call pbrelvp()
|
#
19187819 |
|
05-Nov-2004 |
Alan Cox <alc@FreeBSD.org> |
Move a call to wakeup() from vm_object_terminate() to vnode_pager_dealloc() because this call is only needed to wake threads that slept when they discovered a dead object connected to a vnode. To eliminate unnecessary calls to wakeup() by vnode_pager_dealloc(), introduce a new flag, OBJ_DISCONNECTWNT. Reviewed by: tegge@
|
#
6229cc50 |
|
26-Oct-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Also check that the sectormask is bigger than zero. Wrap this overly long KASSERT and remove newline.
|
#
5d9d81e7 |
|
26-Oct-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Put the I/O block size in bufobj->bo_bsize. We keep si_bsize_phys around for now as that is the simplest way to pull the number out of disk device drivers in devfs_open(). The correct solution would be to do an ioctl(DIOCGSECTORSIZE), but the point is probably mooth when filesystems sit on GEOM, so don't bother for now.
|
#
b792bebe |
|
24-Oct-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Move the buffer method vector (buf->b_op) to the bufobj. Extend it with a strategy method. Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY song and dance. Rename ibwrite to bufwrite(). Move the two NFS buf_ops to more sensible places, add bufstrategy to them. Add inlines for bwrite() and bstrategy() which calls through buf->b_bufobj->b_ops->b_{write,strategy}(). Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().
|
#
1a31a6c3 |
|
07-Sep-2004 |
Poul-Henning Kamp <phk@FreeBSD.org> |
add KASSERTS
|
#
0cb507cb |
|
18-Aug-2004 |
Alan Cox <alc@FreeBSD.org> |
Acquire and release Giant around a call to VOP_BMAP(). (This is a prerequisite to any further reduction in Giant's use by vm_fault().)
|
#
5a324893 |
|
05-May-2004 |
Alan Cox <alc@FreeBSD.org> |
Make vm_page's PG_ZERO flag immutable between the time of the page's allocation and deallocation. This flag's principal use is shortly after allocation. For such cases, clearing the flag is pointless. The only unusual use of PG_ZERO is in vfs_bio_clrbuf(). However, allocbuf() never requests a prezeroed page. So, vfs_bio_clrbuf() never sees a prezeroed page. Reviewed by: tegge@
|
#
87aefa49 |
|
23-Apr-2004 |
Alan Cox <alc@FreeBSD.org> |
Push down Giant into vm_pager_get_pages(). The only get pages methods that require Giant are in the device and vnode pagers.
|
#
9e0ddbd0 |
|
06-Apr-2004 |
Alan Cox <alc@FreeBSD.org> |
Eliminate vm_pager_map_page() and vm_pager_unmap_page() and their uses. Use sf_buf_alloc() and sf_buf_free() instead.
|
#
a6704857 |
|
03-Jan-2004 |
Alan Cox <alc@FreeBSD.org> |
Eliminate the acquisition and release of Giant from vnode_pager_alloc(). The vm object and vnode locking should suffice. Discussed with: jeff
|
#
167a9eff |
|
15-Nov-2003 |
Tim J. Robbins <tjr@FreeBSD.org> |
In vnode_pager_input_smlfs(), call VOP_STRATEGY instead of VOP_SPECSTRATEGY on non-VCHR vnodes. This fixes a panic when reading data from files on a filesystem with a small (less than a page) block size. PR: 59271 Reviewed by: alc
|
#
52051abc |
|
24-Oct-2003 |
Alan Cox <alc@FreeBSD.org> |
- Call vnode_pager_input_old() with the vm object locked.
|
#
2e3b314d |
|
24-Oct-2003 |
Alan Cox <alc@FreeBSD.org> |
- Push down Giant from vm_pageout() to vm_pageout_scan(), freeing vm_pageout_page_stats() from Giant. - Modify vm_pager_put_pages() and vm_pager_page_unswapped() to expect the vm object to be locked on entry. (All of the pager routines now expect this.)
|
#
2bf43e43 |
|
19-Oct-2003 |
Alan Cox <alc@FreeBSD.org> |
- Hold the vm object's lock around calls to vm_page_set_validclean().
|
#
1b26eb10 |
|
18-Oct-2003 |
Alan Cox <alc@FreeBSD.org> |
- Synchronize access to a vm page's valid field using the containing vm object's lock. - Reduce the scope of the vm page queues lock in two places.
|
#
8b575f6c |
|
18-Oct-2003 |
Alan Cox <alc@FreeBSD.org> |
- Synchronize access to the page's valid field in vnode_pager_generic_getpages() using the containing object's lock.
|
#
2c18019f |
|
18-Oct-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
DuH! bp->b_iooffset (the spot on the disk), not bp->b_offset (the offset in the file)
|
#
9fbf91c0 |
|
18-Oct-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Initialize bp->b_offset before calling VOP_[SPEC]STRATEGY(). Remove stale comment about B_PHYS.
|
#
417a26a1 |
|
17-Sep-2003 |
Alan Cox <alc@FreeBSD.org> |
Add vm object locking to vnode_pager_lock(). (This triggers the movement of a VM_OBJECT_LOCK() in vm_fault().)
|
#
23562e4b |
|
28-Aug-2003 |
Marcel Moolenaar <marcel@FreeBSD.org> |
In vnode_pager_generic_putpages(), change the printf format specifier to long and explicitly cast field dirty of struct vm_page to unsigned long. When PAGE_SIZE is 32K, this field is actually unsigned long.
|
#
b7ad744d |
|
23-Aug-2003 |
Alan Cox <alc@FreeBSD.org> |
Hold the page queues lock when performing vm_page_clear_dirty() and vm_page_set_invalid().
|
#
6a4b5823 |
|
18-Aug-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Replace a homegrown bdone()/bwait() implementation by the real thing
|
#
ec794849 |
|
17-Aug-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Use NULL for 3rd argument of VOP_BMAP() rather than custom cast. Eliminate unused variable.
|
#
4e658600 |
|
05-Aug-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Use sparse struct initializations for struct pagerops. This makes grepping for which pagers implement which methods easier.
|
#
f29ba63e |
|
22-Jun-2003 |
Alan Cox <alc@FreeBSD.org> |
Maintain a lock on the vm object of interest throughout vm_fault(), releasing the lock only if we are about to sleep (e.g., vm_pager_get_pages() or vm_pager_has_pages()). If we sleep, we have marked the vm object with the paging-in-progress flag.
|
#
31953be9 |
|
17-Jun-2003 |
Alan Cox <alc@FreeBSD.org> |
Lock the vm object when freeing a vm page.
|
#
8630c117 |
|
12-Jun-2003 |
Alan Cox <alc@FreeBSD.org> |
Add vm object locking to various pagers' "get pages" methods, i386 stack management functions, and a u area management function.
|
#
874651b1 |
|
11-Jun-2003 |
David E. O'Brien <obrien@FreeBSD.org> |
Use __FBSDID().
|
#
2a8f9ab5 |
|
10-Jun-2003 |
Alan Cox <alc@FreeBSD.org> |
- Finish vm object and page locking in vnode_pager_setsize(). - Make some small style changes to vnode_pager_setsize(); most notably, move two comments to a more logical place.
|
#
658ad5ff |
|
05-May-2003 |
Alan Cox <alc@FreeBSD.org> |
Lock the vm_object when performing vm_pager_deallocate().
|
#
1ca58953 |
|
26-Apr-2003 |
Alan Cox <alc@FreeBSD.org> |
- Convert vm_object_pip_wait() from using tsleep() to msleep(). - Make vm_object_pip_sleep() static. - Lock the vm_object when performing vm_object_pip_wait().
|
#
49281fbf |
|
18-Apr-2003 |
Alan Cox <alc@FreeBSD.org> |
Update locking around vm_object_page_remove() to use the new macros.
|
#
b4b138c2 |
|
18-Mar-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Including <sys/stdint.h> is (almost?) universally only to be able to use %j in printfs, so put a newsted include in <sys/systm.h> where the printf prototype lives and save everybody else the trouble.
|
#
09c80124 |
|
05-Mar-2003 |
Alan Cox <alc@FreeBSD.org> |
Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress. Discussed on: arch@
|
#
ca94e7c4 |
|
13-Jan-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
We can get past here on a normal vnode as well, so use VOP_STRATEGY if so.
|
#
5266a767 |
|
05-Jan-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Convert VOP_STRATEGY to VOP_SPECSTRATEGY in the generic getpages and the pager input for small filesystems.
|
#
86270230 |
|
02-Jan-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Convert calls to BUF_STRATEGY to VOP_STRATEGY calls. This is a no-op since all BUF_STRATEGY did in the first place was call VOP_STRATEGY.
|
#
43b7990e |
|
28-Dec-2002 |
Matthew Dillon <dillon@FreeBSD.org> |
Allow the VM object flushing code to cluster. When the filesystem syncer comes along and flushes a file which has been mmap()'d SHARED/RW, with dirty pages, it was flushing the underlying VM object asynchronously, resulting in thousands of 8K writes. With this change the VM Object flushing code will cluster dirty pages in 64K blocks. Note that until the low memory deadlock issue is reviewed, it is not safe to allow the pageout daemon to use this feature. Forced pageouts still use fs block size'd ops for the moment. MFC after: 3 days
|
#
475e8011 |
|
14-Dec-2002 |
Alan Cox <alc@FreeBSD.org> |
Perform vm_object_lock() and vm_object_unlock() around vm_object_page_remove().
|
#
85e01243 |
|
27-Nov-2002 |
Alan Cox <alc@FreeBSD.org> |
Hold the page queues lock when performing pmap_clear_modify(). Approved by: re (blanket)
|
#
178949e0 |
|
23-Nov-2002 |
Alan Cox <alc@FreeBSD.org> |
Hold the page queues/flags lock when calling vm_page_set_validclean(). Approved by: re
|
#
e8a27959 |
|
22-Nov-2002 |
Alan Cox <alc@FreeBSD.org> |
Add page queue and flag locking in vnode_pager_setsize(). Approved by: re
|
#
4fec79be |
|
16-Nov-2002 |
Alan Cox <alc@FreeBSD.org> |
Now that pmap_remove_all() is exported by our pmap implementations use it directly.
|
#
d154fb4f |
|
10-Nov-2002 |
Alan Cox <alc@FreeBSD.org> |
When prot is VM_PROT_NONE, call pmap_page_protect() directly rather than indirectly through vm_page_protect(). The one remaining page flag that is updated by vm_page_protect() is already being updated by our various pmap implementations. Note: A later commit will similarly change the VM_PROT_READ case and eliminate vm_page_protect().
|
#
bf1001fa |
|
07-Nov-2002 |
Maxime Henrion <mux@FreeBSD.org> |
Better printf() formats.
|
#
37c84183 |
|
28-Sep-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Be consistent about "static" functions: if the function is marked static in its prototype, mark it static at the definition too. Inspired by: FlexeLint warning #512
|
#
63e7e60d |
|
24-Sep-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
- Add a ASSERT_VOP_LOCKED in vnode_pager_alloc. - Lock access to v_iflags.
|
#
fff6062a |
|
24-Aug-2002 |
Alan Cox <alc@FreeBSD.org> |
o Retire vm_page_zero_fill() and vm_page_zero_fill_area(). Ever since pmap_zero_page() and pmap_zero_page_area() were modified to accept a struct vm_page * instead of a physical address, vm_page_zero_fill() and vm_page_zero_fill_area() have served no purpose.
|
#
e6e370a7 |
|
04-Aug-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
- Replace v_flag with v_iflag and v_vflag - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS
|
#
e43c2eab |
|
28-Jul-2002 |
Alan Cox <alc@FreeBSD.org> |
o Lock page queue accesses by vm_page_free(). o Apply some style fixes.
|
#
9d522888 |
|
26-Jul-2002 |
Alan Cox <alc@FreeBSD.org> |
o Lock page queue accesses by vm_page_activate() and vm_page_deactivate().
|
#
47e151dd |
|
01-Jul-2002 |
Robert Drehmel <robert@FreeBSD.org> |
- Use (OFF_TO_IDX(off) - pi) instead of (OFF_TO_IDX(off - IDX_TO_OFF(pi))). - Reformat a comment.
|
#
990ab7ad |
|
22-Jun-2002 |
Alan Cox <alc@FreeBSD.org> |
o Replace GIANT_REQUIRED in vnode_pager_alloc() by the acquisition and release of Giant. (Annotate as MPSAFE.) o Also, in vnode_pager_alloc(), remove an unnecessary re-initialization of struct vm_object::flags and move a statement that is duplicated in both branches of an if-else.
|
#
d394511d |
|
16-May-2002 |
Tom Rhodes <trhodes@FreeBSD.org> |
More s/file system/filesystem/g
|
#
98b0c789 |
|
14-May-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Make daddr_t and u_daddr_t 64bits wide. Retire daddr64_t and use daddr_t instead. Sponsored by: DARPA & NAI Labs.
|
#
c0b6bbb8 |
|
05-May-2002 |
Alan Cox <alc@FreeBSD.org> |
o Condition the compilation and use of vm_freeze_copyopts() on ENABLE_VFS_IOOPT.
|
#
44e74ba6 |
|
27-Apr-2002 |
Peter Wemm <peter@FreeBSD.org> |
We do not necessarily need to map/unmap pages to zero parts of them. On systems where physical memory is also direct mapped (alpha, sparc, ia64 etc) this is slightly harmful.
|
#
11caded3 |
|
19-Mar-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Remove __P.
|
#
0d2af521 |
|
15-Mar-2002 |
Kirk McKusick <mckusick@FreeBSD.org> |
Introduce the new 64-bit size disk block, daddr64_t. Change the bio and buffer structures to have daddr64_t bio_pblkno, b_blkno, and b_lblkno fields which allows access to disks larger than a Terabyte in size. This change also requires that the VOP_BMAP vnode operation accept and return daddr64_t blocks. This delta should not affect system operation in any way. It merely sets up the necessary interfaces to allow the development of disk drivers that work with these larger disk block addresses. It also allows for the development of UFS2 which will use 64-bit block addresses.
|
#
a1287949 |
|
10-Mar-2002 |
Eivind Eklund <eivind@FreeBSD.org> |
- Remove a number of extra newlines that do not belong here according to style(9) - Minor space adjustment in cases where we have "( ", " )", if(), return(), while(), for(), etc. - Add /* SYMBOL */ after a few #endifs. Reviewed by: alc
|
#
a854ed98 |
|
27-Feb-2002 |
John Baldwin <jhb@FreeBSD.org> |
Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.
|
#
3ebeaf59 |
|
13-Dec-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
This fixes a large number of bugs in our NFS client side code. A recent commit by Kirk also fixed a softupdates bug that could easily be triggered by server side NFS. * An edge case with shared R+W mmap()'s and truncate whereby the system would inappropriately clear the dirty bits on still-dirty data. (applicable to all filesystems) THIS FIX TEMPORARILY DISABLED PENDING FURTHER TESTING. see vm/vm_page.c line 1641 * The straddle case for VM pages and buffer cache buffers when truncating. (applicable to NFS client side) * Possible SMP database corruption due to vm_pager_unmap_page() not clearing the TLB for the other cpu's. (applicable to NFS client side but could effect all filesystems). Note: not considered serious since the corruption occurs beyond the file EOF. * When flusing a dirty buffer due to B_CACHE getting cleared, we were accidently setting B_CACHE again (that is, bwrite() sets B_CACHE), when we really want it to stay clear after the write is complete. This resulted in a corrupt buffer. (applicable to all filesystems but probably only triggered by NFS) * We have to call vtruncbuf() when ftruncate()ing to remove any buffer cache buffers. This is still tentitive, I may be able to remove it due to the second bug fix. (applicable to NFS client side) * vnode_pager_setsize() race against nfs_vinvalbuf()... we have to set n_size before calling nfs_vinvalbuf or the NFS code may recursively vnode_pager_setsize() to the original value before the truncate. This is what was causing the user mmap bus faults in the nfs tester program. (applicable to NFS client side) * Fix to softupdates (see ufs/ffs/ffs_inode.c 1.73, commit made by Kirk). Testing program written by: Avadis Tevanian, Jr. Testing program supplied by: jkh / Apple (see Dec2001 posting to freebsd-hackers with Subject 'NFS: How to make FreeBS fall on its face in one easy step') MFC after: 1 week
|
#
33c67741 |
|
05-Nov-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
Adjust vnode_pager_input_smlfs() to not attempt to BMAP blocks beyond the file EOF. This works around a bug in the ISOFS (CDRom) BMAP code which returns bogus values for requests beyond the file EOF rather then returning an error, resulting in either corrupt data being mmap()'d beyond the file EOF or resulting in a seg-fault on the last page of a mmap()'d file (mmap()s of CDRom files). Reported by: peter / Yahoo MFC after: 3 days
|
#
00a6f47f |
|
12-Oct-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
Finally fix the VM bug where a file whos EOF occurs in the middle of a page would sometimes prevent a dirty page from being cleaned, even when synced, resulting in the dirty page being re-flushed to disk every 30-60 seconds or so, forever. The problem is that when the filesystem flushes a page to its backing file it typically does not clear dirty bits representing areas of the page that are beyond the file EOF. If the file is also mmap()'d and a fault is taken, vm_fault (properly, is required to) set the vm_page_t->dirty bits to VM_PAGE_BITS_ALL. This combination could leave us with an uncleanable, unfreeable page. The solution is to have the vnode_pager detect the edge case and manually clear the dirty bits representing areas beyond the file EOF. The filesystem does the rest and the page comes up clean after the write completes. MFC after: 3 days
|
#
bd78cece |
|
11-Oct-2001 |
John Baldwin <jhb@FreeBSD.org> |
Change the kernel's ucred API as follows: - crhold() returns a reference to the ucred whose refcount it bumps. - crcopy() now simply copies the credentials from one credential to another and has no return value. - a new crshared() primitive is added which returns true if a ucred's refcount is > 1 and false (0) otherwise.
|
#
b40ce416 |
|
12-Sep-2001 |
Julian Elischer <julian@FreeBSD.org> |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
|
#
bd8e0d58 |
|
04-Aug-2001 |
John Baldwin <jhb@FreeBSD.org> |
Whitespace fixes.
|
#
54d92145 |
|
04-Jul-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
whitespace / register cleanup
|
#
0cddd8f0 |
|
04-Jul-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
|
#
b62b9b64 |
|
03-Jul-2001 |
John Baldwin <jhb@FreeBSD.org> |
Fix a XXX comment by moving the initialization of the number of pbuf's for the vnode pager to a new vnode pager init method instead of making it a hack in getpages().
|
#
342a1480 |
|
29-May-2001 |
John Baldwin <jhb@FreeBSD.org> |
Don't hold the VM lock across VOP's and other things that can sleep.
|
#
e6b961ff |
|
23-May-2001 |
John Baldwin <jhb@FreeBSD.org> |
- Assert Giant is held in the vnode pager methods. - Lock the VM while walking down a vm_object's backing_object list in vnode_pager_lock().
|
#
23955314 |
|
18-May-2001 |
Alfred Perlstein <alfred@FreeBSD.org> |
Introduce a global lock for the vm subsystem (vm_mtx). vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb
|
#
60fb0ce3 |
|
28-Apr-2001 |
Greg Lehey <grog@FreeBSD.org> |
Revert consequences of changes to mount.h, part 2. Requested by: bde
|
#
d98dc34f |
|
23-Apr-2001 |
Greg Lehey <grog@FreeBSD.org> |
Correct #includes to work with fixed sys/mount.h.
|
#
d8d5fa88 |
|
19-Apr-2001 |
Alfred Perlstein <alfred@FreeBSD.org> |
vnode_pager_freepage() is really vm_page_free() in disguise, nuke vnode_pager_freepage() and replace all calls to it with vm_page_free()
|
#
2b6b0df7 |
|
26-Dec-2000 |
Matthew Dillon <dillon@FreeBSD.org> |
This implements a better launder limiting solution. There was a solution in 4.2-REL which I ripped out in -stable and -current when implementing the low-memory handling solution. However, maxlaunder turns out to be the saving grace in certain very heavily loaded systems (e.g. newsreader box). The new algorithm limits the number of pages laundered in the first pageout daemon pass. If that is not sufficient then suceessive will be run without any limit. Write I/O is now pipelined using two sysctls, vfs.lorunningspace and vfs.hirunningspace. This prevents excessive buffered writes in the disk queues which cause long (multi-second) delays for reads. It leads to more stable (less jerky) and generally faster I/O streaming to disk by allowing required read ops (e.g. for indirect blocks and such) to occur without interrupting the write stream, amoung other things. NOTE: eventually, filesystem write I/O pipelining needs to be done on a per-device basis. At the moment it is globalized.
|
#
f2a2857b |
|
11-Jul-2000 |
Kirk McKusick <mckusick@FreeBSD.org> |
Add snapshots to the fast filesystem. Most of the changes support the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed. Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).
|
#
0385347c |
|
20-May-2000 |
Peter Wemm <peter@FreeBSD.org> |
Implement an optimization of the VM<->pmap API. Pass vm_page_t's directly to various pmap_*() functions instead of looking up the physical address and passing that. In many cases, the first thing the pmap code was doing was going to a lot of trouble to get back the original vm_page_t, or it's shadow pv_table entry. Inspired by: John Dyson's 1998 patches. Also: Eliminate pv_table as a seperate thing and build it into a machine dependent part of vm_page_t. This eliminates having a seperate set of structions that shadow each other in a 1:1 fashion that we often went to a lot of trouble to translate from one to the other. (see above) This happens to save 4 bytes of physical memory for each page in the system. (8 bytes on the Alpha). Eliminate the use of the phys_avail[] array to determine if a page is managed (ie: it has pv_entries etc). Store this information in a flag. Things like device_pager set it because they create vm_page_t's on the fly that do not have pv_entries. This makes it easier to "unmanage" a page of physical memory (this will be taken advantage of in subsequent commits). Add a function to add a new page to the freelist. This could be used for reclaiming the previously wasted pages left over from preloaded loader(8) files. Reviewed by: dillon
|
#
9626b608 |
|
05-May-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Separate the struct bio related stuff out of <sys/buf.h> into <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter
|
#
c244d2de |
|
02-Apr-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Move B_ERROR flag to b_ioflags and call it BIO_ERROR. (Much of this done by script) Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED. Move b_pblkno and b_iodone_chain to struct bio while we transition, they will be obsoleted once bio structs chain/stack. Add bio_queue field for struct bio aware disksort. Address a lot of stylistic issues brought up by bde.
|
#
5929bcfa |
|
27-Mar-2000 |
Philippe Charnier <charnier@FreeBSD.org> |
Revert spelling mistake I made in the previous commit Requested by: Alan and Bruce
|
#
956f3135 |
|
26-Mar-2000 |
Philippe Charnier <charnier@FreeBSD.org> |
Spelling
|
#
b99c307a |
|
20-Mar-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Rename the existing BUF_STRATEGY() to DEV_STRATEGY() substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo) substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo) This patch is machine generated except for the ccd.c and buf.h parts.
|
#
21144e3b |
|
20-Mar-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set. B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes. Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL. Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading. This change is a step in the direction towards a stackable BIO capability. A lot of this patch were machine generated (Thanks to style(9) compliance!) Vinum users: Greg has not had time to test this yet, be careful.
|
#
923502ff |
|
29-Oct-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
useracc() the prequel: Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
|
#
24579ca1 |
|
16-Sep-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
The vnode pager (used when you do file-backed mmaps) must use the underlying physical sector size when aligning I/O transfer sizes. It cannot assume 512 bytes. We assume the underlying sector size is a power of 2. If it isn't, mmap() will break badly anyway (in the same way mmap broke with NFS when NFS tried to cache piecemeal write ranges in buffers, before we enforced read-buffer-before-write-piecemeal for NFS). Reviewed by: Alan Cox <alc@cs.rice.edu>, David Greenman <dg@root.com>
|
#
c3aac50f |
|
27-Aug-1999 |
Peter Wemm <peter@FreeBSD.org> |
$Id$ -> $FreeBSD$
|
#
2c28a105 |
|
16-Aug-1999 |
Alan Cox <alc@FreeBSD.org> |
Add the (inline) function vm_page_undirty for clearing the dirty bitmask of a vm_page. Use it. Submitted by: dillon
|
#
3efc015b |
|
01-Jul-1999 |
Peter Wemm <peter@FreeBSD.org> |
Fix some int/long printf problems for the Alpha
|
#
67812eac |
|
25-Jun-1999 |
Kirk McKusick <mckusick@FreeBSD.org> |
Convert buffer locking from using the B_BUSY and B_WANTED flags to using lockmgr locks. This commit should be functionally equivalent to the old semantics. That is, all buffer locking is done with LK_EXCLUSIVE requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will be done in future commits.
|
#
54746b67 |
|
15-May-1999 |
Dmitrij Tejblum <dt@FreeBSD.org> |
Fix confusion of size of transfer with size of the pager. PR: 11658 Broken in: 1.89 (1998/03/07)
|
#
b0eeea20 |
|
06-May-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
remove b_proc from struct buf, it's (now) unused. Reviewed by: dillon, bde
|
#
4221e284 |
|
02-May-1999 |
Alan Cox <alc@FreeBSD.org> |
The VFS/BIO subsystem contained a number of hacks in order to optimize piecemeal, middle-of-file writes for NFS. These hacks have caused no end of trouble, especially when combined with mmap(). I've removed them. Instead, NFS will issue a read-before-write to fully instantiate the struct buf containing the write. NFS does, however, optimize piecemeal appends to files. For most common file operations, you will not notice the difference. The sole remaining fragment in the VFS/BIO system is b_dirtyoff/end, which NFS uses to avoid cache coherency issues with read-merge-write style operations. NFS also optimizes the write-covers-entire-buffer case by avoiding the read-before-write. There is quite a bit of room for further optimization in these areas. The VM system marks pages fully-valid (AKA vm_page_t->valid = VM_PAGE_BITS_ALL) in several places, most noteably in vm_fault. This is not correct operation. The vm_pager_get_pages() code is now responsible for marking VM pages all-valid. A number of VM helper routines have been added to aid in zeroing-out the invalid portions of a VM page prior to the page being marked all-valid. This operation is necessary to properly support mmap(). The zeroing occurs most often when dealing with file-EOF situations. Several bugs have been fixed in the NFS subsystem, including bits handling file and directory EOF situations and buf->b_flags consistancy issues relating to clearing B_ERROR & B_INVAL, and handling B_DONE. getblk() and allocbuf() have been rewritten. B_CACHE operation is now formally defined in comments and more straightforward in implementation. B_CACHE for VMIO buffers is based on the validity of the backing store. B_CACHE for non-VMIO buffers is based simply on whether the buffer is B_INVAL or not (B_CACHE set if B_INVAL clear, and vise-versa). biodone() is now responsible for setting B_CACHE when a successful read completes. B_CACHE is also set when a bdwrite() is initiated and when a bwrite() is initiated. VFS VOP_BWRITE routines (there are only two - nfs_bwrite() and bwrite()) are now expected to set B_CACHE. This means that bowrite() and bawrite() also set B_CACHE indirectly. There are a number of places in the code which were previously using buf->b_bufsize (which is DEV_BSIZE aligned) when they should have been using buf->b_bcount. These have been fixed. getblk() now clears B_DONE on return because the rest of the system is so bad about dealing with B_DONE. Major fixes to NFS/TCP have been made. A server-side bug could cause requests to be lost by the server due to nfs_realign() overwriting other rpc's in the same TCP mbuf chain. The server's kernel must be recompiled to get the benefit of the fixes. Submitted by: Matthew Dillon <dillon@apollo.backplane.com>
|
#
897a45ef |
|
10-Apr-1999 |
Dmitrij Tejblum <dt@FreeBSD.org> |
Convert usage of vm_page_bits() to the new convention ("Inputs are required to range within a page").
|
#
8d17e694 |
|
05-Apr-1999 |
Julian Elischer <julian@FreeBSD.org> |
Catch a case spotted by Tor where files mmapped could leave garbage in the unallocated parts of the last page when the file ended on a frag but not a page boundary. Delimitted by tags PRE_MATT_MMAP_EOF and POST_MATT_MMAP_EOF, in files alpha/alpha/pmap.c i386/i386/pmap.c nfs/nfs_bio.c vm/pmap.h vm/vm_page.c vm/vm_page.h vm/vnode_pager.c miscfs/specfs/spec_vnops.c ufs/ufs/ufs_readwrite.c kern/vfs_bio.c Submitted by: Matt Dillon <dillon@freebsd.org> Reviewed by: Alan Cox <alc@freebsd.org>
|
#
4491ea91 |
|
26-Mar-1999 |
Eivind Eklund <eivind@FreeBSD.org> |
Correct a comment.
|
#
0e3cdf2c |
|
27-Feb-1999 |
Alan Cox <alc@FreeBSD.org> |
Reviewed by: "John S. Dyson" <dyson@iquest.net> Submitted by: Matthew Dillon <dillon@apollo.backplane.com> To prevent a deadlock, if we are extremely low on memory, force synchronous operation by the VOP_PUTPAGES in vnode_pager_putpages.
|
#
e4542174 |
|
23-Jan-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
vm_pager_put_pages() is passed an rcval array to hold per-page return values. The 'int' return value for the procedure was never used and not well defined in any case when there are mixed errors on pages, so it has been removed. vm_pager_put_pages() and associated vm_pager functions now return void.
|
#
1c7c3c6a |
|
21-Jan-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
This is a rather large commit that encompasses the new swapper, changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues. Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
|
#
af1f63c7 |
|
04-Dec-1998 |
Robert V. Baron <rvb@FreeBSD.org> |
In vnode_pager_input_old, set auio.uio_procp = curproc vs auio.uio_procp = (struct proc *) 0
|
#
6cde7a16 |
|
13-Oct-1998 |
David Greenman <dg@FreeBSD.org> |
Fixed two potentially serious classes of bugs: 1) The vnode pager wasn't properly tracking the file size due to "size" being page rounded in some cases and not in others. This sometimes resulted in corrupted files. First noticed by Terry Lambert. Fixed by changing the "size" pager_alloc parameter to be a 64bit byte value (as opposed to a 32bit page index) and changing the pagers and their callers to deal with this properly. 2) Fixed a bogus type cast in round_page() and trunc_page() that caused some 64bit offsets and sizes to be scrambled. Removing the cast required adding casts at a few dozen callers. There may be problems with other bogus casts in close-by macros. A quick check seemed to indicate that those were okay, however.
|
#
6e3a3f38 |
|
28-Sep-1998 |
Robert V. Baron <rvb@FreeBSD.org> |
John Dyson approved of this solution; make vnode_pager_input_old set m->valid
|
#
500b04a2 |
|
05-Sep-1998 |
Bruce Evans <bde@FreeBSD.org> |
Instantiate `nfs_mount_type' in a standard file so that it is present when nfs is an LKM. Declare it in a header file. Don't forget to use it in non-Lite2 code. Initialize it to -1 instead of to 0, since 0 will soon be the mount type number for the first vfs loaded. NetBSD uses strcmp() to avoid this ugly global.
|
#
e69763a3 |
|
04-Sep-1998 |
Doug Rabson <dfr@FreeBSD.org> |
Cosmetic changes to the PAGE_XXX macros to make them consistent with the other objects in vm.
|
#
c576d121 |
|
25-Aug-1998 |
Luoqi Chen <luoqi@FreeBSD.org> |
Fix a rounding problem that causes vnode pager to fail to remove the last partially filled page during a truncation. PR: kern/7422
|
#
069e9bc1 |
|
24-Aug-1998 |
Doug Rabson <dfr@FreeBSD.org> |
Change various syscalls to use size_t arguments instead of u_int. Add some overflow checks to read/write (from bde). Change all modifications to vm_page::flags, vm_page::busy, vm_object::flags and vm_object::paging_in_progress to use operations which are not interruptable. Reviewed by: Bruce Evans <bde@zeta.org.au>
|
#
fc62ef1f |
|
11-Jul-1998 |
Bruce Evans <bde@FreeBSD.org> |
Fixed printf format errors.
|
#
ac1e407b |
|
11-Jul-1998 |
Bruce Evans <bde@FreeBSD.org> |
Fixed printf format errors.
|
#
fd5d1124 |
|
04-Jul-1998 |
Julian Elischer <julian@FreeBSD.org> |
VOP_STRATEGY grows an (struct vnode *) argument as the value in b_vp is often not really what you want. (and needs to be frobbed). more cleanups will follow this. Reviewed by: Bruce Evans <bde@freebsd.org>
|
#
bef608bd |
|
15-Mar-1998 |
John Dyson <dyson@FreeBSD.org> |
Some VM improvements, including elimination of alot of Sig-11 problems. Tor Egge and others have helped with various VM bugs lately, but don't blame him -- blame me!!! pmap.c: 1) Create an object for kernel page table allocations. This fixes a bogus allocation method previously used for such, by grabbing pages from the kernel object, using bogus pindexes. (This was a code cleanup, and perhaps a minor system stability issue.) pmap.c: 2) Pre-set the modify and accessed bits when prudent. This will decrease bus traffic under certain circumstances. vfs_bio.c, vfs_cluster.c: 3) Rather than calculating the beginning virtual byte offset multiple times, stick the offset into the buffer header, so that the calculated offset can be reused. (Long long multiplies are often expensive, and this is a probably unmeasurable performance improvement, and code cleanup.) vfs_bio.c: 4) Handle write recursion more intelligently (but not perfectly) so that it is less likely to cause a system panic, and is also much more robust. vfs_bio.c: 5) getblk incorrectly wrote out blocks that are incorrectly sized. The problem is fixed, and writes blocks out ONLY when B_DELWRI is true. vfs_bio.c: 6) Check that already constituted buffers have fully valid pages. If not, then make sure that the B_CACHE bit is not set. (This was a major source of Sig-11 type problems.) vfs_bio.c: 7) Fix a potential system deadlock due to an incorrectly specified sleep priority while waiting for a buffer write operation. The change that I made opens the system up to serious problems, and we need to examine the issue of process sleep priorities. vfs_cluster.c, vfs_bio.c: 8) Make clustered reads work more correctly (and more completely) when buffers are already constituted, but not fully valid. (This was another system reliability issue.) vfs_subr.c, ffs_inode.c: 9) Create a vtruncbuf function, which is used by filesystems that can truncate files. The vinvalbuf forced a file sync type operation, while vtruncbuf only invalidates the buffers past the new end of file, and also invalidates the appropriate pages. (This was a system reliabiliy and performance issue.) 10) Modify FFS to use vtruncbuf. vm_object.c: 11) Make the object rundown mechanism for OBJT_VNODE type objects work more correctly. Included in that fix, create pager entries for the OBJT_DEAD pager type, so that paging requests that might slip in during race conditions are properly handled. (This was a system reliability issue.) vm_page.c: 12) Make some of the page validation routines be a little less picky about arguments passed to them. Also, support page invalidation change the object generation count so that we handle generation counts a little more robustly. vm_pageout.c: 13) Further reduce pageout daemon activity when the system doesn't need help from it. There should be no additional performance decrease even when the pageout daemon is running. (This was a significant performance issue.) vnode_pager.c: 14) Teach the vnode pager to handle race conditions during vnode deallocations.
|
#
86ffbd76 |
|
09-Mar-1998 |
Mike Smith <msmith@FreeBSD.org> |
Complement diagnostic messages about missing per-FS VOP page operations, but don't make their absence fatal. Submitted by: terry
|
#
8f9110f6 |
|
07-Mar-1998 |
John Dyson <dyson@FreeBSD.org> |
This mega-commit is meant to fix numerous interrelated problems. There has been some bitrot and incorrect assumptions in the vfs_bio code. These problems have manifest themselves worse on NFS type filesystems, but can still affect local filesystems under certain circumstances. Most of the problems have involved mmap consistancy, and as a side-effect broke the vfs.ioopt code. This code might have been committed seperately, but almost everything is interrelated. 1) Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that are fully valid. 2) Rather than deactivating erroneously read initial (header) pages in kern_exec, we now free them. 3) Fix the rundown of non-VMIO buffers that are in an inconsistent (missing vp) state. 4) Fix the disassociation of pages from buffers in brelse. The previous code had rotted and was faulty in a couple of important circumstances. 5) Remove a gratuitious buffer wakeup in vfs_vmio_release. 6) Remove a crufty and currently unused cluster mechanism for VBLK files in vfs_bio_awrite. When the code is functional, I'll add back a cleaner version. 7) The page busy count wakeups assocated with the buffer cache usage were incorrectly cleaned up in a previous commit by me. Revert to the original, correct version, but with a cleaner implementation. 8) The cluster read code now tries to keep data associated with buffers more aggressively (without breaking the heuristics) when it is presumed that the read data (buffers) will be soon needed. 9) Change to filesystem lockmgr locks so that they use LK_NOPAUSE. The delay loop waiting is not useful for filesystem locks, due to the length of the time intervals. 10) Correct and clean-up spec_getpages. 11) Implement a fully functional nfs_getpages, nfs_putpages. 12) Fix nfs_write so that modifications are coherent with the NFS data on the server disk (at least as well as NFS seems to allow.) 13) Properly support MS_INVALIDATE on NFS. 14) Properly pass down MS_INVALIDATE to lower levels of the VM code from vm_map_clean. 15) Better support the notion of pages being busy but valid, so that fewer in-transit waits occur. (use p->busy more for pageouts instead of PG_BUSY.) Since the page is fully valid, it is still usable for reads. 16) It is possible (in error) for cached pages to be busy. Make the page allocation code handle that case correctly. (It should probably be a printf or panic, but I want the system to handle coding errors robustly. I'll probably add a printf.) 17) Correct the design and usage of vm_page_sleep. It didn't handle consistancy problems very well, so make the design a little less lofty. After vm_page_sleep, if it ever blocked, it is still important to relookup the page (if the object generation count changed), and verify it's status (always.) 18) In vm_pageout.c, vm_pageout_clean had rotted, so clean that up. 19) Push the page busy for writes and VM_PROT_READ into vm_pageout_flush. 20) Fix vm_pager_put_pages and it's descendents to support an int flag instead of a boolean, so that we can pass down the invalidate bit.
|
#
ffc82b0a |
|
28-Feb-1998 |
John Dyson <dyson@FreeBSD.org> |
1) Use a more consistent page wait methodology. 2) Do not unnecessarily force page blocking when paging pages out. 3) Further improve swap pager performance and correctness, including fixing the paging in progress deadlock (except in severe I/O error conditions.) 4) Enable vfs_ioopt=1 as a default. 5) Fix and enable the page prezeroing in SMP mode. All in all, SMP systems especially should show a significant improvement in "snappyness."
|
#
ce75f2c3 |
|
25-Feb-1998 |
Mike Smith <msmith@FreeBSD.org> |
In the author's words: These diffs implement the first stage of a VOP_{GET|PUT}PAGES pushdown for local media FS's. See ffs_putpages in /sys/ufs/ufs/ufs_readwrite.c for implementation details for generic *_{get|put}pages for local media FS's. Support is trivial to add for any FS that formerly relied on the default behaviour of the vnode_pager in in EOPNOTSUPP cases (just copy the ffs_getpages() code for the FS in question's *_{get|put}pages). Obviously, it would be better if each local media FS implemented a more optimal method, instead of calling an exported interface from the /sys/vm/vnode_pager.c, but this is a necessary first step in getting the FS's to a point where they can be supplied with better implementations on a case-by-case basis. Obviously, the cd9660_putpages() can be rather trivial (since it is a read-only FS type 8-)). A slight (temporary) modification is made to print a diagnostic message in the case where the underlying filesystem attempts to engage in the previous behaviour. Failure is likely to be ungraceful. Submitted by: terry@freebsd.org (Terry Lambert)
|
#
66095752 |
|
24-Feb-1998 |
John Dyson <dyson@FreeBSD.org> |
Fix page prezeroing for SMP, and fix some potential paging-in-progress hangs. The paging-in-progress diagnosis was a result of Tor Egge's excellent detective work. Submitted by: Partially from Tor Egge.
|
#
e47ed70b |
|
23-Feb-1998 |
John Dyson <dyson@FreeBSD.org> |
Significantly improve the efficiency of the swap pager, which appears to have declined due to code-rot over time. The swap pager rundown code has been clean-up, and unneeded wakeups removed. Lots of splbio's are changed to splvm's. Also, set the dynamic tunables for the pageout daemon to be more sane for larger systems (thereby decreasing the daemon overheadla.)
|
#
0b08f5f7 |
|
05-Feb-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Back out DIAGNOSTIC changes.
|
#
95461b45 |
|
04-Feb-1998 |
John Dyson <dyson@FreeBSD.org> |
1) Start using a cleaner and more consistant page allocator instead of the various ad-hoc schemes. 2) When bringing in UPAGES, the pmap code needs to do another vm_page_lookup. 3) When appropriate, set the PG_A or PG_M bits a-priori to both avoid some processor errata, and to minimize redundant processor updating of page tables. 4) Modify pmap_protect so that it can only remove permissions (as it originally supported.) The additional capability is not needed. 5) Streamline read-only to read-write page mappings. 6) For pmap_copy_page, don't enable write mapping for source page. 7) Correct and clean-up pmap_incore. 8) Cluster initial kern_exec pagin. 9) Removal of some minor lint from kern_malloc. 10) Correct some ioopt code. 11) Remove some dead code from the MI swapout routine. 12) Correct vm_object_deallocate (to remove backing_object ref.) 13) Fix dead object handling, that had problems under heavy memory load. 14) Add minor vm_page_lookup improvements. 15) Some pages are not in objects, and make sure that the vm_page.c can properly support such pages. 16) Add some more page deficit handling. 17) Some minor code readability improvements.
|
#
47cfdb16 |
|
04-Feb-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Turn DIAGNOSTIC into a new-style option.
|
#
eaf13dd7 |
|
31-Jan-1998 |
John Dyson <dyson@FreeBSD.org> |
Change the busy page mgmt, so that when pages are freed, they MUST be PG_BUSY. It is bogus to free a page that isn't busy, because it is in a state of being "unavailable" when being freed. The additional advantage is that the page_remove code has a better cross-check that the page should be busy and unavailable for other use. There were some minor problems with the collapse code, and this plugs those subtile "holes." Also, the vfs_bio code wasn't checking correctly for PG_BUSY pages. I am going to develop a more consistant scheme for grabbing pages, busy or otherwise. For now, we are stuck with the current morass.
|
#
47221757 |
|
17-Jan-1998 |
John Dyson <dyson@FreeBSD.org> |
Tie up some loose ends in vnode/object management. Remove an unneeded config option in pmap. Fix a problem with faulting in pages. Clean-up some loose ends in swap pager memory management. The system should be much more stable, but all subtile bugs aren't fixed yet.
|
#
95e5e988 |
|
05-Jan-1998 |
John Dyson <dyson@FreeBSD.org> |
Make our v_usecount vnode reference count work identically to the original BSD code. The association between the vnode and the vm_object no longer includes reference counts. The major difference is that vm_object's are no longer freed gratuitiously from the vnode, and so once an object is created for the vnode, it will last as long as the vnode does. When a vnode object reference count is incremented, then the underlying vnode reference count is incremented also. The two "objects" are now more intimately related, and so the interactions are now much less complex. When vnodes are now normally placed onto the free queue with an object still attached. The rundown of the object happens at vnode rundown time, and happens with exactly the same filesystem semantics of the original VFS code. There is absolutely no need for vnode_pager_uncache and other travesties like that anymore. A side-effect of these changes is that SMP locking should be much simpler, the I/O copyin/copyout optimizations work, NFS should be more ponderable, and further work on layered filesystems should be less frustrating, because of the totally coherent management of the vnode objects and vnodes. Please be careful with your system while running this code, but I would greatly appreciate feedback as soon a reasonably possible.
|
#
2be70f79 |
|
28-Dec-1997 |
John Dyson <dyson@FreeBSD.org> |
Lots of improvements, including restructring the caching and management of vnodes and objects. There are some metadata performance improvements that come along with this. There are also a few prototypes added when the need is noticed. Changes include: 1) Cleaning up vref, vget. 2) Removal of the object cache. 3) Nuke vnode_pager_uncache and friends, because they aren't needed anymore. 4) Correct some missing LK_RETRY's in vn_lock. 5) Correct the page range in the code for msync. Be gentle, and please give me feedback asap.
|
#
1efb74fb |
|
19-Dec-1997 |
John Dyson <dyson@FreeBSD.org> |
Some performance improvements, and code cleanups (including changing our expensive OFF_TO_IDX to btoc whenever possible.)
|
#
ab3f7469 |
|
02-Dec-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
In all such uses of struct buf: 's/b_un.b_addr/b_data/g'
|
#
e7b0208f |
|
05-Oct-1997 |
John Dyson <dyson@FreeBSD.org> |
Relax the vnode locking for read only operations.
|
#
79624e21 |
|
31-Aug-1997 |
Bruce Evans <bde@FreeBSD.org> |
Removed unused #includes.
|
#
b9dcd593 |
|
25-Aug-1997 |
Bruce Evans <bde@FreeBSD.org> |
Fixed type mismatches for functions with args of type vm_prot_t and/or vm_inherit_t. These types are smaller than ints, so the prototypes should have used the promoted type (int) to match the old-style function definitions. They use just vm_prot_t and/or vm_inherit_t. This depends on gcc features to work. I fixed the definitions since this is easiest. The correct fix may be to change the small types to u_int, to optimize for time instead of space.
|
#
89721f6f |
|
21-Aug-1997 |
John Dyson <dyson@FreeBSD.org> |
This is a trial improvement for the vnode reference count while on the vnode free list problem. Also, the vnode age flag is no longer used by the vnode pager. (It is actually incorrect to use then.) Constructive feedback welcome -- just be kind.
|
#
32ad9cb5 |
|
19-May-1997 |
Doug Rabson <dfr@FreeBSD.org> |
Fix a few bugs with NFS and mmap caused by NFS' use of b_validoff and b_validend. The changes to vfs_bio.c are a bit ugly but hopefully can be tidied up later by a slight redesign. PR: kern/2573, kern/2754, kern/3046 (possibly) Reviewed by: dyson
|
#
eb2c768e |
|
07-Mar-1997 |
John Dyson <dyson@FreeBSD.org> |
When removing IN_RECURSE support during the Lite/2 merge, read/write to/from mmaped regions was broken. This commit fixes the breakage, and uses the new Lite/2 locking mechanisms.
|
#
6875d254 |
|
22-Feb-1997 |
Peter Wemm <peter@FreeBSD.org> |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
996c772f |
|
09-Feb-1997 |
John Dyson <dyson@FreeBSD.org> |
This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
|
#
09841510 |
|
24-Jan-1997 |
David Greenman <dg@FreeBSD.org> |
Added a check/panic for v_usecount being 0 (no vnode reference) in vnode_pager_alloc().
|
#
1130b656 |
|
14-Jan-1997 |
Jordan K. Hubbard <jkh@FreeBSD.org> |
Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
ad980522 |
|
16-Oct-1996 |
John Dyson <dyson@FreeBSD.org> |
Clean up the rundown of the object backing a vnode. This should fix NFS problems associated with forcible dismounts.
|
#
9fea9a6f |
|
09-Sep-1996 |
John Dyson <dyson@FreeBSD.org> |
The whole issue of not support VOP_LOCK for VBLK devices should be rethought. This fixes YET another problem with unmounting filesystems. The root cause is not fixed here, but at least the problem has gone away.
|
#
6476c0d2 |
|
21-Aug-1996 |
John Dyson <dyson@FreeBSD.org> |
Even though this looks like it, this is not a complex code change. The interface into the "VMIO" system has changed to be more consistant and robust. Essentially, it is now no longer necessary to call vn_open to get merged VM/Buffer cache operation, and exceptional conditions such as merged operation of VBLK devices is simpler and more correct. This code corrects a potentially large set of problems including the problems with ktrace output and loaded systems, file create/deletes, etc. Most of the changes to NFS are cosmetic and name changes, eliminating a layer of subroutine calls. The direct calls to vput/vrele have been re-instituted for better cross platform compatibility. Reviewed by: davidg
|
#
67bf6868 |
|
29-Jul-1996 |
John Dyson <dyson@FreeBSD.org> |
Backed out the recent changes/enhancements to the VM code. The problem with the 'shell scripts' was found, but there was a 'strange' problem found with a 486 laptop that we could not find. This commit backs the code back to 25-jul, and will be re-entered after the snapshot in smaller (more easily tested) chunks.
|
#
4f4d35ed |
|
26-Jul-1996 |
John Dyson <dyson@FreeBSD.org> |
This commit is meant to solve a couple of VM system problems or performance issues. 1) The pmap module has had too many inlines, and so the object file is simply bigger than it needs to be. Some common code is also merged into subroutines. 2) Removal of some *evil* PHYS_TO_VM_PAGE macro calls. Unfortunately, a few have needed to be added also. The removal caused the need for more vm_page_lookups. I added lookup hints to minimize the need for the page table lookup operations. 3) Removal of some bogus performance improvements, that mostly made the code more complex (tracking individual page table page updates unnecessarily). Those improvements actually hurt 386 processors perf (not that people who worry about perf use 386 processors anymore :-)). 4) Changed pv queue manipulations/structures to be TAILQ's. 5) The pv queue code has had some performance problems since day one. Some significant scalability issues are resolved by threading the pv entries from the pmap AND the physical address instead of just the physical address. This makes certain pmap operations run much faster. This does not affect most micro-benchmarks, but should help loaded system performance *significantly*. DG helped and came up with most of the solution for this one. 6) Most if not all pmap bit operations follow the pattern: pmap_test_bit(); pmap_clear_bit(); That made for twice the necessary pv list traversal. The pmap interface now supports only pmap_tc_bit type operations: pmap_[test/clear]_modified, pmap_[test/clear]_referenced. Additionally, the modified routine now takes a vm_page_t arg instead of a phys address. This eliminates a PHYS_TO_VM_PAGE operation. 7) Several rewrites of routines that contain redundant code to use common routines, so that there is a greater likelihood of keeping the cache footprint smaller.
|
#
aa8de40a |
|
03-May-1996 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Another sweep over the pmap/vm macros, this time with more focus on the usage. I'm not satisfied with the naming, but now at least there is less bogus stuff around.
|
#
ad5dd234 |
|
18-Mar-1996 |
John Dyson <dyson@FreeBSD.org> |
Fix the problem that unmounting filesystems that are backed by a VMIO device have reference count problems. We mark the underlying object ono-persistent, and account for the reference count that the VM system maintainsfor the special device close. This should fix the removable device problem.
|
#
8169788f |
|
11-Mar-1996 |
Peter Wemm <peter@FreeBSD.org> |
Import 4.4BSD-Lite2 onto the vendor branch, note that in the kernel, all files are off the vendor branch, so this should not change anything. A "U" marker generally means that the file was not changed in between the 4.4Lite and Lite-2 releases, and does not need a merge. "C" generally means that there was a change.
|
#
bd7e5f99 |
|
18-Jan-1996 |
John Dyson <dyson@FreeBSD.org> |
Eliminated many redundant vm_map_lookup operations for vm_mmap. Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish overhead for merged cache. Efficiency improvement for vfs_cluster. It used to do alot of redundant calls to cluster_rbuild. Correct the ordering for vrele of .text and release of credentials. Use the selective tlb update for 486/586/P6. Numerous fixes to the size of objects allocated for files. Additionally, fixes in the various pagers. Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs. Fixes in the swap pager for exhausted resources. The pageout code will not as readily thrash. Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE), thereby improving efficiency of several routines. Eliminate even more unnecessary vm_page_protect operations. Significantly speed up process forks. Make vm_object_page_clean more efficient, thereby eliminating the pause that happens every 30seconds. Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the case of filesystems mounted async. Fix a panic with busy pages when write clustering is done for non-VMIO buffers.
|
#
d63596ce |
|
17-Dec-1995 |
John Dyson <dyson@FreeBSD.org> |
Fix paging from ext2fs (and other fs w/block size < PAGE_SIZE). This should fix kern/900.
|
#
f708ef1b |
|
14-Dec-1995 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Another mega commit to staticize things.
|
#
a316d390 |
|
10-Dec-1995 |
John Dyson <dyson@FreeBSD.org> |
Changes to support 1Tb filesizes. Pages are now named by an (object,index) pair instead of (object,offset) pair.
|
#
efeaf95a |
|
06-Dec-1995 |
David Greenman <dg@FreeBSD.org> |
Untangled the vm.h include file spaghetti.
|
#
3af76890 |
|
19-Nov-1995 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove unused vars & funcs, make things static, protoize a little bit.
|
#
0b8253a7 |
|
30-Oct-1995 |
Bruce Evans <bde@FreeBSD.org> |
Don't pass an extra trailing arg to some functions. Added the prototypes that found this bug.
|
#
2c4488fc |
|
22-Oct-1995 |
John Dyson <dyson@FreeBSD.org> |
Finalize GETPAGES layering scheme. Move the device GETPAGES interface into specfs code. No need at this point to modify the PUTPAGES stuff except in the layered-type (NULL/UNION) filesystems.
|
#
eed2d59b |
|
19-Oct-1995 |
David Greenman <dg@FreeBSD.org> |
Fix initialization of "bsize" in vnode_pager_haspage(). It must happen after the check for the mount point still existing or else the system will panic if someone forcibly unmounted the filesystem.
|
#
6eab77f2 |
|
12-Sep-1995 |
John Dyson <dyson@FreeBSD.org> |
Fix really bogus casting of a block number to a long. Also change the comparison from a "< 0" to "== -1" like it should be.
|
#
b1fc01b7 |
|
10-Sep-1995 |
John Dyson <dyson@FreeBSD.org> |
Fix an error that can cause attempted reading beyond the end of file.
|
#
ced399ee |
|
05-Sep-1995 |
John Dyson <dyson@FreeBSD.org> |
Minor performance improvements, additional prototype for additional exported symbol.
|
#
170db9c6 |
|
03-Sep-1995 |
John Dyson <dyson@FreeBSD.org> |
Allow the fault code to use additional clustering info from both bmap and the swap pager. Improved fault clustering performance.
|
#
c83ebe77 |
|
03-Sep-1995 |
John Dyson <dyson@FreeBSD.org> |
Added VOP_GETPAGES/VOP_PUTPAGES and also the "backwards" block count for VOP_BMAP. Updated affected filesystems...
|
#
24a1cce3 |
|
13-Jul-1995 |
David Greenman <dg@FreeBSD.org> |
NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct proc or any VM system structure will have to be rebuilt!!! Much needed overhaul of the VM system. Included in this first round of changes: 1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages, haspage, and sync operations are supported. The haspage interface now provides information about clusterability. All pager routines now take struct vm_object's instead of "pagers". 2) Improved data structures. In the previous paradigm, there is constant confusion caused by pagers being both a data structure ("allocate a pager") and a collection of routines. The idea of a pager structure has escentially been eliminated. Objects now have types, and this type is used to index the appropriate pager. In most cases, items in the pager structure were duplicated in the object data structure and thus were unnecessary. In the few cases that remained, a un_pager structure union was created in the object to contain these items. 3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now be removed. For instance, vm_object_enter(), vm_object_lookup(), vm_object_remove(), and the associated object hash list were some of the things that were removed. 4) simple_lock's removed. Discussion with several people reveals that the SMP locking primitives used in the VM system aren't likely the mechanism that we'll be adopting. Even if it were, the locking that was in the code was very inadequate and would have to be mostly re-done anyway. The locking in a uni-processor kernel was a no-op but went a long way toward making the code difficult to read and debug. 5) Places that attempted to kludge-up the fact that we don't have kernel thread support have been fixed to reflect the reality that we are really dealing with processes, not threads. The VM system didn't have complete thread support, so the comments and mis-named routines were just wrong. We now use tsleep and wakeup directly in the lock routines, for instance. 6) Where appropriate, the pagers have been improved, especially in the pager_alloc routines. Most of the pager_allocs have been rewritten and are now faster and easier to maintain. 7) The pagedaemon pageout clustering algorithm has been rewritten and now tries harder to output an even number of pages before and after the requested page. This is sort of the reverse of the ideal pagein algorithm and should provide better overall performance. 8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup have been removed. Some other unnecessary casts have also been removed. 9) Some almost useless debugging code removed. 10) Terminology of shadow objects vs. backing objects straightened out. The fact that the vm_object data structure escentially had this backwards really confused things. The use of "shadow" and "backing object" throughout the code is now internally consistent and correct in the Mach terminology. 11) Several minor bug fixes, including one in the vm daemon that caused 0 RSS objects to not get purged as intended. 12) A "default pager" has now been created which cleans up the transition of objects to the "swap" type. The previous checks throughout the code for swp->pg_data != NULL were really ugly. This change also provides the rudiments for future backing of "anonymous" memory by something other than the swap pager (via the vnode pager, for example), and it allows the decision about which of these pagers to use to be made dynamically (although will need some additional decision code to do this, of course). 13) (dyson) MAP_COPY has been deprecated and the corresponding "copy object" code has been removed. MAP_COPY was undocumented and non- standard. It was furthermore broken in several ways which caused its behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will continue to work correctly, but via the slightly different semantics of MAP_PRIVATE. 14) (dyson) Sharing maps have been removed. It's marginal usefulness in a threads design can be worked around in other ways. Both #12 and #13 were done to simplify the code and improve readability and maintain- ability. (As were most all of these changes) TODO: 1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing this will reduce the vnode pager to a mere fraction of its current size. 2) Rewrite vm_fault and the swap/vnode pagers to use the clustering information provided by the new haspage pager interface. This will substantially reduce the overhead by eliminating a large number of VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be improved to provide both a "behind" and "ahead" indication of contiguousness. 3) Implement the extended features of pager_haspage in swap_pager_haspage(). It currently just says 0 pages ahead/behind. 4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps via a much more general mechanism that could also be used for disk striping of regular filesystems. 5) Do something to improve the architecture of vm_object_collapse(). The fact that it makes calls into the swap pager and knows too much about how the swap pager operates really bothers me. It also doesn't allow for collapsing of non-swap pager objects ("unnamed" objects backed by other pagers).
|
#
06cb7259 |
|
09-Jul-1995 |
David Greenman <dg@FreeBSD.org> |
Moved call to VOP_GETATTR() out of vnode_pager_alloc() and into the places that call vnode_pager_alloc() so that a failure return can be dealt with. This fixes a panic seen on NFS clients when a file being opened is deleted on the server before the open completes.
|
#
39d38f93 |
|
06-Jul-1995 |
David Greenman <dg@FreeBSD.org> |
Fixed an object allocation race condition that was causing a "object deallocated too many times" panic when using NFS. Reviewed by: John Dyson
|
#
aa2cabb9 |
|
27-Jun-1995 |
David Greenman <dg@FreeBSD.org> |
1) Converted v_vmdata to v_object. 2) Removed unnecessary vm_object_lookup()/pager_cache(object, TRUE) pairs after vnode_pager_alloc() calls - the object is already guaranteed to be persistent. 3) Removed some gratuitous casts.
|
#
9b2e5354 |
|
30-May-1995 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
Remove trailing whitespace.
|
#
5f55e841 |
|
17-May-1995 |
David Greenman <dg@FreeBSD.org> |
Accessing pages beyond the end of a mapped file results in internal inconsistencies in the VM system that eventually lead to a panic. These changes fix the behavior to conform to the behavior in SunOS, which is to deny faults to pages beyond the EOF (returning SIGBUS). Internally, this is implemented by requiring faults to be within the object size boundaries. These changes exposed another bug, namely that passing in an offset to mmap when trying to map an unnamed anonymous region also results in internal inconsistencies. In this case, the offset is forced to zero. Reviewed by: John Dyson and others
|
#
ee3a64c9 |
|
10-May-1995 |
David Greenman <dg@FreeBSD.org> |
Changed "handle" from type caddr_t to void *; "handle" is several different types of pointers, and "char *" is a bad choice for the type.
|
#
f6b04d2b |
|
09-Apr-1995 |
David Greenman <dg@FreeBSD.org> |
Changes from John Dyson and myself: Fixed remaining known bugs in the buffer IO and VM system. vfs_bio.c: Fixed some race conditions and locking bugs. Improved performance by removing some (now) unnecessary code and fixing some broken logic. Fixed process accounting of # of FS outputs. Properly handle NFS interrupts (B_EINTR). (various) Replaced calls to clrbuf() with calls to an optimized routine called vfs_bio_clrbuf(). (various FS sync) Sync out modified vnode_pager backed pages. ffs_vnops.c: Do two passes: Sync out file data first, then indirect blocks. vm_fault.c: Fixed deadly embrace caused by acquiring locks in the wrong order. vnode_pager.c: Changed to use buffer I/O system for writing out modified pages. This should fix the problem with the modification date previous not getting updated. Also dramatically simplifies the code. Note that this is going to change in the future and be implemented via VOP_PUTPAGES(). vm_object.c: Fixed a pile of bugs related to cleaning (vnode) objects. The performance of vm_object_page_clean() is terrible when dealing with huge objects, but this will change when we implement a binary tree to keep the object pages sorted. vm_pageout.c: Fixed broken clustering of pageouts. Fixed race conditions and other lockup style bugs in the scanning of pages. Improved performance.
|
#
1b369d98 |
|
21-Mar-1995 |
David Greenman <dg@FreeBSD.org> |
Removed unused variable declaration missed in previous commit.
|
#
71263bf8 |
|
21-Mar-1995 |
David Greenman <dg@FreeBSD.org> |
Removed do-nothing VOP_UPDATE() call.
|
#
7c1f6ced |
|
20-Mar-1995 |
David Greenman <dg@FreeBSD.org> |
Added a new boolean argument to vm_object_page_clean that causes it to only toss out clean pages if TRUE.
|
#
0426122f |
|
20-Mar-1995 |
David Greenman <dg@FreeBSD.org> |
Don't gain/lose an object reference in vnode_pager_setsize(). It will cause vnode locking problems in vm_object_terminate(). Implement proper vnode locking in vm_object_terminate().
|
#
0bdb7528 |
|
19-Mar-1995 |
David Greenman <dg@FreeBSD.org> |
Do proper vnode locking when doing paging I/O. Removed the asynchronous paging capability to facilitate this (we saw little or no measureable improvement with this anyway). Submitted by: John Dyson
|
#
c01a9b8c |
|
18-Mar-1995 |
David Greenman <dg@FreeBSD.org> |
Incorporated 4.4-lite vnode_pager_uncache() and vnode_pager_umount() routines (and merged local changes). The changed vnode_pager_uncache gets rids of the bogosity that you can call the routine without having the vnode locked. The changed vnode_pager_umount properly locks the vnode before calling vnode_pager_uncache.
|
#
b5e8ce9f |
|
16-Mar-1995 |
Bruce Evans <bde@FreeBSD.org> |
Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.
|
#
4bb62461 |
|
12-Mar-1995 |
David Greenman <dg@FreeBSD.org> |
Explicitly set object->flags = OBJ_CANPERSIST.
|
#
00072442 |
|
07-Mar-1995 |
David Greenman <dg@FreeBSD.org> |
Set VAGE flag when pager is destroyed. This usually happens when an object has fallen off the end of the cached list - this is likely the last reference to the vnode and it should be reused before non file vnodes that are already on the free list (VDIR mostly).
|
#
f919ebde |
|
01-Mar-1995 |
David Greenman <dg@FreeBSD.org> |
Various changes from John and myself that do the following: New functions create - vm_object_pip_wakeup and pagedaemon_wakeup that are used to reduce the actual number of wakeups. New function vm_page_protect which is used in conjuction with some new page flags to reduce the number of calls to pmap_page_protect. Minor changes to reduce unnecessary spl nesting. Rewrote vm_page_alloc() to improve readability. Various other mostly cosmetic changes.
|
#
b106f3b2 |
|
23-Feb-1995 |
David Greenman <dg@FreeBSD.org> |
Removed redundant HOLDRELE()'s.
|
#
187f0071 |
|
22-Feb-1995 |
David Greenman <dg@FreeBSD.org> |
Changed return value from vnode_pager_addr to be in DEV_BSIZE units so that 9 bits aren't lost in the conversion. Changed all callers to expect this. This allows paging on large (>2GB) filesystems. Submitted by: John Dyson
|
#
c0503609 |
|
22-Feb-1995 |
David Greenman <dg@FreeBSD.org> |
Only do object paging_in_progress wakeups if someone is waiting on this condition. Submitted by: John Dyson
|
#
7fb0c17e |
|
20-Feb-1995 |
David Greenman <dg@FreeBSD.org> |
Deprecated remaining use of vm_deallocate. Deprecated vm_allocate_with_ pager(). Almost completely rewrote vm_mmap(); when John gets done with the bottom half, it will be a complete rewrite. Deprecated most use of vm_object_setpager(). Removed side effect of setting object persist in vm_object_enter and moved this into the pager(s). A few other cosmetic changes.
|
#
efc68ce1 |
|
02-Feb-1995 |
David Greenman <dg@FreeBSD.org> |
Fixed bmap run-length brokeness. Use bmap run-length extension when doing clustered paging. Submitted by: John Dyson
|
#
6d40c3d3 |
|
24-Jan-1995 |
David Greenman <dg@FreeBSD.org> |
Added ability to detect sequential faults and DTRT. (swap_pager.c) Added hook for pmap_prefault() and use symbolic constant for new third argument to vm_page_alloc() (vm_fault.c, various) Changed the way that upages and page tables are held. (vm_glue.c) Fixed architectural flaw in allocating pages at interrupt time that was introduced with the merged cache changes. (vm_page.c, various) Adjusted some algorithms to acheive better paging performance and to accomodate the fix for the architectural flaw mentioned above. (vm_pageout.c) Fixed pbuf handling problem, changed policy on handling read-behind page. (vnode_pager.c) Submitted by: John Dyson
|
#
a7489784 |
|
11-Jan-1995 |
David Greenman <dg@FreeBSD.org> |
Fixed a panic that Garrett reported to me...the OBJ_INTERNAL flag wasn't being cleared in some cases for vnode backed objects; we now do this in vnode_pager_alloc proper to guarantee it. Also be more careful in the rcollapse code about messing with busy/bmapped pages.
|
#
0d94caff |
|
09-Jan-1995 |
David Greenman <dg@FreeBSD.org> |
These changes embody the support of the fully coherent merged VM buffer cache, much higher filesystem I/O performance, and much better paging performance. It represents the culmination of over 6 months of R&D. The majority of the merged VM/cache work is by John Dyson. The following highlights the most significant changes. Additionally, there are (mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to support the new VM/buffer scheme. vfs_bio.c: Significant rewrite of most of vfs_bio to support the merged VM buffer cache scheme. The scheme is almost fully compatible with the old filesystem interface. Significant improvement in the number of opportunities for write clustering. vfs_cluster.c, vfs_subr.c Upgrade and performance enhancements in vfs layer code to support merged VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff. vm_object.c: Yet more improvements in the collapse code. Elimination of some windows that can cause list corruption. vm_pageout.c: Fixed it, it really works better now. Somehow in 2.0, some "enhancements" broke the code. This code has been reworked from the ground-up. vm_fault.c, vm_page.c, pmap.c, vm_object.c Support for small-block filesystems with merged VM/buffer cache scheme. pmap.c vm_map.c Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of kernel PTs. vm_glue.c Much simpler and more effective swapping code. No more gratuitous swapping. proc.h Fixed the problem that the p_lock flag was not being cleared on a fork. swap_pager.c, vnode_pager.c Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the code doesn't need it anymore. machdep.c Changes to better support the parameter values for the merged VM/buffer cache scheme. machdep.c, kern_exec.c, vm_glue.c Implemented a seperate submap for temporary exec string space and another one to contain process upages. This eliminates all map fragmentation problems that previously existed. ffs_inode.c, ufs_inode.c, ufs_readwrite.c Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on busy buffers. Submitted by: John Dyson and David Greenman
|
#
4abc71c0 |
|
24-Nov-1994 |
David Greenman <dg@FreeBSD.org> |
Don't try to page to a vnode that had it's filesystem unmounted.
|
#
bf556a16 |
|
16-Nov-1994 |
Justin T. Gibbs <gibbs@FreeBSD.org> |
Remove a peice of commented out code that was left over from the early stages of debugging LFS: * if we can't bmap, use old VOP code */ ! if (/* (vp->v_mount && vp->v_mount->mnt_stat.f_type == MOUNT_LFS) || */ ! VOP_BMAP(vp, foff, &dp, 0, 0)) { for (i = 0; i < count; i++) { if (i != reqpage) { vnode_pager_freepage(m[i]); --- 804,810 ---- /* * if we can't bmap, use old VOP code */ ! if (VOP_BMAP(vp, foff, &dp, 0, 0)) { Reviewed by: gibbs Submitted by: John Dyson
|
#
317205ca |
|
13-Nov-1994 |
David Greenman <dg@FreeBSD.org> |
Fixed bug where a read-behind to a negative offset would occur if the fault was at offset 0 in the object. This resulted in more overhead but was othewise benign. Added incore() check in vnode_pager_has_page() to work around a problem with LFS...other than slightly higher overhead, this change has no affect on UFS.
|
#
a83c285c |
|
06-Nov-1994 |
David Greenman <dg@FreeBSD.org> |
Fixed return status from pagers. Ahem...the previous method would manufacture data when it couldn't get it legitimately. :-( Submitted by: John Dyson
|
#
976e77fc |
|
15-Oct-1994 |
David Greenman <dg@FreeBSD.org> |
1) Some of the counters in the vmmeter struct don't fit well into the Mach VM scheme of things, so I've changed them to be more appropriate. page in/ous are now associated with the pager that did them. Nuked v_fault as the only fault of interest that wouldn't be already counted in v_trap is a VM fault, and this is counted seperately. 2) Implemented most of the remaining counters and corrected the counting of some that were done wrong. They are all almost correct now...just a few minor ones left to fix.
|
#
b73f3b1d |
|
13-Oct-1994 |
David Greenman <dg@FreeBSD.org> |
Got rid of redundant declaration warnings.
|
#
0a99546c |
|
14-Oct-1994 |
Jordan K. Hubbard <jkh@FreeBSD.org> |
Add missing )'s to previous midnight changes. :-)
|
#
defb6744 |
|
13-Oct-1994 |
David Greenman <dg@FreeBSD.org> |
Changed I/O error messages to be somewhat less cryptic. Removed a piece of unused code.
|
#
05f0fdd2 |
|
08-Oct-1994 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Cosmetics: unused vars, ()'s, #include's &c &c to silence gcc. Reviewed by: davidg
|
#
8e58bf68 |
|
05-Oct-1994 |
David Greenman <dg@FreeBSD.org> |
Stuff object into v_vmdata rather than pager. Not important which at the moment, but will be in the future. Other changes mostly cosmetic, but are made for future VMIO considerations. Submitted by: John Dyson
|
#
db141545 |
|
06-Sep-1994 |
David Greenman <dg@FreeBSD.org> |
Disabled a debugging printf.
|
#
fff93ab6 |
|
29-Aug-1994 |
David Greenman <dg@FreeBSD.org> |
Patches from John Dyson to improve swap code efficiency. Religiously add back pmap_clear_modify() in vnode_pager_input until we figure out why system performance isn't what we expect. Submitted by: John Dyson (swap_pager) & David Greenman (vnode_pager)
|
#
a481f200 |
|
07-Aug-1994 |
David Greenman <dg@FreeBSD.org> |
Provide support for upcoming merged VM/buffer cache, and fixed a few bugs that haven't appeared to manifest themselves (yet). Submitted by: John Dyson
|
#
c87801fe |
|
06-Aug-1994 |
David Greenman <dg@FreeBSD.org> |
Fixed various prototype problems with the pmap functions and the subsequent problems that fixing them caused.
|
#
16f62314 |
|
06-Aug-1994 |
David Greenman <dg@FreeBSD.org> |
Incorporated post 1.1.5 work from John Dyson. This includes performance improvements via the new routines pmap_qenter/pmap_qremove and pmap_kenter/ pmap_kremove. These routine allow fast mapping of pages for those architectures that have "normal" MMUs. Also included is a fix to the pageout daemon to properly check a queue end condition. Submitted by: John Dyson
|
#
bbc0ec52 |
|
03-Aug-1994 |
David Greenman <dg@FreeBSD.org> |
Integrated VM system improvements/fixes from FreeBSD-1.1.5.
|
#
26f9a767 |
|
25-May-1994 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch. Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
|
#
df8bae1d |
|
24-May-1994 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
BSD 4.4 Lite Kernel Sources
|