#
267654 |
|
19-Jun-2014 |
gjb |
Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
253257 |
|
12-Jul-2013 |
kib |
MFC r253095: Fix typo in comment.
Approved by: re (hrs)
|
#
251897 |
|
18-Jun-2013 |
scottl |
Merge the second part of the unmapped I/O changes. This enables the infrastructure in the block layer and UFS filesystem as well as a few drivers. The list of MFC revisions is long, so I won't quote changelogs.
r248508,248510,248511,248512,248514,248515,248516,248517,248518, 248519,248520,248521,248550,248568,248789,248790,249032,250936
Submitted by: kib Approved by: kib Obtained from: Netflix
|
#
244547 |
|
21-Dec-2012 |
jh |
MFC r243333:
- Don't pass geom and provider names as format strings. - Add __printflike() attributes. - Remove an extra argument for the g_new_geomf() call in swapongeom_ev().
|
#
240760 |
|
20-Sep-2012 |
alc |
MFC r237168 The page flag PGA_WRITEABLE is set and cleared exclusively by the pmap layer, but it is read directly by the MI VM layer. This change introduces pmap_page_is_write_mapped() in order to completely encapsulate all direct access to PGA_WRITEABLE in the pmap layer.
|
#
239789 |
|
28-Aug-2012 |
pluknet |
MFC r239723: Typo in previous change: print half the theoretical maximum as maximum recommended amount.
|
#
239645 |
|
24-Aug-2012 |
des |
MFH (r239327): warn when too much swap is configured, and avoid flooding the console when running out of space for metadata.
|
#
232405 |
|
02-Mar-2012 |
ed |
MFC r231378:
Remove direct access to si_name.
Code should just use the devtoname() function to obtain the name of a character device. Also add const keywords to pieces of code that need it to build properly.
|
#
231188 |
|
08-Feb-2012 |
mav |
MFC 230877: Fix NULL dereference panic on attempt to turn off (on system shutdown) disconnected swap device.
This is quick and imperfect solution, as swap device will still be opened and GEOM will not be able to destroy it. Proper solution would be to automatically turn off and close disconnected swap device, but with existing code it will cause panic if there is at least one page on device, even if it is unimportant page of the user-level process. It needs some work.
|
#
229012 |
|
30-Dec-2011 |
kib |
MFC r228432: Fix printf.
|
#
225736 |
|
22-Sep-2011 |
kensmith |
Copy head to stable/9 as part of 9.0-RELEASE release cycle.
Approved by: re (implicit)
|
#
225617 |
|
16-Sep-2011 |
kmacy |
In order to maximize the re-usability of kernel code in user space this patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls.
Reviewed by: rwatson Approved by: re (bz)
|
#
225418 |
|
06-Sep-2011 |
kib |
Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomic flags field. Updates to the atomic flags are performed using the atomic ops on the containing word, do not require any vm lock to be held, and are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9) functions are provided to modify afalgs.
Document the changes to flags field to only require the page lock.
Introduce vm_page_reference(9) function to provide a stable KPI and KBI for filesystems like tmpfs and zfs which need to mark a page as referenced.
Reviewed by: alc, attilio Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64) Approved by: re (bz)
|
#
225089 |
|
22-Aug-2011 |
kib |
Update some comments in swap_pager.c.
Reviewed and most wording by: alc MFC after: 1 week Approved by: re (bz)
|
#
225076 |
|
22-Aug-2011 |
kib |
Apply the limit to avoid the overflows in the radix tree subr_blist.c after the conversion of the swap device size to the page size units, not before. That lifts the limit on the usable swap partition size from 32GB to 256GB, that is less depressing for the modern systems.
Submitted by: Alexander V. Chernikov <melifaro ipfw ru> Reviewed by: alc Approved by: re (bz) MFC after: 2 weeks
|
#
224582 |
|
01-Aug-2011 |
kib |
Implement the linprocfs swaps file, providing information about the configured swap devices in the Linux-compatible format.
Based on the submission by: Robert Millan <rmh debian org> PR: kern/159281 Reviewed by: bde Approved by: re (kensmith) MFC after: 2 weeks
|
#
223825 |
|
06-Jul-2011 |
trasz |
All the racct_*() calls need to happen with the proc locked. Fixing this won't happen before 9.0. This commit adds "#ifdef RACCT" around all the "PROC_LOCK(p); racct_whatever(p, ...); PROC_UNLOCK(p)" instances, in order to avoid useless locking/unlocking in kernels built without "options RACCT".
|
#
221096 |
|
26-Apr-2011 |
obrien |
Reap old SPL comments.
Reviewed by: alc
|
#
220373 |
|
05-Apr-2011 |
trasz |
Add accounting for most of the memory-related resources.
Sponsored by: The FreeBSD Foundation Reviewed by: kib (earlier version)
|
#
219124 |
|
01-Mar-2011 |
brucec |
Change the return type of vmspace_swap_count to a long to match the other vmspace_*_count functions.
MFC after: 3 days
|
#
218966 |
|
23-Feb-2011 |
brucec |
Calculate and return the count in vmspace_swap_count as a vm_offset_t instead of an int to avoid overflow.
While here, clean up some style(9) issues.
PR: kern/152200 Reviewed by: kib MFC after: 2 weeks
|
#
217529 |
|
18-Jan-2011 |
alc |
Move the definition of M_VMPGDATA to the swap pager, where the only remaining uses are.
|
#
216873 |
|
01-Jan-2011 |
brucec |
There can be more than 0x20000000 swap meta blocks allocated if a swap-backed md(4) device is used. Don't panic when deallocating such a device if swap has been used.
PR: kern/133170 Discussed with: kib MFC after: 3 days
|
#
216128 |
|
02-Dec-2010 |
trasz |
Replace pointer to "struct uidinfo" with pointer to "struct ucred" in "struct vm_object". This is required to make it possible to account for per-jail swap usage.
Reviewed by: kib@ Tested by: pho@ Sponsored by: FreeBSD Foundation
|
#
214095 |
|
20-Oct-2010 |
avg |
PG_BUSY -> VPO_BUSY, PG_WANTED -> VPO_WANTED in manual pages and comments
Reviewed by: alc MFC after: 4 days
|
#
207822 |
|
09-May-2010 |
alc |
Call vm_page_deactivate() rather than vm_page_dontneed() in swp_pager_force_pagein(). By dirtying the page, swp_pager_force_pagein() forces vm_page_dontneed() to insert the page at the head of the inactive queue, just like vm_page_deactivate() does. Moreover, because the page was invalid, it can't have been mapped, and thus the other effect of vm_page_dontneed(), clearing the page's reference bits has no effect. In summary, there is no reason to call vm_page_dontneed() since its effect will be identical to calling the simpler vm_page_deactivate().
|
#
207806 |
|
08-May-2010 |
alc |
Remove the page queues lock around a call to vm_page_activate(). Make the page dirty before adding it to the active queue.
|
#
207796 |
|
08-May-2010 |
alc |
Push down the page queues into vm_page_cache(), vm_page_try_to_cache(), and vm_page_try_to_free(). Consequently, push down the page queues lock into pmap_enter_quick(), pmap_page_wired_mapped(), pmap_remove_all(), and pmap_remove_write().
Push down the page queues lock into Xen's pmap_page_is_mapped(). (I overlooked the Xen pmap in r207702.)
Switch to a per-processor counter for the total number of pages cached.
|
#
207747 |
|
07-May-2010 |
alc |
Eliminate unnecessary page queues locking.
|
#
207410 |
|
29-Apr-2010 |
kmacy |
On Alan's advice, rather than do a wholesale conversion on a single architecture from page queue lock to a hashed array of page locks (based on a patch by Jeff Roberson), I've implemented page lock support in the MI code and have only moved vm_page's hold_count out from under page queue mutex to page lock. This changes pmap_extract_and_hold on all pmaps.
Supported by: Bitgravity Inc.
Discussed with: alc, jeffr, and kib
|
#
207364 |
|
29-Apr-2010 |
kib |
In swap pager, do not free the non-requested pages from the run if they are wired. Kstack pages are wired, this change prepares swap pager for handling of long runs of kstack pages.
Noted and reviewed by: alc Tested by: pho MFC after: 2 weeks
|
#
206761 |
|
17-Apr-2010 |
alc |
Setting PG_REFERENCED on the requested page in swap_pager_getpages() is either redundant or harmful, depending on the caller. For example, when called by vm_fault(), it is redundant. However, when called by vm_thread_swapin(), it is harmful. Specifically, if the thread is later swapped out, having PG_REFERENCED set on its stack pages leads the page daemon to reactivate these stack pages and delay their reclamation.
Reviewed by: kib MFC after: 3 weeks
|
#
198811 |
|
02-Nov-2009 |
ivoras |
Add sysctl documentation strings. The descriptions are derived from tuning(7). One of the descriptions references tuning(7) because it is too complex to adequatly describe here (it is not a simple boolean sysctl) and users should be warned to that.
Reviewed by: alc, kib Approved by: gnn (mentor)
|
#
198201 |
|
18-Oct-2009 |
kib |
Remove spurious call to priv_check(PRIV_VM_SWAP_NOQUOTA). Call priv_check(PRIV_VM_SWAP_NORLIMIT) only when per-uid limit is actually exceed.
Both changes aim at calling priv_check(9) only for the cases when privilege is actually exercised by the process.
Reported and tested by: rwatson Reviewed by: alc MFC after: 3 days
|
#
194814 |
|
24-Jun-2009 |
kib |
Initialize the uip to silence gcc warning that seems to sneak in in some build environments.
Reported by: alc, bf1783 at googlemail com
|
#
194766 |
|
23-Jun-2009 |
kib |
Implement global and per-uid accounting of the anonymous memory. Add rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid.
The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup.
The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped.
The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4).
Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced.
In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)
|
#
193511 |
|
05-Jun-2009 |
rwatson |
Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include.
Discussed with: pjd
|
#
191625 |
|
28-Apr-2009 |
kib |
Fix typo.
|
#
191543 |
|
26-Apr-2009 |
alc |
Eliminate an errant comment.
Discussed with: tegge
|
#
191478 |
|
25-Apr-2009 |
alc |
Eliminate unnecessary calls to pmap_clear_modify(). Specifically, calling pmap_clear_modify() on a page is pointless if that page is not mapped or it is only mapped for read access. Instead, assert that the page is not mapped or not mapped for write access as appropriate.
Eliminate unnecessary clearing of a page's dirty mask. Instead, assert that the page's dirty mask is clear.
|
#
188859 |
|
20-Feb-2009 |
alc |
Eliminate stale comments.
|
#
183474 |
|
29-Sep-2008 |
kib |
Move the code for doing out-of-memory grass from vm_pageout_scan() into the separate function vm_pageout_oom(). Supply a parameter for vm_pageout_oom() describing a reason for the call.
Call vm_pageout_oom() from the swp_pager_meta_build() when swap zone is exhausted.
Reviewed by: alc Tested by: pho, jhb MFC after: 2 weeks
|
#
182371 |
|
28-Aug-2008 |
attilio |
Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful.
Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
|
#
181019 |
|
30-Jul-2008 |
jhb |
If the kernel has run out of metadata for swap, then explicitly panic() instead of emitting a warning before deadlocking.
MFC after: 1 month
|
#
180446 |
|
11-Jul-2008 |
kib |
Use the VM_ALLOC_INTERRUPT for the page requests when allocating memory for the bio for swapout write. It allows the page allocator to drain free page list deeper. As result, a deadlock where pageout deamon sleeps waiting for bio to be allocated for swapout is no more reproducable in practice.
Alan said that M_USE_RESERVE shall be ressurrected and used there, but until this is implemented, M_NOWAIT does exactly what is needed.
Tested by: pho, kris Reviewed by: alc No objections from: phk MFC after: 2 weeks (RELENG_7 only)
|
#
178792 |
|
05-May-2008 |
kmacy |
add malloc flag to blist so that it can be used in ithread context
Reviewed by: alc, bsdimp
|
#
175294 |
|
13-Jan-2008 |
attilio |
VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary.
KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed.
Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
|
#
175202 |
|
09-Jan-2008 |
attilio |
vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed.
Manpage and FreeBSD_version will be updated through further commits.
As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock.
Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
|
#
175157 |
|
08-Jan-2008 |
csjp |
When MAC is enabled in the kernel, fix a panic triggered by a locking assertion hit in swapoff_one() when we un-mount a swap partition. We should be using curthread where we used thread0 before. This change also replaces the thread argument with a credential argument, as the MAC framework only requires the cred.
It should be noted that this allows the machine to be rebooted without panicing with "cannot differ from curthread or NULL" when MAC is enabled.
Submitted by: rwatson Reviewed by: attilio MFC after: 2 weeks
|
#
173292 |
|
02-Nov-2007 |
maxim |
o Fix panic message: it's swap_pager_putpages() not swap_pager_getpages().
Submitted by: Mark Tinguely
|
#
172930 |
|
24-Oct-2007 |
rwatson |
Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms:
mac_<object>_<method/action> mac_<object>_check_<method/action>
The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names.
All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI.
Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer
|
#
171737 |
|
05-Aug-2007 |
alc |
Consider a scenario in which one processor, call it Pt, is performing vm_object_terminate() on a device-backed object at the same time that another processor, call it Pa, is performing dev_pager_alloc() on the same device. The problem is that vm_pager_object_lookup() should not be allowed to return a doomed object, i.e., an object with OBJ_DEAD set, but it does. In detail, the unfortunate sequence of events is: Pt in vm_object_terminate() holds the doomed object's lock and sets OBJ_DEAD on the object. Pa in dev_pager_alloc() holds dev_pager_sx and calls vm_pager_object_lookup(), which returns the doomed object. Next, Pa calls vm_object_reference(), which requires the doomed object's lock, so Pa waits for Pt to release the doomed object's lock. Pt proceeds to the point in vm_object_terminate() where it releases the doomed object's lock. Pa is now able to complete vm_object_reference() because it can now complete the acquisition of the doomed object's lock. So, now the doomed object has a reference count of one! Pa releases dev_pager_sx and returns the doomed object from dev_pager_alloc(). Pt now acquires dev_pager_mtx, removes the doomed object from dev_pager_object_list, releases dev_pager_mtx, and finally calls uma_zfree with the doomed object. However, the doomed object is still in use by Pa.
Repeating my key point, vm_pager_object_lookup() must not return a doomed object. Moreover, the test for the object's state, i.e., doomed or not, and the increment of the object's reference count should be carried out atomically.
Reviewed by: kib Approved by: re (kensmith) MFC after: 3 weeks
|
#
171019 |
|
24-Jun-2007 |
alc |
Eliminate GIANT_REQUIRED from swap_pager_putpages().
Approved by: re (mux) MFC after: 1 week
|
#
170292 |
|
04-Jun-2007 |
attilio |
Do proper "locking" for missing vmmeters part. Now, we assume no more sched_lock protection for some of them and use the distribuited loads method for vmmeter (distribuited through CPUs).
Reviewed by: alc, bde Approved by: jeff (mentor)
|
#
170170 |
|
31-May-2007 |
attilio |
Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately.
Requested by: alc Approved by: jeff (mentor)
|
#
170152 |
|
31-May-2007 |
kib |
Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file.
Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)
|
#
169667 |
|
18-May-2007 |
jeff |
- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines.
Contributed by: Attilio Rao <attilio@FreeBSD.org>
|
#
168979 |
|
23-Apr-2007 |
rwatson |
Audit pathnames looked up in swapon(2) and swapoff(2).
MFC after: 2 weeks Obtained from: TrustedBSD Project
|
#
167086 |
|
27-Feb-2007 |
jhb |
Use pause() rather than tsleep() on stack variables and function pointers.
|
#
166550 |
|
07-Feb-2007 |
jhb |
- Move 'struct swdevt' back into swap_pager.h and expose it to userland. - Restore support for fetching swap information from crash dumps via kvm_get_swapinfo(3) to fix pstat -T/-s on crash dumps.
Reviewed by: arch@, phk MFC after: 1 week
|
#
165809 |
|
05-Jan-2007 |
jhb |
- Add a new function uma_zone_exhausted() to see if a zone is full. - Add a printf in swp_pager_meta_build() to warn if the swapzone becomes exhausted so that there's at least a warning before a box that runs out of swapzone space before running out of swap space deadlocks.
MFC after: 1 week Reviwed by: alc
|
#
164033 |
|
06-Nov-2006 |
rwatson |
Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking.
Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
|
#
163622 |
|
23-Oct-2006 |
alc |
The page queues lock is no longer required by vm_page_wakeup().
|
#
163606 |
|
22-Oct-2006 |
rwatson |
Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead.
This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd.
Obtained from: TrustedBSD Project Sponsored by: SPARTA
|
#
161125 |
|
09-Aug-2006 |
alc |
Introduce a field to struct vm_page for storing flags that are synchronized by the lock on the object containing the page.
Transition PG_WANTED and PG_SWAPINPROG to use the new field, eliminating the need for holding the page queues lock when setting or clearing these flags. Rename PG_WANTED and PG_SWAPINPROG to VPO_WANTED and VPO_SWAPINPROG, respectively.
Eliminate the assertion that the page queues lock is held in vm_page_io_finish().
Eliminate the acquisition and release of the page queues lock around calls to vm_page_io_finish() in kern_sendfile() and vfs_unbusy_pages().
|
#
161005 |
|
05-Aug-2006 |
alc |
Remove a stale comment.
|
#
160960 |
|
03-Aug-2006 |
alc |
When sleeping on a busy page, use the lock from the containing object rather than the global page queues lock.
|
#
158387 |
|
10-May-2006 |
pjd |
Use better order here.
|
#
157628 |
|
10-Apr-2006 |
pjd |
On shutdown try to turn off all swap devices. This way GEOM providers are properly closed on shutdown.
Requested by: ru Reviewed by: alc MFC after: 2 weeks
|
#
156420 |
|
08-Mar-2006 |
imp |
Remove leading __ from __(inline|const|signed|volatile). They are obsolete. This should reduce diffs to NetBSD as well.
|
#
154929 |
|
27-Jan-2006 |
cognet |
Make sure b_vp and b_bufobj are NULL before calling relpbuf(), as it asserts they are. They should be NULL at this point, except if we're coming from swapdev_strategy(). It should only affect the case where we're swapping directly on a file over NFS.
|
#
150418 |
|
21-Sep-2005 |
cognet |
Make sure we have a bufobj before calling bstrategy(). I'm not sure this is the right thing to do, but at least I don't panic anymore when swapping on a NFS file without using md(4).
X-MFC after: proper review
|
#
148200 |
|
20-Jul-2005 |
alc |
Eliminate inconsistency in the setting of the B_DONE flag. Specifically, make the b_iodone callback responsible for setting it if it is needed. Previously, it was set unconditionally by bufdone() without holding whichever lock is shared by the b_iodone callback and the corresponding top-half function. Consequently, in a race, the top-half function could conclude that operation was done before the b_iodone callback finished. See, for example, aio_physwakeup() and aio_fphysio().
Note: I don't believe that the other, more widely-used b_iodone callbacks are affected.
Discussed with: jeff Reviewed by: phk MFC after: 2 weeks
|
#
146459 |
|
20-May-2005 |
alc |
Reduce the number of times that we acquire and release locks in swap_pager_getpages().
MFC after: 1 week
|
#
146367 |
|
19-May-2005 |
alc |
Remove calls to spl*().
|
#
146350 |
|
18-May-2005 |
alc |
Revert revision 1.270: swp_pager_async_iodone() need not perform VM_LOCK_GIANT().
Discussed with: jeff
|
#
145699 |
|
30-Apr-2005 |
jeff |
- VM_LOCK_GIANT in the swap pager's iodone routine as VFS will soon call it without Giant.
Sponsored by: Isilon Systems, Inc.
|
#
145584 |
|
27-Apr-2005 |
jeff |
- Pass the ISOPEN flag to namei so filesystems will know we're about to open them or otherwise access the data.
|
#
143821 |
|
18-Mar-2005 |
das |
Move the swap_zone == NULL check earlier (i.e. before we dereference the pointer.)
Found by: Coverity Prevent analysis tool
|
#
139825 |
|
07-Jan-2005 |
imp |
/* -> /*- for license, minor formatting changes
|
#
139629 |
|
03-Jan-2005 |
phk |
When allocating bio's in the swap_pager use M_WAITOK since the alternative is much worse.
|
#
137910 |
|
20-Nov-2004 |
das |
Disable U area swapping and remove the routines that create, destroy, copy, and swap U areas.
Reviewed by: arch@
|
#
137299 |
|
06-Nov-2004 |
das |
Fix the last known race in swapoff(), which could lead to a spurious panic:
swapoff: failed to locate %d swap blocks
The race occurred because putpages() can block between the time it allocates swap space and the time it updates the swap metadata to associate that space with a vm_object, so swapoff() would complain about the temporary inconsistency. I hoped to fix this by making swp_pager_getswapspace() and swp_pager_meta_build() a single atomic operation, but that proved to be inconvenient. With this change, swapoff() simply doesn't attempt to be so clever about detecting when all the pageout activity to the target device should have drained.
|
#
137239 |
|
05-Nov-2004 |
das |
Close a race in swapoff(). Here are the gory details:
In order to avoid livelock, swapoff() skips over objects with a nonzero pip count and makes another pass if necessary. Since it is impossible to know which objects we care about, it would choose an arbitrary object with a nonzero pip count and wait for it before making another pass, the theory being that this object would finish paging about as quickly as the ones we care about. Unfortunately, we may have slept since we acquired a reference to this object. Hack around this problem by tsleep()ing on the pointer anyway, but timeout after a fixed interval. More elegant solutions are possible, but the ones I considered unnecessarily complicate this rare case.
Also, kill some nits that seem to have crept into the swapoff() code in the last 75 revisions or so:
- Don't pass both sp and sp->sw_used to swap_pager_swapoff(), since the latter can be derived from the former.
- Replace swp_pager_find_dev() with something simpler. There's no need to iterate over the entire list of swap devices just to determine if a given block is assigned to the one we're interested in.
- Expand the scope of the swhash_mtx in a couple of places so that it isn't released and reacquired once for every hash bucket.
- Don't drop the swhash_mtx while holding a reference to an object. We need to lock the object first. Unfortunately, doing so would violate the established lock order, so use VM_OBJECT_TRYLOCK() and try again on a subsequent pass if the object is already locked.
- Refactor swp_pager_force_pagein() and swap_pager_swapoff() a bit.
|
#
137191 |
|
04-Nov-2004 |
phk |
De-couple our I/O bio request from the embedded bio in buf by explicitly copying the fields.
|
#
137186 |
|
04-Nov-2004 |
phk |
Remove buf->b_dev field.
|
#
136927 |
|
24-Oct-2004 |
phk |
Move the buffer method vector (buf->b_op) to the bufobj.
Extend it with a strategy method.
Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY song and dance.
Rename ibwrite to bufwrite().
Move the two NFS buf_ops to more sensible places, add bufstrategy to them.
Add inlines for bwrite() and bstrategy() which calls through buf->b_bufobj->b_ops->b_{write,strategy}().
Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().
|
#
136767 |
|
22-Oct-2004 |
phk |
Add b_bufobj to struct buf which eventually will eliminate the need for b_vp.
Initialize b_bufobj for all buffers.
Make incore() and gbincore() take a bufobj instead of a vnode.
Make inmem() local to vfs_bio.c
Change a lot of VI_[UN]LOCK(bp->b_vp) to BO_[UN]LOCK(bp->b_bufobj) also VI_MTX() to BO_MTX(),
Make buf_vlist_add() take a bufobj instead of a vnode.
Eliminate other uses of bp->b_vp where bp->b_bufobj will do.
Various minor polishing: remove "register", turn panic into KASSERT, use new function declarations, TAILQ_FOREACH_SAFE() etc.
|
#
136751 |
|
21-Oct-2004 |
phk |
Move the VI_BWAIT flag into no bo_flag element of bufobj and call it BO_WWAIT
Add bufobj_wref(), bufobj_wdrop() and bufobj_wwait() to handle the write count on a bufobj. Bufobj_wdrop() replaces vwakeup().
Use these functions all relevant places except in ffs_softdep.c where the use if interlocked_sleep() makes this impossible.
Rename b_vnbufs to b_bobufs now that we touch all the relevant files anyway.
|
#
135746 |
|
24-Sep-2004 |
das |
Don't look for swap blocks in objects that aren't swap-backed. I expect that this will fix the following panic, reported by Jun: swap_pager_isswapped: failed to locate all swap meta blocks
MT5 candidate
|
#
133318 |
|
08-Aug-2004 |
phk |
Tag all geom classes in the tree with a version number.
|
#
132550 |
|
22-Jul-2004 |
alc |
- Change uma_zone_set_obj() to call kmem_alloc_nofault() instead of kmem_alloc_pageable(). The difference between these is that an errant memory access to the zone will be detected sooner with kmem_alloc_nofault().
The following changes serve to eliminate the following lock-order reversal reported by witness:
1st 0xc1a3c084 vm object (vm object) @ vm/swap_pager.c:1311 2nd 0xc07acb00 swap_pager swhash (swap_pager swhash) @ vm/swap_pager.c:1797 3rd 0xc1804bdc vm object (vm object) @ vm/uma_core.c:931
There is no potential deadlock in this case. However, witness is unable to recognize this because vm objects used by UMA have the same type as ordinary vm objects. To remedy this, we make the following changes:
- Add a mutex type argument to VM_OBJECT_LOCK_INIT(). - Use the mutex type argument to assign distinct types to special vm objects such as the kernel object, kmem object, and UMA objects. - Define a static swap zone object for use by UMA. (Only static objects are assigned a special mutex type.)
|
#
131665 |
|
06-Jul-2004 |
bms |
Properly brucify a string by outdenting it.
|
#
130979 |
|
23-Jun-2004 |
bms |
In swap_pager_getpages(), bp->b_dev can be NULL, particularly for the case of NFS mounted swap, so do not try to dereference it.
While we're here, brucify the printf() call which happens when we time out on acquisition of vm_page_queue_mtx.
PR: kern/67898 Submitted by: bde (style)
|
#
130640 |
|
17-Jun-2004 |
phk |
Second half of the dev_t cleanup.
The big lines are: NODEV -> NULL NOUDEV -> NODEV udev_t -> dev_t udev2dev() -> findcdev()
Various minor adjustments including handling of userland access to kernel space struct cdev etc.
|
#
130585 |
|
16-Jun-2004 |
phk |
Do the dreaded s/dev_t/struct cdev */ Bump __FreeBSD_version accordingly.
|
#
128992 |
|
06-May-2004 |
alc |
Make vm_page's PG_ZERO flag immutable between the time of the page's allocation and deallocation. This flag's principal use is shortly after allocation. For such cases, clearing the flag is pointless. The only unusual use of PG_ZERO is in vfs_bio_clrbuf(). However, allocbuf() never requests a prezeroed page. So, vfs_bio_clrbuf() never sees a prezeroed page.
Reviewed by: tegge@
|
#
126135 |
|
23-Feb-2004 |
alc |
- Substitute bdone() and bwait() from vfs_bio.c for swap_pager_putpages()'s buffer completion code. Note: the only difference between swp_pager_sync_iodone() and bdone(), aside from the locking in the latter, was the unnecessary clearing of B_ASYNC. - Remove an unnecessary pmap_page_protect() from swp_pager_async_iodone().
Reviewed by: tegge
|
#
125755 |
|
12-Feb-2004 |
phk |
Remove the absolute count g_access_abs() function since experience has shown that it is not useful.
Rename the relative count g_access_rel() function to g_access(), only the name has changed.
Change all g_access_rel() calls in our CVS tree to call g_access() instead.
Add an #ifndef BURN_BRIDGES #define of g_access_rel() for source code compatibility.
|
#
125558 |
|
07-Feb-2004 |
alc |
swp_pager_async_iodone() no longer requires Giant. Modify bufdone() and swapgeom_done() to perform swp_pager_async_iodone() without Giant.
Reviewed by: tegge
|
#
125322 |
|
02-Feb-2004 |
phk |
Check error return from g_clone_bio(). (netchild@)
Add XXX comment about why this is still not optimal. (phk@)
Submitted by: netchild@
|
#
124933 |
|
24-Jan-2004 |
alc |
1. Statically initialize swap_pager_full and swap_pager_almost_full to the full state. (When swap is added their state will change appropriately.) 2. Set swap_pager_full and swap_pager_almost_full to the full state when the last swap device is removed. Combined these changes eliminate nonsense messages from the kernel on swap- less machines.
Item 2 submitted by: Divacky Roman <xdivac02@stud.fit.vutbr.cz> Prodding by: phk
|
#
124133 |
|
04-Jan-2004 |
alc |
Simplify the various pager allocation routines by computing the desired object size once and assigning that value to a local variable.
|
#
124110 |
|
03-Jan-2004 |
alc |
Reduce the scope of Giant in swap_pager_alloc().
|
#
123948 |
|
29-Dec-2003 |
alc |
Remove swap_pager_un_object_list; it is unused.
|
#
121854 |
|
01-Nov-2003 |
alc |
- Modify swap_pager_copy() and its callers such that the source and destination objects are locked on entry and exit. Add comments to the callers noting that the locks can be released by swap_pager_copy(). - Remove several instances of GIANT_REQUIRED.
|
#
121782 |
|
31-Oct-2003 |
alc |
- Synchronize access to the swdevt's sw_flags with sw_dev_mtx. - Remove several instances of GIANT_REQUIRED.
|
#
121727 |
|
30-Oct-2003 |
alc |
- Synchronize access to the swdevt's sw_blist with sw_dev_mtx. - Remove several instances of GIANT_REQUIRED.
|
#
121725 |
|
30-Oct-2003 |
alc |
- Synchronize access to swdevhd using sw_dev_mtx. - Use swp_sizecheck() rather than assignment to swap_pager_full in swaponsomething().
|
#
121649 |
|
29-Oct-2003 |
alc |
- Synchronize updates to nswapdev using sw_dev_mtx.
|
#
121646 |
|
29-Oct-2003 |
alc |
- Avoid a race in swaponsomething(): Calculate the new swdevt's first and end swblk and insert this new swdevt into the list of swap devices in the same critical section.
|
#
121601 |
|
27-Oct-2003 |
alc |
- Complete the synchronization of accesses to the swblock hash table.
|
#
121583 |
|
26-Oct-2003 |
alc |
- Introduce and use a mutex synchronizing access to the swblock hash table.
|
#
121517 |
|
25-Oct-2003 |
alc |
- Add some of the required vm object locking, including assertions where the vm object lock is required and already held.
|
#
121455 |
|
24-Oct-2003 |
alc |
- Push down Giant from vm_pageout() to vm_pageout_scan(), freeing vm_pageout_page_stats() from Giant. - Modify vm_pager_put_pages() and vm_pager_page_unswapped() to expect the vm object to be locked on entry. (All of the pager routines now expect this.)
|
#
121205 |
|
18-Oct-2003 |
phk |
DuH!
bp->b_iooffset (the spot on the disk), not bp->b_offset (the offset in the file)
|
#
121199 |
|
18-Oct-2003 |
phk |
Initialize bp->b_offset before calling VOP_[SPEC]STRATEGY(). Remove stale comment about B_PHYS.
|
#
119663 |
|
02-Sep-2003 |
phk |
Don't open with exclusive bit, swapon(8) wants to trash our swapdev.
Add XXX comment with a rating of this concept.
|
#
119591 |
|
30-Aug-2003 |
phk |
Add a close() method to a swapdev.
Add a GEOM based backend.
Remove the device/VOP_SPECSTRATEGY() based backend.
|
#
119590 |
|
30-Aug-2003 |
phk |
Protect the swapdevice tailq with a mutex.
Store the udev_t we will report to userland in the swdevt.
|
#
119575 |
|
30-Aug-2003 |
phk |
Continue the objectification of the swapdev backends:
Remove the vnode and dev_t fields and replace them with a void *.
Introduce separate strategy functions for devices and regular (NFS) vnodes.
For devices we don't need the vnode v_numoutput stuff.
Add a generic swaponsomething() function to add a swapdevice and split the remainder of swaponvp() into swaponvp() and swapondev() which calls this backend.
|
#
119574 |
|
30-Aug-2003 |
phk |
Make the strategy function a method of the individual swapdev.
|
#
119573 |
|
30-Aug-2003 |
phk |
Consistent use modern function definitions
|
#
118946 |
|
15-Aug-2003 |
phk |
Eliminate unnecessary udev_t variable: we can derive it from the dev_t when we need it.
|
#
118945 |
|
15-Aug-2003 |
phk |
Make swaponvp() static to the swap_pager.
|
#
118544 |
|
06-Aug-2003 |
phk |
Make the first two pages magic to protect the BSD labels rather than only one.
|
#
118536 |
|
06-Aug-2003 |
phk |
Staticize swap_pager_putpages()
Eliminate a lot of checkes to make sure requests are not cross-device which is unnecessary with the new layout. We know a sequential request cannot possibly be cross-device because there is a reserved page between the devices.
Remove a couple of comments which no longer are relevant.
|
#
118527 |
|
06-Aug-2003 |
phk |
Explicitly set B_PAGING
|
#
118521 |
|
06-Aug-2003 |
phk |
Rip out the totally bogos vnode swapdev_vp with extreeme prejudice.
Don't mark buffers with B_KEEPGIANT, we don't drop giant in strategy at this point in time.
|
#
118468 |
|
05-Aug-2003 |
phk |
Use sparse struct initialization for struct pagerops.
Mark our buffers B_KEEPGIANT before sending them downstream.
Remove swap_pager_strategy implementation.
|
#
118418 |
|
04-Aug-2003 |
phk |
Put an uncovered page between the swap devices, that way we can be sure to not get any cross-device I/O requests. (The unallocated first page protecting BSD labels already gave us this, but that hack may go away at some point in time).
Remove the check for cross-device I/O requests in swap_pager_strategy.
Move the repeated statistics updating into flushchainbuf().
|
#
118398 |
|
03-Aug-2003 |
phk |
Name swap_pager_find_dev() more correctly swp_pager_finde_dev().
Use ->bio_children to count child buffers, rather than abuse the bio_caller1 pointer.
Expand the relevant bits of waitchainbuf() inline, this clarifies the code a little bit.
|
#
118392 |
|
03-Aug-2003 |
phk |
I accidentally hit undo before committing, fix the resulting off-by-one.
|
#
118390 |
|
03-Aug-2003 |
phk |
Change the layout policy of the swap_pager from a hardcoded width striping to a per device round-robin algorithm.
Because of the policy of not attempting to retain previous swap allocation on page-out, this means that a newly added swap device almost instantly takes its 1/N share of the I/O load but it takes somewhat longer for it to assume it's 1/N share of the pages if there is plenty of space on the other devices.
Change the 8G total swapspace limitation to 8G per device instead by using a per device blist rather than one global blist. This reduces the memory footprint by 75% (typically a couple hundred kilobytes) for the common case with one swapdevice but NSWAPDEV=4.
Remove the compile time constant limit of number of swap devices, there is no limit now. Instead of a fixed size array, store the per swapdev structure in a TAILQ.
Total swap space is still addressed by a 32 bit page number and therefore the upper limit is now 2^42 bytes = 16TB (for i386).
We still do not allocate the first page of each device in order to give some amount of protection to any bsdlabel at the start of the device.
A new device is appended after the existing devices in the swap space, no attempt is made to fill in holes left behind by swapoff (this can trivially be changed should it ever become a problem).
The sysctl vm.nswapdev now reflects the number of currently configured swap devices.
Rename vm_swap_size to swap_pager_avail for consistency with other exported names.
Change argument type for vm_proc_swapin_all() and swap_pager_isswapped() to be a struct swdevt pointer rather than an index.
Not changed: we are still using blists to manage the free space, but since the swapspace is no longer fragmented by the striping different resource managers might fare better.
|
#
118286 |
|
31-Jul-2003 |
phk |
Remove unused stuff.
Move used stuff to swap_pager.c where it belongs.
This file no longer exports anything to userland.
|
#
118047 |
|
26-Jul-2003 |
phk |
Add a "int fd" argument to VOP_OPEN() which in the future will contain the filedescriptor number on opens from userland.
The index is used rather than a "struct file *" since it conveys a bit more information, which may be useful to in particular fdescfs and /dev/fd/*
For now pass -1 all over the place.
|
#
117903 |
|
22-Jul-2003 |
phk |
Remove all but one of the inlines here, this reduces the code size by 2032 bytes and has no measurable impact on performance.
|
#
117866 |
|
22-Jul-2003 |
peter |
swp_pager_hash() was called before it was instantiated inline. This made gcc (quite rightly) unhappy. Move it earlier.
|
#
117747 |
|
18-Jul-2003 |
phk |
Fix a printf format warning I introduced. Use the macro max number of swap devices rather than cache the constant in a variable. Avoid a (now) pointless variable.
|
#
117725 |
|
18-Jul-2003 |
phk |
If a proposed swap device exceeds the 8G artificial limit which out radix-tree code imposes, truncate the device instead of rejecting it.
|
#
117724 |
|
18-Jul-2003 |
phk |
Move the implementation of the vmspace_swap_count() (used only in the "toss the largest process" emergency handling) from vm_map.c to swap_pager.c.
The quantity calculated depends strongly on the internals of the swap_pager and by moving it, we no longer need to expose the internal metrics of the swap_pager to the world.
|
#
117723 |
|
18-Jul-2003 |
phk |
Add a new function swap_pager_status() which reports the total size of the paging space and how much of it is in use (in pages).
Use this interface from the Linuxolator instead of groping around in the internals of the swap_pager.
|
#
117722 |
|
18-Jul-2003 |
phk |
Merge swap_pager.c and vm_swap.c into swap_pager.c, the separation is not natural and needlessly exposes a lot of dirty laundry.
Move private interfaces between the two from swap_pager.h to swap_pager.c and staticize as much as possible.
No functional change.
|
#
117702 |
|
17-Jul-2003 |
phk |
Make sure that SWP_NPAGES always has the same value in all source files, so that SWAP_META_PAGES does not vary either.
swap_pager.c ended up with a value of 16, everybody else 8. Go with the 16 for now.
This should only have any effect in the "kill processes because we are out of swap" scenario, where it will make some sort of estimate of something more precise.
|
#
116798 |
|
25-Jun-2003 |
alc |
Maintain the lock on a vm object when calling vm_page_grab().
|
#
116629 |
|
20-Jun-2003 |
alc |
Make swap_pager_haspages() static; remove unused function prototypes.
|
#
116437 |
|
16-Jun-2003 |
phk |
This file was ignored by CVS in my last commit for some reason:
Remove pointless initialization of b_spc field, which now no longer exists.
|
#
116280 |
|
13-Jun-2003 |
alc |
Extend the scope of the vm object lock in swp_pager_async_iodone() to cover a vm_page_free().
|
#
116279 |
|
13-Jun-2003 |
alc |
Add vm object locking to various pagers' "get pages" methods, i386 stack management functions, and a u area management function.
|
#
116226 |
|
11-Jun-2003 |
obrien |
Use __FBSDID().
|
#
115987 |
|
07-Jun-2003 |
alc |
Assert that the vm object is locked on entry to swap_pager_freespace().
|
#
114774 |
|
06-May-2003 |
alc |
Lock the vm_object when performing vm_pager_deallocate().
|
#
114166 |
|
28-Apr-2003 |
alc |
- Lock the vm_object when performing swap_pager_isswapped(). - Assert that the vm_object is locked in swap_pager_isswapped().
|
#
114074 |
|
26-Apr-2003 |
alc |
- Convert vm_object_pip_wait() from using tsleep() to msleep(). - Make vm_object_pip_sleep() static. - Lock the vm_object when performing vm_object_pip_wait().
|
#
113744 |
|
20-Apr-2003 |
alc |
- Lock the vm_object when performing vm_object_pip_add(). - Remove an unnecessary variable.
|
#
113739 |
|
20-Apr-2003 |
alc |
- Lock the vm_object when performing vm_object_pip_add().
|
#
113722 |
|
19-Apr-2003 |
alc |
- Lock the vm_object when performing vm_object_pip_subtract(). - Assert that the vm_object lock is held in vm_object_pip_subtract().
|
#
113721 |
|
19-Apr-2003 |
alc |
- Lock the vm_object when performing vm_object_pip_wakeupn(). - Assert that the vm_object lock is held in vm_object_pip_wakeupn(). - Add a new macro VM_OBJECT_LOCK_ASSERT().
|
#
111119 |
|
19-Feb-2003 |
imp |
Back out M_* changes, per decision of the TRB.
Approved by: trb
|
#
109623 |
|
21-Jan-2003 |
alfred |
Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
|
#
108600 |
|
03-Jan-2003 |
phk |
Avoid extern decls in .c files by putting them in the vm/swap_pager.h include file where they belong. Share the dmmax_mask variable.
|
#
108589 |
|
03-Jan-2003 |
phk |
Convert calls to BUF_STRATEGY to VOP_STRATEGY calls. This is a no-op since all BUF_STRATEGY did in the first place was call VOP_STRATEGY.
|
#
108011 |
|
18-Dec-2002 |
alc |
Hold the page queues lock when performing vm_page_flag_set().
|
#
107913 |
|
15-Dec-2002 |
dillon |
This is David Schultz's swapoff code which I am finally able to commit. This should be considered highly experimental for the moment.
Submitted by: David Schultz <dschultz@uclink.Berkeley.EDU> MFC after: 3 weeks
|
#
107039 |
|
18-Nov-2002 |
alc |
Remove vm_page_protect(). Instead, use pmap_page_protect() directly.
|
#
106778 |
|
11-Nov-2002 |
cognet |
Remove extra #include<sys/vmmeter.h>.
|
#
104094 |
|
28-Sep-2002 |
phk |
Be consistent about "static" functions: if the function is marked static in its prototype, mark it static at the definition too.
Inspired by: FlexeLint warning #512
|
#
103924 |
|
24-Sep-2002 |
jeff |
- Lock access to numoutput on the swap devices.
|
#
102738 |
|
31-Aug-2002 |
dillon |
Reduce the maximum KVA reserved for swap meta structures from 70 to 32 MB. Reduce the swap meta calculation by a factor of 2, it's still massive overkill.
X-MFC after: immediately
|
#
100452 |
|
21-Jul-2002 |
alc |
o Lock page queue accesses by vm_page_free().
|
#
100415 |
|
20-Jul-2002 |
alc |
o Lock page queue accesses by vm_page_try_to_cache(). (The accesses in kern/vfs_bio.c are already locked.) o Assert that the page queues lock is held in vm_page_try_to_cache().
|
#
98892 |
|
26-Jun-2002 |
iedowse |
Avoid using the 64-bit vm_pindex_t in a few places where 64-bit types are not required, as the overhead is unnecessary:
o In the i386 pmap_protect(), `sindex' and `eindex' represent page indices within the 32-bit virtual address space. o In swp_pager_meta_build() and swp_pager_meta_ctl(), use a temporary variable to store the low few bits of a vm_pindex_t that gets used as an array index. o vm_uiomove() uses `osize' and `idx' for page offsets within a map entry. o In vm_object_split(), `idx' is a page offset within a map entry.
|
#
98891 |
|
26-Jun-2002 |
iedowse |
Use an explicit cast to avoid relying on sign extension to do the right thing in code such as `vm_pindex_t x = ~SWAP_META_MASK'.
Reviewed by: dillon
|
#
98607 |
|
22-Jun-2002 |
alc |
o Replace GIANT_REQUIRED in swap_pager_alloc() by the acquisition and release of Giant. (Annotate as MPSAFE.)
|
#
93818 |
|
04-Apr-2002 |
jhb |
Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.
Tested on: i386, alpha, sparc64
|
#
92748 |
|
20-Mar-2002 |
jeff |
Remove references to vm_zone.h and switch over to the new uma API.
|
#
92727 |
|
19-Mar-2002 |
alfred |
Remove __P.
|
#
92654 |
|
19-Mar-2002 |
jeff |
This is the first part of the new kernel memory allocator. This replaces malloc(9) and vm_zone with a slab like allocator.
Reviewed by: arch@
|
#
92029 |
|
10-Mar-2002 |
eivind |
- Remove a number of extra newlines that do not belong here according to style(9) - Minor space adjustment in cases where we have "( ", " )", if(), return(), while(), for(), etc. - Add /* SYMBOL */ after a few #endifs.
Reviewed by: alc
|
#
91420 |
|
27-Feb-2002 |
jhb |
Use thread0.td_ucred instead of proc0.p_ucred. This change is cosmetic and isn't strictly required. However, it lowers the number of false positives found when grep'ing the kernel sources for p_ucred to ensure proper locking.
|
#
91063 |
|
22-Feb-2002 |
phk |
GC: BIO_ORDERED, various infrastructure dealing with BIO_ORDERED.
|
#
85016 |
|
15-Oct-2001 |
tegge |
Don't use an uninitialized field reserved for callers in the bio structure passed to swap_pager_strategy(). Instead, use a field reserved for drivers and initialize it before usage.
Reviewed by: dillon
|
#
84827 |
|
11-Oct-2001 |
jhb |
Change the kernel's ucred API as follows: - crhold() returns a reference to the ucred whose refcount it bumps. - crcopy() now simply copies the credentials from one credential to another and has no return value. - a new crshared() primitive is added which returns true if a ucred's refcount is > 1 and false (0) otherwise.
|
#
81933 |
|
19-Aug-2001 |
dillon |
Limit the amount of KVM reserved for the buffer cache and for swap-meta information. The default limits only effect machines with > 1GB of ram and can be overriden with two new kernel conf variables VM_SWZONE_SIZE_MAX and VM_BCACHE_SIZE_MAX, or with loader variables kern.maxswzone and kern.maxbcache. This has the effect of leaving more KVM available for sizing NMBCLUSTERS and 'maxusers' and should avoid tripups where a sysad adds memory to a machine and then sees the kernel panic on boot due to running out of KVM.
Also change the default swap-meta auto-sizing calculation to allocate half of what it was previously allocating. The prior defaults were way too high. Note that we cannot afford to run out of swap-meta structures so we still stay somewhat conservative here.
|
#
81029 |
|
02-Aug-2001 |
alfred |
Fixups for the initial allocation by dillon: 1) allocate fewer buckets 2) when failing to allocate swap zone, keep reducing the zone by a third rather than a half in order to reduce the chance of allocating way too little.
I also moved around some code for readability.
Suggested by: dillon Reviewed by: dillon
|
#
79242 |
|
04-Jul-2001 |
dillon |
whitespace / register cleanup
|
#
79224 |
|
04-Jul-2001 |
dillon |
With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
|
#
78622 |
|
22-Jun-2001 |
jhb |
- Protect all accesses to nsw_[rw]count{,_{,a}sync} with the pbuf mutex. - Don't drop the vm mutex while grabbing the pbuf mutex to manipulate said variables.
|
#
77088 |
|
23-May-2001 |
jhb |
- Fix the sw_alloc_interlock to actually lock itself when the lock is acquired. - Assert Giant is held in the strategy, getpages, and putpages methods and the getchainbuf, flushchainbuf, and waitchainbuf functions. - Always call flushchainbuf() w/o the VM lock.
|
#
77036 |
|
23-May-2001 |
alfred |
aquire Giant when playing with the buffercache and doing IO. use msleep against the vm mutex while waiting for a page IO to complete.
|
#
77010 |
|
22-May-2001 |
alfred |
aquire vm mutex in swp_pager_async_iodone. Don't call swp_pager_async_iodone with the mutex held.
|
#
76827 |
|
18-May-2001 |
alfred |
Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level vm operations.
faults can not be taken without holding Giant.
Memory subsystems can now call the base page allocators safely.
Almost all atomic ops were removed as they are covered under the vm mutex.
Alpha and ia64 now need to catch up to i386's trap handlers.
FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties).
Reviewed (partially) by: jake, jhb
|
#
76322 |
|
06-May-2001 |
phk |
Actually biofinish(struct bio *, struct devstat *, int error) is more general than the bioerror().
Most of this patch is generated by scripts.
|
#
75675 |
|
18-Apr-2001 |
alfred |
Protect pager object creation with sx locks.
Protect pager object list manipulation with a mutex.
It doesn't look possible to combine them under a single sx lock because creation may block and we can't have the object list manipulation block on anything other than a mutex because of interrupt requests.
|
#
75474 |
|
13-Apr-2001 |
alfred |
protect pbufs and associated counts with a mutex
|
#
72949 |
|
23-Feb-2001 |
rwatson |
Introduce per-swap area accounting in the VM system, and export this information via the vm.nswapdev sysctl (number of swap areas) and vm.swapdevX nodes (where X is the device), which contain the MIBs dev, blocks, used, and flags. These changes are required to allow top and other userland swap-monitoring utilities to run without setgid kmem.
Submitted by: Thomas Moestl <tmoestl@gmx.net> Reviewed by: freebsd-audit
|
#
69972 |
|
13-Dec-2000 |
tanimura |
- If swap metadata does not fit into the KVM, reduce the number of struct swblock entries by dividing the number of the entries by 2 until the swap metadata fits.
- Reject swapon(2) upon failure of swap_zone allocation.
This is just a temporary fix. Better solutions include: (suggested by: dillon)
o reserving swap in SWAP_META_PAGES chunks, and o swapping the swblock structures themselves.
Reviewed by: alfred, dillon
|
#
69781 |
|
08-Dec-2000 |
dwmalone |
Convert more malloc+bzero to malloc+M_ZERO.
Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>
|
#
68921 |
|
19-Nov-2000 |
rwatson |
o Export dmmax ("Maximum size of a swap block") using SYSCTL_INT. This removes a reason that systat requires setgid kmem. More to come.
|
#
68885 |
|
18-Nov-2000 |
dillon |
Implement a low-memory deadlock solution.
Removed most of the hacks that were trying to deal with low-memory situations prior to now.
The new code is based on the concept that I/O must be able to function in a low memory situation. All major modules related to I/O (except networking) have been adjusted to allow allocation out of the system reserve memory pool. These modules now detect a low memory situation but rather then block they instead continue to operate, then return resources to the memory pool instead of cache them or leave them wired.
Code has been added to stall in a low-memory situation prior to a vnode being locked.
Thus situations where a process blocks in a low-memory condition while holding a locked vnode have been reduced to near nothing. Not only will I/O continue to operate, but many prior deadlock conditions simply no longer exist.
Implement a number of VFS/BIO fixes
(found by Ian): in biodone(), bogus-page replacement code, the loop was not properly incrementing loop variables prior to a continue statement. We do not believe this code can be hit anyway but we aren't taking any chances. We'll turn the whole section into a panic (as it already is in brelse()) after the release is rolled.
In biodone(), the foff calculation was incorrectly clamped to the iosize, causing the wrong foff to be calculated for pages in the case of an I/O error or biodone() called without initiating I/O. The problem always caused a panic before. Now it doesn't. The problem is mainly an issue with NFS.
Fixed casts for ~PAGE_MASK. This code worked properly before only because the calculations use signed arithmatic. Better to properly extend PAGE_MASK first before inverting it for the 64 bit masking op.
In brelse(), the bogus_page fixup code was improperly throwing away the original contents of 'm' when it did the j-loop to fix the bogus pages. The result was that it would potentially invalidate parts of the *WRONG* page(!), leading to corruption.
There may still be cases where a background bitmap write is being duplicated, causing potential corruption. We have identified a potentially serious bug related to this but the fix is still TBD. So instead this patch contains a KASSERT to detect the problem and panic the machine rather then continue to corrupt the filesystem. The problem does not occur very often.. it is very hard to reproduce, and it may or may not be the cause of the corruption people have reported.
Review by: (VFS/BIO: mckusick, Ian Dowse <iedowse@maths.tcd.ie>) Testing by: (VM/Deadlock) Paul Saab <ps@yahoo-inc.com>
|
#
68883 |
|
18-Nov-2000 |
dillon |
This patchset fixes a large number of file descriptor race conditions. Pre-rfork code assumed inherent locking of a process's file descriptor array. However, with the advent of rfork() the file descriptor table could be shared between processes. This patch closes over a dozen serious race conditions related to one thread manipulating the table (e.g. closing or dup()ing a descriptor) while another is blocked in an open(), close(), fcntl(), read(), write(), etc...
PR: kern/11629 Discussed with: Alexander Viro <viro@math.psu.edu>
|
#
67082 |
|
13-Oct-2000 |
dillon |
The swap bitmap allocator was not calculating the bitmap size properly in the face of non-stripe-aligned swap areas. The bug could cause a panic during boot.
Refuse to configure a swap area that is too large (67 GB or so)
Properly document the power-of-2 requirement for SWB_NPAGES.
The patch is slightly different then the one Tor enclosed in the P.R., but accomplishes the same thing.
PR: kern/20273 Submitted by: Tor.Egge@fast.no
|
#
60755 |
|
21-May-2000 |
peter |
Implement an optimization of the VM<->pmap API. Pass vm_page_t's directly to various pmap_*() functions instead of looking up the physical address and passing that. In many cases, the first thing the pmap code was doing was going to a lot of trouble to get back the original vm_page_t, or it's shadow pv_table entry.
Inspired by: John Dyson's 1998 patches.
Also: Eliminate pv_table as a seperate thing and build it into a machine dependent part of vm_page_t. This eliminates having a seperate set of structions that shadow each other in a 1:1 fashion that we often went to a lot of trouble to translate from one to the other. (see above) This happens to save 4 bytes of physical memory for each page in the system. (8 bytes on the Alpha).
Eliminate the use of the phys_avail[] array to determine if a page is managed (ie: it has pv_entries etc). Store this information in a flag. Things like device_pager set it because they create vm_page_t's on the fly that do not have pv_entries. This makes it easier to "unmanage" a page of physical memory (this will be taken advantage of in subsequent commits).
Add a function to add a new page to the freelist. This could be used for reclaiming the previously wasted pages left over from preloaded loader(8) files.
Reviewed by: dillon
|
#
60041 |
|
05-May-2000 |
phk |
Separate the struct bio related stuff out of <sys/buf.h> into <sys/bio.h>.
<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes.
Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data.
Still a few bogus uses of struct buf to track down.
Repocopy by: peter
|
#
59915 |
|
03-May-2000 |
phk |
Convert the vm_pager_strategy() interface to take a struct bio instead of a struct buf. Don't try to examine B_ASYNC, it is a layering violation to do so. The only current user of this interface is vn(4) which, since it emulates a disk interface, operates on struct bio already.
|
#
59866 |
|
01-May-2000 |
phk |
Move and staticize the bufchain functions so they become local to the only piece of code using them. This will ease a rewrite of them.
|
#
59249 |
|
15-Apr-2000 |
phk |
Complete the bio/buf divorce for all code below devfs::strategy
Exceptions: Vinum untouched. This means that it cannot be compiled. Greg Lehey is on the case.
CCD not converted yet, casts to struct buf (still safe)
atapi-cd casts to struct buf to examine B_PHYS
|
#
58934 |
|
02-Apr-2000 |
phk |
Move B_ERROR flag to b_ioflags and call it BIO_ERROR.
(Much of this done by script)
Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED.
Move b_pblkno and b_iodone_chain to struct bio while we transition, they will be obsoleted once bio structs chain/stack.
Add bio_queue field for struct bio aware disksort.
Address a lot of stylistic issues brought up by bde.
|
#
58714 |
|
28-Mar-2000 |
dillon |
Misattribution - the excellent SPLASSERT work is being done by Paul Saab <paul@mu.org>, of course!
|
#
58708 |
|
27-Mar-2000 |
dillon |
Add necessary spl protection for swapper. The problem was located by Alfred while testing his SPLASSERT stuff. This is not a complete fix, more protections are probably needed.
|
#
58705 |
|
27-Mar-2000 |
charnier |
Revert spelling mistake I made in the previous commit Requested by: Alan and Bruce
|
#
58634 |
|
26-Mar-2000 |
charnier |
Spelling
|
#
58462 |
|
22-Mar-2000 |
phk |
Fix one place which knew that B_WRITE was zero.
Fix a stylistic mistake of mine while here.
Found by: Stephen Hocking <shocking@prth.pgs.com>
|
#
58349 |
|
20-Mar-2000 |
phk |
Rename the existing BUF_STRATEGY() to DEV_STRATEGY()
substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo)
substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo)
This patch is machine generated except for the ccd.c and buf.h parts.
|
#
58345 |
|
20-Mar-2000 |
phk |
Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set.
B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes.
Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL.
Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading.
This change is a step in the direction towards a stackable BIO capability.
A lot of this patch were machine generated (Thanks to style(9) compliance!)
Vinum users: Greg has not had time to test this yet, be careful.
|
#
58132 |
|
16-Mar-2000 |
phk |
Eliminate the undocumented, experimental, non-delivering and highly dangerous MAX_PERF option.
|
#
55175 |
|
28-Dec-1999 |
peter |
Fix the swap backed vn case - this was broken by my rev 1.128 to swap_pager.c and related commits.
Essentially swap_pager.c is backed out to before the changes, but swapdev_vp is converted into a real vnode with just VOP_STRATEGY(). It no longer abuses specfs vnops and no longer needs a dev_t and /dev/drum (or /dev/swapdev) for the intermediate layer.
This essentially restores the vnode interface as the interface to the bottom of the swap pager, and vm_swap.c provides a clean vnode interface.
This will need to be revisited when we swap to files (vnodes) - which is the other reason for keeping the vnode interface between the swap pager and the swap devices.
OK'ed by: dillon
|
#
53594 |
|
22-Nov-1999 |
phk |
Isolate the swapdev_vp "not quite" vnode in the only source file which needs it now that /dev/drum is gone.
Reviewed by: eivind, peter
|
#
53338 |
|
18-Nov-1999 |
peter |
Remove the non-functional "swap device" userland front-end to the multiplexed underlying swap devices (/dev/drum). The only thing it did was to allow root to open /dev/drum, but not do anything with it. Various utilities used to grovel around in here, but Matt has written a much nicer (and clean) front-end to this for libkvm, and nothing uses the old system any more.
The VM system was calling VOP_STRATEGY() on the vp of the first underlying swap device (not the /dev/drum one, the first real device), and using the VOP system to indirectly (and only) call swstrategy() to choose an underlying device and enqueue it on that device. I have changed it to avoid diverting through the VOP system and to call the only possible target directly, saving a little bit of time and some complexity.
In all, nothing much changes, except some scaffolding to support the roundabout way of calling swstrategy() is gone.
Matt gave me the ok to do this some time ago, and I apologize for taking so long to get around to it.
|
#
52635 |
|
29-Oct-1999 |
phk |
useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs.
This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
|
#
51339 |
|
17-Sep-1999 |
dillon |
Fix a number of spl bugs related to reserving and freeing swap space. Swap space can be freed from an interrupt and so swap reservation and freeing must occur at splvm.
Add swap_pager_reserve() code to support a new swap pre-reservation capability for the VN device.
Generally cleanup the swap code by simplifying the swp_pager_meta_build() static function and consolidating the SWAPBLK_NONE test from a bit test to an absolute compare. The bit test was left over from a rejected swap allocation scheme that was not ultimately committed. A few other minor cleanups were also made.
Reorganize the swap strategy code, again for VN support, to not reallocate swap when writing as this messes up pre-reservation and can fragment I/O unnecessarily as VN-baesd disk is messed around with.
Reviewed by: Alan Cox <alc@cs.rice.edu>, David Greenman <dg@root.com>
|
#
50477 |
|
27-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
#
50269 |
|
23-Aug-1999 |
bde |
Use devtoname to print dev_t's instead of casting them to u_long for misprinting with %lx.
Cast pointers to intptr_t instead of casting them to long. Cosmetic.
|
#
49949 |
|
17-Aug-1999 |
alc |
Correct an accidental omission of one "vm_page_undirty" replacement from the previous commit.
|
#
49945 |
|
17-Aug-1999 |
alc |
Add the (inline) function vm_page_undirty for clearing the dirty bitmask of a vm_page.
Use it.
Submitted by: dillon
|
#
48833 |
|
16-Jul-1999 |
alc |
Remove vm_object::last_read. It is used by the old swap pager, but not by the new one, i.e., vm/swap_pager.c rev 1.108.
Reviewed by: dillon@backplane.com
|
#
48289 |
|
27-Jun-1999 |
peter |
Kirk missed a required BUF_KERNPROC(). Even though this is a non-async transfer, the b_iodone hook causes biodone() to release it from interrupt context.
|
#
48225 |
|
26-Jun-1999 |
mckusick |
Convert buffer locking from using the B_BUSY and B_WANTED flags to using lockmgr locks. This commit should be functionally equivalent to the old semantics. That is, all buffer locking is done with LK_EXCLUSIVE requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will be done in future commits.
|
#
46580 |
|
06-May-1999 |
phk |
remove b_proc from struct buf, it's (now) unused.
Reviewed by: dillon, bde
|
#
44739 |
|
14-Mar-1999 |
julian |
Submitted by: Matt Dillon <dillon@freebsd.org> The old VN device broke in -4.x when the definition of B_PAGING changed. This patch fixes this plus implements additional capabilities. The new VN device can be backed by a file ( as per normal ), or it can be directly backed by swap.
Due to dependencies in VM include files (on opt_xxx options) the new vn device cannot be a module yet. This will be fixed in a later commit. This commit delimitted by tags {PRE,POST}_MATT_VNDEV
|
#
44179 |
|
21-Feb-1999 |
dillon |
Remove conditional sysctl's
Leave swap_async_max sysctl intact, remove swap_cluster_max sysctl.
Reviewed by: Alan Cox <alc@cs.rice.edu>
|
#
44178 |
|
21-Feb-1999 |
dillon |
Reviewed by: Alan Cox <alc@cs.rice.edu>
Fix problem w/ low-swap/low-memory handling as reported by Bruce Evans.
|
#
44124 |
|
18-Feb-1999 |
dillon |
Limit number of simultanious asynchronous swap pager I/Os that can be in progress at any given moment.
Add two swap tuneables to sysctl:
vm.swap_async_max: 4 vm.swap_cluster_max: 16
Recommended values are a cluster size of 8 or 16 pages. async_max is about right for 1-4 swap devices. Reduce to 2 if swap is eating too much bandwidth, or even 1 if swap is both eating too much bandwidth and sitting on a slow network (10BaseT).
The defaults work well across a broad range of configurations and should normally be left alone.
|
#
43700 |
|
06-Feb-1999 |
dillon |
Add hysteresis to the 'swap_pager_getswapspace; failed' console message. Also widen the hysteresis levels a little ( these really should be dynamically configured ).
|
#
43287 |
|
27-Jan-1999 |
dillon |
Remove unintended trigraph sequences in comments for -Wall
|
#
43138 |
|
24-Jan-1999 |
dillon |
Change all manual settings of vm_page_t->dirty = VM_PAGE_BITS_ALL to use the vm_page_dirty() inline.
The inline can thus do sanity checks ( or not ) over all cases.
|
#
43129 |
|
24-Jan-1999 |
dillon |
vm_pager_put_pages() is passed an rcval array to hold per-page return values. The 'int' return value for the procedure was never used and not well defined in any case when there are mixed errors on pages, so it has been removed. vm_pager_put_pages() and associated vm_pager functions now return void.
|
#
42966 |
|
21-Jan-1999 |
dillon |
The default_pager's interaction with the swap_pager has been reorganized, and the swap_pager has been completely replaced.
The new swap pager uses the new blist radix-tree based bitmap allocator for low level swap allocation and deallocation. The new allocator is effectively O(5) while the old one was O(N), and the new allocator allocates all required memory at init time rather then at allocate memory on the fly at run time.
Swap metadata is allocated in clusters and stored in a hash table, eliminating linearly allocated structures.
Many, many features have been rewritten or added. Swap space is now reallocated on the fly providing a poor-mans auto defragmentation of swap space. Swap space that is no longer needed is freed on a timely basis so no garbage collection is necessary.
Swap I/O is marked B_ASYNC and NFS has been fixed to do the right thing with it, so NFS-based paging now has around 10x the performance as it did before ( previously NFS enforced synchronous I/O for paging ).
|
#
42957 |
|
21-Jan-1999 |
dillon |
This is a rather large commit that encompasses the new swapper, changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues.
Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
|
#
42453 |
|
09-Jan-1999 |
eivind |
KNFize, by bde.
|
#
42408 |
|
08-Jan-1999 |
eivind |
Split DIAGNOSTIC -> DIAGNOSTIC, INVARIANTS, and INVARIANT_SUPPORT as discussed on -hackers.
Introduce 'KASSERT(assertion, ("panic message", args))' for simple check + panic.
Reviewed by: msmith
|
#
42153 |
|
29-Dec-1998 |
dt |
Don't free swap in swap_pager_getpages(): this code probably cause the "dying daemons" problem. (I thought this code was introduced in rev.1.80, but it just relaxed the condition.)
Also, kill related "suggest more swap space" warning (also introduced in 1.80). It was confusing, to say the least...
Requested by: msmith Not objected by: dg
|
#
41250 |
|
19-Nov-1998 |
bde |
Fixed a null pointer panic in spc_free(). swap_pager_putpages() almost always causes this panic for the curproc != pageproc case. This case apparently doesn't happen in normal operation, but it happens when vm_page_alloc_contig() is called when there is a memory hogging application that hasn't already been paged out.
PR: 8632 Reviewed by: info@opensound.com (Dev Mazumdar), dg Broken in: rev.1.89 (1998/02/23)
|
#
40790 |
|
31-Oct-1998 |
peter |
Use TAILQ macros for clean/dirty block list processing. Set b_xflags rather than abusing the list next pointer with a magic number.
|
#
40286 |
|
13-Oct-1998 |
dg |
Fixed two potentially serious classes of bugs:
1) The vnode pager wasn't properly tracking the file size due to "size" being page rounded in some cases and not in others. This sometimes resulted in corrupted files. First noticed by Terry Lambert. Fixed by changing the "size" pager_alloc parameter to be a 64bit byte value (as opposed to a 32bit page index) and changing the pagers and their callers to deal with this properly. 2) Fixed a bogus type cast in round_page() and trunc_page() that caused some 64bit offsets and sizes to be scrambled. Removing the cast required adding casts at a few dozen callers. There may be problems with other bogus casts in close-by macros. A quick check seemed to indicate that those were okay, however.
|
#
38799 |
|
04-Sep-1998 |
dfr |
Cosmetic changes to the PAGE_XXX macros to make them consistent with the other objects in vm.
|
#
38517 |
|
24-Aug-1998 |
dfr |
Change various syscalls to use size_t arguments instead of u_int.
Add some overflow checks to read/write (from bde).
Change all modifications to vm_page::flags, vm_page::busy, vm_object::flags and vm_object::paging_in_progress to use operations which are not interruptable.
Reviewed by: Bruce Evans <bde@zeta.org.au>
|
#
38298 |
|
13-Aug-1998 |
dfr |
Protect all modifications to paging_in_progress with splvm().
|
#
37918 |
|
28-Jul-1998 |
bde |
Fixed two spl nesting bugs. They caused (at least) the entire pageout daemon to run at splvm() forever after swap_pager_putpages() is called from vm_pageout_scan().
Broken in: rev.1.189 (1998/02/23)
|
#
37555 |
|
11-Jul-1998 |
bde |
Fixed printf format errors.
|
#
37384 |
|
04-Jul-1998 |
julian |
VOP_STRATEGY grows an (struct vnode *) argument as the value in b_vp is often not really what you want. (and needs to be frobbed). more cleanups will follow this. Reviewed by: Bruce Evans <bde@freebsd.org>
|
#
35669 |
|
04-May-1998 |
dyson |
Work around some VM bugs, the worst being an overly aggressive swap space free calculation. More complete fixes will be forthcoming, in a week.
|
#
35497 |
|
29-Apr-1998 |
dyson |
Tighten up management of memory and swap space during map allocation, deallocation cycles. This should provide a measurable improvement on swap and memory allocation on loaded systems. It is unlikely a complete solution. Also, provide more map info with procfs. Chuck Cranor spurred on this improvement.
|
#
35210 |
|
15-Apr-1998 |
bde |
Support compiling with `gcc -ansi'.
|
#
34206 |
|
07-Mar-1998 |
dyson |
This mega-commit is meant to fix numerous interrelated problems. There has been some bitrot and incorrect assumptions in the vfs_bio code. These problems have manifest themselves worse on NFS type filesystems, but can still affect local filesystems under certain circumstances. Most of the problems have involved mmap consistancy, and as a side-effect broke the vfs.ioopt code. This code might have been committed seperately, but almost everything is interrelated.
1) Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that are fully valid. 2) Rather than deactivating erroneously read initial (header) pages in kern_exec, we now free them. 3) Fix the rundown of non-VMIO buffers that are in an inconsistent (missing vp) state. 4) Fix the disassociation of pages from buffers in brelse. The previous code had rotted and was faulty in a couple of important circumstances. 5) Remove a gratuitious buffer wakeup in vfs_vmio_release. 6) Remove a crufty and currently unused cluster mechanism for VBLK files in vfs_bio_awrite. When the code is functional, I'll add back a cleaner version. 7) The page busy count wakeups assocated with the buffer cache usage were incorrectly cleaned up in a previous commit by me. Revert to the original, correct version, but with a cleaner implementation. 8) The cluster read code now tries to keep data associated with buffers more aggressively (without breaking the heuristics) when it is presumed that the read data (buffers) will be soon needed. 9) Change to filesystem lockmgr locks so that they use LK_NOPAUSE. The delay loop waiting is not useful for filesystem locks, due to the length of the time intervals. 10) Correct and clean-up spec_getpages. 11) Implement a fully functional nfs_getpages, nfs_putpages. 12) Fix nfs_write so that modifications are coherent with the NFS data on the server disk (at least as well as NFS seems to allow.) 13) Properly support MS_INVALIDATE on NFS. 14) Properly pass down MS_INVALIDATE to lower levels of the VM code from vm_map_clean. 15) Better support the notion of pages being busy but valid, so that fewer in-transit waits occur. (use p->busy more for pageouts instead of PG_BUSY.) Since the page is fully valid, it is still usable for reads. 16) It is possible (in error) for cached pages to be busy. Make the page allocation code handle that case correctly. (It should probably be a printf or panic, but I want the system to handle coding errors robustly. I'll probably add a printf.) 17) Correct the design and usage of vm_page_sleep. It didn't handle consistancy problems very well, so make the design a little less lofty. After vm_page_sleep, if it ever blocked, it is still important to relookup the page (if the object generation count changed), and verify it's status (always.) 18) In vm_pageout.c, vm_pageout_clean had rotted, so clean that up. 19) Push the page busy for writes and VM_PROT_READ into vm_pageout_flush. 20) Fix vm_pager_put_pages and it's descendents to support an int flag instead of a boolean, so that we can pass down the invalidate bit.
|
#
33936 |
|
01-Mar-1998 |
dyson |
1) Use a more consistent page wait methodology. 2) Do not unnecessarily force page blocking when paging pages out. 3) Further improve swap pager performance and correctness, including fixing the paging in progress deadlock (except in severe I/O error conditions.) 4) Enable vfs_ioopt=1 as a default. 5) Fix and enable the page prezeroing in SMP mode.
All in all, SMP systems especially should show a significant improvement in "snappyness."
|
#
33817 |
|
25-Feb-1998 |
dyson |
Fix page prezeroing for SMP, and fix some potential paging-in-progress hangs. The paging-in-progress diagnosis was a result of Tor Egge's excellent detective work. Submitted by: Partially from Tor Egge.
|
#
33758 |
|
23-Feb-1998 |
dyson |
Significantly improve the efficiency of the swap pager, which appears to have declined due to code-rot over time. The swap pager rundown code has been clean-up, and unneeded wakeups removed. Lots of splbio's are changed to splvm's. Also, set the dynamic tunables for the pageout daemon to be more sane for larger systems (thereby decreasing the daemon overheadla.)
|
#
33181 |
|
09-Feb-1998 |
eivind |
Staticize.
|
#
33134 |
|
06-Feb-1998 |
eivind |
Back out DIAGNOSTIC changes.
|
#
33108 |
|
04-Feb-1998 |
eivind |
Turn DIAGNOSTIC into a new-style option.
|
#
33034 |
|
02-Feb-1998 |
dyson |
This fix should help the panic problems in -current. There were some errors in "interval" management. Due to the clustering mechanism, the code is necessarily complex and error prone.
|
#
32952 |
|
01-Feb-1998 |
dyson |
Fix a performance problem caused by an earlier commit.
|
#
32937 |
|
31-Jan-1998 |
dyson |
Change the busy page mgmt, so that when pages are freed, they MUST be PG_BUSY. It is bogus to free a page that isn't busy, because it is in a state of being "unavailable" when being freed. The additional advantage is that the page_remove code has a better cross-check that the page should be busy and unavailable for other use. There were some minor problems with the collapse code, and this plugs those subtile "holes."
Also, the vfs_bio code wasn't checking correctly for PG_BUSY pages. I am going to develop a more consistant scheme for grabbing pages, busy or otherwise. For now, we are stuck with the current morass.
|
#
32702 |
|
22-Jan-1998 |
dyson |
VM level code cleanups.
1) Start using TSM. Struct procs continue to point to upages structure, after being freed. Struct vmspace continues to point to pte object and kva space for kstack. u_map is now superfluous. 2) vm_map's don't need to be reference counted. They always exist either in the kernel or in a vmspace. The vmspaces are managed by reference counts. 3) Remove the "wired" vm_map nonsense. 4) No need to keep a cache of kernel stack kva's. 5) Get rid of strange looking ++var, and change to var++. 6) Change more data structures to use our "zone" allocator. Added struct proc, struct vmspace and struct vnode. This saves a significant amount of kva space and physical memory. Additionally, this enables TSM for the zone managed memory. 7) Keep ioopt disabled for now. 8) Remove the now bogus "single use" map concept. 9) Use generation counts or id's for data structures residing in TSM, where it allows us to avoid unneeded restart overhead during traversals, where blocking might occur. 10) Account better for memory deficits, so the pageout daemon will be able to make enough memory available (experimental.) 11) Fix some vnode locking problems. (From Tor, I think.) 12) Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp. (experimental.) 13) Significantly shrink, cleanup, and make slightly faster the vm_fault.c code. Use generation counts, get rid of unneded collpase operations, and clean up the cluster code. 14) Make vm_zone more suitable for TSM.
This commit is partially as a result of discussions and contributions from other people, including DG, Tor Egge, PHK, and probably others that I have forgotten to attribute (so let me know, if I forgot.)
This is not the infamous, final cleanup of the vnode stuff, but a necessary step. Vnode mgmt should be correct, but things might still change, and there is still some missing stuff (like ioopt, and physical backing of non-merged cache files, debugging of layering concepts.)
|
#
32585 |
|
17-Jan-1998 |
dyson |
Tie up some loose ends in vnode/object management. Remove an unneeded config option in pmap. Fix a problem with faulting in pages. Clean-up some loose ends in swap pager memory management.
The system should be much more stable, but all subtile bugs aren't fixed yet.
|
#
31970 |
|
24-Dec-1997 |
dyson |
Support running with inadequate swap space. Additionally, the code will complain with a suggestion of increasing it.
|
#
31493 |
|
02-Dec-1997 |
phk |
In all such uses of struct buf: 's/b_un.b_addr/b_data/g'
|
#
28992 |
|
01-Sep-1997 |
bde |
Removed unused #includes.
|
#
28990 |
|
01-Sep-1997 |
bde |
Print a device number in hex instead of decimal.
|
#
28751 |
|
25-Aug-1997 |
bde |
Fixed type mismatches for functions with args of type vm_prot_t and/or vm_inherit_t. These types are smaller than ints, so the prototypes should have used the promoted type (int) to match the old-style function definitions. They use just vm_prot_t and/or vm_inherit_t. This depends on gcc features to work. I fixed the definitions since this is easiest. The correct fix may be to change the small types to u_int, to optimize for time instead of space.
|
#
22975 |
|
22-Feb-1997 |
peter |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
21673 |
|
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
21529 |
|
11-Jan-1997 |
dyson |
Prepare better for multi-platform by eliminating another required pmap routine (pmap_is_referenced.) Upper level recoded to use pmap_ts_referenced.
|
#
18893 |
|
12-Oct-1996 |
bde |
Removed __pure's and __pure2's. __pure is a no-op for recent versions of gcc by definition, and __pure2 is a no-op in effect (presumably the compiler can see when an inline function has no side effects).
|
#
18169 |
|
08-Sep-1996 |
dyson |
Addition of page coloring support. Various levels of coloring are afforded. The default level works with minimal overhead, but one can also enable full, efficient use of a 512K cache. (Parameters can be generated to support arbitrary cache sizes also.)
|
#
17334 |
|
30-Jul-1996 |
dyson |
Backed out the recent changes/enhancements to the VM code. The problem with the 'shell scripts' was found, but there was a 'strange' problem found with a 486 laptop that we could not find. This commit backs the code back to 25-jul, and will be re-entered after the snapshot in smaller (more easily tested) chunks.
|
#
17294 |
|
27-Jul-1996 |
dyson |
This commit is meant to solve a couple of VM system problems or performance issues.
1) The pmap module has had too many inlines, and so the object file is simply bigger than it needs to be. Some common code is also merged into subroutines. 2) Removal of some *evil* PHYS_TO_VM_PAGE macro calls. Unfortunately, a few have needed to be added also. The removal caused the need for more vm_page_lookups. I added lookup hints to minimize the need for the page table lookup operations. 3) Removal of some bogus performance improvements, that mostly made the code more complex (tracking individual page table page updates unnecessarily). Those improvements actually hurt 386 processors perf (not that people who worry about perf use 386 processors anymore :-)). 4) Changed pv queue manipulations/structures to be TAILQ's. 5) The pv queue code has had some performance problems since day one. Some significant scalability issues are resolved by threading the pv entries from the pmap AND the physical address instead of just the physical address. This makes certain pmap operations run much faster. This does not affect most micro-benchmarks, but should help loaded system performance *significantly*. DG helped and came up with most of the solution for this one. 6) Most if not all pmap bit operations follow the pattern: pmap_test_bit(); pmap_clear_bit(); That made for twice the necessary pv list traversal. The pmap interface now supports only pmap_tc_bit type operations: pmap_[test/clear]_modified, pmap_[test/clear]_referenced. Additionally, the modified routine now takes a vm_page_t arg instead of a phys address. This eliminates a PHYS_TO_VM_PAGE operation. 7) Several rewrites of routines that contain redundant code to use common routines, so that there is a greater likelihood of keeping the cache footprint smaller.
|
#
16274 |
|
10-Jun-1996 |
dyson |
Mostly superficial code improvements, add a diagnostic. The code improvements include significant simplification of the reservation of the swap pager control blocks for reads. Add a panic for an inconsistent swap pager control block count.
|
#
15873 |
|
22-May-1996 |
dyson |
Initial support for MADV_FREE, support for pages that we don't care about the contents anymore. This gives us alot of the advantage of freeing individual pages through munmap, but with almost none of the overhead.
|
#
15809 |
|
18-May-1996 |
dyson |
This set of commits to the VM system does the following, and contain contributions or ideas from Stephen McKay <syssgm@devetir.qld.gov.au>, Alan Cox <alc@cs.rice.edu>, David Greenman <davidg@freebsd.org> and me:
More usage of the TAILQ macros. Additional minor fix to queue.h. Performance enhancements to the pageout daemon. Addition of a wait in the case that the pageout daemon has to run immediately. Slightly modify the pageout algorithm. Significant revamp of the pmap/fork code: 1) PTE's and UPAGES's are NO LONGER in the process's map. 2) PTE's and UPAGES's reside in their own objects. 3) TOTAL elimination of recursive page table pagefaults. 4) The page directory now resides in the PTE object. 5) Implemented pmap_copy, thereby speeding up fork time. 6) Changed the pv entries so that the head is a pointer and not an entire entry. 7) Significant cleanup of pmap_protect, and pmap_remove. 8) Removed significant amounts of machine dependent fork code from vm_glue. Pushed much of that code into the machine dependent pmap module. 9) Support more completely the reuse of already zeroed pages (Page table pages and page directories) as being already zeroed. Performance and code cleanups in vm_map: 1) Improved and simplified allocation of map entries. 2) Improved vm_map_copy code. 3) Corrected some minor problems in the simplify code. Implemented splvm (combo of splbio and splimp.) The VM code now seldom uses splhigh. Improved the speed of and simplified kmem_malloc. Minor mod to vm_fault to avoid using pre-zeroed pages in the case of objects with backing objects along with the already existant condition of having a vnode. (If there is a backing object, there will likely be a COW... With a COW, it isn't necessary to start with a pre-zeroed page.) Minor reorg of source to perhaps improve locality of ref.
|
#
15583 |
|
03-May-1996 |
phk |
Another sweep over the pmap/vm macros, this time with more focus on the usage. I'm not satisfied with the naming, but now at least there is less bogus stuff around.
|
#
15543 |
|
02-May-1996 |
phk |
removed: CLBYTES PD_SHIFT PGSHIFT NBPG PGOFSET CLSIZELOG2 CLSIZE pdei() ptei() kvtopte() ptetov() ispt() ptetoav() &c &c new: NPDEPG
Major macro cleanup.
|
#
14396 |
|
06-Mar-1996 |
dyson |
Fix a problem in the swap pager that caused some of the pages that were paged in under low swap space conditions to both loose their backing store and their dirty bits. This would cause pages to be demand zeroed under certain conditions in low VM space conditions and consequential sig-11's or sig-10's. This situation was made worse lately when the level for swap space reclaim threshold was increased.
|
#
14364 |
|
03-Mar-1996 |
dyson |
In order to fix some concurrency problems with the swap pager early on in the FreeBSD development, I had made a global lock around the rlist code. This was bogus, and now the lock is maintained on a per resource list basis. This now allows the rlist code to be used for almost any non-interrupt level application.
|
#
14316 |
|
02-Mar-1996 |
dyson |
1) Eliminate unnecessary bzero of UPAGES. 2) Eliminate unnecessary copying of pages during/after forks. 3) Add user map simplification.
|
#
13790 |
|
31-Jan-1996 |
dg |
"out of space" -> "out of swap space".
|
#
13490 |
|
19-Jan-1996 |
dyson |
Eliminated many redundant vm_map_lookup operations for vm_mmap. Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish overhead for merged cache. Efficiency improvement for vfs_cluster. It used to do alot of redundant calls to cluster_rbuild. Correct the ordering for vrele of .text and release of credentials. Use the selective tlb update for 486/586/P6. Numerous fixes to the size of objects allocated for files. Additionally, fixes in the various pagers. Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs. Fixes in the swap pager for exhausted resources. The pageout code will not as readily thrash. Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE), thereby improving efficiency of several routines. Eliminate even more unnecessary vm_page_protect operations. Significantly speed up process forks. Make vm_object_page_clean more efficient, thereby eliminating the pause that happens every 30seconds. Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the case of filesystems mounted async. Fix a panic with busy pages when write clustering is done for non-VMIO buffers.
|
#
12904 |
|
17-Dec-1995 |
bde |
Fixed 1TB filesize changes. Some pindexes had bogus names and types but worked because vm_pindex_t is indistinuishable from vm_offset_t.
|
#
12820 |
|
14-Dec-1995 |
phk |
Another mega commit to staticize things.
|
#
12819 |
|
14-Dec-1995 |
phk |
A Major staticize sweep. Generates a couple of warnings that I'll deal with later. A number of unused vars removed. A number of unused procs removed or #ifdefed.
|
#
12779 |
|
11-Dec-1995 |
dyson |
Some new anti-deadlock code ended up messing up the paging stats. A modified version of the code is now in place, and gausspage performance is back up to where it should be.
|
#
12767 |
|
11-Dec-1995 |
dyson |
Changes to support 1Tb filesizes. Pages are now named by an (object,index) pair instead of (object,offset) pair.
|
#
12662 |
|
07-Dec-1995 |
dg |
Untangled the vm.h include file spaghetti.
|
#
12591 |
|
03-Dec-1995 |
bde |
Completed function declarations and/or added prototypes.
Staticized some functions.
__purified some functions. Some functions were bogusly declared as returning `const'. This hasn't done anything since gcc-2.5. For later versions of gcc, the equivalent is __attribute__((const)) at the end of function declarations.
|
#
12423 |
|
20-Nov-1995 |
phk |
Remove unused vars & funcs, make things static, protoize a little bit.
|
#
12325 |
|
16-Nov-1995 |
bde |
Fixed recent staticizations. Some protypes for static functions were left in headers and not staticized.
|
#
12300 |
|
14-Nov-1995 |
phk |
staticize.
|
#
12006 |
|
02-Nov-1995 |
dg |
Move page fixups (pmap_clear_modify, etc) that happen after paging input completes out of vm_fault and into the pagers. This get rid of some redundancy and improves the architecture.
Reviewed by: John Dyson <dyson>
|
#
10984 |
|
24-Sep-1995 |
dg |
Check that the swap block is valid before including it in a cluster.
Submitted by: John Dyson
|
#
10670 |
|
10-Sep-1995 |
dyson |
Make sure that the prezero flag is cleared when needed.
|
#
10579 |
|
06-Sep-1995 |
dyson |
Fixed a sign reversal problem -- might have cause some Sig-11s that people have been seeing.
|
#
10556 |
|
04-Sep-1995 |
dyson |
Allow the fault code to use additional clustering info from both bmap and the swap pager. Improved fault clustering performance.
|
#
9548 |
|
16-Jul-1995 |
dg |
1) Merged swpager structure into vm_object. 2) Changed swap_pager internal interfaces to cope w/#1. 3) Eliminated object->copy as we no longer have copy objects. 4) Minor stylistic changes.
|
#
9507 |
|
13-Jul-1995 |
dg |
NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct proc or any VM system structure will have to be rebuilt!!!
Much needed overhaul of the VM system. Included in this first round of changes:
1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages, haspage, and sync operations are supported. The haspage interface now provides information about clusterability. All pager routines now take struct vm_object's instead of "pagers".
2) Improved data structures. In the previous paradigm, there is constant confusion caused by pagers being both a data structure ("allocate a pager") and a collection of routines. The idea of a pager structure has escentially been eliminated. Objects now have types, and this type is used to index the appropriate pager. In most cases, items in the pager structure were duplicated in the object data structure and thus were unnecessary. In the few cases that remained, a un_pager structure union was created in the object to contain these items.
3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now be removed. For instance, vm_object_enter(), vm_object_lookup(), vm_object_remove(), and the associated object hash list were some of the things that were removed.
4) simple_lock's removed. Discussion with several people reveals that the SMP locking primitives used in the VM system aren't likely the mechanism that we'll be adopting. Even if it were, the locking that was in the code was very inadequate and would have to be mostly re-done anyway. The locking in a uni-processor kernel was a no-op but went a long way toward making the code difficult to read and debug.
5) Places that attempted to kludge-up the fact that we don't have kernel thread support have been fixed to reflect the reality that we are really dealing with processes, not threads. The VM system didn't have complete thread support, so the comments and mis-named routines were just wrong. We now use tsleep and wakeup directly in the lock routines, for instance.
6) Where appropriate, the pagers have been improved, especially in the pager_alloc routines. Most of the pager_allocs have been rewritten and are now faster and easier to maintain.
7) The pagedaemon pageout clustering algorithm has been rewritten and now tries harder to output an even number of pages before and after the requested page. This is sort of the reverse of the ideal pagein algorithm and should provide better overall performance.
8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup have been removed. Some other unnecessary casts have also been removed.
9) Some almost useless debugging code removed.
10) Terminology of shadow objects vs. backing objects straightened out. The fact that the vm_object data structure escentially had this backwards really confused things. The use of "shadow" and "backing object" throughout the code is now internally consistent and correct in the Mach terminology.
11) Several minor bug fixes, including one in the vm daemon that caused 0 RSS objects to not get purged as intended.
12) A "default pager" has now been created which cleans up the transition of objects to the "swap" type. The previous checks throughout the code for swp->pg_data != NULL were really ugly. This change also provides the rudiments for future backing of "anonymous" memory by something other than the swap pager (via the vnode pager, for example), and it allows the decision about which of these pagers to use to be made dynamically (although will need some additional decision code to do this, of course).
13) (dyson) MAP_COPY has been deprecated and the corresponding "copy object" code has been removed. MAP_COPY was undocumented and non- standard. It was furthermore broken in several ways which caused its behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will continue to work correctly, but via the slightly different semantics of MAP_PRIVATE.
14) (dyson) Sharing maps have been removed. It's marginal usefulness in a threads design can be worked around in other ways. Both #12 and #13 were done to simplify the code and improve readability and maintain- ability. (As were most all of these changes)
TODO:
1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing this will reduce the vnode pager to a mere fraction of its current size.
2) Rewrite vm_fault and the swap/vnode pagers to use the clustering information provided by the new haspage pager interface. This will substantially reduce the overhead by eliminating a large number of VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be improved to provide both a "behind" and "ahead" indication of contiguousness.
3) Implement the extended features of pager_haspage in swap_pager_haspage(). It currently just says 0 pages ahead/behind.
4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps via a much more general mechanism that could also be used for disk striping of regular filesystems.
5) Do something to improve the architecture of vm_object_collapse(). The fact that it makes calls into the swap pager and knows too much about how the swap pager operates really bothers me. It also doesn't allow for collapsing of non-swap pager objects ("unnamed" objects backed by other pagers).
|
#
8876 |
|
30-May-1995 |
rgrimes |
Remove trailing whitespace.
|
#
8585 |
|
18-May-1995 |
dg |
Accessing pages beyond the end of a mapped file results in internal inconsistencies in the VM system that eventually lead to a panic. These changes fix the behavior to conform to the behavior in SunOS, which is to deny faults to pages beyond the EOF (returning SIGBUS). Internally, this is implemented by requiring faults to be within the object size boundaries. These changes exposed another bug, namely that passing in an offset to mmap when trying to map an unnamed anonymous region also results in internal inconsistencies. In this case, the offset is forced to zero.
Reviewed by: John Dyson and others
|
#
8504 |
|
14-May-1995 |
dg |
Changed swap partition handling/allocation so that it doesn't require specific partitions be mentioned in the kernel config file ("swap on foo" is now obsolete).
From Poul-Henning:
The visible effect is this:
As default, unless options "NSWAPDEV=23" is in your config, you will have four swap-devices. You can swapon(2) any block device you feel like, it doesn't have to be in the kernel config.
There is a performance/resource win available by getting the NSWAPDEV right (but only if you have just one swap-device ??), but using that as default would be too restrictive.
The invisible effect is that:
Swap-handling disappears from the $arch part of the kernel. It gets a lot simpler (-145 lines) and cleaner.
Reviewed by: John Dyson, David Greenman Submitted by: Poul-Henning Kamp, with minor changes by me.
|
#
8416 |
|
10-May-1995 |
dg |
Changed "handle" from type caddr_t to void *; "handle" is several different types of pointers, and "char *" is a bad choice for the type.
|
#
8319 |
|
07-May-1995 |
dyson |
Another error in the correction for trimming swap allocation for small objects. (This code needs to be revisited.)
|
#
8315 |
|
07-May-1995 |
dyson |
Fixed a calculation that would once-in-a-while cause the swap_pager to emit spurious page outside of object type messages. It is not a fatal condition anyway, so the message will be omitted for release. Also, the code that "clips" the allocation size, associated with the above problem, was fixed.
|
#
7935 |
|
19-Apr-1995 |
dg |
New flag: B_PAGING. Added as part of the vn driver hack.
|
#
7887 |
|
16-Apr-1995 |
dg |
Removed obsolete/unused variable declarations. Removed some extern declarations and included the correct include files.
|
#
7883 |
|
16-Apr-1995 |
dg |
Moved some zero-initialized variables into .bss. Made code intended to be called only from DDB #ifdef DDB. Removed some completely unused globals.
|
#
7240 |
|
22-Mar-1995 |
dg |
Added a check for wrong object size; print a warning, but deal with it correctly. The warning will tell us that there is a bug somewhere else in sizing the object correctly.
Submitted by: John Dyson
|
#
7170 |
|
19-Mar-1995 |
dg |
Removed redundant newlines that were in some panic strings.
|
#
7007 |
|
11-Mar-1995 |
dg |
Clear OBJ_INTERNAL flag for device pager objects and named anonymous objects.
|
#
6816 |
|
01-Mar-1995 |
dg |
Various changes from John and myself that do the following:
New functions create - vm_object_pip_wakeup and pagedaemon_wakeup that are used to reduce the actual number of wakeups. New function vm_page_protect which is used in conjuction with some new page flags to reduce the number of calls to pmap_page_protect. Minor changes to reduce unnecessary spl nesting. Rewrote vm_page_alloc() to improve readability. Various other mostly cosmetic changes.
|
#
6703 |
|
25-Feb-1995 |
dg |
Fixed severely broken printf (arguments out of order, no newline).
|
#
6618 |
|
22-Feb-1995 |
dg |
Only do object paging_in_progress wakeups if someone is waiting on this condition.
Submitted by: John Dyson
|
#
6585 |
|
20-Feb-1995 |
dg |
Deprecated remaining use of vm_deallocate. Deprecated vm_allocate_with_ pager(). Almost completely rewrote vm_mmap(); when John gets done with the bottom half, it will be a complete rewrite. Deprecated most use of vm_object_setpager(). Removed side effect of setting object persist in vm_object_enter and moved this into the pager(s). A few other cosmetic changes.
|
#
6129 |
|
02-Feb-1995 |
dg |
swap_pager.c: Fixed long standing bug in freeing swap space during object collapses. Fixed 'out of space' messages from printing out too often. Modified to use new kmem_malloc() calling convention. Implemented an additional stat in the swap pager struct to count the amount of space allocated to that pager. This may be removed at some point in the future. Minimized unnecessary wakeups.
vm_fault.c: Don't try to collect fault stats on 'swapped' processes - there aren't any upages to store the stats in. Changed read-ahead policy (again!).
vm_glue.c: Be sure to gain a reference to the process's map before swapping. Be sure to lose it when done.
kern_malloc.c: Added the ability to specify if allocations are at interrupt time or are 'safe'; this affects what types of pages can be allocated.
vm_map.c: Fixed a variety of map lock problems; there's still a lurking bug that will eventually bite.
vm_object.c: Explicitly initialize the object fields rather than bzeroing the struct. Eliminated the 'rcollapse' code and folded it's functionality into the "real" collapse routine. Moved an object_unlock() so that the backing_object is protected in the qcollapse routine. Make sure nobody fools with the backing_object when we're destroying it. Added some diagnostic code which can be called from the debugger that looks through all the internal objects and makes certain that they all belong to someone.
vm_page.c: Fixed a rather serious logic bug that would result in random system crashes. Changed pagedaemon wakeup policy (again!).
vm_pageout.c: Removed unnecessary page rotations on the inactive queue. Changed the number of pages to explicitly free to just free_reserved level.
Submitted by: John Dyson
|
#
5841 |
|
24-Jan-1995 |
dg |
Added ability to detect sequential faults and DTRT. (swap_pager.c) Added hook for pmap_prefault() and use symbolic constant for new third argument to vm_page_alloc() (vm_fault.c, various) Changed the way that upages and page tables are held. (vm_glue.c) Fixed architectural flaw in allocating pages at interrupt time that was introduced with the merged cache changes. (vm_page.c, various) Adjusted some algorithms to acheive better paging performance and to accomodate the fix for the architectural flaw mentioned above. (vm_pageout.c) Fixed pbuf handling problem, changed policy on handling read-behind page. (vnode_pager.c)
Submitted by: John Dyson
|
#
5464 |
|
10-Jan-1995 |
dg |
Fixed some formatting weirdness that I overlooked in the previous commit.
|
#
5455 |
|
09-Jan-1995 |
dg |
These changes embody the support of the fully coherent merged VM buffer cache, much higher filesystem I/O performance, and much better paging performance. It represents the culmination of over 6 months of R&D.
The majority of the merged VM/cache work is by John Dyson.
The following highlights the most significant changes. Additionally, there are (mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to support the new VM/buffer scheme.
vfs_bio.c: Significant rewrite of most of vfs_bio to support the merged VM buffer cache scheme. The scheme is almost fully compatible with the old filesystem interface. Significant improvement in the number of opportunities for write clustering.
vfs_cluster.c, vfs_subr.c Upgrade and performance enhancements in vfs layer code to support merged VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff.
vm_object.c: Yet more improvements in the collapse code. Elimination of some windows that can cause list corruption.
vm_pageout.c: Fixed it, it really works better now. Somehow in 2.0, some "enhancements" broke the code. This code has been reworked from the ground-up.
vm_fault.c, vm_page.c, pmap.c, vm_object.c Support for small-block filesystems with merged VM/buffer cache scheme.
pmap.c vm_map.c Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of kernel PTs.
vm_glue.c Much simpler and more effective swapping code. No more gratuitous swapping.
proc.h Fixed the problem that the p_lock flag was not being cleared on a fork.
swap_pager.c, vnode_pager.c Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the code doesn't need it anymore.
machdep.c Changes to better support the parameter values for the merged VM/buffer cache scheme.
machdep.c, kern_exec.c, vm_glue.c Implemented a seperate submap for temporary exec string space and another one to contain process upages. This eliminates all map fragmentation problems that previously existed.
ffs_inode.c, ufs_inode.c, ufs_readwrite.c Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on busy buffers.
Submitted by: John Dyson and David Greenman
|
#
5202 |
|
23-Dec-1994 |
dg |
Initialize b_vnbuf.le_next before returning a new buffer in getpbuf and trypbuf. Move a couple of splbio's to be slightly less conservative.
|
#
5186 |
|
22-Dec-1994 |
dg |
Fixed a benign off by one error.
|
#
5166 |
|
18-Dec-1994 |
dg |
Don't ever clear B_BUSY on a pbuf (or any other flag for that matter). This appears to be the cause of some buffer confusion that leads to a panic during heavy paging.
Submitted by: John Dyson
|
#
4440 |
|
13-Nov-1994 |
dg |
Fixed bugs in accounting of swap space that resulted in the pager thinking it was out of space when it really wasn't.
Submitted by: John Dyson
|
#
4207 |
|
06-Nov-1994 |
dg |
Fixed return status from pagers. Ahem...the previous method would manufacture data when it couldn't get it legitimately. :-(
Submitted by: John Dyson
|
#
3841 |
|
25-Oct-1994 |
dg |
Improved I/O error reporting.
|
#
3766 |
|
22-Oct-1994 |
dg |
Various changes to allow operation without any swapspace configured. Note that this is intended for use only in floppy situations and is done at the sacrifice of performance in that case (in ther words, this is not the best solution, but works okay for this exceptional situation).
Submitted by: John Dyson
|
#
3612 |
|
15-Oct-1994 |
dg |
1) Some of the counters in the vmmeter struct don't fit well into the Mach VM scheme of things, so I've changed them to be more appropriate. page in/ous are now associated with the pager that did them. Nuked v_fault as the only fault of interest that wouldn't be already counted in v_trap is a VM fault, and this is counted seperately. 2) Implemented most of the remaining counters and corrected the counting of some that were done wrong. They are all almost correct now...just a few minor ones left to fix.
|
#
3591 |
|
14-Oct-1994 |
dg |
Got rid of redundant declaration warnings.
|
#
3573 |
|
13-Oct-1994 |
dg |
Fixed bug where page modifications would be lost when swap space was almost depleted.
Reviewed by: John Dyson
|
#
3451 |
|
09-Oct-1994 |
dg |
Got rid of map.h. It's a leftover from the rmap code, and we use rlists. Changed swapmap into swaplist.
|
#
3449 |
|
08-Oct-1994 |
phk |
Cosmetics: unused vars, ()'s, #include's &c &c to silence gcc. Reviewed by: davidg
|
#
3083 |
|
25-Sep-1994 |
dg |
Disabled swap anti-fragmentation code. It reduces swap paging performance by 20% in my tests, and it appears to be the cause of a swap leak.
Submitted by: John Dyson
|
#
2386 |
|
29-Aug-1994 |
dg |
Patches from John Dyson to improve swap code efficiency. Religiously add back pmap_clear_modify() in vnode_pager_input until we figure out why system performance isn't what we expect.
Submitted by: John Dyson (swap_pager) & David Greenman (vnode_pager)
|
#
2112 |
|
18-Aug-1994 |
wollman |
Fix up some sloppy coding practices:
- Delete redundant declarations. - Add -Wredundant-declarations to Makefile.i386 so they don't come back. - Delete sloppy COMMON-style declarations of uninitialized data in header files. - Add a few prototypes. - Clean up warnings resulting from the above.
NB: ioconf.c will still generate a redundant-declaration warning, which is unavoidable unless somebody volunteers to make `config' smarter.
|
#
1895 |
|
07-Aug-1994 |
dg |
Provide support for upcoming merged VM/buffer cache, and fixed a few bugs that haven't appeared to manifest themselves (yet).
Submitted by: John Dyson
|
#
1887 |
|
06-Aug-1994 |
dg |
Incorporated post 1.1.5 work from John Dyson. This includes performance improvements via the new routines pmap_qenter/pmap_qremove and pmap_kenter/ pmap_kremove. These routine allow fast mapping of pages for those architectures that have "normal" MMUs. Also included is a fix to the pageout daemon to properly check a queue end condition.
Submitted by: John Dyson
|
#
1817 |
|
02-Aug-1994 |
dg |
Added $Id$
|
#
1810 |
|
01-Aug-1994 |
dg |
Removed all code related to the pagescan daemon, and changed 'act_count' adjustments to compensate for a world without the pagescan daemon.
|
#
1549 |
|
25-May-1994 |
rgrimes |
The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
|
#
1542 |
|
24-May-1994 |
rgrimes |
This commit was generated by cvs2svn to compensate for changes in r1541, which included commits to RCS files with non-trunk default branches.
|
#
1541 |
|
24-May-1994 |
rgrimes |
BSD 4.4 Lite Kernel Sources
|