#
267654 |
|
19-Jun-2014 |
gjb |
Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
258870 |
|
03-Dec-2013 |
jhb |
MFC 253471,253620,254430,254538: Change mmap() to more optimally use superpages and provide support for tweaking alignment of virtual mappings. - Add a new address space allocation method (VMFS_OPTIMAL_SPACE) for vm_map_find() that will try to alter the alignment of a mapping to match any existing superpage mappings of the object being mapped. If no suitable address range is found with the necessary alignment, vm_map_find() will fall back to using the simple first-fit strategy (VMFS_ANY_SPACE). - Change mmap() without MAP_FIXED, shmat(), shm_map(), and the GEM mapping ioctl to use VMFS_OPTIMAL_SPACE instead of VMFS_ANY_SPACE. - MAP_ALIGNED(n) requests a mapping aligned on a boundary of (1 << n). Requests for n >= number of bits in a pointer or less than the size of a page fail with EINVAL. This matches the API provided by NetBSD. - MAP_ALIGNED_SUPER is a special case of MAP_ALIGNED. It can be used to optimize the chances of using large pages. By default it will align the mapping on a large page boundary (the system is free to choose any large page size to align to that seems best for the mapping request). However, if the object being mapped is already using large pages, then it will align the virtual mapping to match the existing large pages in the object instead. - Internally, VMFS_ALIGNED_SPACE is now renamed to VMFS_SUPER_SPACE, and VMFS_ALIGNED_SPACE(n) is repurposed for specifying a specific alignment. MAP_ALIGNED(n) maps to using VMFS_ALIGNED_SPACE(n), while MAP_ALIGNED_SUPER maps to VMFS_SUPER_SPACE. - mmap() of a device object now uses VMFS_OPTIMAL_SPACE rather than explicitly using VMFS_SUPER_SPACE. All device objects are forced to use a specific color on creation, so VMFS_OPTIMAL_SPACE is effectively equivalent.
PR: ports/184173 (exp-run)
|
#
254089 |
|
08-Aug-2013 |
kib |
MFC r253190: Add the thread owner of the MAP_ENTRY_IN_TRANSITION flag to struct vm_map_entry. In vm_map_wire() and vm_map_unwire(), only process the entries which transition owner is the current thread.
|
#
245787 |
|
22-Jan-2013 |
zont |
MFC r240145: - Simplify VM code by using vmspace_wired_count() for counting wired memory of a process.
MFC r245255: - Reduce kernel size by removing unnecessary pointer indirections.
GENERIC kernel size reduced in 16 bytes and RACCT kernel in 336 bytes.
MFC r245296: - Improve readability of sys_obreak().
MFC r245421: - Get rid of unused function vmspace_wired_count().
|
#
239565 |
|
22-Aug-2012 |
mdf |
MFC r238502:
Fix a bug with memguard(9) on 32-bit architectures without a VM_KMEM_MAX_SIZE.
The code was not taking into account the size of the kernel_map, which the kmem_map is allocated from, so it could produce a sub-map size too large to fit. The simplest solution is to ignore VM_KMEM_MAX entirely and base the memguard map's size off the kernel_map's size, since this is always relevant and always smaller.
Found by: Justin Hibbits
|
#
235876 |
|
24-May-2012 |
alc |
MFC r235230 Give vm_fault()'s sequential access optimization a makeover.
There are two aspects to the sequential access optimization: (1) read ahead of pages that are expected to be accessed in the near future and (2) unmap and cache behind of pages that are not expected to be accessed again. This revision changes both aspects.
The read ahead optimization is now more effective. It starts with the same initial read window as before, but arithmetically grows the window on sequential page faults. This can yield increased read bandwidth. For example, on one of my machines, a program using mmap() to read a file that is several times larger than the machine's physical memory takes about 17% less time to complete.
The unmap and cache behind optimization is now more selectively applied. The read ahead window must grow to its maximum size before unmap and cache behind is performed. This significantly reduces the number of times that pages are unmapped and cached only to be reactivated a short time later.
The unmap and cache behind optimization now clears each page's referenced flag. Previously, in the case of dirty pages, if the containing file was still mapped at the time that the page daemon examined the dirty pages, they would be reactivated.
From a stylistic standpoint, this revision also cleanly separates the implementation of the read ahead and unmap/cache behind optimizations.
|
#
233001 |
|
15-Mar-2012 |
kib |
MFC r232071: Account the writeable shared mappings backed by file in the vnode v_writecount.
MFC r232103: Place the if() at the right location.
MFC note: the added struct vm_object un_pager.vnp.writemappings member is located after the fields of struct vm_object that could be accessed from the modules.
|
#
231889 |
|
17-Feb-2012 |
kib |
MFC r231526: Close a race due to dropping of the map lock between creating map entry for a shared mapping and marking the entry for inheritance.
|
#
225736 |
|
22-Sep-2011 |
kensmith |
Copy head to stable/9 as part of 9.0-RELEASE release cycle.
Approved by: re (implicit)
|
#
219819 |
|
21-Mar-2011 |
jeff |
- Merge changes to the base system to support OFED. These include a wider arg2 for sysctl, updates to vlan code, IFT_INFINIBAND, and other miscellaneous small features.
|
#
219124 |
|
01-Mar-2011 |
brucec |
Change the return type of vmspace_swap_count to a long to match the other vmspace_*_count functions.
MFC after: 3 days
|
#
218966 |
|
23-Feb-2011 |
brucec |
Calculate and return the count in vmspace_swap_count as a vm_offset_t instead of an int to avoid overflow.
While here, clean up some style(9) issues.
PR: kern/152200 Reviewed by: kib MFC after: 2 weeks
|
#
216604 |
|
20-Dec-2010 |
alc |
Introduce vm_fault_hold() and use it to (1) eliminate a long-standing race condition in proc_rwmem() and to (2) simplify the implementation of the cxgb driver's vm_fault_hold_user_pages(). Specifically, in proc_rwmem() the requested read or write could fail because the targeted page could be reclaimed between the calls to vm_fault() and vm_page_hold().
In collaboration with: kib@ MFC after: 6 weeks
|
#
216335 |
|
09-Dec-2010 |
mlaier |
Fix a long standing (from the original 4.4BSD lite sources) race between vmspace_fork and vm_map_wire that would lead to "vm_fault_copy_wired: page missing" panics. While faulting in pages for a map entry that is being wired down, mark the containing map as busy. In vmspace_fork wait until the map is unbusy, before we try to copy the entries.
Reviewed by: kib MFC after: 5 days Sponsored by: Isilon Systems, Inc.
|
#
216128 |
|
02-Dec-2010 |
trasz |
Replace pointer to "struct uidinfo" with pointer to "struct ucred" in "struct vm_object". This is required to make it possible to account for per-jail swap usage.
Reviewed by: kib@ Tested by: pho@ Sponsored by: FreeBSD Foundation
|
#
214144 |
|
21-Oct-2010 |
jhb |
- Make 'vm_refcnt' volatile so that compilers won't be tempted to treat its value as a loop invariant. Currently this is a no-op because 'atomic_cmpset_int()' clobbers all memory on current architectures. - Use atomic_fetchadd_int() instead of an atomic_cmpset_int() loop to drop a reference in vmspace_free().
Reviewed by: alc MFC after: 1 month
|
#
212868 |
|
19-Sep-2010 |
alc |
Make refinements to r212824. In particular, don't make vm_map_unlock_nodefer() part of the synchronization interface for maps.
Add comments to vm_map_unlock_and_wait() and vm_map_wakeup() describing how they should be used. In particular, describe the deferred deallocations issue with vm_map_unlock_and_wait().
Redo the implementation of vm_map_unlock_and_wait() so that it passes along the caller's file and line information, just like the other map locking primitives.
Reviewed by: kib X-MFC after: r212824
|
#
212824 |
|
18-Sep-2010 |
kib |
Adopt the deferring of object deallocation for the deleted map entries on map unlock to the lock downgrade and later read unlock operation.
System map entries cannot be backed by OBJT_VNODE objects, no need to defer deallocation for them. Map entries from user maps do not require the owner map for deallocation, and can be accumulated in the thread-local list for freeing when a user map is unlocked.
Move the collection of entries for deferred reclamation into vm_map_delete(). Create helper vm_map_process_deferred(), that is called from locations where processing is feasible. Do not process deferred entries in vm_map_unlock_and_wait() since map_sleep_mtx is held.
Reviewed by: alc, rstone (previous versions) Tested by: pho MFC after: 2 weeks
|
#
206819 |
|
18-Apr-2010 |
jmallett |
o) Add a VM find-space option, VMFS_TLB_ALIGNED_SPACE, which searches the address space for an address as aligned by the new pmap_align_tlb() function, which is for constraints imposed by the TLB. [1] o) Add a kmem_alloc_nofault_space() function, which acts like kmem_alloc_nofault() but allows the caller to specify which find-space option to use. [1] o) Use kmem_alloc_nofault_space() with VMFS_TLB_ALIGNED_SPACE to allocate the kernel stack address on MIPS. [1] o) Make pmap_align_tlb() on MIPS align addresses so that they do not start on an odd boundary within the TLB, so that they are suitable for insertion as wired entries and do not have to share a TLB entry with another mapping, assuming they are appropriately-sized. o) Eliminate md_realstack now that the kstack will be appropriately-aligned on MIPS. o) Increase the number of guard pages to 2 so that we retain the proper alignment of the kstack address.
Reviewed by: [1] alc X-MFC-after: Making sure alc has not come up with a better interface.
|
#
206142 |
|
03-Apr-2010 |
alc |
Make _vm_map_init() the one place where the vm map's pmap field is initialized.
Reviewed by: kib
|
#
199868 |
|
27-Nov-2009 |
alc |
Simplify the invocation of vm_fault(). Specifically, eliminate the flag VM_FAULT_DIRTY. The information provided by this flag can be trivially inferred by vm_fault().
Discussed with: kib
|
#
199490 |
|
18-Nov-2009 |
alc |
Simplify both the invocation and the implementation of vm_fault() for wiring pages.
(Note: Claims made in the comments about the handling of breakpoints in wired pages have been false for roughly a decade. This and another bug involving breakpoints will be fixed in coming changes.)
Reviewed by: kib
|
#
194766 |
|
23-Jun-2009 |
kib |
Implement global and per-uid accounting of the anonymous memory. Add rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid.
The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup.
The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped.
The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4).
Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced.
In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)
|
#
190886 |
|
10-Apr-2009 |
kib |
When vm_map_wire(9) is allowed to skip holes in the wired region, skip the mappings without any of read and execution rights, in particular, the PROT_NONE entries. This makes mlockall(2) work for the process address space that has such mappings.
Since protection mode of the entry may change between setting MAP_ENTRY_IN_TRANSITION and final pass over the region that records the wire status of the entries, allocate new map entry flag MAP_ENTRY_WIRE_SKIPPED to mark the skipped PROT_NONE entries.
Reported and tested by: Hans Ottevanger <fbsdhackers beasties demon nl> Reviewed by: alc MFC after: 3 weeks
|
#
189015 |
|
24-Feb-2009 |
kib |
Revert the addition of the freelist argument for the vm_map_delete() function, done in r188334. Instead, collect the entries that shall be freed, in the deferred_freelist member of the map. Automatically purge the deferred freelist when map is unlocked.
Tested by: pho Reviewed by: alc
|
#
188334 |
|
08-Feb-2009 |
kib |
Do not call vm_object_deallocate() from vm_map_delete(), because we hold the map lock there, and might need the vnode lock for OBJT_VNODE objects. Postpone object deallocation until caller of vm_map_delete() drops the map lock. Link the map entries to be freed into the freelist, that is released by the new helper function vm_map_entry_free_freelist().
Reviewed by: tegge, alc Tested by: pho
|
#
186665 |
|
31-Dec-2008 |
alc |
Resurrect shared map locks allowing greater concurrency during some map operations, such as page faults.
An earlier version of this change was ...
Reviewed by: kib Tested by: pho MFC after: 6 weeks
|
#
186633 |
|
31-Dec-2008 |
alc |
Update or eliminate some stale comments.
|
#
178928 |
|
10-May-2008 |
alc |
Generalize vm_map_find(9)'s parameter "find_space". Specifically, add support for VMFS_ALIGNED_SPACE, which requests the allocation of an address range best suited to superpages. The old options TRUE and FALSE are mapped to VMFS_ANY_SPACE and VMFS_NO_SPACE, so that there is no immediate need to update all of vm_map_find(9)'s callers.
While I'm here, correct a misstatement about vm_map_find(9)'s return values in the man page.
|
#
178630 |
|
28-Apr-2008 |
alc |
vm_map_fixed(), unlike vm_map_find(), does not update "addr", so it can be passed by value.
|
#
176717 |
|
01-Mar-2008 |
marcel |
Make the vm_pmap field of struct vmspace the last field in the structure. This allows per-CPU variations of struct pmap on a single architecture without affecting the machine-independent fields. As such, the PMAP variations don't affect the ABI. They become part of it.
|
#
173429 |
|
07-Nov-2007 |
pjd |
Change unused 'user_wait' argument to 'timo' argument, which will be used to specify timeout for msleep(9).
Discussed with: alc Reviewed by: alc
|
#
171902 |
|
20-Aug-2007 |
kib |
Do not drop vm_map lock between doing vm_map_remove() and vm_map_insert(). For this, introduce vm_map_fixed() that does that for MAP_FIXED case.
Dropping the lock allowed for parallel thread to occupy the freed space.
Reported by: Tijl Coosemans <tijl ulyssis org> Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks
|
#
159054 |
|
29-May-2006 |
tegge |
Close race between vmspace_exitfree() and exit1() and races between vmspace_exitfree() and vmspace_free() which could result in the same vmspace being freed twice.
Factor out part of exit1() into new function vmspace_exit(). Attach to vmspace0 to allow old vmspace to be freed earlier.
Add new function, vmspace_acquire_ref(), for obtaining a vmspace reference for a vmspace belonging to another process. Avoid changing vmspace refcount from 0 to 1 since that could also lead to the same vmspace being freed twice.
Change vmtotal() and swapout_procs() to use vmspace_acquire_ref().
Reviewed by: alc
|
#
153068 |
|
03-Dec-2005 |
alc |
Eliminate unneeded preallocation at initialization.
Reviewed by: tegge
|
#
139825 |
|
07-Jan-2005 |
imp |
/* -> /*- for license, minor formatting changes
|
#
133636 |
|
13-Aug-2004 |
alc |
Replace the linear search in vm_map_findspace() with an O(log n) algorithm built into the map entry splay tree. This replaces the first_free hint in struct vm_map with two fields in vm_map_entry: adj_free, the amount of free space following a map entry, and max_free, the maximum amount of free space in the entry's subtree. These fields make it possible to find a first-fit free region of a given size in one pass down the tree, so O(log n) amortized using splay trees.
This significantly reduces the overhead in vm_map_findspace() for applications that mmap() many hundreds or thousands of regions, and has a negligible slowdown (0.1%) on buildworld. See, for example, the discussion of a micro-benchmark titled "Some mmap observations compared to Linux 2.6/OpenBSD" on -hackers in late October 2003.
OpenBSD adopted this approach in March 2002, and NetBSD added it in November 2003, both with Red-Black trees.
Submitted by: Mark W. Krentel
|
#
133598 |
|
12-Aug-2004 |
tegge |
The vm map lock is needed in vm_fault() after the page has been found, to avoid later changes before pmap_enter() and vm_fault_prefault() has completed.
Simplify deadlock avoidance by not blocking on vm map relookup.
In collaboration with: alc
|
#
133401 |
|
09-Aug-2004 |
green |
Revamp VM map wiring.
* Allow no-fault wiring/unwiring to succeed for consistency; however, the wired count remains at zero, so it's a special case.
* Fix issues inside vm_map_wire() and vm_map_unwire() where the exact state of user wiring (one or zero) and system wiring (zero or more) could be confused; for example, system unwiring could succeed in removing a user wire, instead of being an error.
* Require all mappings to be unwired before they are deleted. When VM space is still wired upon deletion, it will be waited upon for the following unwire. This makes vslock(9) work rather than allowing kernel-locked memory to be deleted out from underneath of its consumer as it would before.
|
#
132880 |
|
30-Jul-2004 |
mux |
Get rid of another lockmgr(9) consumer by using sx locks for the user maps. We always acquire the sx lock exclusively here, but we can't use a mutex because we want to be able to sleep while holding the lock. This is completely equivalent to what we were doing with the lockmgr(9) locks before.
Approved by: alc
|
#
132593 |
|
24-Jul-2004 |
alc |
Simplify vmspace initialization. The bcopy() of fields from the old vmspace to the new vmspace in vmspace_exec() is mostly wasted effort. With one exception, vm_swrss, the copied fields are immediately overwritten. Instead, initialize these fields to zero in vmspace_alloc(), eliminating a bcopy() from vmspace_exec() and a bzero() from vmspace_fork().
|
#
131719 |
|
06-Jul-2004 |
alc |
Micro-optimize vmspace for 64-bit architectures: Colocate vm_refcnt and vm_exitingcnt so that alignment does not result in wasted space.
|
#
131152 |
|
26-Jun-2004 |
alc |
Remove an unused field from the vmspace structure.
|
#
128596 |
|
24-Apr-2004 |
alc |
In cases where a file was resident in memory mmap(..., PROT_NONE, ...) would actually map the file with read access enabled. According to http://www.opengroup.org/onlinepubs/007904975/functions/mmap.html this is an error. Similarly, an madvise(..., MADV_WILLNEED) would enable read access on a virtual address range that was PROT_NONE.
The solution implemented herein is (1) to pass a vm_prot_t to vm_map_pmap_enter() describing the allowed access and (2) to make vm_map_pmap_enter() responsible for understanding the limitations of pmap_enter_quick().
Submitted by: "Mark W. Krentel" <krentel@dreamscape.com> PR: kern/64573
|
#
127961 |
|
06-Apr-2004 |
imp |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999.
Approved by: core
|
#
126865 |
|
11-Mar-2004 |
peter |
Part 2 of rev 1.68. Update comment to match reality now that vm_endcopy exists and we no longer copy to the end of the struct.
Forgotten by: alfred and green
|
#
122349 |
|
09-Nov-2003 |
alc |
- Rename vm_map_clean() to vm_map_sync(). This better reflects the fact that msync(2) is its only caller. - Migrate the parts of the old vm_map_clean() that examined the internals of a vm object to a new function vm_object_sync() that is implemented in vm_object.c. At the same, introduce the necessary vm object locking so that vm_map_sync() and vm_object_sync() can be called without Giant.
Reviewed by: tegge
|
#
121962 |
|
03-Nov-2003 |
des |
Whitespace cleanup.
|
#
120831 |
|
05-Oct-2003 |
bms |
Move pmap_resident_count() from the MD pmap.h to the MI pmap.h. Add a definition of pmap_wired_count(). Add a definition of vmspace_wired_count().
Reviewed by: truckman Discussed with: peter
|
#
120531 |
|
27-Sep-2003 |
marcel |
Part 2 of implementing rstacks: add the ability to create rstacks and use the ability on ia64 to map the register stack. The orientation of the stack (i.e. its grow direction) is passed to vm_map_stack() in the overloaded cow argument. Since the grow direction is represented by bits, it is possible and allowed to create bi-directional stacks. This is not an advertised feature, more of a side-effect.
Fix a bug in vm_map_growstack() that's specific to rstacks and which we could only find by having the ability to create rstacks: when the mapped stack ends at the faulting address, we have not actually mapped the faulting address. we need to include or cover the faulting address.
Note that at this time mmap(2) has not been extended to allow the creation of rstacks by processes. If such a need arises, this can be done.
Tested on: alpha, i386, ia64, sparc64
|
#
119595 |
|
30-Aug-2003 |
marcel |
Introduce MAP_ENTRY_GROWS_DOWN and MAP_ENTRY_GROWS_UP to allow for growable (stack) entries that not only grow down, but also grow up. Have vm_map_growstack() take these flags into account when growing an entry.
This is the first step in adding support for upward growable stacks. It is a required feature on ia64 to support the register stack (or rstack as I like to call it -- it also means reverse stack). We do not currently create rstacks, so the upward growing is not exercised and the change should be a functional no-op.
Reviewed by: alc
|
#
118852 |
|
13-Aug-2003 |
alc |
Reduce the size of the vm map (and by inclusion the vm space) on 64-bit architectures by moving a field within the structure.
|
#
118771 |
|
11-Aug-2003 |
bms |
Add the mlockall() and munlockall() system calls. - All those diffs to syscalls.master for each architecture *are* necessary. This needed clarification; the stub code generation for mlockall() was disabled, which would prevent applications from linking to this API (suggested by mux) - Giant has been quoshed. It is no longer held by the code, as the required locking has been pushed down within vm_map.c. - Callers must specify VM_MAP_WIRE_HOLESOK or VM_MAP_WIRE_NOHOLES to express their intention explicitly. - Inspected at the vmstat, top and vm pager sysctl stats level. Paging-in activity is occurring correctly, using a test harness. - The RES size for a process may appear to be greater than its SIZE. This is believed to be due to mappings of the same shared library page being wired twice. Further exploration is needed. - Believed to back out of allocations and locks correctly (tested with WITNESS, MUTEX_PROFILING, INVARIANTS and DIAGNOSTIC).
PR: kern/43426, standards/54223 Reviewed by: jake, alc Approved by: jake (mentor) MFC after: 2 weeks
|
#
117047 |
|
29-Jun-2003 |
alc |
Introduce vm_map_pmap_enter(). Presently, this is a stub calling the MD pmap_object_init_pt().
|
#
116387 |
|
15-Jun-2003 |
alc |
Remove an unnecessary forward declaration.
|
#
112167 |
|
12-Mar-2003 |
das |
- When the VM daemon is out of swap space and looking for a process to kill, don't block on a map lock while holding the process lock. Instead, skip processes whose map locks are held and find something else to kill. - Add vm_map_trylock_read() to support the above.
Reviewed by: alc, mike (mentor)
|
#
111937 |
|
06-Mar-2003 |
alc |
Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress.
Discussed on: arch@
|
#
108518 |
|
31-Dec-2002 |
alc |
Add a needed #include.
Reported by: ia64 tinderbox
|
#
108515 |
|
31-Dec-2002 |
alc |
Implement a variant locking scheme for vm maps: Access to system maps is now synchronized by a mutex, whereas access to user maps is still synchronized by a lockmgr()-based lock. Why? No single type of lock, including sx locks, meets the requirements of both types of vm map. Sometimes we sleep while holding the lock on a user map. Thus, a a mutex isn't appropriate. On the other hand, both lockmgr()-based and sx locks release Giant when a thread/process blocks during contention for a lock. This could lead to a race condition in a legacy driver (that relies on Giant for synchronization) if it attempts to kmem_malloc() and fails to immediately obtain the lock. Fortunately, we never sleep while holding a system map lock.
|
#
107912 |
|
15-Dec-2002 |
dillon |
Fix a refcount race with the vmspace structure. In order to prevent resource starvation we clean-up as much of the vmspace structure as we can when the last process using it exits. The rest of the structure is cleaned up when it is reaped. But since exit1() decrements the ref count it is possible for a double-free to occur if someone else, such as the process swapout code, references and then dereferences the structure. Additionally, the final cleanup of the structure should not occur until the last process referencing it is reaped.
This commit solves the problem by introducing a secondary reference count, calling 'vm_exitingcnt'. The normal reference count is decremented on exit and vm_exitingcnt is incremented. vm_exitingcnt is decremented when the process is reaped. When both vm_exitingcnt and vm_refcnt are 0, the structure is freed for real.
MFC after: 3 weeks
|
#
103777 |
|
22-Sep-2002 |
alc |
o Update some comments.
|
#
100512 |
|
22-Jul-2002 |
alfred |
Change struct vmspace->vm_shm from void * to struct shmmap_state *, this removes the need for casts in several cases.
|
#
100511 |
|
22-Jul-2002 |
alfred |
Remove caddr_t.
|
#
99754 |
|
11-Jul-2002 |
alc |
o Add a "needs wakeup" flag to the vm_map for use by kmem_alloc_wait() and kmem_free_wakeup(). Previously, kmem_free_wakeup() always called wakeup(). In general, no one was sleeping. o Export vm_map_unlock_and_wait() and vm_map_wakeup() from vm_map.c for use in vm_kern.c.
|
#
98818 |
|
25-Jun-2002 |
alc |
o Eliminate vmspace::vm_minsaddr. It's initialized but never used. o Replace stale comments in vmspace by "const until freed" annotations on some fields.
|
#
98226 |
|
14-Jun-2002 |
alc |
o Use vm_map_wire() and vm_map_unwire() in place of vm_map_pageable() and vm_map_user_pageable(). o Remove vm_map_pageable() and vm_map_user_pageable(). o Remove vm_map_clear_recursive() and vm_map_set_recursive(). (They were only used by vm_map_pageable() and vm_map_user_pageable().)
Reviewed by: tegge
|
#
98036 |
|
08-Jun-2002 |
alc |
o Remove an unnecessary call to vm_map_wakeup() from vm_map_unwire(). o Add a stub for vm_map_wire().
Note: the description of the previous commit had an error. The in- transition flag actually blocks the deallocation of a vm_map_entry by vm_map_delete() and vm_map_simplify_entry().
|
#
98022 |
|
07-Jun-2002 |
alc |
o Add vm_map_unwire() for unwiring contiguous regions of either kernel or user vm_maps. In accordance with the standards for munlock(2), and in contrast to vm_map_user_pageable(), this implementation does not allow holes in the specified region. This implementation uses the "in transition" flag described below. o Introduce a new flag, "in transition," to the vm_map_entry. Eventually, vm_map_delete() and vm_map_simplify_entry() will respect this flag by deallocating in-transition vm_map_entrys, allowing the vm_map lock to be safely released in vm_map_unwire() and (the forthcoming) vm_map_wire(). o Modify vm_map_simplify_entry() to respect the in-transition flag.
In collaboration with: tegge
|
#
97727 |
|
01-Jun-2002 |
alc |
o Remove GIANT_REQUIRED from vm_map_zfini(), vm_map_zinit(), vm_map_create(), and vm_map_submap(). o Make further use of a local variable in vm_map_entry_splay() that caches a reference to one of a vm_map_entry's children. (This reduces code size somewhat.) o Revert a part of revision 1.66, deinlining vmspace_pmap(). (This function is MPSAFE.)
|
#
97710 |
|
01-Jun-2002 |
alc |
o Revert a part of revision 1.66, contrary to what that commit message says, deinlining vm_map_entry_behavior() and vm_map_entry_set_behavior() actually increases the kernel's size. o Make vm_map_entry_set_behavior() static and add a comment describing its purpose. o Remove an unnecessary initialization statement from vm_map_entry_splay().
|
#
97198 |
|
23-May-2002 |
alc |
o Replace the vm_map's hint by the root of a splay tree. By design, the last accessed datum is moved to the root of the splay tree. Therefore, on lookups in which the hint resulted in O(1) access, the splay tree still achieves O(1) access. In contrast, on lookups in which the hint failed miserably, the splay tree achieves amortized logarithmic complexity, resulting in dramatic improvements on vm_maps with a large number of entries. For example, the execution time for replaying an access log from www.cs.rice.edu against the thttpd web server was reduced by 23.5% due to the large number of files simultaneously mmap()ed by this server. (The machine in question has enough memory to cache most of this workload.)
Nothing comes for free: At present, I see a 0.2% slowdown on "buildworld" due to the overhead of maintaining the splay tree. I believe that some or all of this can be eliminated through optimizations to the code.
Developed in collaboration with: Juan E Navarro <jnavarro@cs.rice.edu> Reviewed by: jeff
|
#
96096 |
|
06-May-2002 |
alc |
o Header files shouldn't depend on options: Provide prototypes for uiomoveco(), uioread(), and vm_uiomove() regardless of whether ENABLE_VFS_IOOPT is defined or not.
Submitted by: bde
|
#
96087 |
|
05-May-2002 |
alc |
o Move vm_freeze_copyopts() from vm_map.{c.h} to vm_object.{c,h}. It's plainly an operation on a vm_object and belongs in the latter place.
|
#
96080 |
|
05-May-2002 |
alc |
o Condition the compilation of uiomoveco() and vm_uiomove() on ENABLE_VFS_IOOPT. o Add a comment to the effect that this code is experimental support for zero-copy I/O.
|
#
95901 |
|
02-May-2002 |
alc |
o Remove dead and lockmgr()-specific debugging code.
|
#
95686 |
|
28-Apr-2002 |
alc |
Pass the caller's file name and line number to the vm_map locking functions.
|
#
95610 |
|
28-Apr-2002 |
alc |
o Introduce and use vm_map_trylock() to replace several direct uses of lockmgr(). o Add missing synchronization to vmspace_swap_count(): Obtain a read lock on the vm_map before traversing it.
|
#
95589 |
|
27-Apr-2002 |
alc |
o Begin documenting the (existing) locking protocol on the vm_map in the same style as sys/proc.h. o Undo the de-inlining of several trivial, MPSAFE methods on the vm_map. (Contrary to the commit message for vm_map.h revision 1.66 and vm_map.c revision 1.206, de-inlining these methods increased the kernel's size.)
|
#
94912 |
|
17-Apr-2002 |
alc |
Remove an unused option, VM_FAULT_HOLD, to vm_fault().
|
#
92654 |
|
19-Mar-2002 |
jeff |
This is the first part of the new kernel memory allocator. This replaces malloc(9) and vm_zone with a slab like allocator.
Reviewed by: arch@
|
#
92588 |
|
18-Mar-2002 |
green |
Back out the modification of vm_map locks from lockmgr to sx locks. The best path forward now is likely to change the lockmgr locks to simple sleep mutexes, then see if any extra contention it generates is greater than removed overhead of managing local locking state information, cost of extra calls into lockmgr, etc.
Additionally, making the vm_map lock a mutex and respecting it properly will put us much closer to not needing Giant magic in vm.
|
#
92246 |
|
13-Mar-2002 |
green |
Rename SI_SUB_MUTEX to SI_SUB_MTX_POOL to make the name at all accurate. While doing this, move it earlier in the sysinit boot process so that the VM system can use it.
After that, the system is now able to use sx locks instead of lockmgr locks in the VM system. To accomplish this, some of the more questionable uses of the locks (such as testing whether they are owned or not, as well as allowing shared+exclusive recursion) are removed, and simpler logic throughout is used so locks should also be easier to understand.
This has been tested on my laptop for months, and has not shown any problems on SMP systems, either, so appears quite safe. One more user of lockmgr down, many more to go :)
|
#
92029 |
|
10-Mar-2002 |
eivind |
- Remove a number of extra newlines that do not belong here according to style(9) - Minor space adjustment in cases where we have "( ", " )", if(), return(), while(), for(), etc. - Add /* SYMBOL */ after a few #endifs.
Reviewed by: alc
|
#
90263 |
|
05-Feb-2002 |
alfred |
Fix a race with free'ing vmspaces at process exit when vmspaces are shared.
Also introduce vm_endcopy instead of using pointer tricks when initializing new vmspaces.
The race occured because of how the reference was utilized: test vmspace reference, possibly block, decrement reference
When sharing a vmspace between multiple processes it was possible for two processes exiting at the same time to test the reference count, possibly block and neither one free because they wouldn't see the other's update.
Submitted by: green
|
#
85762 |
|
31-Oct-2001 |
dillon |
Don't let pmap_object_init_pt() exhaust all available free pages (allocating pv entries w/ zalloci) when called in a loop due to an madvise(). It is possible to completely exhaust the free page list and cause a system panic when an expected allocation fails.
|
#
83366 |
|
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
#
79248 |
|
04-Jul-2001 |
dillon |
Change inlines back into mainline code in preparation for mutexing. Also, most of these inlines had been bloated in -current far beyond their original intent. Normalize prototypes and function declarations to be ANSI only (half already were). And do some general cleanup.
(kernel size also reduced by 50-100K, but that isn't the prime intent)
|
#
79224 |
|
04-Jul-2001 |
dillon |
With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
|
#
77948 |
|
09-Jun-2001 |
dillon |
Two fixes to the out-of-swap process termination code. First, start killing processes a little earlier to avoid a deadlock. Second, when calculating the 'largest process' do not just count RSS. Instead count the RSS + SWAP used by the process. Without this the code tended to kill small inconsequential processes like, oh, sshd, rather then one of the many 'eatmem 200MB' I run on a whim :-). This fix has been extensively tested on -stable and somewhat tested on -current and will be MFCd in a few days.
Shamed into fixing this by: ps
|
#
76827 |
|
18-May-2001 |
alfred |
Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level vm operations.
faults can not be taken without holding Giant.
Memory subsystems can now call the base page allocators safely.
Almost all atomic ops were removed as they are covered under the vm mutex.
Alpha and ia64 now need to catch up to i386's trap handlers.
FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties).
Reviewed (partially) by: jake, jhb
|
#
76244 |
|
03-May-2001 |
markm |
Putting sys/lockmgr.h in here allows us to depollute userland includes a bit. OK'ed by: bde
|
#
75644 |
|
18-Apr-2001 |
alfred |
Fix the botched rev 1.59 where I made it such that without INVARIANTS the map is never locked.
Submitted by: tegge
|
#
75473 |
|
13-Apr-2001 |
alfred |
use %p for pointer printf, include sys/systm.h for printf proto
|
#
75462 |
|
13-Apr-2001 |
alfred |
Use a macro wrapper over printf along with KASSERT to reduce the amount of code here.
|
#
74237 |
|
14-Mar-2001 |
dillon |
Fix a lock reversal problem in the VM subsystem related to threaded programs. There is a case during a fork() which can cause a deadlock.
From Tor - The workaround that consists of setting a flag in the vm map that indicates that a fork is in progress and using that mark in the page fault handling to force a revalidation failure. That change will only affect (pessimize) page fault handling during fork for threaded (linuxthreads style) applications and applications using aio_*().
Submited by: tegge
|
#
72200 |
|
09-Feb-2001 |
bmilekic |
Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:
mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)
similarily, for releasing a lock, we now have:
mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument.
The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind.
Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two:
MTX_QUIET and MTX_NOSWITCH
The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers:
mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively.
Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case.
Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled.
Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those.
Finally, caught up to the interface changes in all sys code.
Contributors: jake, jhb, jasone (in no particular order)
|
#
67046 |
|
12-Oct-2000 |
jasone |
For lockmgr mutex protection, use an array of mutexes that are allocated and initialized during boot. This avoids bloating sizeof(struct lock). As a side effect, it is no longer necessary to enforce the assumtion that lockinit()/lockdestroy() calls are paired, so the LK_VALID flag has been removed.
Idea taken from: BSD/OS.
|
#
66615 |
|
03-Oct-2000 |
jasone |
Convert lockmgr locks from using simple locks to using mutexes.
Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.
|
#
57550 |
|
28-Feb-2000 |
ps |
Add MAP_NOCORE to mmap(2), and MADV_NOCORE and MADV_CORE to madvise(2). This This feature allows you to specify if mmap'd data is included in an application's corefile.
Change the type of eflags in struct vm_map_entry from u_char to vm_eflags_t (an unsigned int).
Reviewed by: dillon,jdp,alfred Approved by: jkh
|
#
55206 |
|
29-Dec-1999 |
peter |
Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL" is an application space macro and the applications are supposed to be free to use it as they please (but cannot). This is consistant with the other BSD's who made this change quite some time ago. More commits to come.
|
#
54467 |
|
12-Dec-1999 |
dillon |
Add MAP_NOSYNC feature to mmap(), and MADV_NOSYNC and MADV_AUTOSYNC to madvise().
This feature prevents the update daemon from gratuitously flushing dirty pages associated with a mapped file-backed region of memory. The system pager will still page the memory as necessary and the VM system will still be fully coherent with the filesystem. Modifications made by other means to the same area of memory, for example by write(), are unaffected. The feature works on a page-granularity basis.
MAP_NOSYNC allows one to use mmap() to share memory between processes without incuring any significant filesystem overhead, putting it in the same performance category as SysV Shared memory and anonymous memory.
Reviewed by: julian, alc, dg
|
#
51493 |
|
21-Sep-1999 |
dillon |
cleanup madvise code, add a few more sanity checks.
Reviewed by: Alan Cox <alc@cs.rice.edu>, dg@root.com
|
#
51342 |
|
17-Sep-1999 |
dillon |
Add 'lastr' field to vm_map_entry in preparation for its removal from the vnode. (The changeover is undergoing final testing and will be committed soon).
Reviewed by: Alan Cox <alc@cs.rice.edu>, David Greenman <dg@root.com>
|
#
50477 |
|
27-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
#
50248 |
|
23-Aug-1999 |
alc |
Correct the inconsistent formatting in struct vm_map.
Addendum to rev 1.47: submitted by dillon.
|
#
50247 |
|
23-Aug-1999 |
alc |
struct vm_map: The lock structure cannot be the first element of the vm_map because this can result in livelock between two or more system processes trying to kmem_alloc_wait.
|
#
49998 |
|
18-Aug-1999 |
mjacob |
Fix breakage - an extra brace got inserted where DIAGNOSTIC was defined but MAP_LOCK_DIAGNOSTIC wasn't.
|
#
49900 |
|
16-Aug-1999 |
alc |
vm_map_lock*: Remove semicolons or add "do { } while (0)" as necessary to enable the use of these macros in arbitrary statements. (There are no functional changes.)
Submitted by: dillon
|
#
49338 |
|
01-Aug-1999 |
alc |
Move the memory access behavior information provided by madvise from the vm_object to the vm_map.
Submitted by: dillon
|
#
48736 |
|
10-Jul-1999 |
alc |
Remove unused function prototypes.
|
#
48022 |
|
19-Jun-1999 |
alc |
Remove some unused function and variable declarations.
|
#
47258 |
|
16-May-1999 |
alc |
Add the options MAP_PREFAULT and MAP_PREFAULT_PARTIAL to vm_map_find/insert, eliminating the need for the pmap_object_init_pt calls in imgact_* and mmap.
Reviewed by: David Greenman <dg@root.com>
|
#
47243 |
|
16-May-1999 |
alc |
Remove prototypes for functions that don't exist anymore (vm_map.h).
Remove a useless argument from vm_map_madvise's interface (vm_map.c, vm_map.h, and vm_mmap.c).
Remove a redundant test in vm_uiomove (vm_map.c).
Make two changes to vm_object_coalesce:
1. Determine whether the new range of pages actually overlaps the existing object's range of pages before calling vm_object_page_remove. (Prior to this change almost 90% of the calls to vm_object_page_remove were to remove pages that were beyond the end of the object.)
2. Free any swap space allocated to removed pages.
|
#
47207 |
|
14-May-1999 |
alc |
Simplify vm_map_find/insert's interface: remove the MAP_COPY_NEEDED option.
It never makes sense to specify MAP_COPY_NEEDED without also specifying MAP_COPY_ON_WRITE, and vice versa. Thus, MAP_COPY_ON_WRITE suffices.
Reviewed by: David Greenman <dg@root.com>
|
#
44513 |
|
06-Mar-1999 |
alc |
Upgrading a map's lock to exclusive status should increment the map's timestamp. In general, whenever an exclusive lock is acquired the timestamp should be incremented.
|
#
44396 |
|
02-Mar-1999 |
alc |
Remove the last of the share map code: struct vm_map::is_main_map.
Reviewed by: Matthew Dillon <dillon@apollo.backplane.com>
|
#
44146 |
|
19-Feb-1999 |
luoqi |
Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This is the preparation step for moving pmap storage out of vmspace proper.
Reviewed by: Alan Cox <alc@cs.rice.edu> Matthew Dillion <dillon@apollo.backplane.com>
|
#
43748 |
|
07-Feb-1999 |
dillon |
Remove MAP_ENTRY_IS_A_MAP 'share' maps. These maps were once used to attempt to optimize forks but were essentially given-up on due to problems and replaced with an explicit dup of the vm_map_entry structure. Prior to the removal, they were entirely unused.
|
#
43209 |
|
26-Jan-1999 |
julian |
Mostly remove the VM_STACK OPTION. This changes the definitions of a few items so that structures are the same whether or not the option itself is enabled. This allows people to enable and disable the option without recompilng the world.
As the author says:
|I ran into a problem pulling out the VM_STACK option. I was aware of this |when I first did the work, but then forgot about it. The VM_STACK stuff |has some code changes in the i386 branch. There need to be corresponding |changes in the alpha branch before it can come out completely.
what is done: | |1) Pull the VM_STACK option out of the header files it appears in. This |really shouldn't affect anything that executes with or without the rest |of the VM_STACK patches. The vm_map_entry will then always have one |extra element (avail_ssize). It just won't be used if the VM_STACK |option is not turned on. | |I've also pulled the option out of vm_map.c. This shouldn't harm anything, |since the routines that are enabled as a result are not called unless |the VM_STACK option is enabled elsewhere. | |2) Add what appears to be appropriate code the the alpha branch, still |protected behind the VM_STACK switch. I don't have an alpha machine, |so we would need to get some testers with alpha machines to try it out. | |Once there is some testing, we can consider making the change permanent |for both i386 and alpha. | [..] | |Once the alpha code is adequately tested, we can pull VM_STACK out |everywhere. |
Submitted by: "Richard Seaman, Jr." <dick@tar.com>
|
#
42360 |
|
06-Jan-1999 |
julian |
Add (but don't activate) code for a special VM option to make downward growing stacks more general. Add (but don't activate) code to use the new stack facility when running threads, (specifically the linux threads support). This allows people to use both linux compiled linuxthreads, and also the native FreeBSD linux-threads port.
The code is conditional on VM_STACK. Not using this will produce the old heavily tested system.
Submitted by: Richard Seaman <dick@tar.com>
|
#
32702 |
|
22-Jan-1998 |
dyson |
VM level code cleanups.
1) Start using TSM. Struct procs continue to point to upages structure, after being freed. Struct vmspace continues to point to pte object and kva space for kstack. u_map is now superfluous. 2) vm_map's don't need to be reference counted. They always exist either in the kernel or in a vmspace. The vmspaces are managed by reference counts. 3) Remove the "wired" vm_map nonsense. 4) No need to keep a cache of kernel stack kva's. 5) Get rid of strange looking ++var, and change to var++. 6) Change more data structures to use our "zone" allocator. Added struct proc, struct vmspace and struct vnode. This saves a significant amount of kva space and physical memory. Additionally, this enables TSM for the zone managed memory. 7) Keep ioopt disabled for now. 8) Remove the now bogus "single use" map concept. 9) Use generation counts or id's for data structures residing in TSM, where it allows us to avoid unneeded restart overhead during traversals, where blocking might occur. 10) Account better for memory deficits, so the pageout daemon will be able to make enough memory available (experimental.) 11) Fix some vnode locking problems. (From Tor, I think.) 12) Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp. (experimental.) 13) Significantly shrink, cleanup, and make slightly faster the vm_fault.c code. Use generation counts, get rid of unneded collpase operations, and clean up the cluster code. 14) Make vm_zone more suitable for TSM.
This commit is partially as a result of discussions and contributions from other people, including DG, Tor Egge, PHK, and probably others that I have forgotten to attribute (so let me know, if I forgot.)
This is not the infamous, final cleanup of the vnode stuff, but a necessary step. Vnode mgmt should be correct, but things might still change, and there is still some missing stuff (like ioopt, and physical backing of non-merged cache files, debugging of layering concepts.)
|
#
32585 |
|
17-Jan-1998 |
dyson |
Tie up some loose ends in vnode/object management. Remove an unneeded config option in pmap. Fix a problem with faulting in pages. Clean-up some loose ends in swap pager memory management.
The system should be much more stable, but all subtile bugs aren't fixed yet.
|
#
32286 |
|
06-Jan-1998 |
dyson |
Make our v_usecount vnode reference count work identically to the original BSD code. The association between the vnode and the vm_object no longer includes reference counts. The major difference is that vm_object's are no longer freed gratuitiously from the vnode, and so once an object is created for the vnode, it will last as long as the vnode does.
When a vnode object reference count is incremented, then the underlying vnode reference count is incremented also. The two "objects" are now more intimately related, and so the interactions are now much less complex.
When vnodes are now normally placed onto the free queue with an object still attached. The rundown of the object happens at vnode rundown time, and happens with exactly the same filesystem semantics of the original VFS code. There is absolutely no need for vnode_pager_uncache and other travesties like that anymore.
A side-effect of these changes is that SMP locking should be much simpler, the I/O copyin/copyout optimizations work, NFS should be more ponderable, and further work on layered filesystems should be less frustrating, because of the totally coherent management of the vnode objects and vnodes.
Please be careful with your system while running this code, but I would greatly appreciate feedback as soon a reasonably possible.
|
#
31853 |
|
19-Dec-1997 |
dyson |
Some performance improvements, and code cleanups (including changing our expensive OFF_TO_IDX to btoc whenever possible.)
|
#
28345 |
|
18-Aug-1997 |
dyson |
Fix kern_lock so that it will work. Additionally, clean-up some of the VM systems usage of the kernel lock (lockmgr) code. This is a first pass implementation, and is expected to evolve as needed. The API for the lock manager code has not changed, but the underlying implementation has changed significantly. This change should not materially affect our current SMP or UP code without non-standard parameters being used.
|
#
27899 |
|
04-Aug-1997 |
dyson |
Get rid of the ad-hoc memory allocator for vm_map_entries, in lieu of a simple, clean zone type allocator. This new allocator will also be used for machine dependent pmap PV entries.
|
#
24691 |
|
07-Apr-1997 |
peter |
The biggie: Get rid of the UPAGES from the top of the per-process address space. (!)
Have each process use the kernel stack and pcb in the kvm space. Since the stacks are at a different address, we cannot copy the stack at fork() and allow the child to return up through the function call tree to return to user mode - create a new execution context and have the new process begin executing from cpu_switch() and go to user mode directly. In theory this should speed up fork a bit.
Context switch the tss_esp0 pointer in the common tss. This is a lot simpler since than swithching the gdt[GPROC0_SEL].sd.sd_base pointer to each process's tss since the esp0 pointer is a 32 bit pointer, and the sd_base setting is split into three different bit sections at non-aligned boundaries and requires a lot of twiddling to reset.
The 8K of memory at the top of the process space is now empty, and unmapped (and unmappable, it's higher than VM_MAXUSER_ADDRESS).
Simplity the pmap code to manage process contexts, we no longer have to double map the UPAGES, this simplifies and should measuably speed up fork().
The following parts came from John Dyson:
Set PG_G on the UPAGES that are now in kernel context, and invalidate them when swapping them out.
Move the upages object (upobj) from the vmspace to the proc structure.
Now that the UPAGES (pcb and kernel stack) are out of user space, make rfork(..RFMEM..) do what was intended by sharing the vmspace entirely via reference counting rather than simply inheriting the mappings.
|
#
24666 |
|
06-Apr-1997 |
dyson |
Fix the gdb executable modify problem. Thanks to the detective work by Alan Cox <alc@cs.rice.edu>, and his description of the problem.
The bug was primarily in procfs_mem, but the mistake likely happened due to the lack of vm system support for the operation. I added better support for selective marking of page dirty flags so that vm_map_pageable(wiring) will not cause this problem again.
The code in procfs_mem is now less bogus (but maybe still a little so.)
|
#
22975 |
|
22-Feb-1997 |
peter |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
22878 |
|
18-Feb-1997 |
bde |
Removed vestiges of Mach lock types.
vm_map.h: Removed #include of <sys/proc.h>. curproc is only used in some macros and users of the macros already include <sys/proc.h>.
|
#
22521 |
|
10-Feb-1997 |
dyson |
This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes.
The system boots and can mount UFS filesystems.
Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed.
Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
|
#
21754 |
|
16-Jan-1997 |
dyson |
Change the map entry flags from bitfields to bitmasks. Allows for some code simplification.
|
#
21673 |
|
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
20993 |
|
28-Dec-1996 |
dyson |
Eliminate the redundancy due to the similarity between the routines vm_map_simplify and vm_map_simplify_entry. Make vm_map_simplify_entry handle wired maps so that we can get rid of vm_map_simplify. Modify the callers of vm_map_simplify to properly use vm_map_simplify_entry. Submitted by: Alan Cox <alc@cs.rice.edu>
|
#
20449 |
|
14-Dec-1996 |
dyson |
Implement closer-to POSIX mlock semantics. The major difference is that we do allow mlock to span unallocated regions (of course, not mlocking them.) We also allow mlocking of RO regions (which the old code couldn't.) The restriction there is that once a RO region is wired (mlocked), it cannot be debugged (or EVER written to.)
Under normal usage, the new mlock code will be a significant improvement over our old stuff.
|
#
20182 |
|
06-Dec-1996 |
dyson |
Make vm_map_insert much more intelligent in the MAP_NOFAULT case so that map entries are coalesced when appropriate. Also, conditionalize some code that is currently not used in vm_map_insert. This mod has been added to eliminate unnecessary map entries in buffer map.
Additionally, there were some cases where map coalescing could be done when it shouldn't. That problem has been resolved.
|
#
20054 |
|
30-Nov-1996 |
dyson |
Implement a new totally dynamic (up to MAXPHYS) buffer kva allocation scheme. Additionally, add the capability for checking for unexpected kernel page faults. The maximum amount of kva space for buffers hasn't been decreased from where it is, but it will now be possible to do so.
This scheme manages the kva space similar to the buffers themselves. If there isn't enough kva space because of usage or fragementation, buffers will be reclaimed until a buffer allocation is successful. This scheme should be very resistant to fragmentation problems until/if the LFS code is fixed and uses the bogus buffer locking scheme -- but a 'fixed' LFS is not likely to use such a scheme.
Now there should be NO problem allocating buffers up to MAXPHYS.
|
#
17334 |
|
30-Jul-1996 |
dyson |
Backed out the recent changes/enhancements to the VM code. The problem with the 'shell scripts' was found, but there was a 'strange' problem found with a 486 laptop that we could not find. This commit backs the code back to 25-jul, and will be re-entered after the snapshot in smaller (more easily tested) chunks.
|
#
17294 |
|
27-Jul-1996 |
dyson |
This commit is meant to solve a couple of VM system problems or performance issues.
1) The pmap module has had too many inlines, and so the object file is simply bigger than it needs to be. Some common code is also merged into subroutines. 2) Removal of some *evil* PHYS_TO_VM_PAGE macro calls. Unfortunately, a few have needed to be added also. The removal caused the need for more vm_page_lookups. I added lookup hints to minimize the need for the page table lookup operations. 3) Removal of some bogus performance improvements, that mostly made the code more complex (tracking individual page table page updates unnecessarily). Those improvements actually hurt 386 processors perf (not that people who worry about perf use 386 processors anymore :-)). 4) Changed pv queue manipulations/structures to be TAILQ's. 5) The pv queue code has had some performance problems since day one. Some significant scalability issues are resolved by threading the pv entries from the pmap AND the physical address instead of just the physical address. This makes certain pmap operations run much faster. This does not affect most micro-benchmarks, but should help loaded system performance *significantly*. DG helped and came up with most of the solution for this one. 6) Most if not all pmap bit operations follow the pattern: pmap_test_bit(); pmap_clear_bit(); That made for twice the necessary pv list traversal. The pmap interface now supports only pmap_tc_bit type operations: pmap_[test/clear]_modified, pmap_[test/clear]_referenced. Additionally, the modified routine now takes a vm_page_t arg instead of a phys address. This eliminates a PHYS_TO_VM_PAGE operation. 7) Several rewrites of routines that contain redundant code to use common routines, so that there is a greater likelihood of keeping the cache footprint smaller.
|
#
15819 |
|
19-May-1996 |
dyson |
Initial support for mincore and madvise. Both are almost fully supported, except madvise does not page in with MADV_WILLNEED, and MADV_DONTNEED doesn't force dirty pages out.
|
#
13765 |
|
30-Jan-1996 |
mpp |
Fix a bunch of spelling errors in the comment fields of a bunch of system include files.
|
#
13490 |
|
19-Jan-1996 |
dyson |
Eliminated many redundant vm_map_lookup operations for vm_mmap. Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish overhead for merged cache. Efficiency improvement for vfs_cluster. It used to do alot of redundant calls to cluster_rbuild. Correct the ordering for vrele of .text and release of credentials. Use the selective tlb update for 486/586/P6. Numerous fixes to the size of objects allocated for files. Additionally, fixes in the various pagers. Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs. Fixes in the swap pager for exhausted resources. The pageout code will not as readily thrash. Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE), thereby improving efficiency of several routines. Eliminate even more unnecessary vm_page_protect operations. Significantly speed up process forks. Make vm_object_page_clean more efficient, thereby eliminating the pause that happens every 30seconds. Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the case of filesystems mounted async. Fix a panic with busy pages when write clustering is done for non-VMIO buffers.
|
#
12820 |
|
14-Dec-1995 |
phk |
Another mega commit to staticize things.
|
#
12767 |
|
11-Dec-1995 |
dyson |
Changes to support 1Tb filesizes. Pages are now named by an (object,index) pair instead of (object,offset) pair.
|
#
12662 |
|
07-Dec-1995 |
dg |
Untangled the vm.h include file spaghetti.
|
#
10344 |
|
26-Aug-1995 |
bde |
Change vm_map_print() to have the correct number and type of args for a ddb command.
|
#
9507 |
|
13-Jul-1995 |
dg |
NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct proc or any VM system structure will have to be rebuilt!!!
Much needed overhaul of the VM system. Included in this first round of changes:
1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages, haspage, and sync operations are supported. The haspage interface now provides information about clusterability. All pager routines now take struct vm_object's instead of "pagers".
2) Improved data structures. In the previous paradigm, there is constant confusion caused by pagers being both a data structure ("allocate a pager") and a collection of routines. The idea of a pager structure has escentially been eliminated. Objects now have types, and this type is used to index the appropriate pager. In most cases, items in the pager structure were duplicated in the object data structure and thus were unnecessary. In the few cases that remained, a un_pager structure union was created in the object to contain these items.
3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now be removed. For instance, vm_object_enter(), vm_object_lookup(), vm_object_remove(), and the associated object hash list were some of the things that were removed.
4) simple_lock's removed. Discussion with several people reveals that the SMP locking primitives used in the VM system aren't likely the mechanism that we'll be adopting. Even if it were, the locking that was in the code was very inadequate and would have to be mostly re-done anyway. The locking in a uni-processor kernel was a no-op but went a long way toward making the code difficult to read and debug.
5) Places that attempted to kludge-up the fact that we don't have kernel thread support have been fixed to reflect the reality that we are really dealing with processes, not threads. The VM system didn't have complete thread support, so the comments and mis-named routines were just wrong. We now use tsleep and wakeup directly in the lock routines, for instance.
6) Where appropriate, the pagers have been improved, especially in the pager_alloc routines. Most of the pager_allocs have been rewritten and are now faster and easier to maintain.
7) The pagedaemon pageout clustering algorithm has been rewritten and now tries harder to output an even number of pages before and after the requested page. This is sort of the reverse of the ideal pagein algorithm and should provide better overall performance.
8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup have been removed. Some other unnecessary casts have also been removed.
9) Some almost useless debugging code removed.
10) Terminology of shadow objects vs. backing objects straightened out. The fact that the vm_object data structure escentially had this backwards really confused things. The use of "shadow" and "backing object" throughout the code is now internally consistent and correct in the Mach terminology.
11) Several minor bug fixes, including one in the vm daemon that caused 0 RSS objects to not get purged as intended.
12) A "default pager" has now been created which cleans up the transition of objects to the "swap" type. The previous checks throughout the code for swp->pg_data != NULL were really ugly. This change also provides the rudiments for future backing of "anonymous" memory by something other than the swap pager (via the vnode pager, for example), and it allows the decision about which of these pagers to use to be made dynamically (although will need some additional decision code to do this, of course).
13) (dyson) MAP_COPY has been deprecated and the corresponding "copy object" code has been removed. MAP_COPY was undocumented and non- standard. It was furthermore broken in several ways which caused its behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will continue to work correctly, but via the slightly different semantics of MAP_PRIVATE.
14) (dyson) Sharing maps have been removed. It's marginal usefulness in a threads design can be worked around in other ways. Both #12 and #13 were done to simplify the code and improve readability and maintain- ability. (As were most all of these changes)
TODO:
1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing this will reduce the vnode pager to a mere fraction of its current size.
2) Rewrite vm_fault and the swap/vnode pagers to use the clustering information provided by the new haspage pager interface. This will substantially reduce the overhead by eliminating a large number of VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be improved to provide both a "behind" and "ahead" indication of contiguousness.
3) Implement the extended features of pager_haspage in swap_pager_haspage(). It currently just says 0 pages ahead/behind.
4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps via a much more general mechanism that could also be used for disk striping of regular filesystems.
5) Do something to improve the architecture of vm_object_collapse(). The fact that it makes calls into the swap pager and knows too much about how the swap pager operates really bothers me. It also doesn't allow for collapsing of non-swap pager objects ("unnamed" objects backed by other pagers).
|
#
7090 |
|
16-Mar-1995 |
bde |
Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.
|
#
5455 |
|
09-Jan-1995 |
dg |
These changes embody the support of the fully coherent merged VM buffer cache, much higher filesystem I/O performance, and much better paging performance. It represents the culmination of over 6 months of R&D.
The majority of the merged VM/cache work is by John Dyson.
The following highlights the most significant changes. Additionally, there are (mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to support the new VM/buffer scheme.
vfs_bio.c: Significant rewrite of most of vfs_bio to support the merged VM buffer cache scheme. The scheme is almost fully compatible with the old filesystem interface. Significant improvement in the number of opportunities for write clustering.
vfs_cluster.c, vfs_subr.c Upgrade and performance enhancements in vfs layer code to support merged VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff.
vm_object.c: Yet more improvements in the collapse code. Elimination of some windows that can cause list corruption.
vm_pageout.c: Fixed it, it really works better now. Somehow in 2.0, some "enhancements" broke the code. This code has been reworked from the ground-up.
vm_fault.c, vm_page.c, pmap.c, vm_object.c Support for small-block filesystems with merged VM/buffer cache scheme.
pmap.c vm_map.c Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of kernel PTs.
vm_glue.c Much simpler and more effective swapping code. No more gratuitous swapping.
proc.h Fixed the problem that the p_lock flag was not being cleared on a fork.
swap_pager.c, vnode_pager.c Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the code doesn't need it anymore.
machdep.c Changes to better support the parameter values for the merged VM/buffer cache scheme.
machdep.c, kern_exec.c, vm_glue.c Implemented a seperate submap for temporary exec string space and another one to contain process upages. This eliminates all map fragmentation problems that previously existed.
ffs_inode.c, ufs_inode.c, ufs_readwrite.c Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on busy buffers.
Submitted by: John Dyson and David Greenman
|
#
1817 |
|
02-Aug-1994 |
dg |
Added $Id$
|
#
1549 |
|
25-May-1994 |
rgrimes |
The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
|
#
1542 |
|
24-May-1994 |
rgrimes |
This commit was generated by cvs2svn to compensate for changes in r1541, which included commits to RCS files with non-trunk default branches.
|
#
1541 |
|
24-May-1994 |
rgrimes |
BSD 4.4 Lite Kernel Sources
|