Cross Reference: /freebsd-9.3-release/sys/vm/vm

History log of /freebsd-9.3-release/sys/vm/vm_kern.c
Revision	Date	Author	Comments (<<< Hide modified files) (Show modified files >>>)
# 267654	19-Jun-2014	gjb	Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle. Approved by: re (implicit) Sponsored by: The FreeBSD Foundation /freebsd-9.3-release
# 258870	03-Dec-2013	jhb	MFC 253471,253620,254430,254538: Change mmap() to more optimally use superpages and provide support for tweaking alignment of virtual mappings. - Add a new address space allocation method (VMFS_OPTIMAL_SPACE) for vm_map_find() that will try to alter the alignment of a mapping to match any existing superpage mappings of the object being mapped. If no suitable address range is found with the necessary alignment, vm_map_find() will fall back to using the simple first-fit strategy (VMFS_ANY_SPACE). - Change mmap() without MAP_FIXED, shmat(), shm_map(), and the GEM mapping ioctl to use VMFS_OPTIMAL_SPACE instead of VMFS_ANY_SPACE. - MAP_ALIGNED(n) requests a mapping aligned on a boundary of (1 << n). Requests for n >= number of bits in a pointer or less than the size of a page fail with EINVAL. This matches the API provided by NetBSD. - MAP_ALIGNED_SUPER is a special case of MAP_ALIGNED. It can be used to optimize the chances of using large pages. By default it will align the mapping on a large page boundary (the system is free to choose any large page size to align to that seems best for the mapping request). However, if the object being mapped is already using large pages, then it will align the virtual mapping to match the existing large pages in the object instead. - Internally, VMFS_ALIGNED_SPACE is now renamed to VMFS_SUPER_SPACE, and VMFS_ALIGNED_SPACE(n) is repurposed for specifying a specific alignment. MAP_ALIGNED(n) maps to using VMFS_ALIGNED_SPACE(n), while MAP_ALIGNED_SUPER maps to VMFS_SUPER_SPACE. - mmap() of a device object now uses VMFS_OPTIMAL_SPACE rather than explicitly using VMFS_SUPER_SPACE. All device objects are forced to use a specific color on creation, so VMFS_OPTIMAL_SPACE is effectively equivalent. PR: ports/184173 (exp-run)
# 251897	18-Jun-2013	scottl	Merge the second part of the unmapped I/O changes. This enables the infrastructure in the block layer and UFS filesystem as well as a few drivers. The list of MFC revisions is long, so I won't quote changelogs. r248508,248510,248511,248512,248514,248515,248516,248517,248518, 248519,248520,248521,248550,248568,248789,248790,249032,250936 Submitted by: kib Approved by: kib Obtained from: Netflix
# 248813	28-Mar-2013	kib	MFC r248277: Remove excessive and inconsistent initializers for the various kernel maps and submaps.
# 237949	02-Jul-2012	alc	MFC r226843 Eliminate vestiges of page coloring in VM_ALLOC_NOOBJ calls to vm_page_alloc(). While I'm here, for the sake of consistency, always specify the allocation class, such as VM_ALLOC_NORMAL, as the first of the flags.
# 235882	24-May-2012	alc	MFC r232288 Simplify kmem_alloc() by eliminating code that existed on account of external pagers in Mach.
# 232066	23-Feb-2012	kmacy	MFC r230623 exclude kmem_alloc'ed ARC data buffers from kernel minidumps on amd64 excluding other allocations including UMA now entails the addition of a single flag to kmem_alloc or uma zone create Reviewed by: alc,avg
# 225736	22-Sep-2011	kensmith	Copy head to stable/9 as part of 9.0-RELEASE release cycle. Approved by: re (implicit)
# 224746	09-Aug-2011	kib	- Move the PG_UNMANAGED flag from m->flags to m->oflags, renaming the flag to VPO_UNMANAGED (and also making the flag protected by the vm object lock, instead of vm page queue lock). - Mark the fake pages with both PG_FICTITIOUS (as it is now) and VPO_UNMANAGED. As a consequence, pmap code now can use use just VPO_UNMANAGED to decide whether the page is unmanaged. Reviewed by: alc Tested by: pho (x86, previous version), marius (sparc64), marcel (arm, ia64, powerpc), ray (mips) Sponsored by: The FreeBSD Foundation Approved by: re (bz)
# 221855	13-May-2011	mdf	Move the ZERO_REGION_SIZE to a machine-dependent file, as on many architectures (i386, for example) the virtual memory space may be constrained enough that 2MB is a large chunk. Use 64K for arches other than amd64 and ia64, with special handling for sparc64 due to differing hardware. Also commit the comment changes to kmem_init_zero_region() that I missed due to not saving the file. (Darn the unfamiliar development environment). Arch maintainers, please feel free to adjust ZERO_REGION_SIZE as you see fit. Requested by: alc MFC after: 1 week MFC with: r221853
# 221853	13-May-2011	mdf	Usa a globally visible region of zeros for both /dev/zero and the md device. There are likely other kernel uses of "blob of zeros" than can be converted. Reviewed by: alc MFC after: 1 week
# 218701	15-Feb-2011	kib	Since r218070 reenabled the call to vm_map_simplify_entry() from vm_map_insert(), the kmem_back() assumption about newly inserted entry might be broken due to interference of two factors. In the low memory condition, when vm_page_alloc() returns NULL, supplied map is unlocked. If another thread performs kmem_malloc() meantime, and its map entry is placed right next to our thread map entry in the map, both entries wire count is still 0 and entries are coalesced due to vm_map_simplify_entry(). Mark new entry with MAP_ENTRY_IN_TRANSITION to prevent coalesce. Fix some style issues, tighten the assertions to account for MAP_ENTRY_IN_TRANSITION state. Reported and tested by: pho Reviewed by: alc
# 212931	20-Sep-2010	mdf	Replace an XXX comment with the appropriate code. Submitted by: alc
# 211194	11-Aug-2010	mdf	Rework memguard(9) to reserve significantly more KVA to detect use-after-free over a longer time. Also release the backing pages of a guarded allocation at free(9) time to reduce the overhead of using memguard(9). Allow setting and varying the malloc type at run-time. Add knobs to allow: - randomly guarding memory - adding un-backed KVA guard pages to detect underflow and overflow - a lower limit on the size of allocations that are guarded Reviewed by: alc Reviewed by: brueffer, Ulrich Spörlein <uqs spoerlein net> (man page) Silence from: -arch Approved by: zml (mentor) MFC after: 1 month
# 207552	03-May-2010	alc	The pages allocated by kmem_alloc_attr() and kmem_malloc() are unmanaged. Consequently, neither the page lock nor the page queues lock is needed to unwire and free them.
# 207410	29-Apr-2010	kmacy	On Alan's advice, rather than do a wholesale conversion on a single architecture from page queue lock to a hashed array of page locks (based on a patch by Jeff Roberson), I've implemented page lock support in the MI code and have only moved vm_page's hold_count out from under page queue mutex to page lock. This changes pmap_extract_and_hold on all pmaps. Supported by: Bitgravity Inc. Discussed with: alc, jeffr, and kib
# 206819	18-Apr-2010	jmallett	o) Add a VM find-space option, VMFS_TLB_ALIGNED_SPACE, which searches the address space for an address as aligned by the new pmap_align_tlb() function, which is for constraints imposed by the TLB. [1] o) Add a kmem_alloc_nofault_space() function, which acts like kmem_alloc_nofault() but allows the caller to specify which find-space option to use. [1] o) Use kmem_alloc_nofault_space() with VMFS_TLB_ALIGNED_SPACE to allocate the kernel stack address on MIPS. [1] o) Make pmap_align_tlb() on MIPS align addresses so that they do not start on an odd boundary within the TLB, so that they are suitable for insertion as wired entries and do not have to share a TLB entry with another mapping, assuming they are appropriately-sized. o) Eliminate md_realstack now that the kstack will be appropriately-aligned on MIPS. o) Increase the number of guard pages to 2 so that we retain the proper alignment of the kstack address. Reviewed by: [1] alc X-MFC-after: Making sure alc has not come up with a better interface.
# 194766	23-Jun-2009	kib	Implement global and per-uid accounting of the anonymous memory. Add rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid. The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup. The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped. The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4). Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced. In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)
# 189015	24-Feb-2009	kib	Revert the addition of the freelist argument for the vm_map_delete() function, done in r188334. Instead, collect the entries that shall be freed, in the deferred_freelist member of the map. Automatically purge the deferred freelist when map is unlocked. Tested by: pho Reviewed by: alc
# 188967	23-Feb-2009	rwatson	Put debug.vm_lowmem sysctl under DIAGNOSTIC. Submitted by: sam MFC after: 3 days
# 188964	23-Feb-2009	rwatson	Add a debugging sysctl, debug.vm_lowmem, that when assigned a value of 1 will trigger a pass through the VM's low-memory handlers, such as protocol and UMA drain routines. This makes it easier to exercise these otherwise rarely-invoked code paths. MFC after: 3 days
# 188334	08-Feb-2009	kib	Do not call vm_object_deallocate() from vm_map_delete(), because we hold the map lock there, and might need the vnode lock for OBJT_VNODE objects. Postpone object deallocation until caller of vm_map_delete() drops the map lock. Link the map entries to be freed into the freelist, that is released by the new helper function vm_map_entry_free_freelist(). Reviewed by: tegge, alc Tested by: pho
# 180598	18-Jul-2008	alc	Eliminate stale comments from kmem_malloc().
# 179923	22-Jun-2008	alc	Make preparations for increasing the size of the kernel virtual address space on the amd64 architecture. The amd64 architecture requires kernel code and global variables to reside in the highest 2GB of the 64-bit virtual address space. Thus, the memory allocated during bootstrap, before the call to kmem_init(), starts at KERNBASE, which is not necessarily the same as VM_MIN_KERNEL_ADDRESS on amd64.
# 178933	10-May-2008	alc	Introduce a new parameter "superpage_align" to kmem_suballoc() that is used to request superpage alignment for the submap. Request superpage alignment for the kmem_map. Pass VMFS_ANY_SPACE instead of TRUE to vm_map_find(). (They are currently equivalent but VMFS_ANY_SPACE is the new preferred spelling.) Remove a stale comment from kmem_malloc().
# 178637	28-Apr-2008	alc	Eliminate pointless casts from kmem_suballoc().
# 177762	30-Mar-2008	alc	Eliminate an unnecessary printf() from kmem_suballoc(). The subsequent panic() can be extended to convey the same information.
# 175210	10-Jan-2008	pjd	When one tries to allocate memory with the M_WAITOK flag and we are short in address space in kmem map call vm_lowmem event in a loop and wait a bit for subsystems to reclaim some memory which in turn will reclaim address space as well. Note, this is a work-around. Reviewed by: alc Approved by: alc MFC after: 3 days
# 175067	03-Jan-2008	alc	Add an access type parameter to pmap_enter(). It will be used to implement superpage promotion. Correct a style error in kmem_malloc(): pmap_enter()'s last parameter is a Boolean.
# 173429	07-Nov-2007	pjd	Change unused 'user_wait' argument to 'timo' argument, which will be used to specify timeout for msleep(9). Discussed with: alc Reviewed by: alc
# 168395	05-Apr-2007	pjd	When KVA is exhausted, try the vm_lowmem event for the last time before panicing. This helps a lot in ZFS stability.
# 166964	25-Feb-2007	alc	Change the way that unmanaged pages are created. Specifically, immediately flag any page that is allocated to a OBJT_PHYS object as unmanaged in vm_page_alloc() rather than waiting for a later call to vm_page_unmanage(). This allows for the elimination of some uses of the page queues lock. Change the type of the kernel and kmem objects from OBJT_DEFAULT to OBJT_PHYS. This allows us to take advantage of the above change to simplify the allocation of unmanaged pages in kmem_alloc() and kmem_malloc(). Remove vm_page_unmanage(). It is no longer used.
# 165854	07-Jan-2007	alc	Declare the map entry created by kmem_init() for the range from VM_MIN_KERNEL_ADDRESS to the end of the kernel's bootstrap data as MAP_NOFAULT.
# 164234	12-Nov-2006	alc	There is no point in setting PG_REFERENCED on kmem_object pages because they are "unmanaged", i.e., non-pageable, pages. Remove a stale comment.
# 164229	12-Nov-2006	alc	Make pmap_enter() responsible for setting PG_WRITEABLE instead of its caller. (As a beneficial side-effect, a high-contention acquisition of the page queues lock in vm_fault() is eliminated.)
# 163622	23-Oct-2006	alc	The page queues lock is no longer required by vm_page_wakeup().
# 139825	07-Jan-2005	imp	/* -> /*- for license, minor formatting changes
# 136923	24-Oct-2004	alc	Use VM_ALLOC_NOBUSY instead of calling vm_page_wakeup().
# 133435	10-Aug-2004	green	Back out all behavioral chnages.
# 133401	09-Aug-2004	green	Revamp VM map wiring. * Allow no-fault wiring/unwiring to succeed for consistency; however, the wired count remains at zero, so it's a special case. * Fix issues inside vm_map_wire() and vm_map_unwire() where the exact state of user wiring (one or zero) and system wiring (zero or more) could be confused; for example, system unwiring could succeed in removing a user wire, instead of being an error. * Require all mappings to be unwired before they are deleted. When VM space is still wired upon deletion, it will be waited upon for the following unwire. This makes vslock(9) work rather than allowing kernel-locked memory to be deleted out from underneath of its consumer as it would before.
# 132638	25-Jul-2004	alc	For years, kmem_alloc_pageable() has been misused. Now that the last of these misuses has been corrected, remove it before new ones appear, such as arm/arm/pmap.c revision 1.8.
# 129906	31-May-2004	bmilekic	Bring in mbuma to replace mballoc. mbuma is an Mbuf & Cluster allocator built on top of a number of extensions to the UMA framework, all included herein. Extensions to UMA worth noting: - Better layering between slab <-> zone caches; introduce Keg structure which splits off slab cache away from the zone structure and allows multiple zones to be stacked on top of a single Keg (single type of slab cache); perhaps we should look into defining a subset API on top of the Keg for special use by malloc(9), for example. - UMA_ZONE_REFCNT zones can now be added, and reference counters automagically allocated for them within the end of the associated slab structures. uma_find_refcnt() does a kextract to fetch the slab struct reference from the underlying page, and lookup the corresponding refcnt. mbuma things worth noting: - integrates mbuf & cluster allocations with extended UMA and provides caches for commonly-allocated items; defines several zones (two primary, one secondary) and two kegs. - change up certain code paths that always used to do: m_get() + m_clget() to instead just use m_getcl() and try to take advantage of the newly defined secondary Packet zone. - netstat(1) and systat(1) quickly hacked up to do basic stat reporting but additional stats work needs to be done once some other details within UMA have been taken care of and it becomes clearer to how stats will work within the modified framework. From the user perspective, one implication is that the NMBCLUSTERS compile-time option is no longer used. The maximum number of clusters is still capped off according to maxusers, but it can be made unlimited by setting the kern.ipc.nmbclusters boot-time tunable to zero. Work should be done to write an appropriate sysctl handler allowing dynamic tuning of kern.ipc.nmbclusters at runtime. Additional things worth noting/known issues (READ): - One report of 'ips' (ServeRAID) driver acting really slow in conjunction with mbuma. Need more data. Latest report is that ips is equally sucking with and without mbuma. - Giant leak in NFS code sometimes occurs, can't reproduce but currently analyzing; brueffer is able to reproduce but THIS IS NOT an mbuma-specific problem and currently occurs even WITHOUT mbuma. - Issues in network locking: there is at least one code path in the rip code where one or more locks are acquired and we end up in m_prepend() with M_WAITOK, which causes WITNESS to whine from within UMA. Current temporary solution: force all UMA allocations to be M_NOWAIT from within UMA for now to avoid deadlocks unless WITNESS is defined and we can determine with certainty that we're not holding any locks when we're M_WAITOK. - I've seen at least one weird socketbuffer empty-but- mbuf-still-attached panic. I don't believe this to be related to mbuma but please keep your eyes open, turn on debugging, and capture crash dumps. This change removes more code than it adds. A paper is available detailing the change and considering various performance issues, it was presented at BSDCan2004: http://www.unixdaemons.com/~bmilekic/netbuf_bmilekic.pdf Please read the paper for Future Work and implementation details, as well as credits. Testing and Debugging: rwatson, brueffer, Ketrien I. Saihr-Kesenchedra, ... Reviewed by: Lots of people (for different parts)
# 128613	24-Apr-2004	alc	Push down the responsibility for zeroing a physical page from the caller to vm_page_grab(). Although this gives VM_ALLOC_ZERO a different meaning for vm_page_grab() than for vm_page_alloc(), I feel such change is necessary to accomplish other goals. Specifically, I want to make the PG_ZERO flag immutable between the time it is allocated by vm_page_alloc() and freed by vm_page_free() or vm_page_free_zero() to avoid locking overheads. Once we gave up on the ability to automatically recognize a zeroed page upon entry to vm_page_free(), the ability to mutate the PG_ZERO flag became useless. Instead, I would like to say that "Once a page becomes valid, its PG_ZERO flag must be ignored."
# 127961	06-Apr-2004	imp	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999. Approved by: core
# 125889	16-Feb-2004	des	Back out previous commit due to objections.
# 125882	16-Feb-2004	des	Don't panic if we fail to satisfy an M_WAITOK request; return 0 instead. The calling code will either handle that gracefully or cause a page fault.
# 124321	09-Jan-2004	alc	Unmanage pages allocated by kmem_alloc(). (There is no point in having PV entries for these pages.)
# 124195	06-Jan-2004	alc	Don't bother clearing PG_ZERO in contigmalloc1(), kmem_alloc(), or kmem_malloc(). It serves no purpose.
# 124048	01-Jan-2004	alc	- Increase the scope of the kmem_object's lock in kmem_malloc(). Add a comment explaining why a further increase is not possible.
# 123914	27-Dec-2003	alc	Remove GIANT_REQUIRED from kmem_suballoc().
# 122383	09-Nov-2003	mini	NFC: Update stale comments. Reviewed by: alc
# 120761	04-Oct-2003	alc	Synchronize access to a vm page's valid field using the containing vm object's lock.
# 120050	14-Sep-2003	alc	Call vm_page_unmanage() on pages belonging to the kmem_object. This eliminates the unnecessary overhead of managing "PV" entries for these pages.
# 119658	01-Sep-2003	eivind	Change clean_map from a global to an auto variable
# 118771	11-Aug-2003	bms	Add the mlockall() and munlockall() system calls. - All those diffs to syscalls.master for each architecture are necessary. This needed clarification; the stub code generation for mlockall() was disabled, which would prevent applications from linking to this API (suggested by mux) - Giant has been quoshed. It is no longer held by the code, as the required locking has been pushed down within vm_map.c. - Callers must specify VM_MAP_WIRE_HOLESOK or VM_MAP_WIRE_NOHOLES to express their intention explicitly. - Inspected at the vmstat, top and vm pager sysctl stats level. Paging-in activity is occurring correctly, using a test harness. - The RES size for a process may appear to be greater than its SIZE. This is believed to be due to mappings of the same shared library page being wired twice. Further exploration is needed. - Believed to back out of allocations and locks correctly (tested with WITNESS, MUTEX_PROFILING, INVARIANTS and DIAGNOSTIC). PR: kern/43426, standards/54223 Reviewed by: jake, alc Approved by: jake (mentor) MFC after: 2 weeks
# 118764	11-Aug-2003	silby	More pipe changes: From alc: Move pageable pipe memory to a seperate kernel submap to avoid awkward vm map interlocking issues. (Bad explanation provided by me.) From me: Rework pipespace accounting code to handle this new layout, and adjust our default values to account for the fact that we now have a solid limit on allocations. Also, remove the "maxpipes" limit, as it no longer has a purpose. (The limit on kva usage solves the problem of having two many pipes.)
# 118317	01-Aug-2003	alc	Update the comment at the head of kmem_alloc_nofault() to describe its purpose and use.
# 118096	27-Jul-2003	alc	Remove GIANT_REQUIRED from kmem_alloc().
# 117004	28-Jun-2003	alc	Remove GIANT_REQUIRED from kmem_malloc().
# 116226	11-Jun-2003	obrien	Use __FBSDID().
# 115997	07-Jun-2003	alc	Lock the kernel object in kmem_alloc().
# 113489	14-Apr-2003	alc	Update locking on the kmem_object to use the new macros.
# 113418	12-Apr-2003	alc	Eliminate unnecessary gotos from kmem_malloc().
# 108677	04-Jan-2003	alc	Allow kmem_malloc() without Giant if M_NOWAIT is specified.
# 108426	30-Dec-2002	alc	- Mark the kernel_map as a system map immediately after its creation. - Correct a cast.
# 108351	28-Dec-2002	alc	Two changes to kmem_malloc(): - Use VM_ALLOC_WIRED. - Perform vm_page_wakeup() after pmap_enter(), like we do everywhere else.
# 108262	24-Dec-2002	alc	- Hold the page queues lock around calls to vm_page_flag_clear().
# 108251	24-Dec-2002	alc	- Hold the page queues lock around vm_page_wakeup().
# 108138	20-Dec-2002	alc	Increase the scope of the kmem_object locking in kmem_malloc().
# 107989	17-Dec-2002	alc	Hold the page queues lock when performing vm_page_flag_set().
# 107918	15-Dec-2002	alc	Perform vm_object_lock() and vm_object_unlock() on kmem_object around vm_page_lookup() and vm_page_free().
# 102382	24-Aug-2002	alc	o Retire vm_page_zero_fill() and vm_page_zero_fill_area(). Ever since pmap_zero_page() and pmap_zero_page_area() were modified to accept a struct vm_page * instead of a physical address, vm_page_zero_fill() and vm_page_zero_fill_area() have served no purpose.
# 101634	10-Aug-2002	alc	o Remove the setting and clearing of the PG_MAPPED flag. (This flag is obsolete.)
# 100796	28-Jul-2002	alc	o Lock page queue accesses by vm_page_free().
# 99985	14-Jul-2002	alc	o Lock page queue accesses by vm_page_wire().
# 99893	12-Jul-2002	alc	o Assert GIANT_REQUIRED on system maps in _vm_map_lock(), _vm_map_lock_read(), and _vm_map_trylock(). Submitted by: tegge o Remove GIANT_REQUIRED from kmem_alloc_wait() and kmem_free_wakeup(). (This clears the way for exec_map accesses to move outside of Giant. The exec_map is not a system map.) o Remove some premature MPSAFE comments. Reviewed by: tegge
# 99754	11-Jul-2002	alc	o Add a "needs wakeup" flag to the vm_map for use by kmem_alloc_wait() and kmem_free_wakeup(). Previously, kmem_free_wakeup() always called wakeup(). In general, no one was sleeping. o Export vm_map_unlock_and_wait() and vm_map_wakeup() from vm_map.c for use in vm_kern.c.
# 98686	23-Jun-2002	alc	o Remove GIANT_REQUIRED from kmem_alloc_pageable(), kmem_alloc_nofault(), and kmem_free(). (Annotate as MPSAFE.) o Remove incorrect casts from kmem_alloc_pageable() and kmem_alloc_nofault().
# 98455	19-Jun-2002	jeff	- Move the computation of pflags out of the page allocation loop in kmem_malloc() - zero fill pages if PG_ZERO bit is not set after allocation in kmem_malloc() Suggested by: alc, jake
# 98450	19-Jun-2002	jeff	Teach kmem_malloc about M_ZERO.
# 98226	14-Jun-2002	alc	o Use vm_map_wire() and vm_map_unwire() in place of vm_map_pageable() and vm_map_user_pageable(). o Remove vm_map_pageable() and vm_map_user_pageable(). o Remove vm_map_clear_recursive() and vm_map_set_recursive(). (They were only used by vm_map_pageable() and vm_map_user_pageable().) Reviewed by: tegge
# 95710	29-Apr-2002	peter	Tidy up some loose ends. i386/ia64/alpha - catch up to sparc64/ppc: - replace pmap_kernel() with refs to kernel_pmap - change kernel_pmap pointer to (&kernel_pmap_store) (this is a speedup since ld can set these at compile/link time) all platforms (as suggested by jake): - gc unused pmap_reference - gc unused pmap_destroy - gc unused struct pmap.pm_count (we never used pm_count - we track address space sharing at the vmspace)
# 92029	10-Mar-2002	eivind	- Remove a number of extra newlines that do not belong here according to style(9) - Minor space adjustment in cases where we have "( ", " )", if(), return(), while(), for(), etc. - Add /* SYMBOL */ after a few #endifs. Reviewed by: alc
# 91946	09-Mar-2002	tegge	Revert change in revision 1.53 and add a small comment to protect the revived code. vm pages newly allocated are marked busy (PG_BUSY), thus calling vm_page_delete before the pages has been freed or unbusied will cause a deadlock since vm_page_object_page_remove will wait for the busy flag to be cleared. This can be triggered by calling malloc with size > PAGE_SIZE and the M_NOWAIT flag on systems low on physical free memory. A kernel module that reproduces the problem, written by Logan Gabriel <logan@mail.2cactus.com>, can be found in the freebsd-hackers mail archive (12 Apr 2001). The problem was recently noticed again by Archie Cobbs <archie@dellroad.org>. Reviewed by: dillon
# 87157	30-Nov-2001	luigi	vm/vm_kern.c: rate limit (to once per second) diagnostic printf when you run out of mbuf address space. kern/subr_mbuf.c: print a warning message when mb_alloc fails, again rate-limited to at most once per second. This covers other cases of mbuf allocation failures. Probably it also overlaps the one handled in vm/vm_kern.c, so maybe the latter should go away. This warning will let us gradually remove the printf that are scattered across most network drivers to report mbuf allocation failures. Those are potentially dangerous, in that they are not rate-limited and can easily cause systems to panic. Unless there is disagreement (which does not seem to be the case judging from the discussion on -net so far), and because this is sort of a safety bugfix, I plan to commit a similar change to STABLE during the weekend (it affects kern/uipc_mbuf.c there). Discussed-with: jlemon, silby and -net
# 81399	10-Aug-2001	jhb	- Remove asleep(), await(), and M_ASLEEP. - Callers of asleep() and await() have been converted to calling tsleep(). The only caller outside of M_ASLEEP was the ata driver, which called both asleep() and await() with spl-raised, so there was no need for the asleep() and await() pair. M_ASLEEP was unused. Reviewed by: jasone, peter
# 79224	04-Jul-2001	dillon	With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
# 78592	22-Jun-2001	bmilekic	Introduce numerous SMP friendly changes to the mbuf allocator. Namely, introduce a modified allocation mechanism for mbufs and mbuf clusters; one which can scale under SMP and which offers the possibility of resource reclamation to be implemented in the future. Notable advantages: o Reduce contention for SMP by offering per-CPU pools and locks. o Better use of data cache due to per-CPU pools. o Much less code cache pollution due to excessively large allocation macros. o Framework for `grouping' objects from same page together so as to be able to possibly free wired-down pages back to the system if they are no longer needed by the network stacks. Additional things changed with this addition: - Moved some mbuf specific declarations and initializations from sys/conf/param.c into mbuf-specific code where they belong. - m_getclr() has been renamed to m_get_clrd() because the old name is really confusing. m_getclr() HAS been preserved though and is defined to the new name. No tree sweep has been done "to change the interface," as the old name will continue to be supported and is not depracated. The change was merely done because m_getclr() sounds too much like "m_get a cluster." - TEMPORARILY disabled mbtypes statistics displaying in netstat(1) and systat(1) (see TODO below). - Fixed systat(1) to display number of "free mbufs" based on new per-CPU stat structures. - Fixed netstat(1) to display new per-CPU stats based on sysctl-exported per-CPU stat structures. All infos are fetched via sysctl. TODO (in order of priority): - Re-enable mbtypes statistics in both netstat(1) and systat(1) after introducing an SMP friendly way to collect the mbtypes stats under the already introduced per-CPU locks (i.e. hopefully don't use atomic() - it seems too costly for a mere stat update, especially when other locks are already present). - Optionally have systat(1) display not only "total free mbufs" but also "total free mbufs per CPU pool." - Fix minor length-fetching issues in netstat(1) related to recently re-enabled option to read mbuf stats from a core file. - Move reference counters at least for mbuf clusters into an unused portion of the cluster itself, to save space and need to allocate a counter. - Look into introducing resource freeing possibly from a kproc. Reviewed by (in parts): jlemon, jake, silby, terry Tested by: jlemon (Intel & Alpha), mjacob (Intel & Alpha) Preliminary performance measurements: jlemon (and me, obviously) URL: http://people.freebsd.org/~bmilekic/mb_alloc/
# 76827	18-May-2001	alfred	Introduce a global lock for the vm subsystem (vm_mtx). vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb
# 76166	01-May-2001	markm	Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)
# 71571	24-Jan-2001	jhb	Add mtx_assert()'s to verify that kmem_alloc() and kmem_free() are called with Giant held.
# 70480	29-Dec-2000	alfred	fix comment which was outdated 3 years ago remove useless assignment purge entire file of 'register' keyword
# 70478	29-Dec-2000	alfred	clean up kmem_suballoc(): remove useless assignment remove 'register' variables
# 69972	13-Dec-2000	tanimura	- If swap metadata does not fit into the KVM, reduce the number of struct swblock entries by dividing the number of the entries by 2 until the swap metadata fits. - Reject swapon(2) upon failure of swap_zone allocation. This is just a temporary fix. Better solutions include: (suggested by: dillon) o reserving swap in SWAP_META_PAGES chunks, and o swapping the swblock structures themselves. Reviewed by: alfred, dillon
# 60755	21-May-2000	peter	Implement an optimization of the VM<->pmap API. Pass vm_page_t's directly to various pmap_*() functions instead of looking up the physical address and passing that. In many cases, the first thing the pmap code was doing was going to a lot of trouble to get back the original vm_page_t, or it's shadow pv_table entry. Inspired by: John Dyson's 1998 patches. Also: Eliminate pv_table as a seperate thing and build it into a machine dependent part of vm_page_t. This eliminates having a seperate set of structions that shadow each other in a 1:1 fashion that we often went to a lot of trouble to translate from one to the other. (see above) This happens to save 4 bytes of physical memory for each page in the system. (8 bytes on the Alpha). Eliminate the use of the phys_avail[] array to determine if a page is managed (ie: it has pv_entries etc). Store this information in a flag. Things like device_pager set it because they create vm_page_t's on the fly that do not have pv_entries. This makes it easier to "unmanage" a page of physical memory (this will be taken advantage of in subsequent commits). Add a function to add a new page to the freelist. This could be used for reclaiming the previously wasted pages left over from preloaded loader(8) files. Reviewed by: dillon
# 58705	27-Mar-2000	charnier	Revert spelling mistake I made in the previous commit Requested by: Alan and Bruce
# 58634	26-Mar-2000	charnier	Spelling
# 52635	29-Oct-1999	phk	useracc() the prequel: Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ\|WRITE} rather than B_{READ\|WRITE} as argument.
# 52617	29-Oct-1999	alc	Remove the last vestiges of "vm_map_t phys_map". It's been unused since i386/i386/machdep.c rev 1.45 (or 1994 :-) ).
# 50477	27-Aug-1999	peter	$Id$ -> $FreeBSD$
# 49858	15-Aug-1999	alc	Remove the declarations for "vm_map_t io_map". It's been unused since i386/i386/machdep rev 1.310, i.e., the demise of BOUNCE_BUFFERS.
# 49852	15-Aug-1999	alc	Remove the declarations for "vm_map_t u_map". It's been unused since i386/i386/pmap rev 1.190. (The alpha never used it.)
# 48409	01-Jul-1999	peter	Fix some int/long printf problems for the Alpha
# 47841	08-Jun-1999	dt	Add a function kmem_alloc_nofault() - same as kmem_alloc_pageable(), but create a nofault entry. It will be used to allocate kmem for upages. (I am not too happy with all this, but it's better than nothing).
# 44793	16-Mar-1999	alc	Correct a problem in kmem_malloc: A kmem_malloc allowing "wait" may block (VM_WAIT) holding the map lock. This is bad. For example, a subsequent kmem_malloc by an interrupt handler on the same map may find the lock held and panic in the lockmgr.
# 44682	12-Mar-1999	alc	Remove vm_page_frees from kmem_malloc that are performed by vm_map_delete/vm_object_page_remove anyway.
# 42969	21-Jan-1999	dillon	Potential bug fix, do not just clear PG_BUSY... call vm_page_wakeup() instead to properly handle any waiters. Added comments, added support for M_ASLEEP. Generally treat M_ flags as flags instead of constants to compare against.
# 42957	21-Jan-1999	dillon	This is a rather large commit that encompasses the new swapper, changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues. Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
# 38799	04-Sep-1998	dfr	Cosmetic changes to the PAGE_XXX macros to make them consistent with the other objects in vm.
# 38517	24-Aug-1998	dfr	Change various syscalls to use size_t arguments instead of u_int. Add some overflow checks to read/write (from bde). Change all modifications to vm_page::flags, vm_page::busy, vm_object::flags and vm_object::paging_in_progress to use operations which are not interruptable. Reviewed by: Bruce Evans <bde@zeta.org.au>
# 37094	21-Jun-1998	bde	Removed unused includes.
# 36677	05-Jun-1998	dg	Changed the log() of "Out of mbuf clusters - increase maxusers" to a printf() of "Out of mbuf clusters - adjust NMBCLUSTERS or increase maxusers" so that the message is more informative and so that it will appear in the kernel message buffer.
# 33936	01-Mar-1998	dyson	1) Use a more consistent page wait methodology. 2) Do not unnecessarily force page blocking when paging pages out. 3) Further improve swap pager performance and correctness, including fixing the paging in progress deadlock (except in severe I/O error conditions.) 4) Enable vfs_ioopt=1 as a default. 5) Fix and enable the page prezeroing in SMP mode. All in all, SMP systems especially should show a significant improvement in "snappyness."
# 33758	23-Feb-1998	dyson	Significantly improve the efficiency of the swap pager, which appears to have declined due to code-rot over time. The swap pager rundown code has been clean-up, and unneeded wakeups removed. Lots of splbio's are changed to splvm's. Also, set the dynamic tunables for the pageout daemon to be more sane for larger systems (thereby decreasing the daemon overheadla.)
# 33134	06-Feb-1998	eivind	Back out DIAGNOSTIC changes.
# 33109	05-Feb-1998	dyson	1) Start using a cleaner and more consistant page allocator instead of the various ad-hoc schemes. 2) When bringing in UPAGES, the pmap code needs to do another vm_page_lookup. 3) When appropriate, set the PG_A or PG_M bits a-priori to both avoid some processor errata, and to minimize redundant processor updating of page tables. 4) Modify pmap_protect so that it can only remove permissions (as it originally supported.) The additional capability is not needed. 5) Streamline read-only to read-write page mappings. 6) For pmap_copy_page, don't enable write mapping for source page. 7) Correct and clean-up pmap_incore. 8) Cluster initial kern_exec pagin. 9) Removal of some minor lint from kern_malloc. 10) Correct some ioopt code. 11) Remove some dead code from the MI swapout routine. 12) Correct vm_object_deallocate (to remove backing_object ref.) 13) Fix dead object handling, that had problems under heavy memory load. 14) Add minor vm_page_lookup improvements. 15) Some pages are not in objects, and make sure that the vm_page.c can properly support such pages. 16) Add some more page deficit handling. 17) Some minor code readability improvements.
# 33108	04-Feb-1998	eivind	Turn DIAGNOSTIC into a new-style option.
# 32937	31-Jan-1998	dyson	Change the busy page mgmt, so that when pages are freed, they MUST be PG_BUSY. It is bogus to free a page that isn't busy, because it is in a state of being "unavailable" when being freed. The additional advantage is that the page_remove code has a better cross-check that the page should be busy and unavailable for other use. There were some minor problems with the collapse code, and this plugs those subtile "holes." Also, the vfs_bio code wasn't checking correctly for PG_BUSY pages. I am going to develop a more consistant scheme for grabbing pages, busy or otherwise. For now, we are stuck with the current morass.
# 32702	22-Jan-1998	dyson	VM level code cleanups. 1) Start using TSM. Struct procs continue to point to upages structure, after being freed. Struct vmspace continues to point to pte object and kva space for kstack. u_map is now superfluous. 2) vm_map's don't need to be reference counted. They always exist either in the kernel or in a vmspace. The vmspaces are managed by reference counts. 3) Remove the "wired" vm_map nonsense. 4) No need to keep a cache of kernel stack kva's. 5) Get rid of strange looking ++var, and change to var++. 6) Change more data structures to use our "zone" allocator. Added struct proc, struct vmspace and struct vnode. This saves a significant amount of kva space and physical memory. Additionally, this enables TSM for the zone managed memory. 7) Keep ioopt disabled for now. 8) Remove the now bogus "single use" map concept. 9) Use generation counts or id's for data structures residing in TSM, where it allows us to avoid unneeded restart overhead during traversals, where blocking might occur. 10) Account better for memory deficits, so the pageout daemon will be able to make enough memory available (experimental.) 11) Fix some vnode locking problems. (From Tor, I think.) 12) Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp. (experimental.) 13) Significantly shrink, cleanup, and make slightly faster the vm_fault.c code. Use generation counts, get rid of unneded collpase operations, and clean up the cluster code. 14) Make vm_zone more suitable for TSM. This commit is partially as a result of discussions and contributions from other people, including DG, Tor Egge, PHK, and probably others that I have forgotten to attribute (so let me know, if I forgot.) This is not the infamous, final cleanup of the vnode stuff, but a necessary step. Vnode mgmt should be correct, but things might still change, and there is still some missing stuff (like ioopt, and physical backing of non-merged cache files, debugging of layering concepts.)
# 27899	04-Aug-1997	dyson	Get rid of the ad-hoc memory allocator for vm_map_entries, in lieu of a simple, clean zone type allocator. This new allocator will also be used for machine dependent pmap PV entries.
# 27845	02-Aug-1997	bde	Removed unused #includes.
# 26811	22-Jun-1997	peter	Kill some stale leftovers from the earlier attempts at SMP per-cpu pages
# 26258	29-May-1997	peter	Update the #include "opt_smpxxx.h" includes - opt_smp.h isn't needed very much in the generic parts of the kernel now.
# 25164	26-Apr-1997	peter	Man the liferafts! Here comes the long awaited SMP -> -current merge! There are various options documented in i386/conf/LINT, there is more to come over the next few days. The kernel should run pretty much "as before" without the options to activate SMP mode. There are a handful of known "loose ends" that need to be fixed, but have been put off since the SMP kernel is in a moderately good condition at the moment. This commit is the result of the tinkering and testing over the last 14 months by many people. A special thanks to Steve Passe for implementing the APIC code!
# 24437	31-Mar-1997	dg	Changed the way that the exec image header is read to be filesystem- centric rather than VM-centric to fix a problem with errors not being detectable when the header is read. Killed exech_map as a result of these changes. There appears to be no performance difference with this change.
# 22975	22-Feb-1997	peter	Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
# 22521	10-Feb-1997	dyson	This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
# 21737	15-Jan-1997	dg	Fix bug related to map entry allocations where a sleep might be attempted when allocating memory for network buffers at interrupt time. This is due to inadequate checking for the new mcl_map. Fixed by merging mb_map and mcl_map into a single mb_map. Reviewed by: wollman
# 21673	14-Jan-1997	jkh	Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
# 20993	28-Dec-1996	dyson	Eliminate the redundancy due to the similarity between the routines vm_map_simplify and vm_map_simplify_entry. Make vm_map_simplify_entry handle wired maps so that we can get rid of vm_map_simplify. Modify the callers of vm_map_simplify to properly use vm_map_simplify_entry. Submitted by: Alan Cox <alc@cs.rice.edu>
# 19830	17-Nov-1996	dyson	Improve the locality of reference for variables in vm_page and vm_kern by moving them from .bss to .data. With this change, there is a measurable perf improvement in fork/exec.
# 16892	02-Jul-1996	dyson	Properly set the PG_MAPPED and PG_WRITEABLE flags. This fixes some potential problems with vm_map_remove/vm_map_delete.
# 15876	23-May-1996	dyson	Eliminate a vm_page_free, busy panic, in kern_malloc.
# 15809	18-May-1996	dyson	This set of commits to the VM system does the following, and contain contributions or ideas from Stephen McKay <syssgm@devetir.qld.gov.au>, Alan Cox <alc@cs.rice.edu>, David Greenman <davidg@freebsd.org> and me: More usage of the TAILQ macros. Additional minor fix to queue.h. Performance enhancements to the pageout daemon. Addition of a wait in the case that the pageout daemon has to run immediately. Slightly modify the pageout algorithm. Significant revamp of the pmap/fork code: 1) PTE's and UPAGES's are NO LONGER in the process's map. 2) PTE's and UPAGES's reside in their own objects. 3) TOTAL elimination of recursive page table pagefaults. 4) The page directory now resides in the PTE object. 5) Implemented pmap_copy, thereby speeding up fork time. 6) Changed the pv entries so that the head is a pointer and not an entire entry. 7) Significant cleanup of pmap_protect, and pmap_remove. 8) Removed significant amounts of machine dependent fork code from vm_glue. Pushed much of that code into the machine dependent pmap module. 9) Support more completely the reuse of already zeroed pages (Page table pages and page directories) as being already zeroed. Performance and code cleanups in vm_map: 1) Improved and simplified allocation of map entries. 2) Improved vm_map_copy code. 3) Corrected some minor problems in the simplify code. Implemented splvm (combo of splbio and splimp.) The VM code now seldom uses splhigh. Improved the speed of and simplified kmem_malloc. Minor mod to vm_fault to avoid using pre-zeroed pages in the case of objects with backing objects along with the already existant condition of having a vnode. (If there is a backing object, there will likely be a COW... With a COW, it isn't necessary to start with a pre-zeroed page.) Minor reorg of source to perhaps improve locality of ref.
# 15722	10-May-1996	wollman	Allocate mbufs from a separate submap so that NMBCLUSTERS works as expected.
# 15367	24-Apr-1996	dyson	This fixes kmem_malloc/kmem_free (and malloc/free of objects of > 8K). A page index was calculated incorrectly in vm_kern, and vm_object_page_remove removed pages that should not have been.
# 13786	31-Jan-1996	dg	Print a more descriptive message when the mb_map is filled (out of mbuf clusters), and tell the operator what to do about it (increase maxusers).
# 13490	19-Jan-1996	dyson	Eliminated many redundant vm_map_lookup operations for vm_mmap. Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish overhead for merged cache. Efficiency improvement for vfs_cluster. It used to do alot of redundant calls to cluster_rbuild. Correct the ordering for vrele of .text and release of credentials. Use the selective tlb update for 486/586/P6. Numerous fixes to the size of objects allocated for files. Additionally, fixes in the various pagers. Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs. Fixes in the swap pager for exhausted resources. The pageout code will not as readily thrash. Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE), thereby improving efficiency of several routines. Eliminate even more unnecessary vm_page_protect operations. Significantly speed up process forks. Make vm_object_page_clean more efficient, thereby eliminating the pause that happens every 30seconds. Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the case of filesystems mounted async. Fix a panic with busy pages when write clustering is done for non-VMIO buffers.
# 12767	11-Dec-1995	dyson	Changes to support 1Tb filesizes. Pages are now named by an (object,index) pair instead of (object,offset) pair.
# 12726	10-Dec-1995	bde	Restored used includes of <vm/vm_extern.h>.
# 12662	07-Dec-1995	dg	Untangled the vm.h include file spaghetti.
# 12259	13-Nov-1995	dg	Fixed up a comment and removed some #if 0'd code.
# 10548	03-Sep-1995	dyson	Machine independent changes to support pre-zeroed free pages. This significantly improves demand-zero performance.
# 9759	29-Jul-1995	bde	Eliminate sloppy common-style declarations. There should be none left for the LINT configuation.
# 9507	13-Jul-1995	dg	NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct proc or any VM system structure will have to be rebuilt!!! Much needed overhaul of the VM system. Included in this first round of changes: 1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages, haspage, and sync operations are supported. The haspage interface now provides information about clusterability. All pager routines now take struct vm_object's instead of "pagers". 2) Improved data structures. In the previous paradigm, there is constant confusion caused by pagers being both a data structure ("allocate a pager") and a collection of routines. The idea of a pager structure has escentially been eliminated. Objects now have types, and this type is used to index the appropriate pager. In most cases, items in the pager structure were duplicated in the object data structure and thus were unnecessary. In the few cases that remained, a un_pager structure union was created in the object to contain these items. 3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now be removed. For instance, vm_object_enter(), vm_object_lookup(), vm_object_remove(), and the associated object hash list were some of the things that were removed. 4) simple_lock's removed. Discussion with several people reveals that the SMP locking primitives used in the VM system aren't likely the mechanism that we'll be adopting. Even if it were, the locking that was in the code was very inadequate and would have to be mostly re-done anyway. The locking in a uni-processor kernel was a no-op but went a long way toward making the code difficult to read and debug. 5) Places that attempted to kludge-up the fact that we don't have kernel thread support have been fixed to reflect the reality that we are really dealing with processes, not threads. The VM system didn't have complete thread support, so the comments and mis-named routines were just wrong. We now use tsleep and wakeup directly in the lock routines, for instance. 6) Where appropriate, the pagers have been improved, especially in the pager_alloc routines. Most of the pager_allocs have been rewritten and are now faster and easier to maintain. 7) The pagedaemon pageout clustering algorithm has been rewritten and now tries harder to output an even number of pages before and after the requested page. This is sort of the reverse of the ideal pagein algorithm and should provide better overall performance. 8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup have been removed. Some other unnecessary casts have also been removed. 9) Some almost useless debugging code removed. 10) Terminology of shadow objects vs. backing objects straightened out. The fact that the vm_object data structure escentially had this backwards really confused things. The use of "shadow" and "backing object" throughout the code is now internally consistent and correct in the Mach terminology. 11) Several minor bug fixes, including one in the vm daemon that caused 0 RSS objects to not get purged as intended. 12) A "default pager" has now been created which cleans up the transition of objects to the "swap" type. The previous checks throughout the code for swp->pg_data != NULL were really ugly. This change also provides the rudiments for future backing of "anonymous" memory by something other than the swap pager (via the vnode pager, for example), and it allows the decision about which of these pagers to use to be made dynamically (although will need some additional decision code to do this, of course). 13) (dyson) MAP_COPY has been deprecated and the corresponding "copy object" code has been removed. MAP_COPY was undocumented and non- standard. It was furthermore broken in several ways which caused its behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will continue to work correctly, but via the slightly different semantics of MAP_PRIVATE. 14) (dyson) Sharing maps have been removed. It's marginal usefulness in a threads design can be worked around in other ways. Both #12 and #13 were done to simplify the code and improve readability and maintain- ability. (As were most all of these changes) TODO: 1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing this will reduce the vnode pager to a mere fraction of its current size. 2) Rewrite vm_fault and the swap/vnode pagers to use the clustering information provided by the new haspage pager interface. This will substantially reduce the overhead by eliminating a large number of VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be improved to provide both a "behind" and "ahead" indication of contiguousness. 3) Implement the extended features of pager_haspage in swap_pager_haspage(). It currently just says 0 pages ahead/behind. 4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps via a much more general mechanism that could also be used for disk striping of regular filesystems. 5) Do something to improve the architecture of vm_object_collapse(). The fact that it makes calls into the swap pager and knows too much about how the swap pager operates really bothers me. It also doesn't allow for collapsing of non-swap pager objects ("unnamed" objects backed by other pagers).
# 8876	30-May-1995	rgrimes	Remove trailing whitespace.
# 7066	15-Mar-1995	dg	Special cased the handling of mb_map in the M_WAITOK case. kmem_malloc() now returns NULL and sets a global 'mb_map_full' when the map is full. m_clalloc() has further been taught to expect this and do the right thing. This should fix the "mb_map full" panics that several people have reported.
# 6585	20-Feb-1995	dg	Deprecated remaining use of vm_deallocate. Deprecated vm_allocate_with_ pager(). Almost completely rewrote vm_mmap(); when John gets done with the bottom half, it will be a complete rewrite. Deprecated most use of vm_object_setpager(). Removed side effect of setting object persist in vm_object_enter and moved this into the pager(s). A few other cosmetic changes.
# 6129	02-Feb-1995	dg	swap_pager.c: Fixed long standing bug in freeing swap space during object collapses. Fixed 'out of space' messages from printing out too often. Modified to use new kmem_malloc() calling convention. Implemented an additional stat in the swap pager struct to count the amount of space allocated to that pager. This may be removed at some point in the future. Minimized unnecessary wakeups. vm_fault.c: Don't try to collect fault stats on 'swapped' processes - there aren't any upages to store the stats in. Changed read-ahead policy (again!). vm_glue.c: Be sure to gain a reference to the process's map before swapping. Be sure to lose it when done. kern_malloc.c: Added the ability to specify if allocations are at interrupt time or are 'safe'; this affects what types of pages can be allocated. vm_map.c: Fixed a variety of map lock problems; there's still a lurking bug that will eventually bite. vm_object.c: Explicitly initialize the object fields rather than bzeroing the struct. Eliminated the 'rcollapse' code and folded it's functionality into the "real" collapse routine. Moved an object_unlock() so that the backing_object is protected in the qcollapse routine. Make sure nobody fools with the backing_object when we're destroying it. Added some diagnostic code which can be called from the debugger that looks through all the internal objects and makes certain that they all belong to someone. vm_page.c: Fixed a rather serious logic bug that would result in random system crashes. Changed pagedaemon wakeup policy (again!). vm_pageout.c: Removed unnecessary page rotations on the inactive queue. Changed the number of pages to explicitly free to just free_reserved level. Submitted by: John Dyson
# 5841	24-Jan-1995	dg	Added ability to detect sequential faults and DTRT. (swap_pager.c) Added hook for pmap_prefault() and use symbolic constant for new third argument to vm_page_alloc() (vm_fault.c, various) Changed the way that upages and page tables are held. (vm_glue.c) Fixed architectural flaw in allocating pages at interrupt time that was introduced with the merged cache changes. (vm_page.c, various) Adjusted some algorithms to acheive better paging performance and to accomodate the fix for the architectural flaw mentioned above. (vm_pageout.c) Fixed pbuf handling problem, changed policy on handling read-behind page. (vnode_pager.c) Submitted by: John Dyson
# 5455	09-Jan-1995	dg	These changes embody the support of the fully coherent merged VM buffer cache, much higher filesystem I/O performance, and much better paging performance. It represents the culmination of over 6 months of R&D. The majority of the merged VM/cache work is by John Dyson. The following highlights the most significant changes. Additionally, there are (mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to support the new VM/buffer scheme. vfs_bio.c: Significant rewrite of most of vfs_bio to support the merged VM buffer cache scheme. The scheme is almost fully compatible with the old filesystem interface. Significant improvement in the number of opportunities for write clustering. vfs_cluster.c, vfs_subr.c Upgrade and performance enhancements in vfs layer code to support merged VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff. vm_object.c: Yet more improvements in the collapse code. Elimination of some windows that can cause list corruption. vm_pageout.c: Fixed it, it really works better now. Somehow in 2.0, some "enhancements" broke the code. This code has been reworked from the ground-up. vm_fault.c, vm_page.c, pmap.c, vm_object.c Support for small-block filesystems with merged VM/buffer cache scheme. pmap.c vm_map.c Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of kernel PTs. vm_glue.c Much simpler and more effective swapping code. No more gratuitous swapping. proc.h Fixed the problem that the p_lock flag was not being cleared on a fork. swap_pager.c, vnode_pager.c Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the code doesn't need it anymore. machdep.c Changes to better support the parameter values for the merged VM/buffer cache scheme. machdep.c, kern_exec.c, vm_glue.c Implemented a seperate submap for temporary exec string space and another one to contain process upages. This eliminates all map fragmentation problems that previously existed. ffs_inode.c, ufs_inode.c, ufs_readwrite.c Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on busy buffers. Submitted by: John Dyson and David Greenman
# 2112	18-Aug-1994	wollman	Fix up some sloppy coding practices: - Delete redundant declarations. - Add -Wredundant-declarations to Makefile.i386 so they don't come back. - Delete sloppy COMMON-style declarations of uninitialized data in header files. - Add a few prototypes. - Clean up warnings resulting from the above. NB: ioconf.c will still generate a redundant-declaration warning, which is unavoidable unless somebody volunteers to make `config' smarter.
# 1896	07-Aug-1994	dg	Made pmap_kenter "TLB safe". ...and then removed all the pmap_updates that are no longer needed because of this.
# 1895	07-Aug-1994	dg	Provide support for upcoming merged VM/buffer cache, and fixed a few bugs that haven't appeared to manifest themselves (yet). Submitted by: John Dyson
# 1887	06-Aug-1994	dg	Incorporated post 1.1.5 work from John Dyson. This includes performance improvements via the new routines pmap_qenter/pmap_qremove and pmap_kenter/ pmap_kremove. These routine allow fast mapping of pages for those architectures that have "normal" MMUs. Also included is a fix to the pageout daemon to properly check a queue end condition. Submitted by: John Dyson
# 1817	02-Aug-1994	dg	Added $Id$
# 1549	25-May-1994	rgrimes	The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch. Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
# 1542	24-May-1994	rgrimes	This commit was generated by cvs2svn to compensate for changes in r1541, which included commits to RCS files with non-trunk default branches.
# 1541	24-May-1994	rgrimes	BSD 4.4 Lite Kernel Sources