History log of /netbsd-current/sys/uvm/uvm_extern.h
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.233 26-Feb-2023 skrll

nkmempages should be size_t


Revision tags: netbsd-10-base bouyer-sunxi-drm-base thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 thorpej-i2c-spi-conf-base
# 1.232 31-May-2021 riastradh

uvm: Make uvm_extern.h (more) self-contained, needs sys/types.h.


Revision tags: cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.231 14-Aug-2020 chs

branches: 1.231.6; 1.231.8;
centralize calls from UVM to radixtree into a few functions.
in those functions, assert that the object lock is held in
the correct mode.


# 1.230 14-Jun-2020 ad

g/c vm_page_zero_enable


# 1.229 13-Jun-2020 ad

uvm_pagerealloc(): resurrect the insertion case.


# 1.228 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.227 26-May-2020 kamil

Catch up with the usage of struct vmspace::vm_refcnt

Use the dedicated reference counting routines.

Change the type of struct vmspace::vm_refcnt and struct vm_map::ref_count
to volatile.

Remove the unnecessary vm->vm_map.misc_lock locking in process_domem().

Reviewed by <ad>


# 1.226 09-May-2020 thorpej

Make the uvm_voaddr structure more compact, only occupying 2 pointers
worth of space, by encoding the type in the lower bits of the object
pointer.


# 1.225 27-Apr-2020 rin

Add missing \ to fix build for PMAP_CACHE_VIVT, i.e., ARMv4 and prior.


Revision tags: bouyer-xenpvh-base2
# 1.224 23-Apr-2020 ad

PR kern/54759 (vm.ubc_direct deadlock when read()/write() into mapping of itself)

- Add new flag UBC_ISMAPPED which tells ubc_uiomove() the object is mmap()ed
somewhere. Use it to decide whether to do direct-mapped copy, rather than
poking around directly in the vnode in ubc_uiomove(), which is ugly and
doesn't work for tmpfs. It would be nicer to contain all this in UVM but
the filesystem provides the needed locking here (VV_MAPPED) and to
reinvent that would suck more.

- Rename UBC_UNMAP_FLAG() to UBC_VNODE_FLAGS(). Pass in UBC_ISMAPPED where
appropriate.


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.223 18-Apr-2020 thorpej

Add an API to get a reference on the identity of an individual byte of
virtual memory, a "virtual object address". This is not a reference to
a physical byte of memory, per se, but a reference to a byte residing
in a page, owned by a unique UVM object (either a uobj or an anon). Two
separate address+addresses space tuples that reference the same byte in
an object (such as a location in a shared memory segment) will resolve
to equivalent virtual object addresses. Even if the residency status
of the page changes, the virtual object address remains unchanged.

struct uvm_voaddr -- a structure that encapsulates this address reference.

uvm_voaddr_acquire() -- a function to acquire this address reference,
given a vm_map and a vaddr_t.

uvm_voaddr_release() -- a function to release this address reference.

uvm_voaddr_compare() -- a function to compare two such address references.

uvm_voaddr_acquire() resolves the COW status of the object address before
acquiring.

In collaboration with riastradh@ and chs@.


Revision tags: phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.222 22-Mar-2020 ad

branches: 1.222.2;
Process concurrent page faults on individual uvm_objects / vm_amaps in
parallel, where the relevant pages are already in-core. Proposed on
tech-kern.

Temporarily disabled on MP architectures with __HAVE_UNLOCKED_PMAP until
adjustments are made to their pmaps.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.221 23-Feb-2020 ad

UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


# 1.220 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.219 15-Jan-2020 ad

Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.


Revision tags: ad-namecache-base
# 1.218 31-Dec-2019 ad

branches: 1.218.2;
- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.


# 1.217 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.216 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.215 21-Dec-2019 ad

Add uvm_free(): returns number of free pages in system.


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-3-RELEASE netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

branches: 1.213.2;
allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.232 31-May-2021 riastradh

uvm: Make uvm_extern.h (more) self-contained, needs sys/types.h.


Revision tags: cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.231 14-Aug-2020 chs

centralize calls from UVM to radixtree into a few functions.
in those functions, assert that the object lock is held in
the correct mode.


# 1.230 14-Jun-2020 ad

g/c vm_page_zero_enable


# 1.229 13-Jun-2020 ad

uvm_pagerealloc(): resurrect the insertion case.


# 1.228 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.227 26-May-2020 kamil

Catch up with the usage of struct vmspace::vm_refcnt

Use the dedicated reference counting routines.

Change the type of struct vmspace::vm_refcnt and struct vm_map::ref_count
to volatile.

Remove the unnecessary vm->vm_map.misc_lock locking in process_domem().

Reviewed by <ad>


# 1.226 09-May-2020 thorpej

Make the uvm_voaddr structure more compact, only occupying 2 pointers
worth of space, by encoding the type in the lower bits of the object
pointer.


# 1.225 27-Apr-2020 rin

Add missing \ to fix build for PMAP_CACHE_VIVT, i.e., ARMv4 and prior.


Revision tags: bouyer-xenpvh-base2
# 1.224 23-Apr-2020 ad

PR kern/54759 (vm.ubc_direct deadlock when read()/write() into mapping of itself)

- Add new flag UBC_ISMAPPED which tells ubc_uiomove() the object is mmap()ed
somewhere. Use it to decide whether to do direct-mapped copy, rather than
poking around directly in the vnode in ubc_uiomove(), which is ugly and
doesn't work for tmpfs. It would be nicer to contain all this in UVM but
the filesystem provides the needed locking here (VV_MAPPED) and to
reinvent that would suck more.

- Rename UBC_UNMAP_FLAG() to UBC_VNODE_FLAGS(). Pass in UBC_ISMAPPED where
appropriate.


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.223 18-Apr-2020 thorpej

Add an API to get a reference on the identity of an individual byte of
virtual memory, a "virtual object address". This is not a reference to
a physical byte of memory, per se, but a reference to a byte residing
in a page, owned by a unique UVM object (either a uobj or an anon). Two
separate address+addresses space tuples that reference the same byte in
an object (such as a location in a shared memory segment) will resolve
to equivalent virtual object addresses. Even if the residency status
of the page changes, the virtual object address remains unchanged.

struct uvm_voaddr -- a structure that encapsulates this address reference.

uvm_voaddr_acquire() -- a function to acquire this address reference,
given a vm_map and a vaddr_t.

uvm_voaddr_release() -- a function to release this address reference.

uvm_voaddr_compare() -- a function to compare two such address references.

uvm_voaddr_acquire() resolves the COW status of the object address before
acquiring.

In collaboration with riastradh@ and chs@.


Revision tags: phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.222 22-Mar-2020 ad

branches: 1.222.2;
Process concurrent page faults on individual uvm_objects / vm_amaps in
parallel, where the relevant pages are already in-core. Proposed on
tech-kern.

Temporarily disabled on MP architectures with __HAVE_UNLOCKED_PMAP until
adjustments are made to their pmaps.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.221 23-Feb-2020 ad

UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


# 1.220 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.219 15-Jan-2020 ad

Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.


Revision tags: ad-namecache-base
# 1.218 31-Dec-2019 ad

branches: 1.218.2;
- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.


# 1.217 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.216 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.215 21-Dec-2019 ad

Add uvm_free(): returns number of free pages in system.


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

branches: 1.213.2;
allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.231 14-Aug-2020 chs

centralize calls from UVM to radixtree into a few functions.
in those functions, assert that the object lock is held in
the correct mode.


# 1.230 14-Jun-2020 ad

g/c vm_page_zero_enable


# 1.229 13-Jun-2020 ad

uvm_pagerealloc(): resurrect the insertion case.


# 1.228 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.227 26-May-2020 kamil

Catch up with the usage of struct vmspace::vm_refcnt

Use the dedicated reference counting routines.

Change the type of struct vmspace::vm_refcnt and struct vm_map::ref_count
to volatile.

Remove the unnecessary vm->vm_map.misc_lock locking in process_domem().

Reviewed by <ad>


# 1.226 09-May-2020 thorpej

Make the uvm_voaddr structure more compact, only occupying 2 pointers
worth of space, by encoding the type in the lower bits of the object
pointer.


# 1.225 27-Apr-2020 rin

Add missing \ to fix build for PMAP_CACHE_VIVT, i.e., ARMv4 and prior.


Revision tags: bouyer-xenpvh-base2
# 1.224 23-Apr-2020 ad

PR kern/54759 (vm.ubc_direct deadlock when read()/write() into mapping of itself)

- Add new flag UBC_ISMAPPED which tells ubc_uiomove() the object is mmap()ed
somewhere. Use it to decide whether to do direct-mapped copy, rather than
poking around directly in the vnode in ubc_uiomove(), which is ugly and
doesn't work for tmpfs. It would be nicer to contain all this in UVM but
the filesystem provides the needed locking here (VV_MAPPED) and to
reinvent that would suck more.

- Rename UBC_UNMAP_FLAG() to UBC_VNODE_FLAGS(). Pass in UBC_ISMAPPED where
appropriate.


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.223 18-Apr-2020 thorpej

Add an API to get a reference on the identity of an individual byte of
virtual memory, a "virtual object address". This is not a reference to
a physical byte of memory, per se, but a reference to a byte residing
in a page, owned by a unique UVM object (either a uobj or an anon). Two
separate address+addresses space tuples that reference the same byte in
an object (such as a location in a shared memory segment) will resolve
to equivalent virtual object addresses. Even if the residency status
of the page changes, the virtual object address remains unchanged.

struct uvm_voaddr -- a structure that encapsulates this address reference.

uvm_voaddr_acquire() -- a function to acquire this address reference,
given a vm_map and a vaddr_t.

uvm_voaddr_release() -- a function to release this address reference.

uvm_voaddr_compare() -- a function to compare two such address references.

uvm_voaddr_acquire() resolves the COW status of the object address before
acquiring.

In collaboration with riastradh@ and chs@.


Revision tags: phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.222 22-Mar-2020 ad

branches: 1.222.2;
Process concurrent page faults on individual uvm_objects / vm_amaps in
parallel, where the relevant pages are already in-core. Proposed on
tech-kern.

Temporarily disabled on MP architectures with __HAVE_UNLOCKED_PMAP until
adjustments are made to their pmaps.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.221 23-Feb-2020 ad

UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


# 1.220 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.219 15-Jan-2020 ad

Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.


Revision tags: ad-namecache-base
# 1.218 31-Dec-2019 ad

branches: 1.218.2;
- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.


# 1.217 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.216 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.215 21-Dec-2019 ad

Add uvm_free(): returns number of free pages in system.


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

branches: 1.213.2;
allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.230 14-Jun-2020 ad

g/c vm_page_zero_enable


# 1.229 13-Jun-2020 ad

uvm_pagerealloc(): resurrect the insertion case.


# 1.228 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.227 26-May-2020 kamil

Catch up with the usage of struct vmspace::vm_refcnt

Use the dedicated reference counting routines.

Change the type of struct vmspace::vm_refcnt and struct vm_map::ref_count
to volatile.

Remove the unnecessary vm->vm_map.misc_lock locking in process_domem().

Reviewed by <ad>


# 1.226 09-May-2020 thorpej

Make the uvm_voaddr structure more compact, only occupying 2 pointers
worth of space, by encoding the type in the lower bits of the object
pointer.


# 1.225 27-Apr-2020 rin

Add missing \ to fix build for PMAP_CACHE_VIVT, i.e., ARMv4 and prior.


Revision tags: bouyer-xenpvh-base2
# 1.224 23-Apr-2020 ad

PR kern/54759 (vm.ubc_direct deadlock when read()/write() into mapping of itself)

- Add new flag UBC_ISMAPPED which tells ubc_uiomove() the object is mmap()ed
somewhere. Use it to decide whether to do direct-mapped copy, rather than
poking around directly in the vnode in ubc_uiomove(), which is ugly and
doesn't work for tmpfs. It would be nicer to contain all this in UVM but
the filesystem provides the needed locking here (VV_MAPPED) and to
reinvent that would suck more.

- Rename UBC_UNMAP_FLAG() to UBC_VNODE_FLAGS(). Pass in UBC_ISMAPPED where
appropriate.


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.223 18-Apr-2020 thorpej

Add an API to get a reference on the identity of an individual byte of
virtual memory, a "virtual object address". This is not a reference to
a physical byte of memory, per se, but a reference to a byte residing
in a page, owned by a unique UVM object (either a uobj or an anon). Two
separate address+addresses space tuples that reference the same byte in
an object (such as a location in a shared memory segment) will resolve
to equivalent virtual object addresses. Even if the residency status
of the page changes, the virtual object address remains unchanged.

struct uvm_voaddr -- a structure that encapsulates this address reference.

uvm_voaddr_acquire() -- a function to acquire this address reference,
given a vm_map and a vaddr_t.

uvm_voaddr_release() -- a function to release this address reference.

uvm_voaddr_compare() -- a function to compare two such address references.

uvm_voaddr_acquire() resolves the COW status of the object address before
acquiring.

In collaboration with riastradh@ and chs@.


Revision tags: phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.222 22-Mar-2020 ad

branches: 1.222.2;
Process concurrent page faults on individual uvm_objects / vm_amaps in
parallel, where the relevant pages are already in-core. Proposed on
tech-kern.

Temporarily disabled on MP architectures with __HAVE_UNLOCKED_PMAP until
adjustments are made to their pmaps.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.221 23-Feb-2020 ad

UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


# 1.220 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.219 15-Jan-2020 ad

Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.


Revision tags: ad-namecache-base
# 1.218 31-Dec-2019 ad

branches: 1.218.2;
- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.


# 1.217 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.216 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.215 21-Dec-2019 ad

Add uvm_free(): returns number of free pages in system.


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

branches: 1.213.2;
allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.229 13-Jun-2020 ad

uvm_pagerealloc(): resurrect the insertion case.


# 1.228 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.227 26-May-2020 kamil

Catch up with the usage of struct vmspace::vm_refcnt

Use the dedicated reference counting routines.

Change the type of struct vmspace::vm_refcnt and struct vm_map::ref_count
to volatile.

Remove the unnecessary vm->vm_map.misc_lock locking in process_domem().

Reviewed by <ad>


# 1.226 09-May-2020 thorpej

Make the uvm_voaddr structure more compact, only occupying 2 pointers
worth of space, by encoding the type in the lower bits of the object
pointer.


# 1.225 27-Apr-2020 rin

Add missing \ to fix build for PMAP_CACHE_VIVT, i.e., ARMv4 and prior.


Revision tags: bouyer-xenpvh-base2
# 1.224 23-Apr-2020 ad

PR kern/54759 (vm.ubc_direct deadlock when read()/write() into mapping of itself)

- Add new flag UBC_ISMAPPED which tells ubc_uiomove() the object is mmap()ed
somewhere. Use it to decide whether to do direct-mapped copy, rather than
poking around directly in the vnode in ubc_uiomove(), which is ugly and
doesn't work for tmpfs. It would be nicer to contain all this in UVM but
the filesystem provides the needed locking here (VV_MAPPED) and to
reinvent that would suck more.

- Rename UBC_UNMAP_FLAG() to UBC_VNODE_FLAGS(). Pass in UBC_ISMAPPED where
appropriate.


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.223 18-Apr-2020 thorpej

Add an API to get a reference on the identity of an individual byte of
virtual memory, a "virtual object address". This is not a reference to
a physical byte of memory, per se, but a reference to a byte residing
in a page, owned by a unique UVM object (either a uobj or an anon). Two
separate address+addresses space tuples that reference the same byte in
an object (such as a location in a shared memory segment) will resolve
to equivalent virtual object addresses. Even if the residency status
of the page changes, the virtual object address remains unchanged.

struct uvm_voaddr -- a structure that encapsulates this address reference.

uvm_voaddr_acquire() -- a function to acquire this address reference,
given a vm_map and a vaddr_t.

uvm_voaddr_release() -- a function to release this address reference.

uvm_voaddr_compare() -- a function to compare two such address references.

uvm_voaddr_acquire() resolves the COW status of the object address before
acquiring.

In collaboration with riastradh@ and chs@.


Revision tags: phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.222 22-Mar-2020 ad

branches: 1.222.2;
Process concurrent page faults on individual uvm_objects / vm_amaps in
parallel, where the relevant pages are already in-core. Proposed on
tech-kern.

Temporarily disabled on MP architectures with __HAVE_UNLOCKED_PMAP until
adjustments are made to their pmaps.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.221 23-Feb-2020 ad

UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


# 1.220 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.219 15-Jan-2020 ad

Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.


Revision tags: ad-namecache-base
# 1.218 31-Dec-2019 ad

branches: 1.218.2;
- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.


# 1.217 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.216 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.215 21-Dec-2019 ad

Add uvm_free(): returns number of free pages in system.


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

branches: 1.213.2;
allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.228 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.227 26-May-2020 kamil

Catch up with the usage of struct vmspace::vm_refcnt

Use the dedicated reference counting routines.

Change the type of struct vmspace::vm_refcnt and struct vm_map::ref_count
to volatile.

Remove the unnecessary vm->vm_map.misc_lock locking in process_domem().

Reviewed by <ad>


# 1.226 09-May-2020 thorpej

Make the uvm_voaddr structure more compact, only occupying 2 pointers
worth of space, by encoding the type in the lower bits of the object
pointer.


# 1.225 27-Apr-2020 rin

Add missing \ to fix build for PMAP_CACHE_VIVT, i.e., ARMv4 and prior.


Revision tags: bouyer-xenpvh-base2
# 1.224 23-Apr-2020 ad

PR kern/54759 (vm.ubc_direct deadlock when read()/write() into mapping of itself)

- Add new flag UBC_ISMAPPED which tells ubc_uiomove() the object is mmap()ed
somewhere. Use it to decide whether to do direct-mapped copy, rather than
poking around directly in the vnode in ubc_uiomove(), which is ugly and
doesn't work for tmpfs. It would be nicer to contain all this in UVM but
the filesystem provides the needed locking here (VV_MAPPED) and to
reinvent that would suck more.

- Rename UBC_UNMAP_FLAG() to UBC_VNODE_FLAGS(). Pass in UBC_ISMAPPED where
appropriate.


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.223 18-Apr-2020 thorpej

Add an API to get a reference on the identity of an individual byte of
virtual memory, a "virtual object address". This is not a reference to
a physical byte of memory, per se, but a reference to a byte residing
in a page, owned by a unique UVM object (either a uobj or an anon). Two
separate address+addresses space tuples that reference the same byte in
an object (such as a location in a shared memory segment) will resolve
to equivalent virtual object addresses. Even if the residency status
of the page changes, the virtual object address remains unchanged.

struct uvm_voaddr -- a structure that encapsulates this address reference.

uvm_voaddr_acquire() -- a function to acquire this address reference,
given a vm_map and a vaddr_t.

uvm_voaddr_release() -- a function to release this address reference.

uvm_voaddr_compare() -- a function to compare two such address references.

uvm_voaddr_acquire() resolves the COW status of the object address before
acquiring.

In collaboration with riastradh@ and chs@.


Revision tags: phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.222 22-Mar-2020 ad

branches: 1.222.2;
Process concurrent page faults on individual uvm_objects / vm_amaps in
parallel, where the relevant pages are already in-core. Proposed on
tech-kern.

Temporarily disabled on MP architectures with __HAVE_UNLOCKED_PMAP until
adjustments are made to their pmaps.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.221 23-Feb-2020 ad

UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


# 1.220 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.219 15-Jan-2020 ad

Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.


Revision tags: ad-namecache-base
# 1.218 31-Dec-2019 ad

branches: 1.218.2;
- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.


# 1.217 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.216 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.215 21-Dec-2019 ad

Add uvm_free(): returns number of free pages in system.


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

branches: 1.213.2;
allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.227 26-May-2020 kamil

Catch up with the usage of struct vmspace::vm_refcnt

Use the dedicated reference counting routines.

Change the type of struct vmspace::vm_refcnt and struct vm_map::ref_count
to volatile.

Remove the unnecessary vm->vm_map.misc_lock locking in process_domem().

Reviewed by <ad>


# 1.226 09-May-2020 thorpej

Make the uvm_voaddr structure more compact, only occupying 2 pointers
worth of space, by encoding the type in the lower bits of the object
pointer.


# 1.225 27-Apr-2020 rin

Add missing \ to fix build for PMAP_CACHE_VIVT, i.e., ARMv4 and prior.


Revision tags: bouyer-xenpvh-base2
# 1.224 23-Apr-2020 ad

PR kern/54759 (vm.ubc_direct deadlock when read()/write() into mapping of itself)

- Add new flag UBC_ISMAPPED which tells ubc_uiomove() the object is mmap()ed
somewhere. Use it to decide whether to do direct-mapped copy, rather than
poking around directly in the vnode in ubc_uiomove(), which is ugly and
doesn't work for tmpfs. It would be nicer to contain all this in UVM but
the filesystem provides the needed locking here (VV_MAPPED) and to
reinvent that would suck more.

- Rename UBC_UNMAP_FLAG() to UBC_VNODE_FLAGS(). Pass in UBC_ISMAPPED where
appropriate.


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.223 18-Apr-2020 thorpej

Add an API to get a reference on the identity of an individual byte of
virtual memory, a "virtual object address". This is not a reference to
a physical byte of memory, per se, but a reference to a byte residing
in a page, owned by a unique UVM object (either a uobj or an anon). Two
separate address+addresses space tuples that reference the same byte in
an object (such as a location in a shared memory segment) will resolve
to equivalent virtual object addresses. Even if the residency status
of the page changes, the virtual object address remains unchanged.

struct uvm_voaddr -- a structure that encapsulates this address reference.

uvm_voaddr_acquire() -- a function to acquire this address reference,
given a vm_map and a vaddr_t.

uvm_voaddr_release() -- a function to release this address reference.

uvm_voaddr_compare() -- a function to compare two such address references.

uvm_voaddr_acquire() resolves the COW status of the object address before
acquiring.

In collaboration with riastradh@ and chs@.


Revision tags: phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.222 22-Mar-2020 ad

branches: 1.222.2;
Process concurrent page faults on individual uvm_objects / vm_amaps in
parallel, where the relevant pages are already in-core. Proposed on
tech-kern.

Temporarily disabled on MP architectures with __HAVE_UNLOCKED_PMAP until
adjustments are made to their pmaps.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.221 23-Feb-2020 ad

UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


# 1.220 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.219 15-Jan-2020 ad

Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.


Revision tags: ad-namecache-base
# 1.218 31-Dec-2019 ad

branches: 1.218.2;
- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.


# 1.217 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.216 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.215 21-Dec-2019 ad

Add uvm_free(): returns number of free pages in system.


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

branches: 1.213.2;
allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.226 09-May-2020 thorpej

Make the uvm_voaddr structure more compact, only occupying 2 pointers
worth of space, by encoding the type in the lower bits of the object
pointer.


# 1.225 27-Apr-2020 rin

Add missing \ to fix build for PMAP_CACHE_VIVT, i.e., ARMv4 and prior.


Revision tags: bouyer-xenpvh-base2
# 1.224 23-Apr-2020 ad

PR kern/54759 (vm.ubc_direct deadlock when read()/write() into mapping of itself)

- Add new flag UBC_ISMAPPED which tells ubc_uiomove() the object is mmap()ed
somewhere. Use it to decide whether to do direct-mapped copy, rather than
poking around directly in the vnode in ubc_uiomove(), which is ugly and
doesn't work for tmpfs. It would be nicer to contain all this in UVM but
the filesystem provides the needed locking here (VV_MAPPED) and to
reinvent that would suck more.

- Rename UBC_UNMAP_FLAG() to UBC_VNODE_FLAGS(). Pass in UBC_ISMAPPED where
appropriate.


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.223 18-Apr-2020 thorpej

Add an API to get a reference on the identity of an individual byte of
virtual memory, a "virtual object address". This is not a reference to
a physical byte of memory, per se, but a reference to a byte residing
in a page, owned by a unique UVM object (either a uobj or an anon). Two
separate address+addresses space tuples that reference the same byte in
an object (such as a location in a shared memory segment) will resolve
to equivalent virtual object addresses. Even if the residency status
of the page changes, the virtual object address remains unchanged.

struct uvm_voaddr -- a structure that encapsulates this address reference.

uvm_voaddr_acquire() -- a function to acquire this address reference,
given a vm_map and a vaddr_t.

uvm_voaddr_release() -- a function to release this address reference.

uvm_voaddr_compare() -- a function to compare two such address references.

uvm_voaddr_acquire() resolves the COW status of the object address before
acquiring.

In collaboration with riastradh@ and chs@.


Revision tags: phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.222 22-Mar-2020 ad

branches: 1.222.2;
Process concurrent page faults on individual uvm_objects / vm_amaps in
parallel, where the relevant pages are already in-core. Proposed on
tech-kern.

Temporarily disabled on MP architectures with __HAVE_UNLOCKED_PMAP until
adjustments are made to their pmaps.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.221 23-Feb-2020 ad

UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


# 1.220 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.219 15-Jan-2020 ad

Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.


Revision tags: ad-namecache-base
# 1.218 31-Dec-2019 ad

branches: 1.218.2;
- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.


# 1.217 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.216 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.215 21-Dec-2019 ad

Add uvm_free(): returns number of free pages in system.


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

branches: 1.213.2;
allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.225 27-Apr-2020 rin

Add missing \ to fix build for PMAP_CACHE_VIVT, i.e., ARMv4 and prior.


Revision tags: bouyer-xenpvh-base2
# 1.224 23-Apr-2020 ad

PR kern/54759 (vm.ubc_direct deadlock when read()/write() into mapping of itself)

- Add new flag UBC_ISMAPPED which tells ubc_uiomove() the object is mmap()ed
somewhere. Use it to decide whether to do direct-mapped copy, rather than
poking around directly in the vnode in ubc_uiomove(), which is ugly and
doesn't work for tmpfs. It would be nicer to contain all this in UVM but
the filesystem provides the needed locking here (VV_MAPPED) and to
reinvent that would suck more.

- Rename UBC_UNMAP_FLAG() to UBC_VNODE_FLAGS(). Pass in UBC_ISMAPPED where
appropriate.


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.223 18-Apr-2020 thorpej

Add an API to get a reference on the identity of an individual byte of
virtual memory, a "virtual object address". This is not a reference to
a physical byte of memory, per se, but a reference to a byte residing
in a page, owned by a unique UVM object (either a uobj or an anon). Two
separate address+addresses space tuples that reference the same byte in
an object (such as a location in a shared memory segment) will resolve
to equivalent virtual object addresses. Even if the residency status
of the page changes, the virtual object address remains unchanged.

struct uvm_voaddr -- a structure that encapsulates this address reference.

uvm_voaddr_acquire() -- a function to acquire this address reference,
given a vm_map and a vaddr_t.

uvm_voaddr_release() -- a function to release this address reference.

uvm_voaddr_compare() -- a function to compare two such address references.

uvm_voaddr_acquire() resolves the COW status of the object address before
acquiring.

In collaboration with riastradh@ and chs@.


Revision tags: phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.222 22-Mar-2020 ad

branches: 1.222.2;
Process concurrent page faults on individual uvm_objects / vm_amaps in
parallel, where the relevant pages are already in-core. Proposed on
tech-kern.

Temporarily disabled on MP architectures with __HAVE_UNLOCKED_PMAP until
adjustments are made to their pmaps.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.221 23-Feb-2020 ad

UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


# 1.220 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.219 15-Jan-2020 ad

Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.


Revision tags: ad-namecache-base
# 1.218 31-Dec-2019 ad

branches: 1.218.2;
- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.


# 1.217 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.216 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.215 21-Dec-2019 ad

Add uvm_free(): returns number of free pages in system.


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

branches: 1.213.2;
allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.224 23-Apr-2020 ad

PR kern/54759 (vm.ubc_direct deadlock when read()/write() into mapping of itself)

- Add new flag UBC_ISMAPPED which tells ubc_uiomove() the object is mmap()ed
somewhere. Use it to decide whether to do direct-mapped copy, rather than
poking around directly in the vnode in ubc_uiomove(), which is ugly and
doesn't work for tmpfs. It would be nicer to contain all this in UVM but
the filesystem provides the needed locking here (VV_MAPPED) and to
reinvent that would suck more.

- Rename UBC_UNMAP_FLAG() to UBC_VNODE_FLAGS(). Pass in UBC_ISMAPPED where
appropriate.


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.223 18-Apr-2020 thorpej

Add an API to get a reference on the identity of an individual byte of
virtual memory, a "virtual object address". This is not a reference to
a physical byte of memory, per se, but a reference to a byte residing
in a page, owned by a unique UVM object (either a uobj or an anon). Two
separate address+addresses space tuples that reference the same byte in
an object (such as a location in a shared memory segment) will resolve
to equivalent virtual object addresses. Even if the residency status
of the page changes, the virtual object address remains unchanged.

struct uvm_voaddr -- a structure that encapsulates this address reference.

uvm_voaddr_acquire() -- a function to acquire this address reference,
given a vm_map and a vaddr_t.

uvm_voaddr_release() -- a function to release this address reference.

uvm_voaddr_compare() -- a function to compare two such address references.

uvm_voaddr_acquire() resolves the COW status of the object address before
acquiring.

In collaboration with riastradh@ and chs@.


Revision tags: phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.222 22-Mar-2020 ad

branches: 1.222.2;
Process concurrent page faults on individual uvm_objects / vm_amaps in
parallel, where the relevant pages are already in-core. Proposed on
tech-kern.

Temporarily disabled on MP architectures with __HAVE_UNLOCKED_PMAP until
adjustments are made to their pmaps.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.221 23-Feb-2020 ad

UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


# 1.220 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.219 15-Jan-2020 ad

Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.


Revision tags: ad-namecache-base
# 1.218 31-Dec-2019 ad

branches: 1.218.2;
- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.


# 1.217 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.216 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.215 21-Dec-2019 ad

Add uvm_free(): returns number of free pages in system.


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

branches: 1.213.2;
allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.223 18-Apr-2020 thorpej

Add an API to get a reference on the identity of an individual byte of
virtual memory, a "virtual object address". This is not a reference to
a physical byte of memory, per se, but a reference to a byte residing
in a page, owned by a unique UVM object (either a uobj or an anon). Two
separate address+addresses space tuples that reference the same byte in
an object (such as a location in a shared memory segment) will resolve
to equivalent virtual object addresses. Even if the residency status
of the page changes, the virtual object address remains unchanged.

struct uvm_voaddr -- a structure that encapsulates this address reference.

uvm_voaddr_acquire() -- a function to acquire this address reference,
given a vm_map and a vaddr_t.

uvm_voaddr_release() -- a function to release this address reference.

uvm_voaddr_compare() -- a function to compare two such address references.

uvm_voaddr_acquire() resolves the COW status of the object address before
acquiring.

In collaboration with riastradh@ and chs@.


Revision tags: phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.222 22-Mar-2020 ad

Process concurrent page faults on individual uvm_objects / vm_amaps in
parallel, where the relevant pages are already in-core. Proposed on
tech-kern.

Temporarily disabled on MP architectures with __HAVE_UNLOCKED_PMAP until
adjustments are made to their pmaps.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.221 23-Feb-2020 ad

UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


# 1.220 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.219 15-Jan-2020 ad

Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.


Revision tags: ad-namecache-base
# 1.218 31-Dec-2019 ad

branches: 1.218.2;
- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.


# 1.217 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.216 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.215 21-Dec-2019 ad

Add uvm_free(): returns number of free pages in system.


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

branches: 1.213.2;
allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.222 22-Mar-2020 ad

Process concurrent page faults on individual uvm_objects / vm_amaps in
parallel, where the relevant pages are already in-core. Proposed on
tech-kern.

Temporarily disabled on MP architectures with __HAVE_UNLOCKED_PMAP until
adjustments are made to their pmaps.


Revision tags: ad-namecache-base3
# 1.221 23-Feb-2020 ad

UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


# 1.220 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.219 15-Jan-2020 ad

Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.


Revision tags: ad-namecache-base
# 1.218 31-Dec-2019 ad

branches: 1.218.2;
- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.


# 1.217 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.216 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.215 21-Dec-2019 ad

Add uvm_free(): returns number of free pages in system.


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.221 23-Feb-2020 ad

UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


# 1.220 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.219 15-Jan-2020 ad

Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.


Revision tags: ad-namecache-base
# 1.218 31-Dec-2019 ad

branches: 1.218.2;
- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.


# 1.217 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.216 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.215 21-Dec-2019 ad

Add uvm_free(): returns number of free pages in system.


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.220 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.219 15-Jan-2020 ad

Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.


Revision tags: ad-namecache-base
# 1.218 31-Dec-2019 ad

branches: 1.218.2;
- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.


# 1.217 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.216 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.215 21-Dec-2019 ad

Add uvm_free(): returns number of free pages in system.


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.219 15-Jan-2020 ad

Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.


Revision tags: ad-namecache-base
# 1.218 31-Dec-2019 ad

- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.


# 1.217 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.216 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.215 21-Dec-2019 ad

Add uvm_free(): returns number of free pages in system.


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.218 31-Dec-2019 ad

- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.


# 1.217 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.216 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.215 21-Dec-2019 ad

Add uvm_free(): returns number of free pages in system.


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.216 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.215 21-Dec-2019 ad

Add uvm_free(): returns number of free pages in system.


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.215 21-Dec-2019 ad

Add uvm_free(): returns number of free pages in system.


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.214 16-Dec-2019 ad

- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).


Revision tags: netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-1-RELEASE netbsd-8-1-RC1 netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


Revision tags: isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.213 28-May-2018 chs

allow tmpfs files to be larger than 4GB.


Revision tags: pgoyette-compat-0521
# 1.212 19-May-2018 jdolecek

Remove emap support. Unfortunately it never got to state where it would be
used and usable, due to reliability and limited & complicated MD support.

Going forward, we need to concentrate on interface which do not map anything
into kernel in first place (such as direct map or KVA-less I/O), rather
than making those mappings cheaper to do.


# 1.211 08-May-2018 christos

don't store the rssmax in the lwp rusage, it is a per proc property. Instead
utilize an unused field in the vmspace struct to store it. Also conditionalize
on platforms that have pmap statistics available.


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.210 20-Apr-2018 jdolecek

add prot parameter for uvm_emap_enter(), so that it's possible to
enter also read/write mappings


# 1.209 20-Apr-2018 jdolecek

make ubc_alloc() and ubc_release() static, they should not be used
outside of ubc_uiomove()/ubc_zeropage(); for now mark as noinline
to keep them available as breakpoints


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.208 15-Dec-2017 maya

branches: 1.208.2;
Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: netbsd-8-0-RELEASE netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.208 15-Dec-2017 maya

Match locking notes with reality.
misc_lock is used to protect vm_refcnt.

ok chuq


Revision tags: tls-maxphys-base-20171202
# 1.207 02-Dec-2017 mrg

add two new members to uvmexp_sysctl{}: bootpages and poolpages.
bootpages is set to the pages allocated via uvm_pageboot_alloc().
poolpages is calculated from the list of pools nr_pages members.

this brings us closer to having a valid total of pages known by
the system, vs actual pages originally managed.

XXX: poolpages needs some handling for PR_RECURSIVE pools still.


Revision tags: matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base
# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.206 20-May-2017 chs

MAP_FIXED means something different for mremap() than it does for mmap(),
so we cannot use UVM_FLAG_FIXED to specify both behaviors.
keep UVM_FLAG_FIXED with its earlier meaning (prior to my previous change)
of whether to use uvm_map_findspace() to locate space for the new mapping or
to use the hint address that the caller passed in, and add a new flag
UVM_FLAG_UNMAP to indicate that any existing entries in the range should be
unmapped as part of creating the new mapping. the new UVM_FLAG_UNMAP flag
may only be used if UVM_FLAG_FIXED is also specified.


Revision tags: prg-localcount2-base3
# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.205 17-May-2017 christos

snprintb(3) for UVM_FLAGS.


Revision tags: prg-localcount2-base2
# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

branches: 1.203.6;
don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.204 06-May-2017 joerg

Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.


Revision tags: prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


Revision tags: pgoyette-localcount-20170107
# 1.203 04-Jan-2017 christos

don't include uvm_physseg.h for kmem grovellers.


# 1.202 02-Jan-2017 cherry

Remove a redundant #ifdef _KERNEL/#endif pair.

ok mrg@


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision


# 1.201 24-Dec-2016 cherry

uvm_extern.h is has both a _KERNEL only, and a non _KERNEL only API.

Since we unconditionally expose the uvm_physseg.h API via uvm_extern.h
right now, and since uvm_physseg.h uses a kernel only datatype, viz
psize_t, we restrict exposure of uvm_physseg.h API exposure to kernel
only.

This is in conformance of its documentation via uvm_hotplug(9) as a
kernel internal API.


# 1.200 22-Dec-2016 cherry

Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h

For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().


# 1.199 22-Dec-2016 cherry

Add a new function called uvm_md_init() that can be called at the
appropriate time in the boot path by MD code.


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726
# 1.198 20-Jul-2016 maxv

Introduce uvm_km_protect.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529
# 1.197 25-May-2016 christos

branches: 1.197.2;
Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass
mprotect settings so that debuggers can write to the text segment of traced
processes so that they can insert breakpoints. Turned off by default.
Ok: chuq (for now)


Revision tags: nick-nhusb-base-20160422 nick-nhusb-base-20160319
# 1.196 05-Feb-2016 christos

PR/50744: NONAKA Kimihiro: Protect more stuff with _KERNEL && _KMEMUSER to
make uvm_extern.h compile standalone again for net-snmp.


Revision tags: nick-nhusb-base-20151226
# 1.195 26-Nov-2015 martin

We never exec(2) with a kernel vmspace, so do not test for that, but instead
KASSERT() that we don't.
When calculating the load address for the interpreter (e.g. ld.elf_so),
we need to take into account wether the exec'd process will run with
topdown memory or bottom up. We can not use the current vmspace's flags
to test for that, as this happens too early. Luckily the execpack already
knows what the new state will be later, so instead of testing the current
vmspace, pass the info as additional argument to struct emul
e_vm_default_addr.
Fix all such functions and adopt all callers.


Revision tags: nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406
# 1.194 20-Mar-2015 riastradh

Comments explaining UBC_* flags.


# 1.193 06-Feb-2015 maxv

Kill kmeminit().


# 1.192 14-Dec-2014 chs

add a new "fo_mmap" fileops method to allow use of arbitrary uvm_objects for
mappings of file objects. move vnode-specific details of mmap()ing a vnode
from uvm_mmap() to the new vnode-specific vn_mmap(). add new uvm_mmap_dev()
and uvm_mmap_anon() convenience functions for mapping character devices
and anonymous memory, and replace all other calls to uvm_mmap() with those.
use the new fileop in drm2 so that libdrm can use mmap() to map things
like on other platforms (instead of the ioctl that we have used so far).


Revision tags: nick-nhusb-base netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.191 07-Jul-2014 riastradh

branches: 1.191.2; 1.191.4;
Initialize ubchist earlier.


# 1.190 22-May-2014 riastradh

Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist.

Brought up on tech-kern:

https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.189 21-Feb-2014 skrll

branches: 1.189.2;
Remove unnecessary struct simplelock forward declaration.


# 1.188 03-Jan-2014 dsl

There is no need for uvm_coredump_walkmap() to explicity pass the proc_t
pointer to the calller's function.
If the code needs the process its address can be placed in the caller's
cookie.


# 1.187 03-Jan-2014 dsl

Minor changes to the process coredump code.
- Add some extra comments.
- Add some XXX comments because the process state might not be stable,
- Add uvm_coredump_count_segs() to simplify the calling code.
- uvm code now only returns non-empty sections/segments.
- Put the 'iocookie' into the 'cookie' block passed to uvm_coredump_walkmap()
instead of passing it through as an additional parameter.
amd64 can still generate core dumps that gdb can read.


# 1.186 01-Jan-2014 dsl

Change the type of the 'cookie' that holds the state of the core dump file
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.


# 1.185 14-Nov-2013 martin

As discussed on tech-kern: make TOPDOWN-VM runtime selectable per process
(offer MD code or emulations to override it).


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.184 01-Sep-2012 matt

branches: 1.184.2; 1.184.4;
Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4
# 1.183 08-Apr-2012 martin

Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat


Revision tags: jmcneill-usbmp-base8
# 1.182 18-Mar-2012 uebayasi

Move base type definitions from uvm_extern.h to uvm_param.h so that
other sources can easily include part of UVM headers without the whole
uvm_extern.h (e.g. sys/vnode.h wants only uvm_object.h).


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-base2 netbsd-6-base
# 1.181 02-Feb-2012 para

branches: 1.181.2;
- bringing kmeminit_nkmempages back and revert pmaps that called this early
- use nkmempages to scale the kmem_arena
- reducing diff to pre kmem/vmem change
(NKMEMPAGES_MAX_DEFAULT will need adjusting on some archs)


# 1.180 27-Jan-2012 para

extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged


# 1.179 05-Jan-2012 reinoud

Revert MAP_NOSYSCALLS patch.


# 1.178 22-Dec-2011 reinoud

Redo uvm_map_setattr() to never fail and remove the possible panic. The
possibility of failure was a C&P error.


# 1.177 20-Dec-2011 reinoud

Add a MAP_NOSYSCALLS flag to mmap. This flag prohibits executing of system
calls from the mapped region. This can be used for emulation perposed or for
extra security in the case of generated code.

Its implemented by adding mapping-attributes to each uvm_map_entry. These can
then be queried when needed.

Currently the MAP_NOSYSCALLS is only implemented for x86 but other
architectures are easy to adapt; see the sys/arch/x86/x86/syscall.c patch.
Port maintainers are encouraged to add them for their processor ports too.
When this feature is not yet implemented for an architecture the
MAP_NOSYSCALLS is simply ignored with virtually no cpu cost..


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.176 01-Sep-2011 matt

branches: 1.176.2; 1.176.6;
Forward some UVM from matt-nb5-mips64. Add UVM_KMF_COLORMATCH flag.
When uvm_map gets passed UVM_FLAG_COLORMATCH, the align argument contains
the color of the starting address to be allocated (0..colormask).
When uvm_km_alloc is passed UVM_KMF_COLORMATCH (which can only be used with
UVM_KMF_VAONLY), the align argument contain the color of the starting address
to be allocated.
Change uvm_pagermapin to use this. When mapping user pages in the kernel,
if colormatch is used with the color of the starting user page then the kernel
mapping will be congruent with the existing user mappings.


# 1.175 27-Aug-2011 christos

Add an optional pglist argument to uvm_obj_wirepages, to be
filled with the list of pages that were wired.


# 1.174 16-Jun-2011 hannken

Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode. Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.


# 1.173 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.172 23-Apr-2011 rmind

branches: 1.172.2;
Replace "malloc" in comments, remove unnecessary header inclusions.


Revision tags: bouyer-quota2-nbase
# 1.171 17-Feb-2011 matt

Add support for cpu-specific uarea allocation routines. Allows different
allocation for user and system lwps. MIPS will use this to map uareas of
system lwp used direct-mapped addresses (to reduce the overhead of
switching to kernel threads). ibm4xx could use to map uareas via direct
mapped addresses and avoid the problem of having the kernel stack not in
the TLB.


Revision tags: uebayasi-xip-base7 bouyer-quota2-base
# 1.170 10-Feb-2011 pooka

Make vmapbuf() return success/error and make physio deal with a
failure.


# 1.169 02-Feb-2011 chuck

udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.


Revision tags: jruoho-x86intr-base
# 1.168 04-Jan-2011 matt

branches: 1.168.2; 1.168.4;
Add better color matching selecting free pages. KM pages will now allocated
so that VA and PA have the same color. On a page fault, choose a physical
page that has the same color as the virtual address.

When allocating kernel memory pages, allow the MD to specify a preferred
VM_FREELIST from which to choose pages. For machines with large amounts
of memory (> 4GB), all kernel memory to come from <4GB to reduce the amount
of bounce buffering needed with 32bit DMA devices.


Revision tags: matt-mips64-premerge-20101231
# 1.167 20-Dec-2010 matt

Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.


Revision tags: uebayasi-xip-base6
# 1.166 13-Nov-2010 uebayasi

Hide uvm/uvm_page.h again to ensure its internal structures are MD.

GENERIC or at least one kernel compile tested for:
acorn26, acorn32, algor, all, alpha, amd64, amiga, amigappc,
arc, bebox, bighill, cats, cobalt, dreamcast, ews4800mips,
hp300, hp700, hpcarm, hpcmips, hpcsh, i386, ibmnws,
integrator, ixm1200, iyonix, landisk, luna68k, mac68k,
macppc, mipsco, mmeye, mvme68k, mvmeppc, netwinder, news68k,
newsmips, next68k, obs266a, ofppc, pmax, pmppc, prep,
rs6000, sandpoint, sbmips, shark, sidebeach, sparc, sparc64,
sun2, sun3, usermode, vax, x68k, zaurus


# 1.165 12-Nov-2010 uebayasi

Put back uvm_page.h for now. Sorry for mess.


# 1.164 12-Nov-2010 uebayasi

Abstraction fix; don't pull in physical segment/page definitions
in UVM external API, uvm_extern.h. Because most users care only
virtual memory.

Device drivers use bus_dma(9) to manage physical memory. Device
drivers pull in bus_dma(9) API, bus_dma.h. bus_dma(9) implementations
pull in UVM internal API, uvm.h.

Tested By: Compiling i386 ALL kernel


Revision tags: uebayasi-xip-base5 uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1
# 1.163 16-Apr-2010 rmind

- Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
Avoids blocking effect on real-time threads. Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON. Also,
sched_pstats() might be cleaned-up slightly.


Revision tags: yamt-nfs-mp-base9
# 1.162 08-Feb-2010 joerg

branches: 1.162.2;
Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.


Revision tags: uebayasi-xip-base matt-premerge-20091211
# 1.161 21-Nov-2009 rmind

branches: 1.161.2;
Add uvm_lwp_getuarea() and uvm_lwp_setuarea(). OK matt@.


Revision tags: jym-xensuspend-nbase
# 1.160 21-Oct-2009 rmind

Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7
# 1.159 18-Aug-2009 yamt

whitespace fixes. no functional changes.


# 1.158 10-Aug-2009 haad

Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.


# 1.157 05-Aug-2009 pooka

kill uvm_aio_biodone1(). only user was lfs and that uses nestiobuf now.


# 1.156 05-Aug-2009 pooka

add some advice symbols we'll eventually need


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.155 28-Jun-2009 rmind

Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.154 30-Mar-2009 yamt

g/c uvm_aiobuf_pool.


# 1.153 29-Mar-2009 mrg

- add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)


# 1.152 12-Mar-2009 abs

Clarify free_list usage in uvm_page_physload() regarding faster/slower RAM.
Slower RAM should be assigned a higher free_list id.
No functional change to code, just comments and manpage


Revision tags: nick-hppapmap-base2
# 1.151 18-Feb-2009 yamt

make some functions static.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.150 26-Nov-2008 pooka

branches: 1.150.4;
Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.


# 1.149 31-Oct-2008 christos

- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.148 08-Aug-2008 skrll

branches: 1.148.2; 1.148.4;
g/c exec_map


Revision tags: simonb-wapbl-nbase simonb-wapbl-base
# 1.147 11-Jul-2008 skrll

English improvement in comments.

"seems good to me :)" from yamt.


Revision tags: wrstuden-revivesa-base-1 yamt-pf42-base4 wrstuden-revivesa-base
# 1.146 04-Jun-2008 ad

branches: 1.146.2; 1.146.4;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.145 29-Feb-2008 yamt

branches: 1.145.2; 1.145.4; 1.145.6;
uvm_swap_io: if pagedaemon, don't wait for iobuf.


Revision tags: nick-net80211-sync-base mjf-devfs-base hpcarm-cleanup-base
# 1.144 28-Jan-2008 yamt

branches: 1.144.2; 1.144.6;
remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base matt-armv6-base
# 1.143 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.142 26-Dec-2007 christos

Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.


Revision tags: vmlocking2-base3
# 1.141 24-Dec-2007 perry

Remove __attribute__((__noreturn__)) from things already marked __dead
Found by the department of redundancy department.


Revision tags: yamt-kmem-base3
# 1.140 13-Dec-2007 yamt

add ddb "whatis" command. inspired from solaris ::whatis dcmd.


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase jmcneill-pm-base reinoud-bufcleanup-base
# 1.139 05-Dec-2007 yamt

branches: 1.139.2; 1.139.4;
g/c uvm_vnp_sync


# 1.138 05-Dec-2007 yamt

fix UBC_WANT_UNMAP.
- check PMAP_CACHE_VIVT after pulling pmap.h.
- VTEXT -> VI_TEXT.


Revision tags: vmlocking2-base1 vmlocking-nbase
# 1.137 30-Nov-2007 ad

branches: 1.137.2;
Make {anon,file,exec}pages unsigned.


Revision tags: jmcneill-base bouyer-xenamd64-base2 bouyer-xenamd64-base
# 1.136 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base4 yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.135 18-Aug-2007 ad

branches: 1.135.2; 1.135.6; 1.135.8;
Make the uarea cache per-CPU and drain in batches of 4.


Revision tags: matt-mips64-base
# 1.134 27-Jul-2007 yamt

branches: 1.134.4; 1.134.6;
ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.


# 1.133 22-Jul-2007 pooka

Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden


Revision tags: nick-csl-alignment-base
# 1.132 17-Jul-2007 joerg

branches: 1.132.2;
Add native mremap system call based on the UVM implementation for
Linux compat. Add code to enforce alignment of the new location.
Special thanks to wizd for helping with the man page.


Revision tags: mjf-ufs-trans-base
# 1.131 09-Jul-2007 ad

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.130 05-Jun-2007 yamt

improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.129 24-Mar-2007 rmind

Export uvm_uarea_free() to the rest.
Make things compile again.


# 1.128 04-Mar-2007 christos

branches: 1.128.2; 1.128.4; 1.128.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.127 22-Feb-2007 thorpej

TRUE -> true, FALSE -> false


# 1.126 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


# 1.125 15-Feb-2007 ad

branches: 1.125.2;
Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0).


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.124 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3
# 1.123 07-Dec-2006 elad

Back out uvm_is_swap_device().


Revision tags: netbsd-4-base
# 1.122 01-Dec-2006 elad

branches: 1.122.2;
Introduce uvm_is_swap_device(), to check if the passed struct vnode * is
used as a swap device or not.

Okay mrg@.


Revision tags: yamt-splraiseipl-base2
# 1.121 12-Oct-2006 yamt

move some knowledge about vnode into uvm_vnode.c.


# 1.120 12-Oct-2006 yamt

uobj_wirepages and uobj_unwirepages from Mindaugas. PR/34771.
(commented out in files.uvm for now because there is no user in tree.)

http://mail-index.netbsd.org/tech-kern/2006/09/24/0000.html
http://mail-index.netbsd.org/tech-kern/2006/10/10/0000.html


# 1.119 05-Oct-2006 chs

add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).


Revision tags: yamt-splraiseipl-base
# 1.118 15-Sep-2006 yamt

branches: 1.118.2;
merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
- implement an alternative replacement policy


Revision tags: yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.117 01-Sep-2006 cherry

branches: 1.117.2;
bumps kernel aobj to 64 bit. \
See: http://mail-index.netbsd.org/tech-kern/2006/03/07/0007.html


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.116 04-Aug-2006 he

Rearrange included headers and/or add include of <sys/types.h> and
<sys/lock.h>, so that the mipsco port can build again, ref.
http://mail-index.netbsd.org/port-mips/2006/08/04/0000.html
Reviewed by thorpej


# 1.115 05-Jul-2006 drochner

Introduce a UVM_KMF_EXEC flag for uvm_km_alloc() which enforces an
executable mapping. Up to now, only R+W was requested from pmap_kenter_pa.
On most CPUs, we get an executable mapping anyway, due to lack of
hardware support or due to lazyness in the pmap implementation. Only
alpha does obey VM_PROT_EXECUTE, afaics.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.114 19-May-2006 yamt

branches: 1.114.2; 1.114.4;
UVM_MAPFLAG: add missing parens.


# 1.113 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base elad-kernelauth-base
# 1.112 15-Mar-2006 drochner

branches: 1.112.2;
-clean up the interface to uvm_fault: the "fault type" didn't serve
any purpose (done by a macro, so we don't save any cycles for now)
-kill vm_fault_t; it is not needed for real faults, and for simulated
faults (wiring) it can be replaced by UVM internal flags
-remove <uvm/uvm_fault.h> from uvm_extern.h again


Revision tags: yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.111 01-Mar-2006 yamt

branches: 1.111.2; 1.111.4;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.110 10-Feb-2006 simonb

Make a note that some counters should be 64-bit as they wrap far to
quickly.


# 1.109 21-Jan-2006 yamt

branches: 1.109.2; 1.109.4;
implement compat_linux mremap.


# 1.108 21-Dec-2005 yamt

branches: 1.108.2;
make length of inactive queue tunable by sysctl. (vm.inactivepct)


Revision tags: ktrace-lwp-base
# 1.107 29-Nov-2005 yamt

merge yamt-readahead branch.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.106 01-Sep-2005 yamt

branches: 1.106.6;
remove one of duplicated forward decl. of vmspace. pointed by Dheeraj S.


# 1.105 01-Sep-2005 yamt

put back uvm_fault.h for now as it's needed for some ports.


# 1.104 27-Aug-2005 yamt

don't include uvm_fault.h unnecessarily.


# 1.103 10-Jun-2005 matt

branches: 1.103.2;
Rework the coredump code to have no explicit knownledge of how coredump
i/o is done. Instead, pass an opaque cookie which is then passed to a
new routine, coredump_write, which does the actual i/o. This allows the
method of doing i/o to change without affecting any future MD code.
Also, make netbsd32_core.c [re]use core_netbsd.c (in a similar manner that
core_elf64.c uses core_elf32.c) and eliminate that code duplication.
cpu_coredump{,32} is now called twice, first with a NULL iocookie to fill
the core structure and a second to actually write md parts of the coredump.
All i/o is nolonger random access and is suitable for shipping over a stream.


# 1.102 02-Jun-2005 matt

When writing coredumps, don't write zero uninstantiated demand-zero pages.
Also, with ELF core dumps, trim trailing zeroes from sections. These two
changes can shrink coredumps by over 50% in size.


# 1.101 15-May-2005 yamt

remove anon related statistics which are no longer used.


Revision tags: kent-audio2-base
# 1.100 01-Apr-2005 yamt

merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
save some resources like pv_entry. also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.


Revision tags: yamt-km-base4
# 1.99 26-Mar-2005 fvdl

Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.

* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
that will return the default VM map address. The default function
is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
macro. This gives emulations control over the default map address,
and allows things to be mapped at the right address (in 32bit range)
for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.


Revision tags: yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.98 13-Jan-2005 yamt

branches: 1.98.2; 1.98.4; 1.98.8;
in uvm_unmap_remove, always wakeup va waiters if any.
uvm_km_free_wakeup is now a synonym of uvm_km_free.


Revision tags: kent-audio1-beforemerge
# 1.97 09-Jan-2005 chs

adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.


# 1.96 01-Jan-2005 yamt

in the case of !PMAP_MAP_POOLPAGE, gather pool backend allocations to
large chunks for kernel_map and kmem_map to ease kva fragmentation.


# 1.95 01-Jan-2005 yamt

introduce vm_map_kernel, a subclass of vm_map, and
move some kernel-only members of vm_map to it.


# 1.94 01-Jan-2005 yamt

for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
PR/24039.


Revision tags: kent-audio1-base
# 1.93 28-Aug-2004 thorpej

Garbage-collect pagemove(); nothing use it anymore (YAY!!!)


# 1.92 04-May-2004 pk

Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.91 24-Mar-2004 junyoung

Nuke __P().


# 1.90 14-Mar-2004 jdolecek

fix typo in comment


# 1.89 13-Feb-2004 yamt

when breaking a loan from uobj,
insert the replacement page into the same position
as the original page on the object memq so that
genfs_putpages (and lfs) won't be confused.

noted by Stephan Uphoff (PR/24328)


# 1.88 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.87 18-Dec-2003 pk

* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset
to be passed to uvm_map().

* Turn all uvm_km_valloc*() macros back into (inlined) functions to retain
binary compatibility with any 3rd party modules.


# 1.86 18-Dec-2003 pk

Condense all existing variants of uvm_km_valloc into a single function:
uvm_km_valloc1(), and use it to express all of
uvm_km_valloc()
uvm_km_valloc_wait()
uvm_km_valloc_prefer()
uvm_km_valloc_prefer_wait()
uvm_km_valloc_align()
in terms of it by macro expansion.


# 1.85 13-Nov-2003 chs

eliminate uvm_useracc() in favor of checking the return value of
copyin() or copyout().

uvm_useracc() tells us whether the mapping permissions allow access to
the desired part of an address space, and many callers assume that
this is the same as knowing whether an attempt to access that part of
the address space will succeed. however, access to user space can
fail for reasons other than insufficient permission, most notably that
paging in any non-resident data can fail due to i/o errors. most of
the callers of uvm_useracc() make the above incorrect assumption. the
rest are all misguided optimizations, which optimize for the case
where an operation will fail. we'd rather optimize for operations
succeeding, in which case we should just attempt the access and handle
failures due to insufficient permissions the same way we handle i/o
errors. since there appear to be no good uses of uvm_useracc(), we'll
just remove it.


# 1.84 11-Aug-2003 pk

Introduce uvm_swapisfull(), which computes the available swap space by
taking into account swap devices that are in the process of being removed.


# 1.83 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.82 29-Jun-2003 fvdl

branches: 1.82.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.81 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.80 10-May-2003 thorpej

Back out the following chagne:
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html

There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.

Fixes PR kern/21517.


# 1.79 08-May-2003 thorpej

Simplify the way the bounds of the managed kernel virtual address
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.

This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().

This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.

This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).


# 1.78 03-May-2003 wiz

Misc fixes from jmc@openbsd.


# 1.77 01-Feb-2003 thorpej

Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.


# 1.76 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.75 11-Dec-2002 thorpej

Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.

From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.


# 1.74 17-Nov-2002 chs

change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.


Revision tags: kqueue-aftermerge kqueue-beforemerge kqueue-base
# 1.73 22-Sep-2002 chs

encapsulate knowledge of uarea allocation in some new functions.


# 1.72 15-Sep-2002 chs

add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.


Revision tags: netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base gehenna-devsw-base
# 1.71 17-May-2002 enami

branches: 1.71.2;
Make uvn_findpages to return number of pages found so that caller can
easily check if all requested pages are found or not.


Revision tags: eeh-devprop-base newlock-base ifpoll-base
# 1.70 10-Dec-2001 thorpej

branches: 1.70.8;
Move the code that walks the process's VM map during a coredump
into uvm_coredump_walkmap(), and use callbacks into the coredump
routine to do something with each section.


# 1.69 09-Dec-2001 chs

add {anon,file,exec}max as a upper bound on the amount of memory that
will be allocated for the respective usage types when there is contention
for memory.

replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names
and sysctl names.


# 1.68 08-Dec-2001 thorpej

Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf
# 1.67 15-Sep-2001 chs

a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.


Revision tags: pre-chs-ubcperf thorpej-devvp-base
# 1.66 16-Aug-2001 chs

branches: 1.66.2;
user maps are always pageable.


# 1.65 02-Jun-2001 chs

branches: 1.65.2;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.


# 1.64 26-May-2001 chs

replace vm_page_t with struct vm_page *.


# 1.63 25-May-2001 chs

remove trailing whitespace.


# 1.62 02-May-2001 thorpej

Support dynamic sizing of the page color bins. We also support
dynamically re-coloring pages; as machine-dependent code discovers
the size of the system's caches, it may call uvm_page_recolor() with
the new number of colors to use. If the new mumber of colors is
smaller (or equal to) the current number of colors, then uvm_page_recolor()
is a no-op.

The system defaults to one bucket if machine-dependent code does not
initialize uvmexp.ncolors before uvm_page_init() is called.

Note that the number of color bins should be initialized to something
reasonable as early as possible -- for many early memory allocations,
we live with the consequences of the page choice for the lifetime of
the boot.


# 1.61 01-May-2001 thorpej

Add the number of page colors to uvmexp.


# 1.60 29-Apr-2001 thorpej

Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets. This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).


# 1.59 25-Apr-2001 thorpej

pmap_resident_count() always exists. Besides, returning the
value of vm_rssize is pointless -- it is never initialized to
anything other than 0.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.58 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.57 09-Mar-2001 chs

add UBC memory-usage balancing. we track the number of pages in use for
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.


# 1.56 06-Feb-2001 eeh

branches: 1.56.2;
Specify a process' address space limits for uvmspace_exec().


# 1.55 30-Nov-2000 simonb

Move uvm_pgcnt_vnode and uvm_pgcnt_anon into uvmexp (as vnodepages and
anonpages), and add vtextpages which is currently unused but will be
used to trace the number of pages used by vtext vnodes.


# 1.54 29-Nov-2000 simonb

Add a vm.uvmexp2 sysctl that uses a ABI-safe 'struct uvmexp_sysctl'.


# 1.53 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.52 27-Nov-2000 nisimura

Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.


# 1.51 28-Sep-2000 eeh

Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.


# 1.50 21-Sep-2000 thorpej

Make PMAP_PAGEIDLEZERO() return a boolean value. FALSE indidcates
that the page being zero'd was not completed and that page zeroing
should be aborted. This may be used by machine-dependent code doing
slow page access to reduce the latency of running a process that has
become runnable while in the middle of doing a slow page zero.


# 1.49 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.48 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.47 01-Aug-2000 wiz

Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).


# 1.46 24-Jul-2000 jeffs

Add uvm_km_valloc_prefer_wait(). Used to valloc with the passed in
voff_t being passed to PMAP_PREFER(), which results in the propper
virtual alignment of the allocated space.


# 1.45 27-Jun-2000 mrg

move the contents of <vm/vm.h> into <uvm/uvm_extern.h>. <vm/vm.h> is simply
an include of <uvm/uvm_extern.h> now.


# 1.44 27-Jun-2000 mrg

more vm header file changes:

<vm/vm_extern.h> merged into <uvm/uvm_extern.h>
<vm/vm_page.h> merged into <uvm/uvm_page.h>
<vm/pmap.h> has become <uvm/uvm_pmap.h>

this leaves just <vm/vm.h> in NetBSD.


# 1.43 26-Jun-2000 mrg

remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.


Revision tags: netbsd-1-5-PATCH003 netbsd-1-5-PATCH002 netbsd-1-5-PATCH001 netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.42 08-Jun-2000 thorpej

Change UVM_UNLOCK_AND_WAIT() to use ltsleep() (it is now atomic, as
advertised). Garbage-collect uvm_sleep().


# 1.41 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.40 24-Apr-2000 thorpej

branches: 1.40.2;
Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
pages with unknown contents.
- Implement uvm_pageidlezero(). This function attempts to zero up to
the target number of pages until the target has been reached (currently
target is `all free pages') or until whichqs becomes non-zero (indicating
that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages. This is
used to zero the pages using uncached access. This allows us to zero
as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.


# 1.39 10-Apr-2000 thorpej

Add UVM_PGA_ZERO which instructs uvm_pagealloc{,_strat}() to return a
zero'd, ! PG_CLEAN page, as if it were uvm_pagezero()'d.


# 1.38 26-Mar-2000 kleink

Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.


Revision tags: chs-ubc2-newbase
# 1.37 11-Feb-2000 thorpej

Add some very simple code to auto-size the kmem_map. We take the
amount of physical memory, divide it by 4, and then allow machine
dependent code to place upper and lower bounds on the size. Export
the computed value to userspace via the new "vm.nkmempages" sysctl.

NKMEMCLUSTERS is now deprecated and will generate an error if you
attempt to use it. The new option, should you choose to use it,
is called NKMEMPAGES, and two new options NKMEMPAGES_MIN and
NKMEMPAGES_MAX allow the user to configure the bounds in the kernel
config file.


# 1.36 11-Jan-2000 chs

add support for ``swapctl -d'' (removing swap space).
improve handling of i/o errors in swap space.

reviewed by: Chuck Cranor


# 1.35 30-Dec-1999 eeh

I should have made uvm_page_physload() take paddr_t's instead of vaddr_t's.
Also, add uvm_coredump32().


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.34 22-Jul-1999 thorpej

branches: 1.34.2;
Garbage collect thread_sleep()/thread_wakeup() left over from the old
Mach VM code. Also nuke iprintf(), which was no longer used anywhere.

Add proclist locking where appropriate.


# 1.33 17-Jul-1999 thorpej

Add a set of "lockflags", which can control the locking behavior
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.


# 1.32 02-Jul-1999 thorpej

Bring in additional uvmexp members from chs-ubc2, so that VM stats can
be read no matter which kernel you're running.


# 1.31 21-Jun-1999 thorpej

Protect prototypes, certain macros, and inlines from userland.


# 1.30 18-Jun-1999 thorpej

Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap(). Update all calls accordingly.


# 1.29 17-Jun-1999 thorpej

Make uvm_vslock() return the error code from uvm_fault_wire(). All places
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.

XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.


# 1.28 15-Jun-1999 thorpej

Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]


# 1.27 26-May-1999 thorpej

Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.


# 1.26 26-May-1999 thorpej

Pass an access_type to uvm_vslock().


# 1.25 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.24 11-Apr-1999 chs

add a `flags' argument to uvm_pagealloc_strat().
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.


Revision tags: netbsd-1-4-base
# 1.23 26-Mar-1999 chs

branches: 1.23.2;
add uvmexp.swpgonly and use it to detect out-of-swap conditions.


# 1.22 25-Mar-1999 mrg

remove now >1 year old pre-release message.


Revision tags: kenh-if-detach-base chs-ubc-base
# 1.21 08-Sep-1998 thorpej

branches: 1.21.2;
Implement uvm_exit(), which frees VM resources when a process finishes
exiting.


# 1.20 28-Aug-1998 thorpej

Add a waitok boolean argument to the VM system's pool page allocator backend.


# 1.19 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.18 01-Aug-1998 thorpej

We need to be able to specify a uvm_object to the pool page allocator, too.


# 1.17 31-Jul-1998 thorpej

Allow an alternate splimp-protected map to be specified in the pool page
allocator routines.


Revision tags: eeh-paddr_t-base
# 1.16 24-Jul-1998 thorpej

branches: 1.16.2;
Implement uvm_km_{alloc,free}_poolpage(). These functions use pmap hooks to
map/unmap pool pages if provided by the pmap layer.


# 1.15 08-Jul-1998 thorpej

Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

- normal: high -> low priority free list walk, taking the
page off the first free list that has one.

- only: attempt to allocate a page only from the specified free
list, failing if that free list has none available.

- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto. This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

VM_NFREELIST: the number of free lists the system will have

VM_FREELIST_DEFAULT: the default freelist (should always be 0,
but is defined in machdep code so that it's with all of the
other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).


# 1.14 04-Jul-1998 jonathan

defopt DDB.


# 1.13 09-May-1998 kleink

Use size_t to pass the length of the memory region to operate on to chgkprot(),
kernacc(), useracc(), vslock() and vsunlock(); (unsigned) ints are not
adequate on all platforms.


# 1.12 30-Apr-1998 thorpej

Pass vslock() and vsunlock() a proc *, rather than implicitly operating
on curproc.


# 1.11 30-Mar-1998 mycroft

Mark scheduler() and uvm_scheduler() as never returning.


# 1.10 27-Mar-1998 thorpej

Split uvmspace_alloc() into uvmspace_alloc() and uvmspace_init(). The latter
can be used for initializing a pre-allocated vmspace.


# 1.9 09-Mar-1998 mrg

KNF.


# 1.8 10-Feb-1998 perry

add/cleanup multiple inclusion protection.


# 1.7 09-Feb-1998 mrg

keep statistics on pageout/pagein, total pages, and total operations.


# 1.6 08-Feb-1998 thorpej

Allow callers of uvm_km_suballoc() to specify where the base of the
submap _must_ begin, by adding a "fixed" boolean argument.


# 1.5 07-Feb-1998 mrg

implement counters for pages paged in/out


# 1.4 07-Feb-1998 mrg

restore rcsids


# 1.3 07-Feb-1998 chs

prototype for uvm_map_checkprot() moved here.
add uvmexp fields for pagouts-in-progress and kernel-reserved pages.


# 1.2 06-Feb-1998 thorpej

RCS ID police.


# 1.1 05-Feb-1998 mrg

branches: 1.1.1;
Initial revision