History log of /openbsd-current/sys/uvm/uvm.h
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.73 02-Apr-2024 deraadt

Delete the msyscall mechanism entirely, since mimmutable+pinsyscalls has
replaced it with a more strict mechanism, which happens to be lockless O(1)
rather than micro-lock O(1)+O(log N). Also nop-out the sys_msyscall(2) guts,
but leave the syscall around for a bit longer so that people can build through
it, since ld.so(1) still wants to call it.


# 1.72 30-Mar-2024 mpi

Document that pmemrange control data are protected by `uvm.fpageqlock'.


Revision tags: OPENBSD_7_3_BASE OPENBSD_7_4_BASE OPENBSD_7_5_BASE
# 1.71 07-Oct-2022 deraadt

new UVM_ET_IMMUTABLE flag marks a uvm entry as immutable.


# 1.70 29-Sep-2022 deraadt

There no longer is any KVM_ET_* to keep in sync with UVM_ET_*, so
comment can be deleted.


Revision tags: OPENBSD_7_2_BASE
# 1.69 04-May-2022 mpi

Merge swap-backed and object-backed inactive page lists.

ok millert@, kettenis@


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE OPENBSD_7_1_BASE
# 1.68 24-Nov-2020 mpi

Grab the `pageqlock' before calling uvm_pageclean() as intended.

Document which global data structures require this lock and add some
asserts where the lock should be held.

Some code paths are still incorrect and should be revisited.

ok jmatthew@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.67 06-Dec-2019 mpi

Sync KVE_ET_* and UVM_ET_* flags.

ok guenther@


# 1.66 29-Nov-2019 deraadt

Repurpose the "syscalls must be on a writeable page" mechanism to
enforce a new policy: system calls must be in pre-registered regions.
We have discussed more strict checks than this, but none satisfy the
cost/benefit based upon our understanding of attack methods, anyways
let's see what the next iteration looks like.

This is intended to harden (translation: attackers must put extra
effort into attacking) against a mixture of W^X failures and JIT bugs
which allow syscall misinterpretation, especially in environments with
polymorphic-instruction/variable-sized instructions. It fits in a bit
with libc/libcrypto/ld.so random relink on boot and no-restart-at-crash
behaviour, particularily for remote problems. Less effective once on-host
since someone the libraries can be read.

For static-executables the kernel registers the main program's
PIE-mapped exec section valid, as well as the randomly-placed sigtramp
page. For dynamic executables ELF ld.so's exec segment is also
labelled valid; ld.so then has enough information to register libc's
exec section as valid via call-once msyscall(2)

For dynamic binaries, we continue to to permit the main program exec
segment because "go" (and potentially a few other applications) have
embedded system calls in the main program. Hopefully at least go gets
fixed soon.

We declare the concept of embedded syscalls a bad idea for numerous
reasons, as we notice the ecosystem has many of
static-syscall-in-base-binary which are dynamically linked against
libraries which in turn use libc, which contains another set of
syscall stubs. We've been concerned about adding even one additional
syscall entry point... but go's approach tends to double the entry-point
attack surface.

This was started at a nano-hackathon in Bob Beck's basement 2 weeks
ago during a long discussion with mortimer trying to hide from the SSL
scream-conversations, and finished in more comfortable circumstances
next to a wood-stove at Elk Lakes cabin with UVM scream-conversations.

ok guenther kettenis mortimer, lots of feedback from others
conversations about go with jsing tb sthen


Revision tags: OPENBSD_6_6_BASE
# 1.65 18-Jul-2019 cheloha

R.I.P. UVM_WAIT(). Use tsleep_nsec(9) directly.

UVM_WAIT() doesn't provide much of a useful abstraction. All callers
tsleep forever and no callers set PCATCH, so only 2 of 4 parameters are
actually used. Might as well just use tsleep_nsec(9) directly and make
the uvm code a bit less specialized.

Suggested by mpi@.

ok mpi@ visa@ millert@


Revision tags: OPENBSD_6_5_BASE
# 1.64 01-Mar-2019 cheloha

New mmap(2) flag: MAP_CONCEAL.

MAP_CONCEAL'd memory is not written to disk in the event of a core dump.
It may grow other qualities in the future.

Wanted by libressl, probably useful elsewhere, too.

Prompted by deraadt@, concept from deraadt@/kettenis@. With input from
deraadt@, cjeker@, kettenis@, otto@, bcook@, matthew@, guenther@, djm@,
and tedu@.

ok otto@ deraadt@


# 1.63 31-Oct-2018 kettenis

Add support to uvm to establish write-combining mappings. Use this in the
inteldrm driver to add support for the I915_MMAP_WC flag.

ok deraadt@, jsg@


Revision tags: OPENBSD_6_4_BASE
# 1.62 12-Apr-2018 deraadt

Implement MAP_STACK option for mmap(). Synchronous faults (pagefault and
syscall) confirm the stack register points at MAP_STACK memory, otherwise
SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
to create a MAP_STACK sub-region which satisfies alignment requirements.
Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
contents of the region -- there is no mprotect() equivalent operation, so
there is no MAP_STACK-adding gadget.
This opportunistic software-emulation of a stack protection bit makes
stack-pivot operations during ROPchain fragile (kind of like removing a
tool from the toolbox).
original discussion with tedu, uvm work by stefan, testing by mortimer
ok kettenis


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.61 11-Aug-2016 dlg

replace abuse of the static map entries RB_ENTRY pointers with an SLIST

free static entries are kept in a simple linked list, so use SLIST
to make this obvious. the RB_PARENT manipulations are ugly and
confusing.

ok kettenis@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.60 08-Oct-2015 kettenis

Lock the page queues by turning uvm_lock_pageq() and uvm_unlock_pageq() into
mtx_enter() and mtx_leave() operations. Not 100% this won't blow up but
there is only one way to find out, and we need this to make progress on
further unlocking uvm.

prodded by deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.59 04-May-2015 dlg

reduce the scope of things that include uvm_swap_encrypt.h.

uvm_meter.c needs it to route the sysctl, uvm_swap.c needs it to
use the functionality, and uvm_swap_encrypt.c needs it to for obvious
reasons. userland sysctl already includes it explicitely.

everything else doesnt and shouldnt care.

ok miod@


# 1.58 23-Apr-2015 dlg

tedu remnants of the previous attempt to implement page zeroing in
the idle thread.

ok deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.57 03-Oct-2014 kettenis

Introduce __MAP_NOFAULT, a mmap(2) flag that makes sure a mapping will not
cause a SIGSEGV or SIGBUS when a mapped file gets truncated. Access to
pages that are not backed by a file on such a mapping will be replaced by
zero-filled anonymous pages. Makes passing file descriptors of mapped files
usable without having to play tricks with signal handlers.

"steal your mmap flag" deraadt@


Revision tags: OPENBSD_5_6_BASE
# 1.56 11-Jul-2014 jsg

Chuck Cranor rescinded clauses in his license
on the 2nd of February 2011 in NetBSD.

http://marc.info/?l=netbsd-source-changes&m=129658899212732&w=2
http://marc.info/?l=netbsd-source-changes&m=129659095515558&w=2
http://marc.info/?l=netbsd-source-changes&m=129659157916514&w=2
http://marc.info/?l=netbsd-source-changes&m=129665962324372&w=2
http://marc.info/?l=netbsd-source-changes&m=129666033625342&w=2
http://marc.info/?l=netbsd-source-changes&m=129666052825545&w=2
http://marc.info/?l=netbsd-source-changes&m=129666922906480&w=2
http://marc.info/?l=netbsd-source-changes&m=129667725518082&w=2


# 1.55 08-Jul-2014 deraadt

decouple struct uvmexp into a new file, so that uvm_extern.h and sysctl.h
don't need to be married.
ok guenther miod beck jsing kettenis


# 1.54 08-Jul-2014 deraadt

white space repairs


# 1.53 28-Mar-2014 mpi

Reduce uvm include madness. Use <uvm/uvm_extern.h> instead of
<uvm/uvm.h> if possible and remove double inclusions.

ok beck@, mlarkin@, deraadt@


Revision tags: OPENBSD_5_5_BASE
# 1.52 09-Nov-2013 guenther

Add KASSERT()s to tsleep() and msleep() to verify that bogus flags
aren't being passed to them. Fix UVM_WAIT() to not pass PNORELOCK to
tsleep(), as that flag only does something with msleep().

ok beck@ dlg@


Revision tags: OPENBSD_5_4_BASE
# 1.51 30-May-2013 tedu

UVM_UNLOCK_AND_WAIT no longer unlocks, so rename it to UVM_WAIT.


# 1.50 30-May-2013 tedu

remove lots of comments about locking per beck's request


# 1.49 30-May-2013 tedu

remove simple_locks from uvm code. ok beck deraadt


# 1.48 29-May-2013 tedu

uvm_loan has not (ever) been compiled or used.


Revision tags: OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.47 09-Mar-2012 ariane

New vmmap implementation.

no oks (it is really a pain to review properly)
extensively tested, I'm confident it'll be stable
'now is the time' from several icb inhabitants

Diff provides:
- ability to specify different allocators for different regions/maps
- a simpler implementation of the current allocator
- currently in compatibility mode: it will generate similar addresses
as the old allocator


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.46 06-Jul-2011 beck

uvm changes for buffer cache improvements.
1) Make the pagedaemon aware of the memory ranges and size of allocations
where memory is being requested, and pass this information on to
bufbackoff(), which will later (not yet) be used to ensure that the
buffer cache gets out of the way in the right area of memory.

Note that this commit does not yet make it *do* that - as currently
the buffer cache is all in dma-able memory and it will simply back
off.

2) Add uvm_pagerealloc_multi - to be used by the buffer cache code
for reallocating pages to particular regions.

much of this work by ariane, with smatterings of me, art,and oga

ok oga@, thib@, ariane@, deraadt@


# 1.45 03-Jul-2011 oga

endodoify UVM_CNT too.

``beat it'' tedu@ the deleteotron


# 1.44 03-Jul-2011 oga

Rip out and burn support for UVM_HIST.

The vm hackers don't use it, don't maintain it and have to look at it all the
time. About time this 800 lines of code hit /dev/null.

``never liked it'' tedu@. ariane@ was very happy when i told her i wrote
this diff.


# 1.43 30-May-2011 oga

Remove the freelist member from vm_physseg

The new world order of pmemrange makes this data completely redundant
(being dealt with by the pmemrange constraints instead). Remove all code
that messes with the freelist.

While touching every caller of uvm_page_physload() anyway, add the flags
argument to all callers (all but one is 0 and that one already used
PHYSLOAD_DEVICE) and remove the macro magic to allow callers to continue
without it.

Should shrink the code a bit, as well.

matthew@ pointed out some mistakes i'd made.
``freelist death, I like. Ok.' ariane@
`I agree with the general direction, go ahead and i'll fix any fallout
shortly'' miod@ (68k 88k and vax i could not check would build)


# 1.42 15-Apr-2011 oga

When I switched uvm objects to use a per-object page tree instead of the
global hash I forgot to remove the has declarations from struct uvm. So
remove them now.

pointed out by blambert@, ok beck@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.41 29-Jun-2010 thib

Add a no_constraint uvm_constraint_range; use it in the pool code.

ok tedu@, beck@, oga@


# 1.40 27-Jun-2010 thib

uvm constraints. Add two mandatory MD symbols, uvm_md_constraints
which contains the constraints for DMA/memory allocation for each
architecture, and dma_constraints which contains the range of addresses
that are dma accessable by the system.

This is based on ariane@'s physcontig diff, with lots of bugfixes and
additions the following additions by my self:

Introduce a new function pool_set_constraints() which sets the address
range for which we allocate pages for the pool from, this is now used
for the mbuf/mbuf cluster pools to keep them dma accessible.

The !direct archs no longer stuff pages into the kernel object in
uvm_km_getpage_pla but rather do a pmap_extract() in uvm_km_putpages.

Tested heavily by my self on i386, amd64 and sparc64. Some tests on
alpha and SGI.

"commit it" beck, art, oga, deraadt
"i like the diff" deraadt


# 1.39 09-Jun-2010 thib

Move the prototype for uvm_wait() to uvm_extern.h and remove
uvm_pdaemon.h has it was only holding that one prototype.

OK art@, oga@, miod@, deraadt@


# 1.38 22-Apr-2010 oga

Committing on behalf or ariane@.

recommit pmemrange:
physmem allocator: change the view of free memory from single
free pages to free ranges. Classify memory based on region with
associated use-counter (which is used to construct a priority
list of where to allocate memory).

Based on code from tedu@, help from many.

Useable now that bugs have been found and fixed in most architecture's
pmap.c

ok by everyone who has done a pmap or uvm commit in the last year.


Revision tags: OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.37 16-Jun-2009 oga

date based reversion of uvm to the 4th May.

We still have no idea why this stops the crashes. but it does.

a machine forced to 64mb of ram cycled 10GB through swap with this diff
and is still running as I type this. Other tests by ariane@ and thib@
also seem to show that it's alright.

ok deraadt@, thib@, ariane@


# 1.36 16-Jun-2009 ariane

Backout pmemrange (which to most people is more well known as physmem
allocator).

"i can't see any obvious problems" oga


# 1.35 16-Jun-2009 oga

Backout all changes to uvm after pmemrange (which will be backed out
separately).

a change at or just before the hackathon has either exposed or added a
very very nasty memory corruption bug that is giving us hell right now.
So in the interest of kernel stability these diffs are being backed out
until such a time as that corruption bug has been found and squashed,
then the ones that are proven good may slowly return.

a quick hitlist of the main commits this backs out:

mine:
uvm_objwire
the lock change in uvm_swap.c
using trees for uvm objects instead of the hash
removing the pgo_releasepg callback.

art@'s:
putting pmap_page_protect(VM_PROT_NONE) in uvm_pagedeactivate() since
all callers called that just prior anyway.

ok beck@, ariane@.

prompted by deraadt@.


# 1.34 02-Jun-2009 oga

Instead of the global hash table with the terrible hashfunction and a
global lock, switch the uvm object pages to being kept in a per-object
RB_TREE. Right now this is approximately the same speed, but cleaner.
When biglock usage is reduced this will improve concurrency due to lock
contention..

ok beck@ art@. Thanks to jasper for the speed testing.


# 1.33 01-Jun-2009 ariane

physmem allocator: change the view of free memory from single free pages
to free ranges.
Classify memory based on region with associated use-counter (which is used
to construct a priority list of where to allocate memory).

Based on code from tedu@, help from many.
Ok art@


# 1.32 04-May-2009 oga

Instead of keeping two ints in the uvm structure specifically just to
sleep on them (and otherwise ignore them) sleep on the pointer to the
{aiodoned,pagedaemon}_proc members, and nuke the two extra words.

"no objections" art@, ok beck@.


# 1.31 28-Apr-2009 miod

Revert pageqlock back from a mutex to a simple_lock, as it needs to be
recursive in some cases (mostly involving swapping). A proper fix is in
the works, but this will unbreak kernels for now.


# 1.30 14-Apr-2009 oga

The use of uvm.pagedaemon_lock is incredibly inconsistent. only a
fraction of the wakeups and sleeps involved here actually grab that
lock. The remainder, on the other hand, always have the fpageq_lock
locked.

So, make this locking correct by switching the other users over to
fpageq_lock, too.

This would probably be better off being a semaphore, but for now at
least it's correct.

"ok, unless you want to implement semaphores" art@


# 1.29 13-Apr-2009 oga

Convert the page queue lock to a mutex instead of a simplelock.

Fix up the one case of lock recursion (which blatantly ignored the
comment right above it saying that we don't need to lock). The rest of
the lock usage has been checked and appears to be correct.

ok ariane@.


# 1.28 06-Apr-2009 oga

Instead of doing splbio(); simple_lock(&uvm.aiodoned_lock); just replace
the simple lock with a real lock - a IPL_BIO mutex. While i'm here, make
the sleeping condition one hell of a lot simpler in the aio daemon.

some ideas from and ok art@.


# 1.27 26-Mar-2009 oga

Convert splvm() + simplelock(&uvm.hashlock); around the page hash table
into a IPL_VM blocking mutex, also slightly extend the locked area so
that it actually protects access to the page array (as the comment on
the lock declaration says it should).

ansify a few functions while i'm in the file.

"ok, even though you're sneaking in ansification in a diff. You dirty
you." art@


# 1.26 25-Mar-2009 oga

Move all of the pseudo-inline functions in uvm into C files.

By pseudo-inline, I mean that if a certain macro was defined, they would
be inlined. However, no architecture defines that, and none has for a
very very long time. Therefore mainly this just makes the code a damned
sight easier to read. Some k&r -> ansi declarations while I'm in there.

"just commit it" art@. ok weingart@.


Revision tags: OPENBSD_4_5_BASE
# 1.25 27-Jan-2009 miod

Get rid of the last traces of uvm.pager_[se]va


Revision tags: OPENBSD_4_4_BASE
# 1.24 09-Jun-2008 miod

Define a new flag, UVM_FLAG_HOLE, for uvm_map to create a vm_map_entry of
a new etype, UVM_ET_HOLE, meaning it has no backend.

UVM_ET_HOLE entries (which should be created as UVM_PROT_NONE and with
UVM_FLAG_NOMERGE and UVM_FLAG_HOLE) are skipped in uvm_unmap_remove(), so
that pmap_{k,}remove() is not called on the entry.

This is intended to save time, and behave better, on pmaps with MMU holes
at process exit time.

ok art@, kettenis@ provided feedback as well.


# 1.23 05-May-2008 thib

retire ltsleep(); The only refrence left too it is in an
ifdef netbsd block in drm code, but oga@ says he'll remove
it soon...

OK art@, oga@;


Revision tags: OPENBSD_4_3_BASE
# 1.22 29-Nov-2007 tedu

use a working mutex for the freepage list. ok art deraadt


Revision tags: OPENBSD_4_2_BASE
# 1.21 18-Jun-2007 pedro

Bring back Mickey's UVM anon change. Testing by thib@, beck@ and
ckuethe@ for a while. Okay beck@, "it is good timing" deraadt@.


Revision tags: OPENBSD_4_0_BASE OPENBSD_4_1_BASE
# 1.20 13-Jul-2006 deraadt

Back out the anon change. Apparently it was tested by a few, but most of
us did not see it or get a chance to test it before it was commited. It
broke cvs, in the ami driver, making it not succeed at seeing it's devices.


# 1.19 21-Jun-2006 mickey

from netbsd: make anons dynamically allocated from pool.
this results in lesse kva waste due to static preallocation of those
for every phys page and also every swap page.
tested by beck krw miod


Revision tags: OPENBSD_3_9_BASE
# 1.18 16-Jan-2006 mickey

add another uvm histroy for physpage alloc/free and propagate a debugging pgfree check into pglist; no functional change for normal kernels; make histories uncommon


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B UBC_SYNC_A
# 1.17 29-Mar-2003 mickey

ubchist is not a fully cooked kadaver and though use the other well formed pdhist one until ubc gaets back. art@ ok


Revision tags: OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.16 19-Dec-2001 art

UBC was a disaster. It worked very good when it worked, but on some
machines or some configurations or in some phase of the moon (we actually
don't know when or why) files disappeared. Since we've not been able to
track down the problem in two weeks intense debugging and we need -current
to be stable, back out everything to a state it had before UBC.

We apologise for the inconvenience.


Revision tags: UBC_BASE
# 1.15 28-Nov-2001 art

branches: 1.15.2;
Sync in more uvm from NetBSD. Mostly just cosmetic stuff.
Contains also support for page coloring.


# 1.14 10-Nov-2001 art

Merge in some parts of the ubc work that has been done in NetBSD that are not
UBC, but prerequsites for it.

- Create a daemon that processes async I/O (swap and paging in the future)
requests that need processing in process context and that were processed
in the pagedaemon before.
- Convert some ugly ifdef DIAGNOSTIC code to less intrusive KASSERTs.
- misc other cleanups.


# 1.13 05-Nov-2001 art

Minor sync to NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.12 11-Aug-2001 art

Various random fixes from NetBSD.
Including support for zeroing pages in the idle loop (not enabled yet).


# 1.11 18-Jul-2001 art

Unconfuse UVM_UNLOCK_AND_WAIT. From NetBSD.


# 1.10 23-Jun-2001 smart

Sync with NetBSD 19990911 (just before PMAP_NEW was required)
- thread_sleep_msg() -> uvm_sleep()
- initialize reference count lock in uvm_anon_{init,add}()
- add uao_flush()
- replace boolean 'islocked' with 'lockflags'
- in uvm_fault() change FALSE to TRUE to in 'wide' fault handling
- get rid of uvm_km_get()
- various bug fixes


Revision tags: OPENBSD_2_9_BASE
# 1.9 10-Apr-2001 niklas

Fix for machines which need to enlarge the kernel address space, at least
1GB i386 machines needs this. The fix is heavily based on Jason Thorpe's
found in NetBSD. Here is his original commit message:

Instead of checking vm_physmem[<physseg>].pgs to determine if
uvm_page_init() has completed, add a boolean uvm.page_init_done,
and test against that. Use this same boolean (rather than
pmap_initialized) in pmap_growkernel() to determine if we are
being called via uvm_page_init() to grow the kernel address space.

This fixes a problem on some i386 configurations where pmap_init()
itself was needing to have the kernel page table grown, and since
pmap_initialized was not yet set to TRUE, pmap_growkernel() was
choosing the wrong code path.


# 1.8 22-Mar-2001 smart

Sync style, typo, and comments a little closer to NetBSD. art@ ok


# 1.7 09-Mar-2001 smart

Protect protypes, certain macros, and inlines from userland. Checked userland
with a 'make build'. From NetBSD. art@ ok


# 1.6 29-Jan-2001 niklas

$OpenBSD$


Revision tags: OPENBSD_2_8_BASE
# 1.5 27-May-2000 provos

use rijndael instead of blowfish because of faster key setup.
break swap paritions into sections, each section has own
encryption key. if a section's key becomes unreferenced, erase it.


Revision tags: OPENBSD_2_7_BASE
# 1.4 15-Mar-2000 art

Fix the NetBSD id strings.


Revision tags: OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 23-Aug-1999 art

branches: 1.3.4;
sync with NetBSD from 1999.05.24 (there is a reason for this date)
Mostly cleanups, but also a few improvements to pagedaemon for better
handling of low memory and/or low swap conditions.


Revision tags: OPENBSD_2_5_BASE
# 1.2 26-Feb-1999 art

add OpenBSD tags


# 1.1 26-Feb-1999 art

Import of uvm from NetBSD. Some local changes, some code disabled


# 1.72 30-Mar-2024 mpi

Document that pmemrange control data are protected by `uvm.fpageqlock'.


Revision tags: OPENBSD_7_3_BASE OPENBSD_7_4_BASE OPENBSD_7_5_BASE
# 1.71 07-Oct-2022 deraadt

new UVM_ET_IMMUTABLE flag marks a uvm entry as immutable.


# 1.70 29-Sep-2022 deraadt

There no longer is any KVM_ET_* to keep in sync with UVM_ET_*, so
comment can be deleted.


Revision tags: OPENBSD_7_2_BASE
# 1.69 04-May-2022 mpi

Merge swap-backed and object-backed inactive page lists.

ok millert@, kettenis@


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE OPENBSD_7_1_BASE
# 1.68 24-Nov-2020 mpi

Grab the `pageqlock' before calling uvm_pageclean() as intended.

Document which global data structures require this lock and add some
asserts where the lock should be held.

Some code paths are still incorrect and should be revisited.

ok jmatthew@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.67 06-Dec-2019 mpi

Sync KVE_ET_* and UVM_ET_* flags.

ok guenther@


# 1.66 29-Nov-2019 deraadt

Repurpose the "syscalls must be on a writeable page" mechanism to
enforce a new policy: system calls must be in pre-registered regions.
We have discussed more strict checks than this, but none satisfy the
cost/benefit based upon our understanding of attack methods, anyways
let's see what the next iteration looks like.

This is intended to harden (translation: attackers must put extra
effort into attacking) against a mixture of W^X failures and JIT bugs
which allow syscall misinterpretation, especially in environments with
polymorphic-instruction/variable-sized instructions. It fits in a bit
with libc/libcrypto/ld.so random relink on boot and no-restart-at-crash
behaviour, particularily for remote problems. Less effective once on-host
since someone the libraries can be read.

For static-executables the kernel registers the main program's
PIE-mapped exec section valid, as well as the randomly-placed sigtramp
page. For dynamic executables ELF ld.so's exec segment is also
labelled valid; ld.so then has enough information to register libc's
exec section as valid via call-once msyscall(2)

For dynamic binaries, we continue to to permit the main program exec
segment because "go" (and potentially a few other applications) have
embedded system calls in the main program. Hopefully at least go gets
fixed soon.

We declare the concept of embedded syscalls a bad idea for numerous
reasons, as we notice the ecosystem has many of
static-syscall-in-base-binary which are dynamically linked against
libraries which in turn use libc, which contains another set of
syscall stubs. We've been concerned about adding even one additional
syscall entry point... but go's approach tends to double the entry-point
attack surface.

This was started at a nano-hackathon in Bob Beck's basement 2 weeks
ago during a long discussion with mortimer trying to hide from the SSL
scream-conversations, and finished in more comfortable circumstances
next to a wood-stove at Elk Lakes cabin with UVM scream-conversations.

ok guenther kettenis mortimer, lots of feedback from others
conversations about go with jsing tb sthen


Revision tags: OPENBSD_6_6_BASE
# 1.65 18-Jul-2019 cheloha

R.I.P. UVM_WAIT(). Use tsleep_nsec(9) directly.

UVM_WAIT() doesn't provide much of a useful abstraction. All callers
tsleep forever and no callers set PCATCH, so only 2 of 4 parameters are
actually used. Might as well just use tsleep_nsec(9) directly and make
the uvm code a bit less specialized.

Suggested by mpi@.

ok mpi@ visa@ millert@


Revision tags: OPENBSD_6_5_BASE
# 1.64 01-Mar-2019 cheloha

New mmap(2) flag: MAP_CONCEAL.

MAP_CONCEAL'd memory is not written to disk in the event of a core dump.
It may grow other qualities in the future.

Wanted by libressl, probably useful elsewhere, too.

Prompted by deraadt@, concept from deraadt@/kettenis@. With input from
deraadt@, cjeker@, kettenis@, otto@, bcook@, matthew@, guenther@, djm@,
and tedu@.

ok otto@ deraadt@


# 1.63 31-Oct-2018 kettenis

Add support to uvm to establish write-combining mappings. Use this in the
inteldrm driver to add support for the I915_MMAP_WC flag.

ok deraadt@, jsg@


Revision tags: OPENBSD_6_4_BASE
# 1.62 12-Apr-2018 deraadt

Implement MAP_STACK option for mmap(). Synchronous faults (pagefault and
syscall) confirm the stack register points at MAP_STACK memory, otherwise
SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
to create a MAP_STACK sub-region which satisfies alignment requirements.
Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
contents of the region -- there is no mprotect() equivalent operation, so
there is no MAP_STACK-adding gadget.
This opportunistic software-emulation of a stack protection bit makes
stack-pivot operations during ROPchain fragile (kind of like removing a
tool from the toolbox).
original discussion with tedu, uvm work by stefan, testing by mortimer
ok kettenis


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.61 11-Aug-2016 dlg

replace abuse of the static map entries RB_ENTRY pointers with an SLIST

free static entries are kept in a simple linked list, so use SLIST
to make this obvious. the RB_PARENT manipulations are ugly and
confusing.

ok kettenis@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.60 08-Oct-2015 kettenis

Lock the page queues by turning uvm_lock_pageq() and uvm_unlock_pageq() into
mtx_enter() and mtx_leave() operations. Not 100% this won't blow up but
there is only one way to find out, and we need this to make progress on
further unlocking uvm.

prodded by deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.59 04-May-2015 dlg

reduce the scope of things that include uvm_swap_encrypt.h.

uvm_meter.c needs it to route the sysctl, uvm_swap.c needs it to
use the functionality, and uvm_swap_encrypt.c needs it to for obvious
reasons. userland sysctl already includes it explicitely.

everything else doesnt and shouldnt care.

ok miod@


# 1.58 23-Apr-2015 dlg

tedu remnants of the previous attempt to implement page zeroing in
the idle thread.

ok deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.57 03-Oct-2014 kettenis

Introduce __MAP_NOFAULT, a mmap(2) flag that makes sure a mapping will not
cause a SIGSEGV or SIGBUS when a mapped file gets truncated. Access to
pages that are not backed by a file on such a mapping will be replaced by
zero-filled anonymous pages. Makes passing file descriptors of mapped files
usable without having to play tricks with signal handlers.

"steal your mmap flag" deraadt@


Revision tags: OPENBSD_5_6_BASE
# 1.56 11-Jul-2014 jsg

Chuck Cranor rescinded clauses in his license
on the 2nd of February 2011 in NetBSD.

http://marc.info/?l=netbsd-source-changes&m=129658899212732&w=2
http://marc.info/?l=netbsd-source-changes&m=129659095515558&w=2
http://marc.info/?l=netbsd-source-changes&m=129659157916514&w=2
http://marc.info/?l=netbsd-source-changes&m=129665962324372&w=2
http://marc.info/?l=netbsd-source-changes&m=129666033625342&w=2
http://marc.info/?l=netbsd-source-changes&m=129666052825545&w=2
http://marc.info/?l=netbsd-source-changes&m=129666922906480&w=2
http://marc.info/?l=netbsd-source-changes&m=129667725518082&w=2


# 1.55 08-Jul-2014 deraadt

decouple struct uvmexp into a new file, so that uvm_extern.h and sysctl.h
don't need to be married.
ok guenther miod beck jsing kettenis


# 1.54 08-Jul-2014 deraadt

white space repairs


# 1.53 28-Mar-2014 mpi

Reduce uvm include madness. Use <uvm/uvm_extern.h> instead of
<uvm/uvm.h> if possible and remove double inclusions.

ok beck@, mlarkin@, deraadt@


Revision tags: OPENBSD_5_5_BASE
# 1.52 09-Nov-2013 guenther

Add KASSERT()s to tsleep() and msleep() to verify that bogus flags
aren't being passed to them. Fix UVM_WAIT() to not pass PNORELOCK to
tsleep(), as that flag only does something with msleep().

ok beck@ dlg@


Revision tags: OPENBSD_5_4_BASE
# 1.51 30-May-2013 tedu

UVM_UNLOCK_AND_WAIT no longer unlocks, so rename it to UVM_WAIT.


# 1.50 30-May-2013 tedu

remove lots of comments about locking per beck's request


# 1.49 30-May-2013 tedu

remove simple_locks from uvm code. ok beck deraadt


# 1.48 29-May-2013 tedu

uvm_loan has not (ever) been compiled or used.


Revision tags: OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.47 09-Mar-2012 ariane

New vmmap implementation.

no oks (it is really a pain to review properly)
extensively tested, I'm confident it'll be stable
'now is the time' from several icb inhabitants

Diff provides:
- ability to specify different allocators for different regions/maps
- a simpler implementation of the current allocator
- currently in compatibility mode: it will generate similar addresses
as the old allocator


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.46 06-Jul-2011 beck

uvm changes for buffer cache improvements.
1) Make the pagedaemon aware of the memory ranges and size of allocations
where memory is being requested, and pass this information on to
bufbackoff(), which will later (not yet) be used to ensure that the
buffer cache gets out of the way in the right area of memory.

Note that this commit does not yet make it *do* that - as currently
the buffer cache is all in dma-able memory and it will simply back
off.

2) Add uvm_pagerealloc_multi - to be used by the buffer cache code
for reallocating pages to particular regions.

much of this work by ariane, with smatterings of me, art,and oga

ok oga@, thib@, ariane@, deraadt@


# 1.45 03-Jul-2011 oga

endodoify UVM_CNT too.

``beat it'' tedu@ the deleteotron


# 1.44 03-Jul-2011 oga

Rip out and burn support for UVM_HIST.

The vm hackers don't use it, don't maintain it and have to look at it all the
time. About time this 800 lines of code hit /dev/null.

``never liked it'' tedu@. ariane@ was very happy when i told her i wrote
this diff.


# 1.43 30-May-2011 oga

Remove the freelist member from vm_physseg

The new world order of pmemrange makes this data completely redundant
(being dealt with by the pmemrange constraints instead). Remove all code
that messes with the freelist.

While touching every caller of uvm_page_physload() anyway, add the flags
argument to all callers (all but one is 0 and that one already used
PHYSLOAD_DEVICE) and remove the macro magic to allow callers to continue
without it.

Should shrink the code a bit, as well.

matthew@ pointed out some mistakes i'd made.
``freelist death, I like. Ok.' ariane@
`I agree with the general direction, go ahead and i'll fix any fallout
shortly'' miod@ (68k 88k and vax i could not check would build)


# 1.42 15-Apr-2011 oga

When I switched uvm objects to use a per-object page tree instead of the
global hash I forgot to remove the has declarations from struct uvm. So
remove them now.

pointed out by blambert@, ok beck@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.41 29-Jun-2010 thib

Add a no_constraint uvm_constraint_range; use it in the pool code.

ok tedu@, beck@, oga@


# 1.40 27-Jun-2010 thib

uvm constraints. Add two mandatory MD symbols, uvm_md_constraints
which contains the constraints for DMA/memory allocation for each
architecture, and dma_constraints which contains the range of addresses
that are dma accessable by the system.

This is based on ariane@'s physcontig diff, with lots of bugfixes and
additions the following additions by my self:

Introduce a new function pool_set_constraints() which sets the address
range for which we allocate pages for the pool from, this is now used
for the mbuf/mbuf cluster pools to keep them dma accessible.

The !direct archs no longer stuff pages into the kernel object in
uvm_km_getpage_pla but rather do a pmap_extract() in uvm_km_putpages.

Tested heavily by my self on i386, amd64 and sparc64. Some tests on
alpha and SGI.

"commit it" beck, art, oga, deraadt
"i like the diff" deraadt


# 1.39 09-Jun-2010 thib

Move the prototype for uvm_wait() to uvm_extern.h and remove
uvm_pdaemon.h has it was only holding that one prototype.

OK art@, oga@, miod@, deraadt@


# 1.38 22-Apr-2010 oga

Committing on behalf or ariane@.

recommit pmemrange:
physmem allocator: change the view of free memory from single
free pages to free ranges. Classify memory based on region with
associated use-counter (which is used to construct a priority
list of where to allocate memory).

Based on code from tedu@, help from many.

Useable now that bugs have been found and fixed in most architecture's
pmap.c

ok by everyone who has done a pmap or uvm commit in the last year.


Revision tags: OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.37 16-Jun-2009 oga

date based reversion of uvm to the 4th May.

We still have no idea why this stops the crashes. but it does.

a machine forced to 64mb of ram cycled 10GB through swap with this diff
and is still running as I type this. Other tests by ariane@ and thib@
also seem to show that it's alright.

ok deraadt@, thib@, ariane@


# 1.36 16-Jun-2009 ariane

Backout pmemrange (which to most people is more well known as physmem
allocator).

"i can't see any obvious problems" oga


# 1.35 16-Jun-2009 oga

Backout all changes to uvm after pmemrange (which will be backed out
separately).

a change at or just before the hackathon has either exposed or added a
very very nasty memory corruption bug that is giving us hell right now.
So in the interest of kernel stability these diffs are being backed out
until such a time as that corruption bug has been found and squashed,
then the ones that are proven good may slowly return.

a quick hitlist of the main commits this backs out:

mine:
uvm_objwire
the lock change in uvm_swap.c
using trees for uvm objects instead of the hash
removing the pgo_releasepg callback.

art@'s:
putting pmap_page_protect(VM_PROT_NONE) in uvm_pagedeactivate() since
all callers called that just prior anyway.

ok beck@, ariane@.

prompted by deraadt@.


# 1.34 02-Jun-2009 oga

Instead of the global hash table with the terrible hashfunction and a
global lock, switch the uvm object pages to being kept in a per-object
RB_TREE. Right now this is approximately the same speed, but cleaner.
When biglock usage is reduced this will improve concurrency due to lock
contention..

ok beck@ art@. Thanks to jasper for the speed testing.


# 1.33 01-Jun-2009 ariane

physmem allocator: change the view of free memory from single free pages
to free ranges.
Classify memory based on region with associated use-counter (which is used
to construct a priority list of where to allocate memory).

Based on code from tedu@, help from many.
Ok art@


# 1.32 04-May-2009 oga

Instead of keeping two ints in the uvm structure specifically just to
sleep on them (and otherwise ignore them) sleep on the pointer to the
{aiodoned,pagedaemon}_proc members, and nuke the two extra words.

"no objections" art@, ok beck@.


# 1.31 28-Apr-2009 miod

Revert pageqlock back from a mutex to a simple_lock, as it needs to be
recursive in some cases (mostly involving swapping). A proper fix is in
the works, but this will unbreak kernels for now.


# 1.30 14-Apr-2009 oga

The use of uvm.pagedaemon_lock is incredibly inconsistent. only a
fraction of the wakeups and sleeps involved here actually grab that
lock. The remainder, on the other hand, always have the fpageq_lock
locked.

So, make this locking correct by switching the other users over to
fpageq_lock, too.

This would probably be better off being a semaphore, but for now at
least it's correct.

"ok, unless you want to implement semaphores" art@


# 1.29 13-Apr-2009 oga

Convert the page queue lock to a mutex instead of a simplelock.

Fix up the one case of lock recursion (which blatantly ignored the
comment right above it saying that we don't need to lock). The rest of
the lock usage has been checked and appears to be correct.

ok ariane@.


# 1.28 06-Apr-2009 oga

Instead of doing splbio(); simple_lock(&uvm.aiodoned_lock); just replace
the simple lock with a real lock - a IPL_BIO mutex. While i'm here, make
the sleeping condition one hell of a lot simpler in the aio daemon.

some ideas from and ok art@.


# 1.27 26-Mar-2009 oga

Convert splvm() + simplelock(&uvm.hashlock); around the page hash table
into a IPL_VM blocking mutex, also slightly extend the locked area so
that it actually protects access to the page array (as the comment on
the lock declaration says it should).

ansify a few functions while i'm in the file.

"ok, even though you're sneaking in ansification in a diff. You dirty
you." art@


# 1.26 25-Mar-2009 oga

Move all of the pseudo-inline functions in uvm into C files.

By pseudo-inline, I mean that if a certain macro was defined, they would
be inlined. However, no architecture defines that, and none has for a
very very long time. Therefore mainly this just makes the code a damned
sight easier to read. Some k&r -> ansi declarations while I'm in there.

"just commit it" art@. ok weingart@.


Revision tags: OPENBSD_4_5_BASE
# 1.25 27-Jan-2009 miod

Get rid of the last traces of uvm.pager_[se]va


Revision tags: OPENBSD_4_4_BASE
# 1.24 09-Jun-2008 miod

Define a new flag, UVM_FLAG_HOLE, for uvm_map to create a vm_map_entry of
a new etype, UVM_ET_HOLE, meaning it has no backend.

UVM_ET_HOLE entries (which should be created as UVM_PROT_NONE and with
UVM_FLAG_NOMERGE and UVM_FLAG_HOLE) are skipped in uvm_unmap_remove(), so
that pmap_{k,}remove() is not called on the entry.

This is intended to save time, and behave better, on pmaps with MMU holes
at process exit time.

ok art@, kettenis@ provided feedback as well.


# 1.23 05-May-2008 thib

retire ltsleep(); The only refrence left too it is in an
ifdef netbsd block in drm code, but oga@ says he'll remove
it soon...

OK art@, oga@;


Revision tags: OPENBSD_4_3_BASE
# 1.22 29-Nov-2007 tedu

use a working mutex for the freepage list. ok art deraadt


Revision tags: OPENBSD_4_2_BASE
# 1.21 18-Jun-2007 pedro

Bring back Mickey's UVM anon change. Testing by thib@, beck@ and
ckuethe@ for a while. Okay beck@, "it is good timing" deraadt@.


Revision tags: OPENBSD_4_0_BASE OPENBSD_4_1_BASE
# 1.20 13-Jul-2006 deraadt

Back out the anon change. Apparently it was tested by a few, but most of
us did not see it or get a chance to test it before it was commited. It
broke cvs, in the ami driver, making it not succeed at seeing it's devices.


# 1.19 21-Jun-2006 mickey

from netbsd: make anons dynamically allocated from pool.
this results in lesse kva waste due to static preallocation of those
for every phys page and also every swap page.
tested by beck krw miod


Revision tags: OPENBSD_3_9_BASE
# 1.18 16-Jan-2006 mickey

add another uvm histroy for physpage alloc/free and propagate a debugging pgfree check into pglist; no functional change for normal kernels; make histories uncommon


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B UBC_SYNC_A
# 1.17 29-Mar-2003 mickey

ubchist is not a fully cooked kadaver and though use the other well formed pdhist one until ubc gaets back. art@ ok


Revision tags: OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.16 19-Dec-2001 art

UBC was a disaster. It worked very good when it worked, but on some
machines or some configurations or in some phase of the moon (we actually
don't know when or why) files disappeared. Since we've not been able to
track down the problem in two weeks intense debugging and we need -current
to be stable, back out everything to a state it had before UBC.

We apologise for the inconvenience.


Revision tags: UBC_BASE
# 1.15 28-Nov-2001 art

branches: 1.15.2;
Sync in more uvm from NetBSD. Mostly just cosmetic stuff.
Contains also support for page coloring.


# 1.14 10-Nov-2001 art

Merge in some parts of the ubc work that has been done in NetBSD that are not
UBC, but prerequsites for it.

- Create a daemon that processes async I/O (swap and paging in the future)
requests that need processing in process context and that were processed
in the pagedaemon before.
- Convert some ugly ifdef DIAGNOSTIC code to less intrusive KASSERTs.
- misc other cleanups.


# 1.13 05-Nov-2001 art

Minor sync to NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.12 11-Aug-2001 art

Various random fixes from NetBSD.
Including support for zeroing pages in the idle loop (not enabled yet).


# 1.11 18-Jul-2001 art

Unconfuse UVM_UNLOCK_AND_WAIT. From NetBSD.


# 1.10 23-Jun-2001 smart

Sync with NetBSD 19990911 (just before PMAP_NEW was required)
- thread_sleep_msg() -> uvm_sleep()
- initialize reference count lock in uvm_anon_{init,add}()
- add uao_flush()
- replace boolean 'islocked' with 'lockflags'
- in uvm_fault() change FALSE to TRUE to in 'wide' fault handling
- get rid of uvm_km_get()
- various bug fixes


Revision tags: OPENBSD_2_9_BASE
# 1.9 10-Apr-2001 niklas

Fix for machines which need to enlarge the kernel address space, at least
1GB i386 machines needs this. The fix is heavily based on Jason Thorpe's
found in NetBSD. Here is his original commit message:

Instead of checking vm_physmem[<physseg>].pgs to determine if
uvm_page_init() has completed, add a boolean uvm.page_init_done,
and test against that. Use this same boolean (rather than
pmap_initialized) in pmap_growkernel() to determine if we are
being called via uvm_page_init() to grow the kernel address space.

This fixes a problem on some i386 configurations where pmap_init()
itself was needing to have the kernel page table grown, and since
pmap_initialized was not yet set to TRUE, pmap_growkernel() was
choosing the wrong code path.


# 1.8 22-Mar-2001 smart

Sync style, typo, and comments a little closer to NetBSD. art@ ok


# 1.7 09-Mar-2001 smart

Protect protypes, certain macros, and inlines from userland. Checked userland
with a 'make build'. From NetBSD. art@ ok


# 1.6 29-Jan-2001 niklas

$OpenBSD$


Revision tags: OPENBSD_2_8_BASE
# 1.5 27-May-2000 provos

use rijndael instead of blowfish because of faster key setup.
break swap paritions into sections, each section has own
encryption key. if a section's key becomes unreferenced, erase it.


Revision tags: OPENBSD_2_7_BASE
# 1.4 15-Mar-2000 art

Fix the NetBSD id strings.


Revision tags: OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 23-Aug-1999 art

branches: 1.3.4;
sync with NetBSD from 1999.05.24 (there is a reason for this date)
Mostly cleanups, but also a few improvements to pagedaemon for better
handling of low memory and/or low swap conditions.


Revision tags: OPENBSD_2_5_BASE
# 1.2 26-Feb-1999 art

add OpenBSD tags


# 1.1 26-Feb-1999 art

Import of uvm from NetBSD. Some local changes, some code disabled


# 1.71 07-Oct-2022 deraadt

new UVM_ET_IMMUTABLE flag marks a uvm entry as immutable.


# 1.70 29-Sep-2022 deraadt

There no longer is any KVM_ET_* to keep in sync with UVM_ET_*, so
comment can be deleted.


Revision tags: OPENBSD_7_2_BASE
# 1.69 04-May-2022 mpi

Merge swap-backed and object-backed inactive page lists.

ok millert@, kettenis@


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE OPENBSD_7_1_BASE
# 1.68 24-Nov-2020 mpi

Grab the `pageqlock' before calling uvm_pageclean() as intended.

Document which global data structures require this lock and add some
asserts where the lock should be held.

Some code paths are still incorrect and should be revisited.

ok jmatthew@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.67 06-Dec-2019 mpi

Sync KVE_ET_* and UVM_ET_* flags.

ok guenther@


# 1.66 29-Nov-2019 deraadt

Repurpose the "syscalls must be on a writeable page" mechanism to
enforce a new policy: system calls must be in pre-registered regions.
We have discussed more strict checks than this, but none satisfy the
cost/benefit based upon our understanding of attack methods, anyways
let's see what the next iteration looks like.

This is intended to harden (translation: attackers must put extra
effort into attacking) against a mixture of W^X failures and JIT bugs
which allow syscall misinterpretation, especially in environments with
polymorphic-instruction/variable-sized instructions. It fits in a bit
with libc/libcrypto/ld.so random relink on boot and no-restart-at-crash
behaviour, particularily for remote problems. Less effective once on-host
since someone the libraries can be read.

For static-executables the kernel registers the main program's
PIE-mapped exec section valid, as well as the randomly-placed sigtramp
page. For dynamic executables ELF ld.so's exec segment is also
labelled valid; ld.so then has enough information to register libc's
exec section as valid via call-once msyscall(2)

For dynamic binaries, we continue to to permit the main program exec
segment because "go" (and potentially a few other applications) have
embedded system calls in the main program. Hopefully at least go gets
fixed soon.

We declare the concept of embedded syscalls a bad idea for numerous
reasons, as we notice the ecosystem has many of
static-syscall-in-base-binary which are dynamically linked against
libraries which in turn use libc, which contains another set of
syscall stubs. We've been concerned about adding even one additional
syscall entry point... but go's approach tends to double the entry-point
attack surface.

This was started at a nano-hackathon in Bob Beck's basement 2 weeks
ago during a long discussion with mortimer trying to hide from the SSL
scream-conversations, and finished in more comfortable circumstances
next to a wood-stove at Elk Lakes cabin with UVM scream-conversations.

ok guenther kettenis mortimer, lots of feedback from others
conversations about go with jsing tb sthen


Revision tags: OPENBSD_6_6_BASE
# 1.65 18-Jul-2019 cheloha

R.I.P. UVM_WAIT(). Use tsleep_nsec(9) directly.

UVM_WAIT() doesn't provide much of a useful abstraction. All callers
tsleep forever and no callers set PCATCH, so only 2 of 4 parameters are
actually used. Might as well just use tsleep_nsec(9) directly and make
the uvm code a bit less specialized.

Suggested by mpi@.

ok mpi@ visa@ millert@


Revision tags: OPENBSD_6_5_BASE
# 1.64 01-Mar-2019 cheloha

New mmap(2) flag: MAP_CONCEAL.

MAP_CONCEAL'd memory is not written to disk in the event of a core dump.
It may grow other qualities in the future.

Wanted by libressl, probably useful elsewhere, too.

Prompted by deraadt@, concept from deraadt@/kettenis@. With input from
deraadt@, cjeker@, kettenis@, otto@, bcook@, matthew@, guenther@, djm@,
and tedu@.

ok otto@ deraadt@


# 1.63 31-Oct-2018 kettenis

Add support to uvm to establish write-combining mappings. Use this in the
inteldrm driver to add support for the I915_MMAP_WC flag.

ok deraadt@, jsg@


Revision tags: OPENBSD_6_4_BASE
# 1.62 12-Apr-2018 deraadt

Implement MAP_STACK option for mmap(). Synchronous faults (pagefault and
syscall) confirm the stack register points at MAP_STACK memory, otherwise
SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
to create a MAP_STACK sub-region which satisfies alignment requirements.
Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
contents of the region -- there is no mprotect() equivalent operation, so
there is no MAP_STACK-adding gadget.
This opportunistic software-emulation of a stack protection bit makes
stack-pivot operations during ROPchain fragile (kind of like removing a
tool from the toolbox).
original discussion with tedu, uvm work by stefan, testing by mortimer
ok kettenis


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.61 11-Aug-2016 dlg

replace abuse of the static map entries RB_ENTRY pointers with an SLIST

free static entries are kept in a simple linked list, so use SLIST
to make this obvious. the RB_PARENT manipulations are ugly and
confusing.

ok kettenis@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.60 08-Oct-2015 kettenis

Lock the page queues by turning uvm_lock_pageq() and uvm_unlock_pageq() into
mtx_enter() and mtx_leave() operations. Not 100% this won't blow up but
there is only one way to find out, and we need this to make progress on
further unlocking uvm.

prodded by deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.59 04-May-2015 dlg

reduce the scope of things that include uvm_swap_encrypt.h.

uvm_meter.c needs it to route the sysctl, uvm_swap.c needs it to
use the functionality, and uvm_swap_encrypt.c needs it to for obvious
reasons. userland sysctl already includes it explicitely.

everything else doesnt and shouldnt care.

ok miod@


# 1.58 23-Apr-2015 dlg

tedu remnants of the previous attempt to implement page zeroing in
the idle thread.

ok deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.57 03-Oct-2014 kettenis

Introduce __MAP_NOFAULT, a mmap(2) flag that makes sure a mapping will not
cause a SIGSEGV or SIGBUS when a mapped file gets truncated. Access to
pages that are not backed by a file on such a mapping will be replaced by
zero-filled anonymous pages. Makes passing file descriptors of mapped files
usable without having to play tricks with signal handlers.

"steal your mmap flag" deraadt@


Revision tags: OPENBSD_5_6_BASE
# 1.56 11-Jul-2014 jsg

Chuck Cranor rescinded clauses in his license
on the 2nd of February 2011 in NetBSD.

http://marc.info/?l=netbsd-source-changes&m=129658899212732&w=2
http://marc.info/?l=netbsd-source-changes&m=129659095515558&w=2
http://marc.info/?l=netbsd-source-changes&m=129659157916514&w=2
http://marc.info/?l=netbsd-source-changes&m=129665962324372&w=2
http://marc.info/?l=netbsd-source-changes&m=129666033625342&w=2
http://marc.info/?l=netbsd-source-changes&m=129666052825545&w=2
http://marc.info/?l=netbsd-source-changes&m=129666922906480&w=2
http://marc.info/?l=netbsd-source-changes&m=129667725518082&w=2


# 1.55 08-Jul-2014 deraadt

decouple struct uvmexp into a new file, so that uvm_extern.h and sysctl.h
don't need to be married.
ok guenther miod beck jsing kettenis


# 1.54 08-Jul-2014 deraadt

white space repairs


# 1.53 28-Mar-2014 mpi

Reduce uvm include madness. Use <uvm/uvm_extern.h> instead of
<uvm/uvm.h> if possible and remove double inclusions.

ok beck@, mlarkin@, deraadt@


Revision tags: OPENBSD_5_5_BASE
# 1.52 09-Nov-2013 guenther

Add KASSERT()s to tsleep() and msleep() to verify that bogus flags
aren't being passed to them. Fix UVM_WAIT() to not pass PNORELOCK to
tsleep(), as that flag only does something with msleep().

ok beck@ dlg@


Revision tags: OPENBSD_5_4_BASE
# 1.51 30-May-2013 tedu

UVM_UNLOCK_AND_WAIT no longer unlocks, so rename it to UVM_WAIT.


# 1.50 30-May-2013 tedu

remove lots of comments about locking per beck's request


# 1.49 30-May-2013 tedu

remove simple_locks from uvm code. ok beck deraadt


# 1.48 29-May-2013 tedu

uvm_loan has not (ever) been compiled or used.


Revision tags: OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.47 09-Mar-2012 ariane

New vmmap implementation.

no oks (it is really a pain to review properly)
extensively tested, I'm confident it'll be stable
'now is the time' from several icb inhabitants

Diff provides:
- ability to specify different allocators for different regions/maps
- a simpler implementation of the current allocator
- currently in compatibility mode: it will generate similar addresses
as the old allocator


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.46 06-Jul-2011 beck

uvm changes for buffer cache improvements.
1) Make the pagedaemon aware of the memory ranges and size of allocations
where memory is being requested, and pass this information on to
bufbackoff(), which will later (not yet) be used to ensure that the
buffer cache gets out of the way in the right area of memory.

Note that this commit does not yet make it *do* that - as currently
the buffer cache is all in dma-able memory and it will simply back
off.

2) Add uvm_pagerealloc_multi - to be used by the buffer cache code
for reallocating pages to particular regions.

much of this work by ariane, with smatterings of me, art,and oga

ok oga@, thib@, ariane@, deraadt@


# 1.45 03-Jul-2011 oga

endodoify UVM_CNT too.

``beat it'' tedu@ the deleteotron


# 1.44 03-Jul-2011 oga

Rip out and burn support for UVM_HIST.

The vm hackers don't use it, don't maintain it and have to look at it all the
time. About time this 800 lines of code hit /dev/null.

``never liked it'' tedu@. ariane@ was very happy when i told her i wrote
this diff.


# 1.43 30-May-2011 oga

Remove the freelist member from vm_physseg

The new world order of pmemrange makes this data completely redundant
(being dealt with by the pmemrange constraints instead). Remove all code
that messes with the freelist.

While touching every caller of uvm_page_physload() anyway, add the flags
argument to all callers (all but one is 0 and that one already used
PHYSLOAD_DEVICE) and remove the macro magic to allow callers to continue
without it.

Should shrink the code a bit, as well.

matthew@ pointed out some mistakes i'd made.
``freelist death, I like. Ok.' ariane@
`I agree with the general direction, go ahead and i'll fix any fallout
shortly'' miod@ (68k 88k and vax i could not check would build)


# 1.42 15-Apr-2011 oga

When I switched uvm objects to use a per-object page tree instead of the
global hash I forgot to remove the has declarations from struct uvm. So
remove them now.

pointed out by blambert@, ok beck@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.41 29-Jun-2010 thib

Add a no_constraint uvm_constraint_range; use it in the pool code.

ok tedu@, beck@, oga@


# 1.40 27-Jun-2010 thib

uvm constraints. Add two mandatory MD symbols, uvm_md_constraints
which contains the constraints for DMA/memory allocation for each
architecture, and dma_constraints which contains the range of addresses
that are dma accessable by the system.

This is based on ariane@'s physcontig diff, with lots of bugfixes and
additions the following additions by my self:

Introduce a new function pool_set_constraints() which sets the address
range for which we allocate pages for the pool from, this is now used
for the mbuf/mbuf cluster pools to keep them dma accessible.

The !direct archs no longer stuff pages into the kernel object in
uvm_km_getpage_pla but rather do a pmap_extract() in uvm_km_putpages.

Tested heavily by my self on i386, amd64 and sparc64. Some tests on
alpha and SGI.

"commit it" beck, art, oga, deraadt
"i like the diff" deraadt


# 1.39 09-Jun-2010 thib

Move the prototype for uvm_wait() to uvm_extern.h and remove
uvm_pdaemon.h has it was only holding that one prototype.

OK art@, oga@, miod@, deraadt@


# 1.38 22-Apr-2010 oga

Committing on behalf or ariane@.

recommit pmemrange:
physmem allocator: change the view of free memory from single
free pages to free ranges. Classify memory based on region with
associated use-counter (which is used to construct a priority
list of where to allocate memory).

Based on code from tedu@, help from many.

Useable now that bugs have been found and fixed in most architecture's
pmap.c

ok by everyone who has done a pmap or uvm commit in the last year.


Revision tags: OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.37 16-Jun-2009 oga

date based reversion of uvm to the 4th May.

We still have no idea why this stops the crashes. but it does.

a machine forced to 64mb of ram cycled 10GB through swap with this diff
and is still running as I type this. Other tests by ariane@ and thib@
also seem to show that it's alright.

ok deraadt@, thib@, ariane@


# 1.36 16-Jun-2009 ariane

Backout pmemrange (which to most people is more well known as physmem
allocator).

"i can't see any obvious problems" oga


# 1.35 16-Jun-2009 oga

Backout all changes to uvm after pmemrange (which will be backed out
separately).

a change at or just before the hackathon has either exposed or added a
very very nasty memory corruption bug that is giving us hell right now.
So in the interest of kernel stability these diffs are being backed out
until such a time as that corruption bug has been found and squashed,
then the ones that are proven good may slowly return.

a quick hitlist of the main commits this backs out:

mine:
uvm_objwire
the lock change in uvm_swap.c
using trees for uvm objects instead of the hash
removing the pgo_releasepg callback.

art@'s:
putting pmap_page_protect(VM_PROT_NONE) in uvm_pagedeactivate() since
all callers called that just prior anyway.

ok beck@, ariane@.

prompted by deraadt@.


# 1.34 02-Jun-2009 oga

Instead of the global hash table with the terrible hashfunction and a
global lock, switch the uvm object pages to being kept in a per-object
RB_TREE. Right now this is approximately the same speed, but cleaner.
When biglock usage is reduced this will improve concurrency due to lock
contention..

ok beck@ art@. Thanks to jasper for the speed testing.


# 1.33 01-Jun-2009 ariane

physmem allocator: change the view of free memory from single free pages
to free ranges.
Classify memory based on region with associated use-counter (which is used
to construct a priority list of where to allocate memory).

Based on code from tedu@, help from many.
Ok art@


# 1.32 04-May-2009 oga

Instead of keeping two ints in the uvm structure specifically just to
sleep on them (and otherwise ignore them) sleep on the pointer to the
{aiodoned,pagedaemon}_proc members, and nuke the two extra words.

"no objections" art@, ok beck@.


# 1.31 28-Apr-2009 miod

Revert pageqlock back from a mutex to a simple_lock, as it needs to be
recursive in some cases (mostly involving swapping). A proper fix is in
the works, but this will unbreak kernels for now.


# 1.30 14-Apr-2009 oga

The use of uvm.pagedaemon_lock is incredibly inconsistent. only a
fraction of the wakeups and sleeps involved here actually grab that
lock. The remainder, on the other hand, always have the fpageq_lock
locked.

So, make this locking correct by switching the other users over to
fpageq_lock, too.

This would probably be better off being a semaphore, but for now at
least it's correct.

"ok, unless you want to implement semaphores" art@


# 1.29 13-Apr-2009 oga

Convert the page queue lock to a mutex instead of a simplelock.

Fix up the one case of lock recursion (which blatantly ignored the
comment right above it saying that we don't need to lock). The rest of
the lock usage has been checked and appears to be correct.

ok ariane@.


# 1.28 06-Apr-2009 oga

Instead of doing splbio(); simple_lock(&uvm.aiodoned_lock); just replace
the simple lock with a real lock - a IPL_BIO mutex. While i'm here, make
the sleeping condition one hell of a lot simpler in the aio daemon.

some ideas from and ok art@.


# 1.27 26-Mar-2009 oga

Convert splvm() + simplelock(&uvm.hashlock); around the page hash table
into a IPL_VM blocking mutex, also slightly extend the locked area so
that it actually protects access to the page array (as the comment on
the lock declaration says it should).

ansify a few functions while i'm in the file.

"ok, even though you're sneaking in ansification in a diff. You dirty
you." art@


# 1.26 25-Mar-2009 oga

Move all of the pseudo-inline functions in uvm into C files.

By pseudo-inline, I mean that if a certain macro was defined, they would
be inlined. However, no architecture defines that, and none has for a
very very long time. Therefore mainly this just makes the code a damned
sight easier to read. Some k&r -> ansi declarations while I'm in there.

"just commit it" art@. ok weingart@.


Revision tags: OPENBSD_4_5_BASE
# 1.25 27-Jan-2009 miod

Get rid of the last traces of uvm.pager_[se]va


Revision tags: OPENBSD_4_4_BASE
# 1.24 09-Jun-2008 miod

Define a new flag, UVM_FLAG_HOLE, for uvm_map to create a vm_map_entry of
a new etype, UVM_ET_HOLE, meaning it has no backend.

UVM_ET_HOLE entries (which should be created as UVM_PROT_NONE and with
UVM_FLAG_NOMERGE and UVM_FLAG_HOLE) are skipped in uvm_unmap_remove(), so
that pmap_{k,}remove() is not called on the entry.

This is intended to save time, and behave better, on pmaps with MMU holes
at process exit time.

ok art@, kettenis@ provided feedback as well.


# 1.23 05-May-2008 thib

retire ltsleep(); The only refrence left too it is in an
ifdef netbsd block in drm code, but oga@ says he'll remove
it soon...

OK art@, oga@;


Revision tags: OPENBSD_4_3_BASE
# 1.22 29-Nov-2007 tedu

use a working mutex for the freepage list. ok art deraadt


Revision tags: OPENBSD_4_2_BASE
# 1.21 18-Jun-2007 pedro

Bring back Mickey's UVM anon change. Testing by thib@, beck@ and
ckuethe@ for a while. Okay beck@, "it is good timing" deraadt@.


Revision tags: OPENBSD_4_0_BASE OPENBSD_4_1_BASE
# 1.20 13-Jul-2006 deraadt

Back out the anon change. Apparently it was tested by a few, but most of
us did not see it or get a chance to test it before it was commited. It
broke cvs, in the ami driver, making it not succeed at seeing it's devices.


# 1.19 21-Jun-2006 mickey

from netbsd: make anons dynamically allocated from pool.
this results in lesse kva waste due to static preallocation of those
for every phys page and also every swap page.
tested by beck krw miod


Revision tags: OPENBSD_3_9_BASE
# 1.18 16-Jan-2006 mickey

add another uvm histroy for physpage alloc/free and propagate a debugging pgfree check into pglist; no functional change for normal kernels; make histories uncommon


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B UBC_SYNC_A
# 1.17 29-Mar-2003 mickey

ubchist is not a fully cooked kadaver and though use the other well formed pdhist one until ubc gaets back. art@ ok


Revision tags: OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.16 19-Dec-2001 art

UBC was a disaster. It worked very good when it worked, but on some
machines or some configurations or in some phase of the moon (we actually
don't know when or why) files disappeared. Since we've not been able to
track down the problem in two weeks intense debugging and we need -current
to be stable, back out everything to a state it had before UBC.

We apologise for the inconvenience.


Revision tags: UBC_BASE
# 1.15 28-Nov-2001 art

branches: 1.15.2;
Sync in more uvm from NetBSD. Mostly just cosmetic stuff.
Contains also support for page coloring.


# 1.14 10-Nov-2001 art

Merge in some parts of the ubc work that has been done in NetBSD that are not
UBC, but prerequsites for it.

- Create a daemon that processes async I/O (swap and paging in the future)
requests that need processing in process context and that were processed
in the pagedaemon before.
- Convert some ugly ifdef DIAGNOSTIC code to less intrusive KASSERTs.
- misc other cleanups.


# 1.13 05-Nov-2001 art

Minor sync to NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.12 11-Aug-2001 art

Various random fixes from NetBSD.
Including support for zeroing pages in the idle loop (not enabled yet).


# 1.11 18-Jul-2001 art

Unconfuse UVM_UNLOCK_AND_WAIT. From NetBSD.


# 1.10 23-Jun-2001 smart

Sync with NetBSD 19990911 (just before PMAP_NEW was required)
- thread_sleep_msg() -> uvm_sleep()
- initialize reference count lock in uvm_anon_{init,add}()
- add uao_flush()
- replace boolean 'islocked' with 'lockflags'
- in uvm_fault() change FALSE to TRUE to in 'wide' fault handling
- get rid of uvm_km_get()
- various bug fixes


Revision tags: OPENBSD_2_9_BASE
# 1.9 10-Apr-2001 niklas

Fix for machines which need to enlarge the kernel address space, at least
1GB i386 machines needs this. The fix is heavily based on Jason Thorpe's
found in NetBSD. Here is his original commit message:

Instead of checking vm_physmem[<physseg>].pgs to determine if
uvm_page_init() has completed, add a boolean uvm.page_init_done,
and test against that. Use this same boolean (rather than
pmap_initialized) in pmap_growkernel() to determine if we are
being called via uvm_page_init() to grow the kernel address space.

This fixes a problem on some i386 configurations where pmap_init()
itself was needing to have the kernel page table grown, and since
pmap_initialized was not yet set to TRUE, pmap_growkernel() was
choosing the wrong code path.


# 1.8 22-Mar-2001 smart

Sync style, typo, and comments a little closer to NetBSD. art@ ok


# 1.7 09-Mar-2001 smart

Protect protypes, certain macros, and inlines from userland. Checked userland
with a 'make build'. From NetBSD. art@ ok


# 1.6 29-Jan-2001 niklas

$OpenBSD$


Revision tags: OPENBSD_2_8_BASE
# 1.5 27-May-2000 provos

use rijndael instead of blowfish because of faster key setup.
break swap paritions into sections, each section has own
encryption key. if a section's key becomes unreferenced, erase it.


Revision tags: OPENBSD_2_7_BASE
# 1.4 15-Mar-2000 art

Fix the NetBSD id strings.


Revision tags: OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 23-Aug-1999 art

branches: 1.3.4;
sync with NetBSD from 1999.05.24 (there is a reason for this date)
Mostly cleanups, but also a few improvements to pagedaemon for better
handling of low memory and/or low swap conditions.


Revision tags: OPENBSD_2_5_BASE
# 1.2 26-Feb-1999 art

add OpenBSD tags


# 1.1 26-Feb-1999 art

Import of uvm from NetBSD. Some local changes, some code disabled


# 1.70 29-Sep-2022 deraadt

There no longer is any KVM_ET_* to keep in sync with UVM_ET_*, so
comment can be deleted.


Revision tags: OPENBSD_7_2_BASE
# 1.69 04-May-2022 mpi

Merge swap-backed and object-backed inactive page lists.

ok millert@, kettenis@


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE OPENBSD_7_1_BASE
# 1.68 24-Nov-2020 mpi

Grab the `pageqlock' before calling uvm_pageclean() as intended.

Document which global data structures require this lock and add some
asserts where the lock should be held.

Some code paths are still incorrect and should be revisited.

ok jmatthew@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.67 06-Dec-2019 mpi

Sync KVE_ET_* and UVM_ET_* flags.

ok guenther@


# 1.66 29-Nov-2019 deraadt

Repurpose the "syscalls must be on a writeable page" mechanism to
enforce a new policy: system calls must be in pre-registered regions.
We have discussed more strict checks than this, but none satisfy the
cost/benefit based upon our understanding of attack methods, anyways
let's see what the next iteration looks like.

This is intended to harden (translation: attackers must put extra
effort into attacking) against a mixture of W^X failures and JIT bugs
which allow syscall misinterpretation, especially in environments with
polymorphic-instruction/variable-sized instructions. It fits in a bit
with libc/libcrypto/ld.so random relink on boot and no-restart-at-crash
behaviour, particularily for remote problems. Less effective once on-host
since someone the libraries can be read.

For static-executables the kernel registers the main program's
PIE-mapped exec section valid, as well as the randomly-placed sigtramp
page. For dynamic executables ELF ld.so's exec segment is also
labelled valid; ld.so then has enough information to register libc's
exec section as valid via call-once msyscall(2)

For dynamic binaries, we continue to to permit the main program exec
segment because "go" (and potentially a few other applications) have
embedded system calls in the main program. Hopefully at least go gets
fixed soon.

We declare the concept of embedded syscalls a bad idea for numerous
reasons, as we notice the ecosystem has many of
static-syscall-in-base-binary which are dynamically linked against
libraries which in turn use libc, which contains another set of
syscall stubs. We've been concerned about adding even one additional
syscall entry point... but go's approach tends to double the entry-point
attack surface.

This was started at a nano-hackathon in Bob Beck's basement 2 weeks
ago during a long discussion with mortimer trying to hide from the SSL
scream-conversations, and finished in more comfortable circumstances
next to a wood-stove at Elk Lakes cabin with UVM scream-conversations.

ok guenther kettenis mortimer, lots of feedback from others
conversations about go with jsing tb sthen


Revision tags: OPENBSD_6_6_BASE
# 1.65 18-Jul-2019 cheloha

R.I.P. UVM_WAIT(). Use tsleep_nsec(9) directly.

UVM_WAIT() doesn't provide much of a useful abstraction. All callers
tsleep forever and no callers set PCATCH, so only 2 of 4 parameters are
actually used. Might as well just use tsleep_nsec(9) directly and make
the uvm code a bit less specialized.

Suggested by mpi@.

ok mpi@ visa@ millert@


Revision tags: OPENBSD_6_5_BASE
# 1.64 01-Mar-2019 cheloha

New mmap(2) flag: MAP_CONCEAL.

MAP_CONCEAL'd memory is not written to disk in the event of a core dump.
It may grow other qualities in the future.

Wanted by libressl, probably useful elsewhere, too.

Prompted by deraadt@, concept from deraadt@/kettenis@. With input from
deraadt@, cjeker@, kettenis@, otto@, bcook@, matthew@, guenther@, djm@,
and tedu@.

ok otto@ deraadt@


# 1.63 31-Oct-2018 kettenis

Add support to uvm to establish write-combining mappings. Use this in the
inteldrm driver to add support for the I915_MMAP_WC flag.

ok deraadt@, jsg@


Revision tags: OPENBSD_6_4_BASE
# 1.62 12-Apr-2018 deraadt

Implement MAP_STACK option for mmap(). Synchronous faults (pagefault and
syscall) confirm the stack register points at MAP_STACK memory, otherwise
SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
to create a MAP_STACK sub-region which satisfies alignment requirements.
Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
contents of the region -- there is no mprotect() equivalent operation, so
there is no MAP_STACK-adding gadget.
This opportunistic software-emulation of a stack protection bit makes
stack-pivot operations during ROPchain fragile (kind of like removing a
tool from the toolbox).
original discussion with tedu, uvm work by stefan, testing by mortimer
ok kettenis


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.61 11-Aug-2016 dlg

replace abuse of the static map entries RB_ENTRY pointers with an SLIST

free static entries are kept in a simple linked list, so use SLIST
to make this obvious. the RB_PARENT manipulations are ugly and
confusing.

ok kettenis@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.60 08-Oct-2015 kettenis

Lock the page queues by turning uvm_lock_pageq() and uvm_unlock_pageq() into
mtx_enter() and mtx_leave() operations. Not 100% this won't blow up but
there is only one way to find out, and we need this to make progress on
further unlocking uvm.

prodded by deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.59 04-May-2015 dlg

reduce the scope of things that include uvm_swap_encrypt.h.

uvm_meter.c needs it to route the sysctl, uvm_swap.c needs it to
use the functionality, and uvm_swap_encrypt.c needs it to for obvious
reasons. userland sysctl already includes it explicitely.

everything else doesnt and shouldnt care.

ok miod@


# 1.58 23-Apr-2015 dlg

tedu remnants of the previous attempt to implement page zeroing in
the idle thread.

ok deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.57 03-Oct-2014 kettenis

Introduce __MAP_NOFAULT, a mmap(2) flag that makes sure a mapping will not
cause a SIGSEGV or SIGBUS when a mapped file gets truncated. Access to
pages that are not backed by a file on such a mapping will be replaced by
zero-filled anonymous pages. Makes passing file descriptors of mapped files
usable without having to play tricks with signal handlers.

"steal your mmap flag" deraadt@


Revision tags: OPENBSD_5_6_BASE
# 1.56 11-Jul-2014 jsg

Chuck Cranor rescinded clauses in his license
on the 2nd of February 2011 in NetBSD.

http://marc.info/?l=netbsd-source-changes&m=129658899212732&w=2
http://marc.info/?l=netbsd-source-changes&m=129659095515558&w=2
http://marc.info/?l=netbsd-source-changes&m=129659157916514&w=2
http://marc.info/?l=netbsd-source-changes&m=129665962324372&w=2
http://marc.info/?l=netbsd-source-changes&m=129666033625342&w=2
http://marc.info/?l=netbsd-source-changes&m=129666052825545&w=2
http://marc.info/?l=netbsd-source-changes&m=129666922906480&w=2
http://marc.info/?l=netbsd-source-changes&m=129667725518082&w=2


# 1.55 08-Jul-2014 deraadt

decouple struct uvmexp into a new file, so that uvm_extern.h and sysctl.h
don't need to be married.
ok guenther miod beck jsing kettenis


# 1.54 08-Jul-2014 deraadt

white space repairs


# 1.53 28-Mar-2014 mpi

Reduce uvm include madness. Use <uvm/uvm_extern.h> instead of
<uvm/uvm.h> if possible and remove double inclusions.

ok beck@, mlarkin@, deraadt@


Revision tags: OPENBSD_5_5_BASE
# 1.52 09-Nov-2013 guenther

Add KASSERT()s to tsleep() and msleep() to verify that bogus flags
aren't being passed to them. Fix UVM_WAIT() to not pass PNORELOCK to
tsleep(), as that flag only does something with msleep().

ok beck@ dlg@


Revision tags: OPENBSD_5_4_BASE
# 1.51 30-May-2013 tedu

UVM_UNLOCK_AND_WAIT no longer unlocks, so rename it to UVM_WAIT.


# 1.50 30-May-2013 tedu

remove lots of comments about locking per beck's request


# 1.49 30-May-2013 tedu

remove simple_locks from uvm code. ok beck deraadt


# 1.48 29-May-2013 tedu

uvm_loan has not (ever) been compiled or used.


Revision tags: OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.47 09-Mar-2012 ariane

New vmmap implementation.

no oks (it is really a pain to review properly)
extensively tested, I'm confident it'll be stable
'now is the time' from several icb inhabitants

Diff provides:
- ability to specify different allocators for different regions/maps
- a simpler implementation of the current allocator
- currently in compatibility mode: it will generate similar addresses
as the old allocator


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.46 06-Jul-2011 beck

uvm changes for buffer cache improvements.
1) Make the pagedaemon aware of the memory ranges and size of allocations
where memory is being requested, and pass this information on to
bufbackoff(), which will later (not yet) be used to ensure that the
buffer cache gets out of the way in the right area of memory.

Note that this commit does not yet make it *do* that - as currently
the buffer cache is all in dma-able memory and it will simply back
off.

2) Add uvm_pagerealloc_multi - to be used by the buffer cache code
for reallocating pages to particular regions.

much of this work by ariane, with smatterings of me, art,and oga

ok oga@, thib@, ariane@, deraadt@


# 1.45 03-Jul-2011 oga

endodoify UVM_CNT too.

``beat it'' tedu@ the deleteotron


# 1.44 03-Jul-2011 oga

Rip out and burn support for UVM_HIST.

The vm hackers don't use it, don't maintain it and have to look at it all the
time. About time this 800 lines of code hit /dev/null.

``never liked it'' tedu@. ariane@ was very happy when i told her i wrote
this diff.


# 1.43 30-May-2011 oga

Remove the freelist member from vm_physseg

The new world order of pmemrange makes this data completely redundant
(being dealt with by the pmemrange constraints instead). Remove all code
that messes with the freelist.

While touching every caller of uvm_page_physload() anyway, add the flags
argument to all callers (all but one is 0 and that one already used
PHYSLOAD_DEVICE) and remove the macro magic to allow callers to continue
without it.

Should shrink the code a bit, as well.

matthew@ pointed out some mistakes i'd made.
``freelist death, I like. Ok.' ariane@
`I agree with the general direction, go ahead and i'll fix any fallout
shortly'' miod@ (68k 88k and vax i could not check would build)


# 1.42 15-Apr-2011 oga

When I switched uvm objects to use a per-object page tree instead of the
global hash I forgot to remove the has declarations from struct uvm. So
remove them now.

pointed out by blambert@, ok beck@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.41 29-Jun-2010 thib

Add a no_constraint uvm_constraint_range; use it in the pool code.

ok tedu@, beck@, oga@


# 1.40 27-Jun-2010 thib

uvm constraints. Add two mandatory MD symbols, uvm_md_constraints
which contains the constraints for DMA/memory allocation for each
architecture, and dma_constraints which contains the range of addresses
that are dma accessable by the system.

This is based on ariane@'s physcontig diff, with lots of bugfixes and
additions the following additions by my self:

Introduce a new function pool_set_constraints() which sets the address
range for which we allocate pages for the pool from, this is now used
for the mbuf/mbuf cluster pools to keep them dma accessible.

The !direct archs no longer stuff pages into the kernel object in
uvm_km_getpage_pla but rather do a pmap_extract() in uvm_km_putpages.

Tested heavily by my self on i386, amd64 and sparc64. Some tests on
alpha and SGI.

"commit it" beck, art, oga, deraadt
"i like the diff" deraadt


# 1.39 09-Jun-2010 thib

Move the prototype for uvm_wait() to uvm_extern.h and remove
uvm_pdaemon.h has it was only holding that one prototype.

OK art@, oga@, miod@, deraadt@


# 1.38 22-Apr-2010 oga

Committing on behalf or ariane@.

recommit pmemrange:
physmem allocator: change the view of free memory from single
free pages to free ranges. Classify memory based on region with
associated use-counter (which is used to construct a priority
list of where to allocate memory).

Based on code from tedu@, help from many.

Useable now that bugs have been found and fixed in most architecture's
pmap.c

ok by everyone who has done a pmap or uvm commit in the last year.


Revision tags: OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.37 16-Jun-2009 oga

date based reversion of uvm to the 4th May.

We still have no idea why this stops the crashes. but it does.

a machine forced to 64mb of ram cycled 10GB through swap with this diff
and is still running as I type this. Other tests by ariane@ and thib@
also seem to show that it's alright.

ok deraadt@, thib@, ariane@


# 1.36 16-Jun-2009 ariane

Backout pmemrange (which to most people is more well known as physmem
allocator).

"i can't see any obvious problems" oga


# 1.35 16-Jun-2009 oga

Backout all changes to uvm after pmemrange (which will be backed out
separately).

a change at or just before the hackathon has either exposed or added a
very very nasty memory corruption bug that is giving us hell right now.
So in the interest of kernel stability these diffs are being backed out
until such a time as that corruption bug has been found and squashed,
then the ones that are proven good may slowly return.

a quick hitlist of the main commits this backs out:

mine:
uvm_objwire
the lock change in uvm_swap.c
using trees for uvm objects instead of the hash
removing the pgo_releasepg callback.

art@'s:
putting pmap_page_protect(VM_PROT_NONE) in uvm_pagedeactivate() since
all callers called that just prior anyway.

ok beck@, ariane@.

prompted by deraadt@.


# 1.34 02-Jun-2009 oga

Instead of the global hash table with the terrible hashfunction and a
global lock, switch the uvm object pages to being kept in a per-object
RB_TREE. Right now this is approximately the same speed, but cleaner.
When biglock usage is reduced this will improve concurrency due to lock
contention..

ok beck@ art@. Thanks to jasper for the speed testing.


# 1.33 01-Jun-2009 ariane

physmem allocator: change the view of free memory from single free pages
to free ranges.
Classify memory based on region with associated use-counter (which is used
to construct a priority list of where to allocate memory).

Based on code from tedu@, help from many.
Ok art@


# 1.32 04-May-2009 oga

Instead of keeping two ints in the uvm structure specifically just to
sleep on them (and otherwise ignore them) sleep on the pointer to the
{aiodoned,pagedaemon}_proc members, and nuke the two extra words.

"no objections" art@, ok beck@.


# 1.31 28-Apr-2009 miod

Revert pageqlock back from a mutex to a simple_lock, as it needs to be
recursive in some cases (mostly involving swapping). A proper fix is in
the works, but this will unbreak kernels for now.


# 1.30 14-Apr-2009 oga

The use of uvm.pagedaemon_lock is incredibly inconsistent. only a
fraction of the wakeups and sleeps involved here actually grab that
lock. The remainder, on the other hand, always have the fpageq_lock
locked.

So, make this locking correct by switching the other users over to
fpageq_lock, too.

This would probably be better off being a semaphore, but for now at
least it's correct.

"ok, unless you want to implement semaphores" art@


# 1.29 13-Apr-2009 oga

Convert the page queue lock to a mutex instead of a simplelock.

Fix up the one case of lock recursion (which blatantly ignored the
comment right above it saying that we don't need to lock). The rest of
the lock usage has been checked and appears to be correct.

ok ariane@.


# 1.28 06-Apr-2009 oga

Instead of doing splbio(); simple_lock(&uvm.aiodoned_lock); just replace
the simple lock with a real lock - a IPL_BIO mutex. While i'm here, make
the sleeping condition one hell of a lot simpler in the aio daemon.

some ideas from and ok art@.


# 1.27 26-Mar-2009 oga

Convert splvm() + simplelock(&uvm.hashlock); around the page hash table
into a IPL_VM blocking mutex, also slightly extend the locked area so
that it actually protects access to the page array (as the comment on
the lock declaration says it should).

ansify a few functions while i'm in the file.

"ok, even though you're sneaking in ansification in a diff. You dirty
you." art@


# 1.26 25-Mar-2009 oga

Move all of the pseudo-inline functions in uvm into C files.

By pseudo-inline, I mean that if a certain macro was defined, they would
be inlined. However, no architecture defines that, and none has for a
very very long time. Therefore mainly this just makes the code a damned
sight easier to read. Some k&r -> ansi declarations while I'm in there.

"just commit it" art@. ok weingart@.


Revision tags: OPENBSD_4_5_BASE
# 1.25 27-Jan-2009 miod

Get rid of the last traces of uvm.pager_[se]va


Revision tags: OPENBSD_4_4_BASE
# 1.24 09-Jun-2008 miod

Define a new flag, UVM_FLAG_HOLE, for uvm_map to create a vm_map_entry of
a new etype, UVM_ET_HOLE, meaning it has no backend.

UVM_ET_HOLE entries (which should be created as UVM_PROT_NONE and with
UVM_FLAG_NOMERGE and UVM_FLAG_HOLE) are skipped in uvm_unmap_remove(), so
that pmap_{k,}remove() is not called on the entry.

This is intended to save time, and behave better, on pmaps with MMU holes
at process exit time.

ok art@, kettenis@ provided feedback as well.


# 1.23 05-May-2008 thib

retire ltsleep(); The only refrence left too it is in an
ifdef netbsd block in drm code, but oga@ says he'll remove
it soon...

OK art@, oga@;


Revision tags: OPENBSD_4_3_BASE
# 1.22 29-Nov-2007 tedu

use a working mutex for the freepage list. ok art deraadt


Revision tags: OPENBSD_4_2_BASE
# 1.21 18-Jun-2007 pedro

Bring back Mickey's UVM anon change. Testing by thib@, beck@ and
ckuethe@ for a while. Okay beck@, "it is good timing" deraadt@.


Revision tags: OPENBSD_4_0_BASE OPENBSD_4_1_BASE
# 1.20 13-Jul-2006 deraadt

Back out the anon change. Apparently it was tested by a few, but most of
us did not see it or get a chance to test it before it was commited. It
broke cvs, in the ami driver, making it not succeed at seeing it's devices.


# 1.19 21-Jun-2006 mickey

from netbsd: make anons dynamically allocated from pool.
this results in lesse kva waste due to static preallocation of those
for every phys page and also every swap page.
tested by beck krw miod


Revision tags: OPENBSD_3_9_BASE
# 1.18 16-Jan-2006 mickey

add another uvm histroy for physpage alloc/free and propagate a debugging pgfree check into pglist; no functional change for normal kernels; make histories uncommon


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B UBC_SYNC_A
# 1.17 29-Mar-2003 mickey

ubchist is not a fully cooked kadaver and though use the other well formed pdhist one until ubc gaets back. art@ ok


Revision tags: OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.16 19-Dec-2001 art

UBC was a disaster. It worked very good when it worked, but on some
machines or some configurations or in some phase of the moon (we actually
don't know when or why) files disappeared. Since we've not been able to
track down the problem in two weeks intense debugging and we need -current
to be stable, back out everything to a state it had before UBC.

We apologise for the inconvenience.


Revision tags: UBC_BASE
# 1.15 28-Nov-2001 art

branches: 1.15.2;
Sync in more uvm from NetBSD. Mostly just cosmetic stuff.
Contains also support for page coloring.


# 1.14 10-Nov-2001 art

Merge in some parts of the ubc work that has been done in NetBSD that are not
UBC, but prerequsites for it.

- Create a daemon that processes async I/O (swap and paging in the future)
requests that need processing in process context and that were processed
in the pagedaemon before.
- Convert some ugly ifdef DIAGNOSTIC code to less intrusive KASSERTs.
- misc other cleanups.


# 1.13 05-Nov-2001 art

Minor sync to NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.12 11-Aug-2001 art

Various random fixes from NetBSD.
Including support for zeroing pages in the idle loop (not enabled yet).


# 1.11 18-Jul-2001 art

Unconfuse UVM_UNLOCK_AND_WAIT. From NetBSD.


# 1.10 23-Jun-2001 smart

Sync with NetBSD 19990911 (just before PMAP_NEW was required)
- thread_sleep_msg() -> uvm_sleep()
- initialize reference count lock in uvm_anon_{init,add}()
- add uao_flush()
- replace boolean 'islocked' with 'lockflags'
- in uvm_fault() change FALSE to TRUE to in 'wide' fault handling
- get rid of uvm_km_get()
- various bug fixes


Revision tags: OPENBSD_2_9_BASE
# 1.9 10-Apr-2001 niklas

Fix for machines which need to enlarge the kernel address space, at least
1GB i386 machines needs this. The fix is heavily based on Jason Thorpe's
found in NetBSD. Here is his original commit message:

Instead of checking vm_physmem[<physseg>].pgs to determine if
uvm_page_init() has completed, add a boolean uvm.page_init_done,
and test against that. Use this same boolean (rather than
pmap_initialized) in pmap_growkernel() to determine if we are
being called via uvm_page_init() to grow the kernel address space.

This fixes a problem on some i386 configurations where pmap_init()
itself was needing to have the kernel page table grown, and since
pmap_initialized was not yet set to TRUE, pmap_growkernel() was
choosing the wrong code path.


# 1.8 22-Mar-2001 smart

Sync style, typo, and comments a little closer to NetBSD. art@ ok


# 1.7 09-Mar-2001 smart

Protect protypes, certain macros, and inlines from userland. Checked userland
with a 'make build'. From NetBSD. art@ ok


# 1.6 29-Jan-2001 niklas

$OpenBSD$


Revision tags: OPENBSD_2_8_BASE
# 1.5 27-May-2000 provos

use rijndael instead of blowfish because of faster key setup.
break swap paritions into sections, each section has own
encryption key. if a section's key becomes unreferenced, erase it.


Revision tags: OPENBSD_2_7_BASE
# 1.4 15-Mar-2000 art

Fix the NetBSD id strings.


Revision tags: OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 23-Aug-1999 art

branches: 1.3.4;
sync with NetBSD from 1999.05.24 (there is a reason for this date)
Mostly cleanups, but also a few improvements to pagedaemon for better
handling of low memory and/or low swap conditions.


Revision tags: OPENBSD_2_5_BASE
# 1.2 26-Feb-1999 art

add OpenBSD tags


# 1.1 26-Feb-1999 art

Import of uvm from NetBSD. Some local changes, some code disabled


# 1.69 04-May-2022 mpi

Merge swap-backed and object-backed inactive page lists.

ok millert@, kettenis@


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE OPENBSD_7_1_BASE
# 1.68 24-Nov-2020 mpi

Grab the `pageqlock' before calling uvm_pageclean() as intended.

Document which global data structures require this lock and add some
asserts where the lock should be held.

Some code paths are still incorrect and should be revisited.

ok jmatthew@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.67 06-Dec-2019 mpi

Sync KVE_ET_* and UVM_ET_* flags.

ok guenther@


# 1.66 29-Nov-2019 deraadt

Repurpose the "syscalls must be on a writeable page" mechanism to
enforce a new policy: system calls must be in pre-registered regions.
We have discussed more strict checks than this, but none satisfy the
cost/benefit based upon our understanding of attack methods, anyways
let's see what the next iteration looks like.

This is intended to harden (translation: attackers must put extra
effort into attacking) against a mixture of W^X failures and JIT bugs
which allow syscall misinterpretation, especially in environments with
polymorphic-instruction/variable-sized instructions. It fits in a bit
with libc/libcrypto/ld.so random relink on boot and no-restart-at-crash
behaviour, particularily for remote problems. Less effective once on-host
since someone the libraries can be read.

For static-executables the kernel registers the main program's
PIE-mapped exec section valid, as well as the randomly-placed sigtramp
page. For dynamic executables ELF ld.so's exec segment is also
labelled valid; ld.so then has enough information to register libc's
exec section as valid via call-once msyscall(2)

For dynamic binaries, we continue to to permit the main program exec
segment because "go" (and potentially a few other applications) have
embedded system calls in the main program. Hopefully at least go gets
fixed soon.

We declare the concept of embedded syscalls a bad idea for numerous
reasons, as we notice the ecosystem has many of
static-syscall-in-base-binary which are dynamically linked against
libraries which in turn use libc, which contains another set of
syscall stubs. We've been concerned about adding even one additional
syscall entry point... but go's approach tends to double the entry-point
attack surface.

This was started at a nano-hackathon in Bob Beck's basement 2 weeks
ago during a long discussion with mortimer trying to hide from the SSL
scream-conversations, and finished in more comfortable circumstances
next to a wood-stove at Elk Lakes cabin with UVM scream-conversations.

ok guenther kettenis mortimer, lots of feedback from others
conversations about go with jsing tb sthen


Revision tags: OPENBSD_6_6_BASE
# 1.65 18-Jul-2019 cheloha

R.I.P. UVM_WAIT(). Use tsleep_nsec(9) directly.

UVM_WAIT() doesn't provide much of a useful abstraction. All callers
tsleep forever and no callers set PCATCH, so only 2 of 4 parameters are
actually used. Might as well just use tsleep_nsec(9) directly and make
the uvm code a bit less specialized.

Suggested by mpi@.

ok mpi@ visa@ millert@


Revision tags: OPENBSD_6_5_BASE
# 1.64 01-Mar-2019 cheloha

New mmap(2) flag: MAP_CONCEAL.

MAP_CONCEAL'd memory is not written to disk in the event of a core dump.
It may grow other qualities in the future.

Wanted by libressl, probably useful elsewhere, too.

Prompted by deraadt@, concept from deraadt@/kettenis@. With input from
deraadt@, cjeker@, kettenis@, otto@, bcook@, matthew@, guenther@, djm@,
and tedu@.

ok otto@ deraadt@


# 1.63 31-Oct-2018 kettenis

Add support to uvm to establish write-combining mappings. Use this in the
inteldrm driver to add support for the I915_MMAP_WC flag.

ok deraadt@, jsg@


Revision tags: OPENBSD_6_4_BASE
# 1.62 12-Apr-2018 deraadt

Implement MAP_STACK option for mmap(). Synchronous faults (pagefault and
syscall) confirm the stack register points at MAP_STACK memory, otherwise
SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
to create a MAP_STACK sub-region which satisfies alignment requirements.
Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
contents of the region -- there is no mprotect() equivalent operation, so
there is no MAP_STACK-adding gadget.
This opportunistic software-emulation of a stack protection bit makes
stack-pivot operations during ROPchain fragile (kind of like removing a
tool from the toolbox).
original discussion with tedu, uvm work by stefan, testing by mortimer
ok kettenis


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.61 11-Aug-2016 dlg

replace abuse of the static map entries RB_ENTRY pointers with an SLIST

free static entries are kept in a simple linked list, so use SLIST
to make this obvious. the RB_PARENT manipulations are ugly and
confusing.

ok kettenis@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.60 08-Oct-2015 kettenis

Lock the page queues by turning uvm_lock_pageq() and uvm_unlock_pageq() into
mtx_enter() and mtx_leave() operations. Not 100% this won't blow up but
there is only one way to find out, and we need this to make progress on
further unlocking uvm.

prodded by deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.59 04-May-2015 dlg

reduce the scope of things that include uvm_swap_encrypt.h.

uvm_meter.c needs it to route the sysctl, uvm_swap.c needs it to
use the functionality, and uvm_swap_encrypt.c needs it to for obvious
reasons. userland sysctl already includes it explicitely.

everything else doesnt and shouldnt care.

ok miod@


# 1.58 23-Apr-2015 dlg

tedu remnants of the previous attempt to implement page zeroing in
the idle thread.

ok deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.57 03-Oct-2014 kettenis

Introduce __MAP_NOFAULT, a mmap(2) flag that makes sure a mapping will not
cause a SIGSEGV or SIGBUS when a mapped file gets truncated. Access to
pages that are not backed by a file on such a mapping will be replaced by
zero-filled anonymous pages. Makes passing file descriptors of mapped files
usable without having to play tricks with signal handlers.

"steal your mmap flag" deraadt@


Revision tags: OPENBSD_5_6_BASE
# 1.56 11-Jul-2014 jsg

Chuck Cranor rescinded clauses in his license
on the 2nd of February 2011 in NetBSD.

http://marc.info/?l=netbsd-source-changes&m=129658899212732&w=2
http://marc.info/?l=netbsd-source-changes&m=129659095515558&w=2
http://marc.info/?l=netbsd-source-changes&m=129659157916514&w=2
http://marc.info/?l=netbsd-source-changes&m=129665962324372&w=2
http://marc.info/?l=netbsd-source-changes&m=129666033625342&w=2
http://marc.info/?l=netbsd-source-changes&m=129666052825545&w=2
http://marc.info/?l=netbsd-source-changes&m=129666922906480&w=2
http://marc.info/?l=netbsd-source-changes&m=129667725518082&w=2


# 1.55 08-Jul-2014 deraadt

decouple struct uvmexp into a new file, so that uvm_extern.h and sysctl.h
don't need to be married.
ok guenther miod beck jsing kettenis


# 1.54 08-Jul-2014 deraadt

white space repairs


# 1.53 28-Mar-2014 mpi

Reduce uvm include madness. Use <uvm/uvm_extern.h> instead of
<uvm/uvm.h> if possible and remove double inclusions.

ok beck@, mlarkin@, deraadt@


Revision tags: OPENBSD_5_5_BASE
# 1.52 09-Nov-2013 guenther

Add KASSERT()s to tsleep() and msleep() to verify that bogus flags
aren't being passed to them. Fix UVM_WAIT() to not pass PNORELOCK to
tsleep(), as that flag only does something with msleep().

ok beck@ dlg@


Revision tags: OPENBSD_5_4_BASE
# 1.51 30-May-2013 tedu

UVM_UNLOCK_AND_WAIT no longer unlocks, so rename it to UVM_WAIT.


# 1.50 30-May-2013 tedu

remove lots of comments about locking per beck's request


# 1.49 30-May-2013 tedu

remove simple_locks from uvm code. ok beck deraadt


# 1.48 29-May-2013 tedu

uvm_loan has not (ever) been compiled or used.


Revision tags: OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.47 09-Mar-2012 ariane

New vmmap implementation.

no oks (it is really a pain to review properly)
extensively tested, I'm confident it'll be stable
'now is the time' from several icb inhabitants

Diff provides:
- ability to specify different allocators for different regions/maps
- a simpler implementation of the current allocator
- currently in compatibility mode: it will generate similar addresses
as the old allocator


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.46 06-Jul-2011 beck

uvm changes for buffer cache improvements.
1) Make the pagedaemon aware of the memory ranges and size of allocations
where memory is being requested, and pass this information on to
bufbackoff(), which will later (not yet) be used to ensure that the
buffer cache gets out of the way in the right area of memory.

Note that this commit does not yet make it *do* that - as currently
the buffer cache is all in dma-able memory and it will simply back
off.

2) Add uvm_pagerealloc_multi - to be used by the buffer cache code
for reallocating pages to particular regions.

much of this work by ariane, with smatterings of me, art,and oga

ok oga@, thib@, ariane@, deraadt@


# 1.45 03-Jul-2011 oga

endodoify UVM_CNT too.

``beat it'' tedu@ the deleteotron


# 1.44 03-Jul-2011 oga

Rip out and burn support for UVM_HIST.

The vm hackers don't use it, don't maintain it and have to look at it all the
time. About time this 800 lines of code hit /dev/null.

``never liked it'' tedu@. ariane@ was very happy when i told her i wrote
this diff.


# 1.43 30-May-2011 oga

Remove the freelist member from vm_physseg

The new world order of pmemrange makes this data completely redundant
(being dealt with by the pmemrange constraints instead). Remove all code
that messes with the freelist.

While touching every caller of uvm_page_physload() anyway, add the flags
argument to all callers (all but one is 0 and that one already used
PHYSLOAD_DEVICE) and remove the macro magic to allow callers to continue
without it.

Should shrink the code a bit, as well.

matthew@ pointed out some mistakes i'd made.
``freelist death, I like. Ok.' ariane@
`I agree with the general direction, go ahead and i'll fix any fallout
shortly'' miod@ (68k 88k and vax i could not check would build)


# 1.42 15-Apr-2011 oga

When I switched uvm objects to use a per-object page tree instead of the
global hash I forgot to remove the has declarations from struct uvm. So
remove them now.

pointed out by blambert@, ok beck@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.41 29-Jun-2010 thib

Add a no_constraint uvm_constraint_range; use it in the pool code.

ok tedu@, beck@, oga@


# 1.40 27-Jun-2010 thib

uvm constraints. Add two mandatory MD symbols, uvm_md_constraints
which contains the constraints for DMA/memory allocation for each
architecture, and dma_constraints which contains the range of addresses
that are dma accessable by the system.

This is based on ariane@'s physcontig diff, with lots of bugfixes and
additions the following additions by my self:

Introduce a new function pool_set_constraints() which sets the address
range for which we allocate pages for the pool from, this is now used
for the mbuf/mbuf cluster pools to keep them dma accessible.

The !direct archs no longer stuff pages into the kernel object in
uvm_km_getpage_pla but rather do a pmap_extract() in uvm_km_putpages.

Tested heavily by my self on i386, amd64 and sparc64. Some tests on
alpha and SGI.

"commit it" beck, art, oga, deraadt
"i like the diff" deraadt


# 1.39 09-Jun-2010 thib

Move the prototype for uvm_wait() to uvm_extern.h and remove
uvm_pdaemon.h has it was only holding that one prototype.

OK art@, oga@, miod@, deraadt@


# 1.38 22-Apr-2010 oga

Committing on behalf or ariane@.

recommit pmemrange:
physmem allocator: change the view of free memory from single
free pages to free ranges. Classify memory based on region with
associated use-counter (which is used to construct a priority
list of where to allocate memory).

Based on code from tedu@, help from many.

Useable now that bugs have been found and fixed in most architecture's
pmap.c

ok by everyone who has done a pmap or uvm commit in the last year.


Revision tags: OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.37 16-Jun-2009 oga

date based reversion of uvm to the 4th May.

We still have no idea why this stops the crashes. but it does.

a machine forced to 64mb of ram cycled 10GB through swap with this diff
and is still running as I type this. Other tests by ariane@ and thib@
also seem to show that it's alright.

ok deraadt@, thib@, ariane@


# 1.36 16-Jun-2009 ariane

Backout pmemrange (which to most people is more well known as physmem
allocator).

"i can't see any obvious problems" oga


# 1.35 16-Jun-2009 oga

Backout all changes to uvm after pmemrange (which will be backed out
separately).

a change at or just before the hackathon has either exposed or added a
very very nasty memory corruption bug that is giving us hell right now.
So in the interest of kernel stability these diffs are being backed out
until such a time as that corruption bug has been found and squashed,
then the ones that are proven good may slowly return.

a quick hitlist of the main commits this backs out:

mine:
uvm_objwire
the lock change in uvm_swap.c
using trees for uvm objects instead of the hash
removing the pgo_releasepg callback.

art@'s:
putting pmap_page_protect(VM_PROT_NONE) in uvm_pagedeactivate() since
all callers called that just prior anyway.

ok beck@, ariane@.

prompted by deraadt@.


# 1.34 02-Jun-2009 oga

Instead of the global hash table with the terrible hashfunction and a
global lock, switch the uvm object pages to being kept in a per-object
RB_TREE. Right now this is approximately the same speed, but cleaner.
When biglock usage is reduced this will improve concurrency due to lock
contention..

ok beck@ art@. Thanks to jasper for the speed testing.


# 1.33 01-Jun-2009 ariane

physmem allocator: change the view of free memory from single free pages
to free ranges.
Classify memory based on region with associated use-counter (which is used
to construct a priority list of where to allocate memory).

Based on code from tedu@, help from many.
Ok art@


# 1.32 04-May-2009 oga

Instead of keeping two ints in the uvm structure specifically just to
sleep on them (and otherwise ignore them) sleep on the pointer to the
{aiodoned,pagedaemon}_proc members, and nuke the two extra words.

"no objections" art@, ok beck@.


# 1.31 28-Apr-2009 miod

Revert pageqlock back from a mutex to a simple_lock, as it needs to be
recursive in some cases (mostly involving swapping). A proper fix is in
the works, but this will unbreak kernels for now.


# 1.30 14-Apr-2009 oga

The use of uvm.pagedaemon_lock is incredibly inconsistent. only a
fraction of the wakeups and sleeps involved here actually grab that
lock. The remainder, on the other hand, always have the fpageq_lock
locked.

So, make this locking correct by switching the other users over to
fpageq_lock, too.

This would probably be better off being a semaphore, but for now at
least it's correct.

"ok, unless you want to implement semaphores" art@


# 1.29 13-Apr-2009 oga

Convert the page queue lock to a mutex instead of a simplelock.

Fix up the one case of lock recursion (which blatantly ignored the
comment right above it saying that we don't need to lock). The rest of
the lock usage has been checked and appears to be correct.

ok ariane@.


# 1.28 06-Apr-2009 oga

Instead of doing splbio(); simple_lock(&uvm.aiodoned_lock); just replace
the simple lock with a real lock - a IPL_BIO mutex. While i'm here, make
the sleeping condition one hell of a lot simpler in the aio daemon.

some ideas from and ok art@.


# 1.27 26-Mar-2009 oga

Convert splvm() + simplelock(&uvm.hashlock); around the page hash table
into a IPL_VM blocking mutex, also slightly extend the locked area so
that it actually protects access to the page array (as the comment on
the lock declaration says it should).

ansify a few functions while i'm in the file.

"ok, even though you're sneaking in ansification in a diff. You dirty
you." art@


# 1.26 25-Mar-2009 oga

Move all of the pseudo-inline functions in uvm into C files.

By pseudo-inline, I mean that if a certain macro was defined, they would
be inlined. However, no architecture defines that, and none has for a
very very long time. Therefore mainly this just makes the code a damned
sight easier to read. Some k&r -> ansi declarations while I'm in there.

"just commit it" art@. ok weingart@.


Revision tags: OPENBSD_4_5_BASE
# 1.25 27-Jan-2009 miod

Get rid of the last traces of uvm.pager_[se]va


Revision tags: OPENBSD_4_4_BASE
# 1.24 09-Jun-2008 miod

Define a new flag, UVM_FLAG_HOLE, for uvm_map to create a vm_map_entry of
a new etype, UVM_ET_HOLE, meaning it has no backend.

UVM_ET_HOLE entries (which should be created as UVM_PROT_NONE and with
UVM_FLAG_NOMERGE and UVM_FLAG_HOLE) are skipped in uvm_unmap_remove(), so
that pmap_{k,}remove() is not called on the entry.

This is intended to save time, and behave better, on pmaps with MMU holes
at process exit time.

ok art@, kettenis@ provided feedback as well.


# 1.23 05-May-2008 thib

retire ltsleep(); The only refrence left too it is in an
ifdef netbsd block in drm code, but oga@ says he'll remove
it soon...

OK art@, oga@;


Revision tags: OPENBSD_4_3_BASE
# 1.22 29-Nov-2007 tedu

use a working mutex for the freepage list. ok art deraadt


Revision tags: OPENBSD_4_2_BASE
# 1.21 18-Jun-2007 pedro

Bring back Mickey's UVM anon change. Testing by thib@, beck@ and
ckuethe@ for a while. Okay beck@, "it is good timing" deraadt@.


Revision tags: OPENBSD_4_0_BASE OPENBSD_4_1_BASE
# 1.20 13-Jul-2006 deraadt

Back out the anon change. Apparently it was tested by a few, but most of
us did not see it or get a chance to test it before it was commited. It
broke cvs, in the ami driver, making it not succeed at seeing it's devices.


# 1.19 21-Jun-2006 mickey

from netbsd: make anons dynamically allocated from pool.
this results in lesse kva waste due to static preallocation of those
for every phys page and also every swap page.
tested by beck krw miod


Revision tags: OPENBSD_3_9_BASE
# 1.18 16-Jan-2006 mickey

add another uvm histroy for physpage alloc/free and propagate a debugging pgfree check into pglist; no functional change for normal kernels; make histories uncommon


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B UBC_SYNC_A
# 1.17 29-Mar-2003 mickey

ubchist is not a fully cooked kadaver and though use the other well formed pdhist one until ubc gaets back. art@ ok


Revision tags: OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.16 19-Dec-2001 art

UBC was a disaster. It worked very good when it worked, but on some
machines or some configurations or in some phase of the moon (we actually
don't know when or why) files disappeared. Since we've not been able to
track down the problem in two weeks intense debugging and we need -current
to be stable, back out everything to a state it had before UBC.

We apologise for the inconvenience.


Revision tags: UBC_BASE
# 1.15 28-Nov-2001 art

branches: 1.15.2;
Sync in more uvm from NetBSD. Mostly just cosmetic stuff.
Contains also support for page coloring.


# 1.14 10-Nov-2001 art

Merge in some parts of the ubc work that has been done in NetBSD that are not
UBC, but prerequsites for it.

- Create a daemon that processes async I/O (swap and paging in the future)
requests that need processing in process context and that were processed
in the pagedaemon before.
- Convert some ugly ifdef DIAGNOSTIC code to less intrusive KASSERTs.
- misc other cleanups.


# 1.13 05-Nov-2001 art

Minor sync to NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.12 11-Aug-2001 art

Various random fixes from NetBSD.
Including support for zeroing pages in the idle loop (not enabled yet).


# 1.11 18-Jul-2001 art

Unconfuse UVM_UNLOCK_AND_WAIT. From NetBSD.


# 1.10 23-Jun-2001 smart

Sync with NetBSD 19990911 (just before PMAP_NEW was required)
- thread_sleep_msg() -> uvm_sleep()
- initialize reference count lock in uvm_anon_{init,add}()
- add uao_flush()
- replace boolean 'islocked' with 'lockflags'
- in uvm_fault() change FALSE to TRUE to in 'wide' fault handling
- get rid of uvm_km_get()
- various bug fixes


Revision tags: OPENBSD_2_9_BASE
# 1.9 10-Apr-2001 niklas

Fix for machines which need to enlarge the kernel address space, at least
1GB i386 machines needs this. The fix is heavily based on Jason Thorpe's
found in NetBSD. Here is his original commit message:

Instead of checking vm_physmem[<physseg>].pgs to determine if
uvm_page_init() has completed, add a boolean uvm.page_init_done,
and test against that. Use this same boolean (rather than
pmap_initialized) in pmap_growkernel() to determine if we are
being called via uvm_page_init() to grow the kernel address space.

This fixes a problem on some i386 configurations where pmap_init()
itself was needing to have the kernel page table grown, and since
pmap_initialized was not yet set to TRUE, pmap_growkernel() was
choosing the wrong code path.


# 1.8 22-Mar-2001 smart

Sync style, typo, and comments a little closer to NetBSD. art@ ok


# 1.7 09-Mar-2001 smart

Protect protypes, certain macros, and inlines from userland. Checked userland
with a 'make build'. From NetBSD. art@ ok


# 1.6 29-Jan-2001 niklas

$OpenBSD$


Revision tags: OPENBSD_2_8_BASE
# 1.5 27-May-2000 provos

use rijndael instead of blowfish because of faster key setup.
break swap paritions into sections, each section has own
encryption key. if a section's key becomes unreferenced, erase it.


Revision tags: OPENBSD_2_7_BASE
# 1.4 15-Mar-2000 art

Fix the NetBSD id strings.


Revision tags: OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 23-Aug-1999 art

branches: 1.3.4;
sync with NetBSD from 1999.05.24 (there is a reason for this date)
Mostly cleanups, but also a few improvements to pagedaemon for better
handling of low memory and/or low swap conditions.


Revision tags: OPENBSD_2_5_BASE
# 1.2 26-Feb-1999 art

add OpenBSD tags


# 1.1 26-Feb-1999 art

Import of uvm from NetBSD. Some local changes, some code disabled


# 1.68 24-Nov-2020 mpi

Grab the `pageqlock' before calling uvm_pageclean() as intended.

Document which global data structures require this lock and add some
asserts where the lock should be held.

Some code paths are still incorrect and should be revisited.

ok jmatthew@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.67 06-Dec-2019 mpi

Sync KVE_ET_* and UVM_ET_* flags.

ok guenther@


# 1.66 29-Nov-2019 deraadt

Repurpose the "syscalls must be on a writeable page" mechanism to
enforce a new policy: system calls must be in pre-registered regions.
We have discussed more strict checks than this, but none satisfy the
cost/benefit based upon our understanding of attack methods, anyways
let's see what the next iteration looks like.

This is intended to harden (translation: attackers must put extra
effort into attacking) against a mixture of W^X failures and JIT bugs
which allow syscall misinterpretation, especially in environments with
polymorphic-instruction/variable-sized instructions. It fits in a bit
with libc/libcrypto/ld.so random relink on boot and no-restart-at-crash
behaviour, particularily for remote problems. Less effective once on-host
since someone the libraries can be read.

For static-executables the kernel registers the main program's
PIE-mapped exec section valid, as well as the randomly-placed sigtramp
page. For dynamic executables ELF ld.so's exec segment is also
labelled valid; ld.so then has enough information to register libc's
exec section as valid via call-once msyscall(2)

For dynamic binaries, we continue to to permit the main program exec
segment because "go" (and potentially a few other applications) have
embedded system calls in the main program. Hopefully at least go gets
fixed soon.

We declare the concept of embedded syscalls a bad idea for numerous
reasons, as we notice the ecosystem has many of
static-syscall-in-base-binary which are dynamically linked against
libraries which in turn use libc, which contains another set of
syscall stubs. We've been concerned about adding even one additional
syscall entry point... but go's approach tends to double the entry-point
attack surface.

This was started at a nano-hackathon in Bob Beck's basement 2 weeks
ago during a long discussion with mortimer trying to hide from the SSL
scream-conversations, and finished in more comfortable circumstances
next to a wood-stove at Elk Lakes cabin with UVM scream-conversations.

ok guenther kettenis mortimer, lots of feedback from others
conversations about go with jsing tb sthen


Revision tags: OPENBSD_6_6_BASE
# 1.65 18-Jul-2019 cheloha

R.I.P. UVM_WAIT(). Use tsleep_nsec(9) directly.

UVM_WAIT() doesn't provide much of a useful abstraction. All callers
tsleep forever and no callers set PCATCH, so only 2 of 4 parameters are
actually used. Might as well just use tsleep_nsec(9) directly and make
the uvm code a bit less specialized.

Suggested by mpi@.

ok mpi@ visa@ millert@


Revision tags: OPENBSD_6_5_BASE
# 1.64 01-Mar-2019 cheloha

New mmap(2) flag: MAP_CONCEAL.

MAP_CONCEAL'd memory is not written to disk in the event of a core dump.
It may grow other qualities in the future.

Wanted by libressl, probably useful elsewhere, too.

Prompted by deraadt@, concept from deraadt@/kettenis@. With input from
deraadt@, cjeker@, kettenis@, otto@, bcook@, matthew@, guenther@, djm@,
and tedu@.

ok otto@ deraadt@


# 1.63 31-Oct-2018 kettenis

Add support to uvm to establish write-combining mappings. Use this in the
inteldrm driver to add support for the I915_MMAP_WC flag.

ok deraadt@, jsg@


Revision tags: OPENBSD_6_4_BASE
# 1.62 12-Apr-2018 deraadt

Implement MAP_STACK option for mmap(). Synchronous faults (pagefault and
syscall) confirm the stack register points at MAP_STACK memory, otherwise
SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
to create a MAP_STACK sub-region which satisfies alignment requirements.
Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
contents of the region -- there is no mprotect() equivalent operation, so
there is no MAP_STACK-adding gadget.
This opportunistic software-emulation of a stack protection bit makes
stack-pivot operations during ROPchain fragile (kind of like removing a
tool from the toolbox).
original discussion with tedu, uvm work by stefan, testing by mortimer
ok kettenis


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.61 11-Aug-2016 dlg

replace abuse of the static map entries RB_ENTRY pointers with an SLIST

free static entries are kept in a simple linked list, so use SLIST
to make this obvious. the RB_PARENT manipulations are ugly and
confusing.

ok kettenis@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.60 08-Oct-2015 kettenis

Lock the page queues by turning uvm_lock_pageq() and uvm_unlock_pageq() into
mtx_enter() and mtx_leave() operations. Not 100% this won't blow up but
there is only one way to find out, and we need this to make progress on
further unlocking uvm.

prodded by deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.59 04-May-2015 dlg

reduce the scope of things that include uvm_swap_encrypt.h.

uvm_meter.c needs it to route the sysctl, uvm_swap.c needs it to
use the functionality, and uvm_swap_encrypt.c needs it to for obvious
reasons. userland sysctl already includes it explicitely.

everything else doesnt and shouldnt care.

ok miod@


# 1.58 23-Apr-2015 dlg

tedu remnants of the previous attempt to implement page zeroing in
the idle thread.

ok deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.57 03-Oct-2014 kettenis

Introduce __MAP_NOFAULT, a mmap(2) flag that makes sure a mapping will not
cause a SIGSEGV or SIGBUS when a mapped file gets truncated. Access to
pages that are not backed by a file on such a mapping will be replaced by
zero-filled anonymous pages. Makes passing file descriptors of mapped files
usable without having to play tricks with signal handlers.

"steal your mmap flag" deraadt@


Revision tags: OPENBSD_5_6_BASE
# 1.56 11-Jul-2014 jsg

Chuck Cranor rescinded clauses in his license
on the 2nd of February 2011 in NetBSD.

http://marc.info/?l=netbsd-source-changes&m=129658899212732&w=2
http://marc.info/?l=netbsd-source-changes&m=129659095515558&w=2
http://marc.info/?l=netbsd-source-changes&m=129659157916514&w=2
http://marc.info/?l=netbsd-source-changes&m=129665962324372&w=2
http://marc.info/?l=netbsd-source-changes&m=129666033625342&w=2
http://marc.info/?l=netbsd-source-changes&m=129666052825545&w=2
http://marc.info/?l=netbsd-source-changes&m=129666922906480&w=2
http://marc.info/?l=netbsd-source-changes&m=129667725518082&w=2


# 1.55 08-Jul-2014 deraadt

decouple struct uvmexp into a new file, so that uvm_extern.h and sysctl.h
don't need to be married.
ok guenther miod beck jsing kettenis


# 1.54 08-Jul-2014 deraadt

white space repairs


# 1.53 28-Mar-2014 mpi

Reduce uvm include madness. Use <uvm/uvm_extern.h> instead of
<uvm/uvm.h> if possible and remove double inclusions.

ok beck@, mlarkin@, deraadt@


Revision tags: OPENBSD_5_5_BASE
# 1.52 09-Nov-2013 guenther

Add KASSERT()s to tsleep() and msleep() to verify that bogus flags
aren't being passed to them. Fix UVM_WAIT() to not pass PNORELOCK to
tsleep(), as that flag only does something with msleep().

ok beck@ dlg@


Revision tags: OPENBSD_5_4_BASE
# 1.51 30-May-2013 tedu

UVM_UNLOCK_AND_WAIT no longer unlocks, so rename it to UVM_WAIT.


# 1.50 30-May-2013 tedu

remove lots of comments about locking per beck's request


# 1.49 30-May-2013 tedu

remove simple_locks from uvm code. ok beck deraadt


# 1.48 29-May-2013 tedu

uvm_loan has not (ever) been compiled or used.


Revision tags: OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.47 09-Mar-2012 ariane

New vmmap implementation.

no oks (it is really a pain to review properly)
extensively tested, I'm confident it'll be stable
'now is the time' from several icb inhabitants

Diff provides:
- ability to specify different allocators for different regions/maps
- a simpler implementation of the current allocator
- currently in compatibility mode: it will generate similar addresses
as the old allocator


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.46 06-Jul-2011 beck

uvm changes for buffer cache improvements.
1) Make the pagedaemon aware of the memory ranges and size of allocations
where memory is being requested, and pass this information on to
bufbackoff(), which will later (not yet) be used to ensure that the
buffer cache gets out of the way in the right area of memory.

Note that this commit does not yet make it *do* that - as currently
the buffer cache is all in dma-able memory and it will simply back
off.

2) Add uvm_pagerealloc_multi - to be used by the buffer cache code
for reallocating pages to particular regions.

much of this work by ariane, with smatterings of me, art,and oga

ok oga@, thib@, ariane@, deraadt@


# 1.45 03-Jul-2011 oga

endodoify UVM_CNT too.

``beat it'' tedu@ the deleteotron


# 1.44 03-Jul-2011 oga

Rip out and burn support for UVM_HIST.

The vm hackers don't use it, don't maintain it and have to look at it all the
time. About time this 800 lines of code hit /dev/null.

``never liked it'' tedu@. ariane@ was very happy when i told her i wrote
this diff.


# 1.43 30-May-2011 oga

Remove the freelist member from vm_physseg

The new world order of pmemrange makes this data completely redundant
(being dealt with by the pmemrange constraints instead). Remove all code
that messes with the freelist.

While touching every caller of uvm_page_physload() anyway, add the flags
argument to all callers (all but one is 0 and that one already used
PHYSLOAD_DEVICE) and remove the macro magic to allow callers to continue
without it.

Should shrink the code a bit, as well.

matthew@ pointed out some mistakes i'd made.
``freelist death, I like. Ok.' ariane@
`I agree with the general direction, go ahead and i'll fix any fallout
shortly'' miod@ (68k 88k and vax i could not check would build)


# 1.42 15-Apr-2011 oga

When I switched uvm objects to use a per-object page tree instead of the
global hash I forgot to remove the has declarations from struct uvm. So
remove them now.

pointed out by blambert@, ok beck@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.41 29-Jun-2010 thib

Add a no_constraint uvm_constraint_range; use it in the pool code.

ok tedu@, beck@, oga@


# 1.40 27-Jun-2010 thib

uvm constraints. Add two mandatory MD symbols, uvm_md_constraints
which contains the constraints for DMA/memory allocation for each
architecture, and dma_constraints which contains the range of addresses
that are dma accessable by the system.

This is based on ariane@'s physcontig diff, with lots of bugfixes and
additions the following additions by my self:

Introduce a new function pool_set_constraints() which sets the address
range for which we allocate pages for the pool from, this is now used
for the mbuf/mbuf cluster pools to keep them dma accessible.

The !direct archs no longer stuff pages into the kernel object in
uvm_km_getpage_pla but rather do a pmap_extract() in uvm_km_putpages.

Tested heavily by my self on i386, amd64 and sparc64. Some tests on
alpha and SGI.

"commit it" beck, art, oga, deraadt
"i like the diff" deraadt


# 1.39 09-Jun-2010 thib

Move the prototype for uvm_wait() to uvm_extern.h and remove
uvm_pdaemon.h has it was only holding that one prototype.

OK art@, oga@, miod@, deraadt@


# 1.38 22-Apr-2010 oga

Committing on behalf or ariane@.

recommit pmemrange:
physmem allocator: change the view of free memory from single
free pages to free ranges. Classify memory based on region with
associated use-counter (which is used to construct a priority
list of where to allocate memory).

Based on code from tedu@, help from many.

Useable now that bugs have been found and fixed in most architecture's
pmap.c

ok by everyone who has done a pmap or uvm commit in the last year.


Revision tags: OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.37 16-Jun-2009 oga

date based reversion of uvm to the 4th May.

We still have no idea why this stops the crashes. but it does.

a machine forced to 64mb of ram cycled 10GB through swap with this diff
and is still running as I type this. Other tests by ariane@ and thib@
also seem to show that it's alright.

ok deraadt@, thib@, ariane@


# 1.36 16-Jun-2009 ariane

Backout pmemrange (which to most people is more well known as physmem
allocator).

"i can't see any obvious problems" oga


# 1.35 16-Jun-2009 oga

Backout all changes to uvm after pmemrange (which will be backed out
separately).

a change at or just before the hackathon has either exposed or added a
very very nasty memory corruption bug that is giving us hell right now.
So in the interest of kernel stability these diffs are being backed out
until such a time as that corruption bug has been found and squashed,
then the ones that are proven good may slowly return.

a quick hitlist of the main commits this backs out:

mine:
uvm_objwire
the lock change in uvm_swap.c
using trees for uvm objects instead of the hash
removing the pgo_releasepg callback.

art@'s:
putting pmap_page_protect(VM_PROT_NONE) in uvm_pagedeactivate() since
all callers called that just prior anyway.

ok beck@, ariane@.

prompted by deraadt@.


# 1.34 02-Jun-2009 oga

Instead of the global hash table with the terrible hashfunction and a
global lock, switch the uvm object pages to being kept in a per-object
RB_TREE. Right now this is approximately the same speed, but cleaner.
When biglock usage is reduced this will improve concurrency due to lock
contention..

ok beck@ art@. Thanks to jasper for the speed testing.


# 1.33 01-Jun-2009 ariane

physmem allocator: change the view of free memory from single free pages
to free ranges.
Classify memory based on region with associated use-counter (which is used
to construct a priority list of where to allocate memory).

Based on code from tedu@, help from many.
Ok art@


# 1.32 04-May-2009 oga

Instead of keeping two ints in the uvm structure specifically just to
sleep on them (and otherwise ignore them) sleep on the pointer to the
{aiodoned,pagedaemon}_proc members, and nuke the two extra words.

"no objections" art@, ok beck@.


# 1.31 28-Apr-2009 miod

Revert pageqlock back from a mutex to a simple_lock, as it needs to be
recursive in some cases (mostly involving swapping). A proper fix is in
the works, but this will unbreak kernels for now.


# 1.30 14-Apr-2009 oga

The use of uvm.pagedaemon_lock is incredibly inconsistent. only a
fraction of the wakeups and sleeps involved here actually grab that
lock. The remainder, on the other hand, always have the fpageq_lock
locked.

So, make this locking correct by switching the other users over to
fpageq_lock, too.

This would probably be better off being a semaphore, but for now at
least it's correct.

"ok, unless you want to implement semaphores" art@


# 1.29 13-Apr-2009 oga

Convert the page queue lock to a mutex instead of a simplelock.

Fix up the one case of lock recursion (which blatantly ignored the
comment right above it saying that we don't need to lock). The rest of
the lock usage has been checked and appears to be correct.

ok ariane@.


# 1.28 06-Apr-2009 oga

Instead of doing splbio(); simple_lock(&uvm.aiodoned_lock); just replace
the simple lock with a real lock - a IPL_BIO mutex. While i'm here, make
the sleeping condition one hell of a lot simpler in the aio daemon.

some ideas from and ok art@.


# 1.27 26-Mar-2009 oga

Convert splvm() + simplelock(&uvm.hashlock); around the page hash table
into a IPL_VM blocking mutex, also slightly extend the locked area so
that it actually protects access to the page array (as the comment on
the lock declaration says it should).

ansify a few functions while i'm in the file.

"ok, even though you're sneaking in ansification in a diff. You dirty
you." art@


# 1.26 25-Mar-2009 oga

Move all of the pseudo-inline functions in uvm into C files.

By pseudo-inline, I mean that if a certain macro was defined, they would
be inlined. However, no architecture defines that, and none has for a
very very long time. Therefore mainly this just makes the code a damned
sight easier to read. Some k&r -> ansi declarations while I'm in there.

"just commit it" art@. ok weingart@.


Revision tags: OPENBSD_4_5_BASE
# 1.25 27-Jan-2009 miod

Get rid of the last traces of uvm.pager_[se]va


Revision tags: OPENBSD_4_4_BASE
# 1.24 09-Jun-2008 miod

Define a new flag, UVM_FLAG_HOLE, for uvm_map to create a vm_map_entry of
a new etype, UVM_ET_HOLE, meaning it has no backend.

UVM_ET_HOLE entries (which should be created as UVM_PROT_NONE and with
UVM_FLAG_NOMERGE and UVM_FLAG_HOLE) are skipped in uvm_unmap_remove(), so
that pmap_{k,}remove() is not called on the entry.

This is intended to save time, and behave better, on pmaps with MMU holes
at process exit time.

ok art@, kettenis@ provided feedback as well.


# 1.23 05-May-2008 thib

retire ltsleep(); The only refrence left too it is in an
ifdef netbsd block in drm code, but oga@ says he'll remove
it soon...

OK art@, oga@;


Revision tags: OPENBSD_4_3_BASE
# 1.22 29-Nov-2007 tedu

use a working mutex for the freepage list. ok art deraadt


Revision tags: OPENBSD_4_2_BASE
# 1.21 18-Jun-2007 pedro

Bring back Mickey's UVM anon change. Testing by thib@, beck@ and
ckuethe@ for a while. Okay beck@, "it is good timing" deraadt@.


Revision tags: OPENBSD_4_0_BASE OPENBSD_4_1_BASE
# 1.20 13-Jul-2006 deraadt

Back out the anon change. Apparently it was tested by a few, but most of
us did not see it or get a chance to test it before it was commited. It
broke cvs, in the ami driver, making it not succeed at seeing it's devices.


# 1.19 21-Jun-2006 mickey

from netbsd: make anons dynamically allocated from pool.
this results in lesse kva waste due to static preallocation of those
for every phys page and also every swap page.
tested by beck krw miod


Revision tags: OPENBSD_3_9_BASE
# 1.18 16-Jan-2006 mickey

add another uvm histroy for physpage alloc/free and propagate a debugging pgfree check into pglist; no functional change for normal kernels; make histories uncommon


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B UBC_SYNC_A
# 1.17 29-Mar-2003 mickey

ubchist is not a fully cooked kadaver and though use the other well formed pdhist one until ubc gaets back. art@ ok


Revision tags: OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.16 19-Dec-2001 art

UBC was a disaster. It worked very good when it worked, but on some
machines or some configurations or in some phase of the moon (we actually
don't know when or why) files disappeared. Since we've not been able to
track down the problem in two weeks intense debugging and we need -current
to be stable, back out everything to a state it had before UBC.

We apologise for the inconvenience.


Revision tags: UBC_BASE
# 1.15 28-Nov-2001 art

branches: 1.15.2;
Sync in more uvm from NetBSD. Mostly just cosmetic stuff.
Contains also support for page coloring.


# 1.14 10-Nov-2001 art

Merge in some parts of the ubc work that has been done in NetBSD that are not
UBC, but prerequsites for it.

- Create a daemon that processes async I/O (swap and paging in the future)
requests that need processing in process context and that were processed
in the pagedaemon before.
- Convert some ugly ifdef DIAGNOSTIC code to less intrusive KASSERTs.
- misc other cleanups.


# 1.13 05-Nov-2001 art

Minor sync to NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.12 11-Aug-2001 art

Various random fixes from NetBSD.
Including support for zeroing pages in the idle loop (not enabled yet).


# 1.11 18-Jul-2001 art

Unconfuse UVM_UNLOCK_AND_WAIT. From NetBSD.


# 1.10 23-Jun-2001 smart

Sync with NetBSD 19990911 (just before PMAP_NEW was required)
- thread_sleep_msg() -> uvm_sleep()
- initialize reference count lock in uvm_anon_{init,add}()
- add uao_flush()
- replace boolean 'islocked' with 'lockflags'
- in uvm_fault() change FALSE to TRUE to in 'wide' fault handling
- get rid of uvm_km_get()
- various bug fixes


Revision tags: OPENBSD_2_9_BASE
# 1.9 10-Apr-2001 niklas

Fix for machines which need to enlarge the kernel address space, at least
1GB i386 machines needs this. The fix is heavily based on Jason Thorpe's
found in NetBSD. Here is his original commit message:

Instead of checking vm_physmem[<physseg>].pgs to determine if
uvm_page_init() has completed, add a boolean uvm.page_init_done,
and test against that. Use this same boolean (rather than
pmap_initialized) in pmap_growkernel() to determine if we are
being called via uvm_page_init() to grow the kernel address space.

This fixes a problem on some i386 configurations where pmap_init()
itself was needing to have the kernel page table grown, and since
pmap_initialized was not yet set to TRUE, pmap_growkernel() was
choosing the wrong code path.


# 1.8 22-Mar-2001 smart

Sync style, typo, and comments a little closer to NetBSD. art@ ok


# 1.7 09-Mar-2001 smart

Protect protypes, certain macros, and inlines from userland. Checked userland
with a 'make build'. From NetBSD. art@ ok


# 1.6 29-Jan-2001 niklas

$OpenBSD$


Revision tags: OPENBSD_2_8_BASE
# 1.5 27-May-2000 provos

use rijndael instead of blowfish because of faster key setup.
break swap paritions into sections, each section has own
encryption key. if a section's key becomes unreferenced, erase it.


Revision tags: OPENBSD_2_7_BASE
# 1.4 15-Mar-2000 art

Fix the NetBSD id strings.


Revision tags: OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 23-Aug-1999 art

branches: 1.3.4;
sync with NetBSD from 1999.05.24 (there is a reason for this date)
Mostly cleanups, but also a few improvements to pagedaemon for better
handling of low memory and/or low swap conditions.


Revision tags: OPENBSD_2_5_BASE
# 1.2 26-Feb-1999 art

add OpenBSD tags


# 1.1 26-Feb-1999 art

Import of uvm from NetBSD. Some local changes, some code disabled


# 1.67 06-Dec-2019 mpi

Sync KVE_ET_* and UVM_ET_* flags.

ok guenther@


# 1.66 29-Nov-2019 deraadt

Repurpose the "syscalls must be on a writeable page" mechanism to
enforce a new policy: system calls must be in pre-registered regions.
We have discussed more strict checks than this, but none satisfy the
cost/benefit based upon our understanding of attack methods, anyways
let's see what the next iteration looks like.

This is intended to harden (translation: attackers must put extra
effort into attacking) against a mixture of W^X failures and JIT bugs
which allow syscall misinterpretation, especially in environments with
polymorphic-instruction/variable-sized instructions. It fits in a bit
with libc/libcrypto/ld.so random relink on boot and no-restart-at-crash
behaviour, particularily for remote problems. Less effective once on-host
since someone the libraries can be read.

For static-executables the kernel registers the main program's
PIE-mapped exec section valid, as well as the randomly-placed sigtramp
page. For dynamic executables ELF ld.so's exec segment is also
labelled valid; ld.so then has enough information to register libc's
exec section as valid via call-once msyscall(2)

For dynamic binaries, we continue to to permit the main program exec
segment because "go" (and potentially a few other applications) have
embedded system calls in the main program. Hopefully at least go gets
fixed soon.

We declare the concept of embedded syscalls a bad idea for numerous
reasons, as we notice the ecosystem has many of
static-syscall-in-base-binary which are dynamically linked against
libraries which in turn use libc, which contains another set of
syscall stubs. We've been concerned about adding even one additional
syscall entry point... but go's approach tends to double the entry-point
attack surface.

This was started at a nano-hackathon in Bob Beck's basement 2 weeks
ago during a long discussion with mortimer trying to hide from the SSL
scream-conversations, and finished in more comfortable circumstances
next to a wood-stove at Elk Lakes cabin with UVM scream-conversations.

ok guenther kettenis mortimer, lots of feedback from others
conversations about go with jsing tb sthen


Revision tags: OPENBSD_6_6_BASE
# 1.65 18-Jul-2019 cheloha

R.I.P. UVM_WAIT(). Use tsleep_nsec(9) directly.

UVM_WAIT() doesn't provide much of a useful abstraction. All callers
tsleep forever and no callers set PCATCH, so only 2 of 4 parameters are
actually used. Might as well just use tsleep_nsec(9) directly and make
the uvm code a bit less specialized.

Suggested by mpi@.

ok mpi@ visa@ millert@


Revision tags: OPENBSD_6_5_BASE
# 1.64 01-Mar-2019 cheloha

New mmap(2) flag: MAP_CONCEAL.

MAP_CONCEAL'd memory is not written to disk in the event of a core dump.
It may grow other qualities in the future.

Wanted by libressl, probably useful elsewhere, too.

Prompted by deraadt@, concept from deraadt@/kettenis@. With input from
deraadt@, cjeker@, kettenis@, otto@, bcook@, matthew@, guenther@, djm@,
and tedu@.

ok otto@ deraadt@


# 1.63 31-Oct-2018 kettenis

Add support to uvm to establish write-combining mappings. Use this in the
inteldrm driver to add support for the I915_MMAP_WC flag.

ok deraadt@, jsg@


Revision tags: OPENBSD_6_4_BASE
# 1.62 12-Apr-2018 deraadt

Implement MAP_STACK option for mmap(). Synchronous faults (pagefault and
syscall) confirm the stack register points at MAP_STACK memory, otherwise
SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
to create a MAP_STACK sub-region which satisfies alignment requirements.
Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
contents of the region -- there is no mprotect() equivalent operation, so
there is no MAP_STACK-adding gadget.
This opportunistic software-emulation of a stack protection bit makes
stack-pivot operations during ROPchain fragile (kind of like removing a
tool from the toolbox).
original discussion with tedu, uvm work by stefan, testing by mortimer
ok kettenis


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.61 11-Aug-2016 dlg

replace abuse of the static map entries RB_ENTRY pointers with an SLIST

free static entries are kept in a simple linked list, so use SLIST
to make this obvious. the RB_PARENT manipulations are ugly and
confusing.

ok kettenis@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.60 08-Oct-2015 kettenis

Lock the page queues by turning uvm_lock_pageq() and uvm_unlock_pageq() into
mtx_enter() and mtx_leave() operations. Not 100% this won't blow up but
there is only one way to find out, and we need this to make progress on
further unlocking uvm.

prodded by deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.59 04-May-2015 dlg

reduce the scope of things that include uvm_swap_encrypt.h.

uvm_meter.c needs it to route the sysctl, uvm_swap.c needs it to
use the functionality, and uvm_swap_encrypt.c needs it to for obvious
reasons. userland sysctl already includes it explicitely.

everything else doesnt and shouldnt care.

ok miod@


# 1.58 23-Apr-2015 dlg

tedu remnants of the previous attempt to implement page zeroing in
the idle thread.

ok deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.57 03-Oct-2014 kettenis

Introduce __MAP_NOFAULT, a mmap(2) flag that makes sure a mapping will not
cause a SIGSEGV or SIGBUS when a mapped file gets truncated. Access to
pages that are not backed by a file on such a mapping will be replaced by
zero-filled anonymous pages. Makes passing file descriptors of mapped files
usable without having to play tricks with signal handlers.

"steal your mmap flag" deraadt@


Revision tags: OPENBSD_5_6_BASE
# 1.56 11-Jul-2014 jsg

Chuck Cranor rescinded clauses in his license
on the 2nd of February 2011 in NetBSD.

http://marc.info/?l=netbsd-source-changes&m=129658899212732&w=2
http://marc.info/?l=netbsd-source-changes&m=129659095515558&w=2
http://marc.info/?l=netbsd-source-changes&m=129659157916514&w=2
http://marc.info/?l=netbsd-source-changes&m=129665962324372&w=2
http://marc.info/?l=netbsd-source-changes&m=129666033625342&w=2
http://marc.info/?l=netbsd-source-changes&m=129666052825545&w=2
http://marc.info/?l=netbsd-source-changes&m=129666922906480&w=2
http://marc.info/?l=netbsd-source-changes&m=129667725518082&w=2


# 1.55 08-Jul-2014 deraadt

decouple struct uvmexp into a new file, so that uvm_extern.h and sysctl.h
don't need to be married.
ok guenther miod beck jsing kettenis


# 1.54 08-Jul-2014 deraadt

white space repairs


# 1.53 28-Mar-2014 mpi

Reduce uvm include madness. Use <uvm/uvm_extern.h> instead of
<uvm/uvm.h> if possible and remove double inclusions.

ok beck@, mlarkin@, deraadt@


Revision tags: OPENBSD_5_5_BASE
# 1.52 09-Nov-2013 guenther

Add KASSERT()s to tsleep() and msleep() to verify that bogus flags
aren't being passed to them. Fix UVM_WAIT() to not pass PNORELOCK to
tsleep(), as that flag only does something with msleep().

ok beck@ dlg@


Revision tags: OPENBSD_5_4_BASE
# 1.51 30-May-2013 tedu

UVM_UNLOCK_AND_WAIT no longer unlocks, so rename it to UVM_WAIT.


# 1.50 30-May-2013 tedu

remove lots of comments about locking per beck's request


# 1.49 30-May-2013 tedu

remove simple_locks from uvm code. ok beck deraadt


# 1.48 29-May-2013 tedu

uvm_loan has not (ever) been compiled or used.


Revision tags: OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.47 09-Mar-2012 ariane

New vmmap implementation.

no oks (it is really a pain to review properly)
extensively tested, I'm confident it'll be stable
'now is the time' from several icb inhabitants

Diff provides:
- ability to specify different allocators for different regions/maps
- a simpler implementation of the current allocator
- currently in compatibility mode: it will generate similar addresses
as the old allocator


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.46 06-Jul-2011 beck

uvm changes for buffer cache improvements.
1) Make the pagedaemon aware of the memory ranges and size of allocations
where memory is being requested, and pass this information on to
bufbackoff(), which will later (not yet) be used to ensure that the
buffer cache gets out of the way in the right area of memory.

Note that this commit does not yet make it *do* that - as currently
the buffer cache is all in dma-able memory and it will simply back
off.

2) Add uvm_pagerealloc_multi - to be used by the buffer cache code
for reallocating pages to particular regions.

much of this work by ariane, with smatterings of me, art,and oga

ok oga@, thib@, ariane@, deraadt@


# 1.45 03-Jul-2011 oga

endodoify UVM_CNT too.

``beat it'' tedu@ the deleteotron


# 1.44 03-Jul-2011 oga

Rip out and burn support for UVM_HIST.

The vm hackers don't use it, don't maintain it and have to look at it all the
time. About time this 800 lines of code hit /dev/null.

``never liked it'' tedu@. ariane@ was very happy when i told her i wrote
this diff.


# 1.43 30-May-2011 oga

Remove the freelist member from vm_physseg

The new world order of pmemrange makes this data completely redundant
(being dealt with by the pmemrange constraints instead). Remove all code
that messes with the freelist.

While touching every caller of uvm_page_physload() anyway, add the flags
argument to all callers (all but one is 0 and that one already used
PHYSLOAD_DEVICE) and remove the macro magic to allow callers to continue
without it.

Should shrink the code a bit, as well.

matthew@ pointed out some mistakes i'd made.
``freelist death, I like. Ok.' ariane@
`I agree with the general direction, go ahead and i'll fix any fallout
shortly'' miod@ (68k 88k and vax i could not check would build)


# 1.42 15-Apr-2011 oga

When I switched uvm objects to use a per-object page tree instead of the
global hash I forgot to remove the has declarations from struct uvm. So
remove them now.

pointed out by blambert@, ok beck@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.41 29-Jun-2010 thib

Add a no_constraint uvm_constraint_range; use it in the pool code.

ok tedu@, beck@, oga@


# 1.40 27-Jun-2010 thib

uvm constraints. Add two mandatory MD symbols, uvm_md_constraints
which contains the constraints for DMA/memory allocation for each
architecture, and dma_constraints which contains the range of addresses
that are dma accessable by the system.

This is based on ariane@'s physcontig diff, with lots of bugfixes and
additions the following additions by my self:

Introduce a new function pool_set_constraints() which sets the address
range for which we allocate pages for the pool from, this is now used
for the mbuf/mbuf cluster pools to keep them dma accessible.

The !direct archs no longer stuff pages into the kernel object in
uvm_km_getpage_pla but rather do a pmap_extract() in uvm_km_putpages.

Tested heavily by my self on i386, amd64 and sparc64. Some tests on
alpha and SGI.

"commit it" beck, art, oga, deraadt
"i like the diff" deraadt


# 1.39 09-Jun-2010 thib

Move the prototype for uvm_wait() to uvm_extern.h and remove
uvm_pdaemon.h has it was only holding that one prototype.

OK art@, oga@, miod@, deraadt@


# 1.38 22-Apr-2010 oga

Committing on behalf or ariane@.

recommit pmemrange:
physmem allocator: change the view of free memory from single
free pages to free ranges. Classify memory based on region with
associated use-counter (which is used to construct a priority
list of where to allocate memory).

Based on code from tedu@, help from many.

Useable now that bugs have been found and fixed in most architecture's
pmap.c

ok by everyone who has done a pmap or uvm commit in the last year.


Revision tags: OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.37 16-Jun-2009 oga

date based reversion of uvm to the 4th May.

We still have no idea why this stops the crashes. but it does.

a machine forced to 64mb of ram cycled 10GB through swap with this diff
and is still running as I type this. Other tests by ariane@ and thib@
also seem to show that it's alright.

ok deraadt@, thib@, ariane@


# 1.36 16-Jun-2009 ariane

Backout pmemrange (which to most people is more well known as physmem
allocator).

"i can't see any obvious problems" oga


# 1.35 16-Jun-2009 oga

Backout all changes to uvm after pmemrange (which will be backed out
separately).

a change at or just before the hackathon has either exposed or added a
very very nasty memory corruption bug that is giving us hell right now.
So in the interest of kernel stability these diffs are being backed out
until such a time as that corruption bug has been found and squashed,
then the ones that are proven good may slowly return.

a quick hitlist of the main commits this backs out:

mine:
uvm_objwire
the lock change in uvm_swap.c
using trees for uvm objects instead of the hash
removing the pgo_releasepg callback.

art@'s:
putting pmap_page_protect(VM_PROT_NONE) in uvm_pagedeactivate() since
all callers called that just prior anyway.

ok beck@, ariane@.

prompted by deraadt@.


# 1.34 02-Jun-2009 oga

Instead of the global hash table with the terrible hashfunction and a
global lock, switch the uvm object pages to being kept in a per-object
RB_TREE. Right now this is approximately the same speed, but cleaner.
When biglock usage is reduced this will improve concurrency due to lock
contention..

ok beck@ art@. Thanks to jasper for the speed testing.


# 1.33 01-Jun-2009 ariane

physmem allocator: change the view of free memory from single free pages
to free ranges.
Classify memory based on region with associated use-counter (which is used
to construct a priority list of where to allocate memory).

Based on code from tedu@, help from many.
Ok art@


# 1.32 04-May-2009 oga

Instead of keeping two ints in the uvm structure specifically just to
sleep on them (and otherwise ignore them) sleep on the pointer to the
{aiodoned,pagedaemon}_proc members, and nuke the two extra words.

"no objections" art@, ok beck@.


# 1.31 28-Apr-2009 miod

Revert pageqlock back from a mutex to a simple_lock, as it needs to be
recursive in some cases (mostly involving swapping). A proper fix is in
the works, but this will unbreak kernels for now.


# 1.30 14-Apr-2009 oga

The use of uvm.pagedaemon_lock is incredibly inconsistent. only a
fraction of the wakeups and sleeps involved here actually grab that
lock. The remainder, on the other hand, always have the fpageq_lock
locked.

So, make this locking correct by switching the other users over to
fpageq_lock, too.

This would probably be better off being a semaphore, but for now at
least it's correct.

"ok, unless you want to implement semaphores" art@


# 1.29 13-Apr-2009 oga

Convert the page queue lock to a mutex instead of a simplelock.

Fix up the one case of lock recursion (which blatantly ignored the
comment right above it saying that we don't need to lock). The rest of
the lock usage has been checked and appears to be correct.

ok ariane@.


# 1.28 06-Apr-2009 oga

Instead of doing splbio(); simple_lock(&uvm.aiodoned_lock); just replace
the simple lock with a real lock - a IPL_BIO mutex. While i'm here, make
the sleeping condition one hell of a lot simpler in the aio daemon.

some ideas from and ok art@.


# 1.27 26-Mar-2009 oga

Convert splvm() + simplelock(&uvm.hashlock); around the page hash table
into a IPL_VM blocking mutex, also slightly extend the locked area so
that it actually protects access to the page array (as the comment on
the lock declaration says it should).

ansify a few functions while i'm in the file.

"ok, even though you're sneaking in ansification in a diff. You dirty
you." art@


# 1.26 25-Mar-2009 oga

Move all of the pseudo-inline functions in uvm into C files.

By pseudo-inline, I mean that if a certain macro was defined, they would
be inlined. However, no architecture defines that, and none has for a
very very long time. Therefore mainly this just makes the code a damned
sight easier to read. Some k&r -> ansi declarations while I'm in there.

"just commit it" art@. ok weingart@.


Revision tags: OPENBSD_4_5_BASE
# 1.25 27-Jan-2009 miod

Get rid of the last traces of uvm.pager_[se]va


Revision tags: OPENBSD_4_4_BASE
# 1.24 09-Jun-2008 miod

Define a new flag, UVM_FLAG_HOLE, for uvm_map to create a vm_map_entry of
a new etype, UVM_ET_HOLE, meaning it has no backend.

UVM_ET_HOLE entries (which should be created as UVM_PROT_NONE and with
UVM_FLAG_NOMERGE and UVM_FLAG_HOLE) are skipped in uvm_unmap_remove(), so
that pmap_{k,}remove() is not called on the entry.

This is intended to save time, and behave better, on pmaps with MMU holes
at process exit time.

ok art@, kettenis@ provided feedback as well.


# 1.23 05-May-2008 thib

retire ltsleep(); The only refrence left too it is in an
ifdef netbsd block in drm code, but oga@ says he'll remove
it soon...

OK art@, oga@;


Revision tags: OPENBSD_4_3_BASE
# 1.22 29-Nov-2007 tedu

use a working mutex for the freepage list. ok art deraadt


Revision tags: OPENBSD_4_2_BASE
# 1.21 18-Jun-2007 pedro

Bring back Mickey's UVM anon change. Testing by thib@, beck@ and
ckuethe@ for a while. Okay beck@, "it is good timing" deraadt@.


Revision tags: OPENBSD_4_0_BASE OPENBSD_4_1_BASE
# 1.20 13-Jul-2006 deraadt

Back out the anon change. Apparently it was tested by a few, but most of
us did not see it or get a chance to test it before it was commited. It
broke cvs, in the ami driver, making it not succeed at seeing it's devices.


# 1.19 21-Jun-2006 mickey

from netbsd: make anons dynamically allocated from pool.
this results in lesse kva waste due to static preallocation of those
for every phys page and also every swap page.
tested by beck krw miod


Revision tags: OPENBSD_3_9_BASE
# 1.18 16-Jan-2006 mickey

add another uvm histroy for physpage alloc/free and propagate a debugging pgfree check into pglist; no functional change for normal kernels; make histories uncommon


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B UBC_SYNC_A
# 1.17 29-Mar-2003 mickey

ubchist is not a fully cooked kadaver and though use the other well formed pdhist one until ubc gaets back. art@ ok


Revision tags: OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.16 19-Dec-2001 art

UBC was a disaster. It worked very good when it worked, but on some
machines or some configurations or in some phase of the moon (we actually
don't know when or why) files disappeared. Since we've not been able to
track down the problem in two weeks intense debugging and we need -current
to be stable, back out everything to a state it had before UBC.

We apologise for the inconvenience.


Revision tags: UBC_BASE
# 1.15 28-Nov-2001 art

branches: 1.15.2;
Sync in more uvm from NetBSD. Mostly just cosmetic stuff.
Contains also support for page coloring.


# 1.14 10-Nov-2001 art

Merge in some parts of the ubc work that has been done in NetBSD that are not
UBC, but prerequsites for it.

- Create a daemon that processes async I/O (swap and paging in the future)
requests that need processing in process context and that were processed
in the pagedaemon before.
- Convert some ugly ifdef DIAGNOSTIC code to less intrusive KASSERTs.
- misc other cleanups.


# 1.13 05-Nov-2001 art

Minor sync to NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.12 11-Aug-2001 art

Various random fixes from NetBSD.
Including support for zeroing pages in the idle loop (not enabled yet).


# 1.11 18-Jul-2001 art

Unconfuse UVM_UNLOCK_AND_WAIT. From NetBSD.


# 1.10 23-Jun-2001 smart

Sync with NetBSD 19990911 (just before PMAP_NEW was required)
- thread_sleep_msg() -> uvm_sleep()
- initialize reference count lock in uvm_anon_{init,add}()
- add uao_flush()
- replace boolean 'islocked' with 'lockflags'
- in uvm_fault() change FALSE to TRUE to in 'wide' fault handling
- get rid of uvm_km_get()
- various bug fixes


Revision tags: OPENBSD_2_9_BASE
# 1.9 10-Apr-2001 niklas

Fix for machines which need to enlarge the kernel address space, at least
1GB i386 machines needs this. The fix is heavily based on Jason Thorpe's
found in NetBSD. Here is his original commit message:

Instead of checking vm_physmem[<physseg>].pgs to determine if
uvm_page_init() has completed, add a boolean uvm.page_init_done,
and test against that. Use this same boolean (rather than
pmap_initialized) in pmap_growkernel() to determine if we are
being called via uvm_page_init() to grow the kernel address space.

This fixes a problem on some i386 configurations where pmap_init()
itself was needing to have the kernel page table grown, and since
pmap_initialized was not yet set to TRUE, pmap_growkernel() was
choosing the wrong code path.


# 1.8 22-Mar-2001 smart

Sync style, typo, and comments a little closer to NetBSD. art@ ok


# 1.7 09-Mar-2001 smart

Protect protypes, certain macros, and inlines from userland. Checked userland
with a 'make build'. From NetBSD. art@ ok


# 1.6 29-Jan-2001 niklas

$OpenBSD$


Revision tags: OPENBSD_2_8_BASE
# 1.5 27-May-2000 provos

use rijndael instead of blowfish because of faster key setup.
break swap paritions into sections, each section has own
encryption key. if a section's key becomes unreferenced, erase it.


Revision tags: OPENBSD_2_7_BASE
# 1.4 15-Mar-2000 art

Fix the NetBSD id strings.


Revision tags: OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 23-Aug-1999 art

branches: 1.3.4;
sync with NetBSD from 1999.05.24 (there is a reason for this date)
Mostly cleanups, but also a few improvements to pagedaemon for better
handling of low memory and/or low swap conditions.


Revision tags: OPENBSD_2_5_BASE
# 1.2 26-Feb-1999 art

add OpenBSD tags


# 1.1 26-Feb-1999 art

Import of uvm from NetBSD. Some local changes, some code disabled


# 1.66 29-Nov-2019 deraadt

Repurpose the "syscalls must be on a writeable page" mechanism to
enforce a new policy: system calls must be in pre-registered regions.
We have discussed more strict checks than this, but none satisfy the
cost/benefit based upon our understanding of attack methods, anyways
let's see what the next iteration looks like.

This is intended to harden (translation: attackers must put extra
effort into attacking) against a mixture of W^X failures and JIT bugs
which allow syscall misinterpretation, especially in environments with
polymorphic-instruction/variable-sized instructions. It fits in a bit
with libc/libcrypto/ld.so random relink on boot and no-restart-at-crash
behaviour, particularily for remote problems. Less effective once on-host
since someone the libraries can be read.

For static-executables the kernel registers the main program's
PIE-mapped exec section valid, as well as the randomly-placed sigtramp
page. For dynamic executables ELF ld.so's exec segment is also
labelled valid; ld.so then has enough information to register libc's
exec section as valid via call-once msyscall(2)

For dynamic binaries, we continue to to permit the main program exec
segment because "go" (and potentially a few other applications) have
embedded system calls in the main program. Hopefully at least go gets
fixed soon.

We declare the concept of embedded syscalls a bad idea for numerous
reasons, as we notice the ecosystem has many of
static-syscall-in-base-binary which are dynamically linked against
libraries which in turn use libc, which contains another set of
syscall stubs. We've been concerned about adding even one additional
syscall entry point... but go's approach tends to double the entry-point
attack surface.

This was started at a nano-hackathon in Bob Beck's basement 2 weeks
ago during a long discussion with mortimer trying to hide from the SSL
scream-conversations, and finished in more comfortable circumstances
next to a wood-stove at Elk Lakes cabin with UVM scream-conversations.

ok guenther kettenis mortimer, lots of feedback from others
conversations about go with jsing tb sthen


Revision tags: OPENBSD_6_6_BASE
# 1.65 18-Jul-2019 cheloha

R.I.P. UVM_WAIT(). Use tsleep_nsec(9) directly.

UVM_WAIT() doesn't provide much of a useful abstraction. All callers
tsleep forever and no callers set PCATCH, so only 2 of 4 parameters are
actually used. Might as well just use tsleep_nsec(9) directly and make
the uvm code a bit less specialized.

Suggested by mpi@.

ok mpi@ visa@ millert@


Revision tags: OPENBSD_6_5_BASE
# 1.64 01-Mar-2019 cheloha

New mmap(2) flag: MAP_CONCEAL.

MAP_CONCEAL'd memory is not written to disk in the event of a core dump.
It may grow other qualities in the future.

Wanted by libressl, probably useful elsewhere, too.

Prompted by deraadt@, concept from deraadt@/kettenis@. With input from
deraadt@, cjeker@, kettenis@, otto@, bcook@, matthew@, guenther@, djm@,
and tedu@.

ok otto@ deraadt@


# 1.63 31-Oct-2018 kettenis

Add support to uvm to establish write-combining mappings. Use this in the
inteldrm driver to add support for the I915_MMAP_WC flag.

ok deraadt@, jsg@


Revision tags: OPENBSD_6_4_BASE
# 1.62 12-Apr-2018 deraadt

Implement MAP_STACK option for mmap(). Synchronous faults (pagefault and
syscall) confirm the stack register points at MAP_STACK memory, otherwise
SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
to create a MAP_STACK sub-region which satisfies alignment requirements.
Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
contents of the region -- there is no mprotect() equivalent operation, so
there is no MAP_STACK-adding gadget.
This opportunistic software-emulation of a stack protection bit makes
stack-pivot operations during ROPchain fragile (kind of like removing a
tool from the toolbox).
original discussion with tedu, uvm work by stefan, testing by mortimer
ok kettenis


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.61 11-Aug-2016 dlg

replace abuse of the static map entries RB_ENTRY pointers with an SLIST

free static entries are kept in a simple linked list, so use SLIST
to make this obvious. the RB_PARENT manipulations are ugly and
confusing.

ok kettenis@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.60 08-Oct-2015 kettenis

Lock the page queues by turning uvm_lock_pageq() and uvm_unlock_pageq() into
mtx_enter() and mtx_leave() operations. Not 100% this won't blow up but
there is only one way to find out, and we need this to make progress on
further unlocking uvm.

prodded by deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.59 04-May-2015 dlg

reduce the scope of things that include uvm_swap_encrypt.h.

uvm_meter.c needs it to route the sysctl, uvm_swap.c needs it to
use the functionality, and uvm_swap_encrypt.c needs it to for obvious
reasons. userland sysctl already includes it explicitely.

everything else doesnt and shouldnt care.

ok miod@


# 1.58 23-Apr-2015 dlg

tedu remnants of the previous attempt to implement page zeroing in
the idle thread.

ok deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.57 03-Oct-2014 kettenis

Introduce __MAP_NOFAULT, a mmap(2) flag that makes sure a mapping will not
cause a SIGSEGV or SIGBUS when a mapped file gets truncated. Access to
pages that are not backed by a file on such a mapping will be replaced by
zero-filled anonymous pages. Makes passing file descriptors of mapped files
usable without having to play tricks with signal handlers.

"steal your mmap flag" deraadt@


Revision tags: OPENBSD_5_6_BASE
# 1.56 11-Jul-2014 jsg

Chuck Cranor rescinded clauses in his license
on the 2nd of February 2011 in NetBSD.

http://marc.info/?l=netbsd-source-changes&m=129658899212732&w=2
http://marc.info/?l=netbsd-source-changes&m=129659095515558&w=2
http://marc.info/?l=netbsd-source-changes&m=129659157916514&w=2
http://marc.info/?l=netbsd-source-changes&m=129665962324372&w=2
http://marc.info/?l=netbsd-source-changes&m=129666033625342&w=2
http://marc.info/?l=netbsd-source-changes&m=129666052825545&w=2
http://marc.info/?l=netbsd-source-changes&m=129666922906480&w=2
http://marc.info/?l=netbsd-source-changes&m=129667725518082&w=2


# 1.55 08-Jul-2014 deraadt

decouple struct uvmexp into a new file, so that uvm_extern.h and sysctl.h
don't need to be married.
ok guenther miod beck jsing kettenis


# 1.54 08-Jul-2014 deraadt

white space repairs


# 1.53 28-Mar-2014 mpi

Reduce uvm include madness. Use <uvm/uvm_extern.h> instead of
<uvm/uvm.h> if possible and remove double inclusions.

ok beck@, mlarkin@, deraadt@


Revision tags: OPENBSD_5_5_BASE
# 1.52 09-Nov-2013 guenther

Add KASSERT()s to tsleep() and msleep() to verify that bogus flags
aren't being passed to them. Fix UVM_WAIT() to not pass PNORELOCK to
tsleep(), as that flag only does something with msleep().

ok beck@ dlg@


Revision tags: OPENBSD_5_4_BASE
# 1.51 30-May-2013 tedu

UVM_UNLOCK_AND_WAIT no longer unlocks, so rename it to UVM_WAIT.


# 1.50 30-May-2013 tedu

remove lots of comments about locking per beck's request


# 1.49 30-May-2013 tedu

remove simple_locks from uvm code. ok beck deraadt


# 1.48 29-May-2013 tedu

uvm_loan has not (ever) been compiled or used.


Revision tags: OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.47 09-Mar-2012 ariane

New vmmap implementation.

no oks (it is really a pain to review properly)
extensively tested, I'm confident it'll be stable
'now is the time' from several icb inhabitants

Diff provides:
- ability to specify different allocators for different regions/maps
- a simpler implementation of the current allocator
- currently in compatibility mode: it will generate similar addresses
as the old allocator


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.46 06-Jul-2011 beck

uvm changes for buffer cache improvements.
1) Make the pagedaemon aware of the memory ranges and size of allocations
where memory is being requested, and pass this information on to
bufbackoff(), which will later (not yet) be used to ensure that the
buffer cache gets out of the way in the right area of memory.

Note that this commit does not yet make it *do* that - as currently
the buffer cache is all in dma-able memory and it will simply back
off.

2) Add uvm_pagerealloc_multi - to be used by the buffer cache code
for reallocating pages to particular regions.

much of this work by ariane, with smatterings of me, art,and oga

ok oga@, thib@, ariane@, deraadt@


# 1.45 03-Jul-2011 oga

endodoify UVM_CNT too.

``beat it'' tedu@ the deleteotron


# 1.44 03-Jul-2011 oga

Rip out and burn support for UVM_HIST.

The vm hackers don't use it, don't maintain it and have to look at it all the
time. About time this 800 lines of code hit /dev/null.

``never liked it'' tedu@. ariane@ was very happy when i told her i wrote
this diff.


# 1.43 30-May-2011 oga

Remove the freelist member from vm_physseg

The new world order of pmemrange makes this data completely redundant
(being dealt with by the pmemrange constraints instead). Remove all code
that messes with the freelist.

While touching every caller of uvm_page_physload() anyway, add the flags
argument to all callers (all but one is 0 and that one already used
PHYSLOAD_DEVICE) and remove the macro magic to allow callers to continue
without it.

Should shrink the code a bit, as well.

matthew@ pointed out some mistakes i'd made.
``freelist death, I like. Ok.' ariane@
`I agree with the general direction, go ahead and i'll fix any fallout
shortly'' miod@ (68k 88k and vax i could not check would build)


# 1.42 15-Apr-2011 oga

When I switched uvm objects to use a per-object page tree instead of the
global hash I forgot to remove the has declarations from struct uvm. So
remove them now.

pointed out by blambert@, ok beck@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.41 29-Jun-2010 thib

Add a no_constraint uvm_constraint_range; use it in the pool code.

ok tedu@, beck@, oga@


# 1.40 27-Jun-2010 thib

uvm constraints. Add two mandatory MD symbols, uvm_md_constraints
which contains the constraints for DMA/memory allocation for each
architecture, and dma_constraints which contains the range of addresses
that are dma accessable by the system.

This is based on ariane@'s physcontig diff, with lots of bugfixes and
additions the following additions by my self:

Introduce a new function pool_set_constraints() which sets the address
range for which we allocate pages for the pool from, this is now used
for the mbuf/mbuf cluster pools to keep them dma accessible.

The !direct archs no longer stuff pages into the kernel object in
uvm_km_getpage_pla but rather do a pmap_extract() in uvm_km_putpages.

Tested heavily by my self on i386, amd64 and sparc64. Some tests on
alpha and SGI.

"commit it" beck, art, oga, deraadt
"i like the diff" deraadt


# 1.39 09-Jun-2010 thib

Move the prototype for uvm_wait() to uvm_extern.h and remove
uvm_pdaemon.h has it was only holding that one prototype.

OK art@, oga@, miod@, deraadt@


# 1.38 22-Apr-2010 oga

Committing on behalf or ariane@.

recommit pmemrange:
physmem allocator: change the view of free memory from single
free pages to free ranges. Classify memory based on region with
associated use-counter (which is used to construct a priority
list of where to allocate memory).

Based on code from tedu@, help from many.

Useable now that bugs have been found and fixed in most architecture's
pmap.c

ok by everyone who has done a pmap or uvm commit in the last year.


Revision tags: OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.37 16-Jun-2009 oga

date based reversion of uvm to the 4th May.

We still have no idea why this stops the crashes. but it does.

a machine forced to 64mb of ram cycled 10GB through swap with this diff
and is still running as I type this. Other tests by ariane@ and thib@
also seem to show that it's alright.

ok deraadt@, thib@, ariane@


# 1.36 16-Jun-2009 ariane

Backout pmemrange (which to most people is more well known as physmem
allocator).

"i can't see any obvious problems" oga


# 1.35 16-Jun-2009 oga

Backout all changes to uvm after pmemrange (which will be backed out
separately).

a change at or just before the hackathon has either exposed or added a
very very nasty memory corruption bug that is giving us hell right now.
So in the interest of kernel stability these diffs are being backed out
until such a time as that corruption bug has been found and squashed,
then the ones that are proven good may slowly return.

a quick hitlist of the main commits this backs out:

mine:
uvm_objwire
the lock change in uvm_swap.c
using trees for uvm objects instead of the hash
removing the pgo_releasepg callback.

art@'s:
putting pmap_page_protect(VM_PROT_NONE) in uvm_pagedeactivate() since
all callers called that just prior anyway.

ok beck@, ariane@.

prompted by deraadt@.


# 1.34 02-Jun-2009 oga

Instead of the global hash table with the terrible hashfunction and a
global lock, switch the uvm object pages to being kept in a per-object
RB_TREE. Right now this is approximately the same speed, but cleaner.
When biglock usage is reduced this will improve concurrency due to lock
contention..

ok beck@ art@. Thanks to jasper for the speed testing.


# 1.33 01-Jun-2009 ariane

physmem allocator: change the view of free memory from single free pages
to free ranges.
Classify memory based on region with associated use-counter (which is used
to construct a priority list of where to allocate memory).

Based on code from tedu@, help from many.
Ok art@


# 1.32 04-May-2009 oga

Instead of keeping two ints in the uvm structure specifically just to
sleep on them (and otherwise ignore them) sleep on the pointer to the
{aiodoned,pagedaemon}_proc members, and nuke the two extra words.

"no objections" art@, ok beck@.


# 1.31 28-Apr-2009 miod

Revert pageqlock back from a mutex to a simple_lock, as it needs to be
recursive in some cases (mostly involving swapping). A proper fix is in
the works, but this will unbreak kernels for now.


# 1.30 14-Apr-2009 oga

The use of uvm.pagedaemon_lock is incredibly inconsistent. only a
fraction of the wakeups and sleeps involved here actually grab that
lock. The remainder, on the other hand, always have the fpageq_lock
locked.

So, make this locking correct by switching the other users over to
fpageq_lock, too.

This would probably be better off being a semaphore, but for now at
least it's correct.

"ok, unless you want to implement semaphores" art@


# 1.29 13-Apr-2009 oga

Convert the page queue lock to a mutex instead of a simplelock.

Fix up the one case of lock recursion (which blatantly ignored the
comment right above it saying that we don't need to lock). The rest of
the lock usage has been checked and appears to be correct.

ok ariane@.


# 1.28 06-Apr-2009 oga

Instead of doing splbio(); simple_lock(&uvm.aiodoned_lock); just replace
the simple lock with a real lock - a IPL_BIO mutex. While i'm here, make
the sleeping condition one hell of a lot simpler in the aio daemon.

some ideas from and ok art@.


# 1.27 26-Mar-2009 oga

Convert splvm() + simplelock(&uvm.hashlock); around the page hash table
into a IPL_VM blocking mutex, also slightly extend the locked area so
that it actually protects access to the page array (as the comment on
the lock declaration says it should).

ansify a few functions while i'm in the file.

"ok, even though you're sneaking in ansification in a diff. You dirty
you." art@


# 1.26 25-Mar-2009 oga

Move all of the pseudo-inline functions in uvm into C files.

By pseudo-inline, I mean that if a certain macro was defined, they would
be inlined. However, no architecture defines that, and none has for a
very very long time. Therefore mainly this just makes the code a damned
sight easier to read. Some k&r -> ansi declarations while I'm in there.

"just commit it" art@. ok weingart@.


Revision tags: OPENBSD_4_5_BASE
# 1.25 27-Jan-2009 miod

Get rid of the last traces of uvm.pager_[se]va


Revision tags: OPENBSD_4_4_BASE
# 1.24 09-Jun-2008 miod

Define a new flag, UVM_FLAG_HOLE, for uvm_map to create a vm_map_entry of
a new etype, UVM_ET_HOLE, meaning it has no backend.

UVM_ET_HOLE entries (which should be created as UVM_PROT_NONE and with
UVM_FLAG_NOMERGE and UVM_FLAG_HOLE) are skipped in uvm_unmap_remove(), so
that pmap_{k,}remove() is not called on the entry.

This is intended to save time, and behave better, on pmaps with MMU holes
at process exit time.

ok art@, kettenis@ provided feedback as well.


# 1.23 05-May-2008 thib

retire ltsleep(); The only refrence left too it is in an
ifdef netbsd block in drm code, but oga@ says he'll remove
it soon...

OK art@, oga@;


Revision tags: OPENBSD_4_3_BASE
# 1.22 29-Nov-2007 tedu

use a working mutex for the freepage list. ok art deraadt


Revision tags: OPENBSD_4_2_BASE
# 1.21 18-Jun-2007 pedro

Bring back Mickey's UVM anon change. Testing by thib@, beck@ and
ckuethe@ for a while. Okay beck@, "it is good timing" deraadt@.


Revision tags: OPENBSD_4_0_BASE OPENBSD_4_1_BASE
# 1.20 13-Jul-2006 deraadt

Back out the anon change. Apparently it was tested by a few, but most of
us did not see it or get a chance to test it before it was commited. It
broke cvs, in the ami driver, making it not succeed at seeing it's devices.


# 1.19 21-Jun-2006 mickey

from netbsd: make anons dynamically allocated from pool.
this results in lesse kva waste due to static preallocation of those
for every phys page and also every swap page.
tested by beck krw miod


Revision tags: OPENBSD_3_9_BASE
# 1.18 16-Jan-2006 mickey

add another uvm histroy for physpage alloc/free and propagate a debugging pgfree check into pglist; no functional change for normal kernels; make histories uncommon


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B UBC_SYNC_A
# 1.17 29-Mar-2003 mickey

ubchist is not a fully cooked kadaver and though use the other well formed pdhist one until ubc gaets back. art@ ok


Revision tags: OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.16 19-Dec-2001 art

UBC was a disaster. It worked very good when it worked, but on some
machines or some configurations or in some phase of the moon (we actually
don't know when or why) files disappeared. Since we've not been able to
track down the problem in two weeks intense debugging and we need -current
to be stable, back out everything to a state it had before UBC.

We apologise for the inconvenience.


Revision tags: UBC_BASE
# 1.15 28-Nov-2001 art

branches: 1.15.2;
Sync in more uvm from NetBSD. Mostly just cosmetic stuff.
Contains also support for page coloring.


# 1.14 10-Nov-2001 art

Merge in some parts of the ubc work that has been done in NetBSD that are not
UBC, but prerequsites for it.

- Create a daemon that processes async I/O (swap and paging in the future)
requests that need processing in process context and that were processed
in the pagedaemon before.
- Convert some ugly ifdef DIAGNOSTIC code to less intrusive KASSERTs.
- misc other cleanups.


# 1.13 05-Nov-2001 art

Minor sync to NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.12 11-Aug-2001 art

Various random fixes from NetBSD.
Including support for zeroing pages in the idle loop (not enabled yet).


# 1.11 18-Jul-2001 art

Unconfuse UVM_UNLOCK_AND_WAIT. From NetBSD.


# 1.10 23-Jun-2001 smart

Sync with NetBSD 19990911 (just before PMAP_NEW was required)
- thread_sleep_msg() -> uvm_sleep()
- initialize reference count lock in uvm_anon_{init,add}()
- add uao_flush()
- replace boolean 'islocked' with 'lockflags'
- in uvm_fault() change FALSE to TRUE to in 'wide' fault handling
- get rid of uvm_km_get()
- various bug fixes


Revision tags: OPENBSD_2_9_BASE
# 1.9 10-Apr-2001 niklas

Fix for machines which need to enlarge the kernel address space, at least
1GB i386 machines needs this. The fix is heavily based on Jason Thorpe's
found in NetBSD. Here is his original commit message:

Instead of checking vm_physmem[<physseg>].pgs to determine if
uvm_page_init() has completed, add a boolean uvm.page_init_done,
and test against that. Use this same boolean (rather than
pmap_initialized) in pmap_growkernel() to determine if we are
being called via uvm_page_init() to grow the kernel address space.

This fixes a problem on some i386 configurations where pmap_init()
itself was needing to have the kernel page table grown, and since
pmap_initialized was not yet set to TRUE, pmap_growkernel() was
choosing the wrong code path.


# 1.8 22-Mar-2001 smart

Sync style, typo, and comments a little closer to NetBSD. art@ ok


# 1.7 09-Mar-2001 smart

Protect protypes, certain macros, and inlines from userland. Checked userland
with a 'make build'. From NetBSD. art@ ok


# 1.6 29-Jan-2001 niklas

$OpenBSD$


Revision tags: OPENBSD_2_8_BASE
# 1.5 27-May-2000 provos

use rijndael instead of blowfish because of faster key setup.
break swap paritions into sections, each section has own
encryption key. if a section's key becomes unreferenced, erase it.


Revision tags: OPENBSD_2_7_BASE
# 1.4 15-Mar-2000 art

Fix the NetBSD id strings.


Revision tags: OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 23-Aug-1999 art

branches: 1.3.4;
sync with NetBSD from 1999.05.24 (there is a reason for this date)
Mostly cleanups, but also a few improvements to pagedaemon for better
handling of low memory and/or low swap conditions.


Revision tags: OPENBSD_2_5_BASE
# 1.2 26-Feb-1999 art

add OpenBSD tags


# 1.1 26-Feb-1999 art

Import of uvm from NetBSD. Some local changes, some code disabled


# 1.65 18-Jul-2019 cheloha

R.I.P. UVM_WAIT(). Use tsleep_nsec(9) directly.

UVM_WAIT() doesn't provide much of a useful abstraction. All callers
tsleep forever and no callers set PCATCH, so only 2 of 4 parameters are
actually used. Might as well just use tsleep_nsec(9) directly and make
the uvm code a bit less specialized.

Suggested by mpi@.

ok mpi@ visa@ millert@


Revision tags: OPENBSD_6_5_BASE
# 1.64 01-Mar-2019 cheloha

New mmap(2) flag: MAP_CONCEAL.

MAP_CONCEAL'd memory is not written to disk in the event of a core dump.
It may grow other qualities in the future.

Wanted by libressl, probably useful elsewhere, too.

Prompted by deraadt@, concept from deraadt@/kettenis@. With input from
deraadt@, cjeker@, kettenis@, otto@, bcook@, matthew@, guenther@, djm@,
and tedu@.

ok otto@ deraadt@


# 1.63 31-Oct-2018 kettenis

Add support to uvm to establish write-combining mappings. Use this in the
inteldrm driver to add support for the I915_MMAP_WC flag.

ok deraadt@, jsg@


Revision tags: OPENBSD_6_4_BASE
# 1.62 12-Apr-2018 deraadt

Implement MAP_STACK option for mmap(). Synchronous faults (pagefault and
syscall) confirm the stack register points at MAP_STACK memory, otherwise
SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
to create a MAP_STACK sub-region which satisfies alignment requirements.
Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
contents of the region -- there is no mprotect() equivalent operation, so
there is no MAP_STACK-adding gadget.
This opportunistic software-emulation of a stack protection bit makes
stack-pivot operations during ROPchain fragile (kind of like removing a
tool from the toolbox).
original discussion with tedu, uvm work by stefan, testing by mortimer
ok kettenis


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.61 11-Aug-2016 dlg

replace abuse of the static map entries RB_ENTRY pointers with an SLIST

free static entries are kept in a simple linked list, so use SLIST
to make this obvious. the RB_PARENT manipulations are ugly and
confusing.

ok kettenis@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.60 08-Oct-2015 kettenis

Lock the page queues by turning uvm_lock_pageq() and uvm_unlock_pageq() into
mtx_enter() and mtx_leave() operations. Not 100% this won't blow up but
there is only one way to find out, and we need this to make progress on
further unlocking uvm.

prodded by deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.59 04-May-2015 dlg

reduce the scope of things that include uvm_swap_encrypt.h.

uvm_meter.c needs it to route the sysctl, uvm_swap.c needs it to
use the functionality, and uvm_swap_encrypt.c needs it to for obvious
reasons. userland sysctl already includes it explicitely.

everything else doesnt and shouldnt care.

ok miod@


# 1.58 23-Apr-2015 dlg

tedu remnants of the previous attempt to implement page zeroing in
the idle thread.

ok deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.57 03-Oct-2014 kettenis

Introduce __MAP_NOFAULT, a mmap(2) flag that makes sure a mapping will not
cause a SIGSEGV or SIGBUS when a mapped file gets truncated. Access to
pages that are not backed by a file on such a mapping will be replaced by
zero-filled anonymous pages. Makes passing file descriptors of mapped files
usable without having to play tricks with signal handlers.

"steal your mmap flag" deraadt@


Revision tags: OPENBSD_5_6_BASE
# 1.56 11-Jul-2014 jsg

Chuck Cranor rescinded clauses in his license
on the 2nd of February 2011 in NetBSD.

http://marc.info/?l=netbsd-source-changes&m=129658899212732&w=2
http://marc.info/?l=netbsd-source-changes&m=129659095515558&w=2
http://marc.info/?l=netbsd-source-changes&m=129659157916514&w=2
http://marc.info/?l=netbsd-source-changes&m=129665962324372&w=2
http://marc.info/?l=netbsd-source-changes&m=129666033625342&w=2
http://marc.info/?l=netbsd-source-changes&m=129666052825545&w=2
http://marc.info/?l=netbsd-source-changes&m=129666922906480&w=2
http://marc.info/?l=netbsd-source-changes&m=129667725518082&w=2


# 1.55 08-Jul-2014 deraadt

decouple struct uvmexp into a new file, so that uvm_extern.h and sysctl.h
don't need to be married.
ok guenther miod beck jsing kettenis


# 1.54 08-Jul-2014 deraadt

white space repairs


# 1.53 28-Mar-2014 mpi

Reduce uvm include madness. Use <uvm/uvm_extern.h> instead of
<uvm/uvm.h> if possible and remove double inclusions.

ok beck@, mlarkin@, deraadt@


Revision tags: OPENBSD_5_5_BASE
# 1.52 09-Nov-2013 guenther

Add KASSERT()s to tsleep() and msleep() to verify that bogus flags
aren't being passed to them. Fix UVM_WAIT() to not pass PNORELOCK to
tsleep(), as that flag only does something with msleep().

ok beck@ dlg@


Revision tags: OPENBSD_5_4_BASE
# 1.51 30-May-2013 tedu

UVM_UNLOCK_AND_WAIT no longer unlocks, so rename it to UVM_WAIT.


# 1.50 30-May-2013 tedu

remove lots of comments about locking per beck's request


# 1.49 30-May-2013 tedu

remove simple_locks from uvm code. ok beck deraadt


# 1.48 29-May-2013 tedu

uvm_loan has not (ever) been compiled or used.


Revision tags: OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.47 09-Mar-2012 ariane

New vmmap implementation.

no oks (it is really a pain to review properly)
extensively tested, I'm confident it'll be stable
'now is the time' from several icb inhabitants

Diff provides:
- ability to specify different allocators for different regions/maps
- a simpler implementation of the current allocator
- currently in compatibility mode: it will generate similar addresses
as the old allocator


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.46 06-Jul-2011 beck

uvm changes for buffer cache improvements.
1) Make the pagedaemon aware of the memory ranges and size of allocations
where memory is being requested, and pass this information on to
bufbackoff(), which will later (not yet) be used to ensure that the
buffer cache gets out of the way in the right area of memory.

Note that this commit does not yet make it *do* that - as currently
the buffer cache is all in dma-able memory and it will simply back
off.

2) Add uvm_pagerealloc_multi - to be used by the buffer cache code
for reallocating pages to particular regions.

much of this work by ariane, with smatterings of me, art,and oga

ok oga@, thib@, ariane@, deraadt@


# 1.45 03-Jul-2011 oga

endodoify UVM_CNT too.

``beat it'' tedu@ the deleteotron


# 1.44 03-Jul-2011 oga

Rip out and burn support for UVM_HIST.

The vm hackers don't use it, don't maintain it and have to look at it all the
time. About time this 800 lines of code hit /dev/null.

``never liked it'' tedu@. ariane@ was very happy when i told her i wrote
this diff.


# 1.43 30-May-2011 oga

Remove the freelist member from vm_physseg

The new world order of pmemrange makes this data completely redundant
(being dealt with by the pmemrange constraints instead). Remove all code
that messes with the freelist.

While touching every caller of uvm_page_physload() anyway, add the flags
argument to all callers (all but one is 0 and that one already used
PHYSLOAD_DEVICE) and remove the macro magic to allow callers to continue
without it.

Should shrink the code a bit, as well.

matthew@ pointed out some mistakes i'd made.
``freelist death, I like. Ok.' ariane@
`I agree with the general direction, go ahead and i'll fix any fallout
shortly'' miod@ (68k 88k and vax i could not check would build)


# 1.42 15-Apr-2011 oga

When I switched uvm objects to use a per-object page tree instead of the
global hash I forgot to remove the has declarations from struct uvm. So
remove them now.

pointed out by blambert@, ok beck@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.41 29-Jun-2010 thib

Add a no_constraint uvm_constraint_range; use it in the pool code.

ok tedu@, beck@, oga@


# 1.40 27-Jun-2010 thib

uvm constraints. Add two mandatory MD symbols, uvm_md_constraints
which contains the constraints for DMA/memory allocation for each
architecture, and dma_constraints which contains the range of addresses
that are dma accessable by the system.

This is based on ariane@'s physcontig diff, with lots of bugfixes and
additions the following additions by my self:

Introduce a new function pool_set_constraints() which sets the address
range for which we allocate pages for the pool from, this is now used
for the mbuf/mbuf cluster pools to keep them dma accessible.

The !direct archs no longer stuff pages into the kernel object in
uvm_km_getpage_pla but rather do a pmap_extract() in uvm_km_putpages.

Tested heavily by my self on i386, amd64 and sparc64. Some tests on
alpha and SGI.

"commit it" beck, art, oga, deraadt
"i like the diff" deraadt


# 1.39 09-Jun-2010 thib

Move the prototype for uvm_wait() to uvm_extern.h and remove
uvm_pdaemon.h has it was only holding that one prototype.

OK art@, oga@, miod@, deraadt@


# 1.38 22-Apr-2010 oga

Committing on behalf or ariane@.

recommit pmemrange:
physmem allocator: change the view of free memory from single
free pages to free ranges. Classify memory based on region with
associated use-counter (which is used to construct a priority
list of where to allocate memory).

Based on code from tedu@, help from many.

Useable now that bugs have been found and fixed in most architecture's
pmap.c

ok by everyone who has done a pmap or uvm commit in the last year.


Revision tags: OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.37 16-Jun-2009 oga

date based reversion of uvm to the 4th May.

We still have no idea why this stops the crashes. but it does.

a machine forced to 64mb of ram cycled 10GB through swap with this diff
and is still running as I type this. Other tests by ariane@ and thib@
also seem to show that it's alright.

ok deraadt@, thib@, ariane@


# 1.36 16-Jun-2009 ariane

Backout pmemrange (which to most people is more well known as physmem
allocator).

"i can't see any obvious problems" oga


# 1.35 16-Jun-2009 oga

Backout all changes to uvm after pmemrange (which will be backed out
separately).

a change at or just before the hackathon has either exposed or added a
very very nasty memory corruption bug that is giving us hell right now.
So in the interest of kernel stability these diffs are being backed out
until such a time as that corruption bug has been found and squashed,
then the ones that are proven good may slowly return.

a quick hitlist of the main commits this backs out:

mine:
uvm_objwire
the lock change in uvm_swap.c
using trees for uvm objects instead of the hash
removing the pgo_releasepg callback.

art@'s:
putting pmap_page_protect(VM_PROT_NONE) in uvm_pagedeactivate() since
all callers called that just prior anyway.

ok beck@, ariane@.

prompted by deraadt@.


# 1.34 02-Jun-2009 oga

Instead of the global hash table with the terrible hashfunction and a
global lock, switch the uvm object pages to being kept in a per-object
RB_TREE. Right now this is approximately the same speed, but cleaner.
When biglock usage is reduced this will improve concurrency due to lock
contention..

ok beck@ art@. Thanks to jasper for the speed testing.


# 1.33 01-Jun-2009 ariane

physmem allocator: change the view of free memory from single free pages
to free ranges.
Classify memory based on region with associated use-counter (which is used
to construct a priority list of where to allocate memory).

Based on code from tedu@, help from many.
Ok art@


# 1.32 04-May-2009 oga

Instead of keeping two ints in the uvm structure specifically just to
sleep on them (and otherwise ignore them) sleep on the pointer to the
{aiodoned,pagedaemon}_proc members, and nuke the two extra words.

"no objections" art@, ok beck@.


# 1.31 28-Apr-2009 miod

Revert pageqlock back from a mutex to a simple_lock, as it needs to be
recursive in some cases (mostly involving swapping). A proper fix is in
the works, but this will unbreak kernels for now.


# 1.30 14-Apr-2009 oga

The use of uvm.pagedaemon_lock is incredibly inconsistent. only a
fraction of the wakeups and sleeps involved here actually grab that
lock. The remainder, on the other hand, always have the fpageq_lock
locked.

So, make this locking correct by switching the other users over to
fpageq_lock, too.

This would probably be better off being a semaphore, but for now at
least it's correct.

"ok, unless you want to implement semaphores" art@


# 1.29 13-Apr-2009 oga

Convert the page queue lock to a mutex instead of a simplelock.

Fix up the one case of lock recursion (which blatantly ignored the
comment right above it saying that we don't need to lock). The rest of
the lock usage has been checked and appears to be correct.

ok ariane@.


# 1.28 06-Apr-2009 oga

Instead of doing splbio(); simple_lock(&uvm.aiodoned_lock); just replace
the simple lock with a real lock - a IPL_BIO mutex. While i'm here, make
the sleeping condition one hell of a lot simpler in the aio daemon.

some ideas from and ok art@.


# 1.27 26-Mar-2009 oga

Convert splvm() + simplelock(&uvm.hashlock); around the page hash table
into a IPL_VM blocking mutex, also slightly extend the locked area so
that it actually protects access to the page array (as the comment on
the lock declaration says it should).

ansify a few functions while i'm in the file.

"ok, even though you're sneaking in ansification in a diff. You dirty
you." art@


# 1.26 25-Mar-2009 oga

Move all of the pseudo-inline functions in uvm into C files.

By pseudo-inline, I mean that if a certain macro was defined, they would
be inlined. However, no architecture defines that, and none has for a
very very long time. Therefore mainly this just makes the code a damned
sight easier to read. Some k&r -> ansi declarations while I'm in there.

"just commit it" art@. ok weingart@.


Revision tags: OPENBSD_4_5_BASE
# 1.25 27-Jan-2009 miod

Get rid of the last traces of uvm.pager_[se]va


Revision tags: OPENBSD_4_4_BASE
# 1.24 09-Jun-2008 miod

Define a new flag, UVM_FLAG_HOLE, for uvm_map to create a vm_map_entry of
a new etype, UVM_ET_HOLE, meaning it has no backend.

UVM_ET_HOLE entries (which should be created as UVM_PROT_NONE and with
UVM_FLAG_NOMERGE and UVM_FLAG_HOLE) are skipped in uvm_unmap_remove(), so
that pmap_{k,}remove() is not called on the entry.

This is intended to save time, and behave better, on pmaps with MMU holes
at process exit time.

ok art@, kettenis@ provided feedback as well.


# 1.23 05-May-2008 thib

retire ltsleep(); The only refrence left too it is in an
ifdef netbsd block in drm code, but oga@ says he'll remove
it soon...

OK art@, oga@;


Revision tags: OPENBSD_4_3_BASE
# 1.22 29-Nov-2007 tedu

use a working mutex for the freepage list. ok art deraadt


Revision tags: OPENBSD_4_2_BASE
# 1.21 18-Jun-2007 pedro

Bring back Mickey's UVM anon change. Testing by thib@, beck@ and
ckuethe@ for a while. Okay beck@, "it is good timing" deraadt@.


Revision tags: OPENBSD_4_0_BASE OPENBSD_4_1_BASE
# 1.20 13-Jul-2006 deraadt

Back out the anon change. Apparently it was tested by a few, but most of
us did not see it or get a chance to test it before it was commited. It
broke cvs, in the ami driver, making it not succeed at seeing it's devices.


# 1.19 21-Jun-2006 mickey

from netbsd: make anons dynamically allocated from pool.
this results in lesse kva waste due to static preallocation of those
for every phys page and also every swap page.
tested by beck krw miod


Revision tags: OPENBSD_3_9_BASE
# 1.18 16-Jan-2006 mickey

add another uvm histroy for physpage alloc/free and propagate a debugging pgfree check into pglist; no functional change for normal kernels; make histories uncommon


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B UBC_SYNC_A
# 1.17 29-Mar-2003 mickey

ubchist is not a fully cooked kadaver and though use the other well formed pdhist one until ubc gaets back. art@ ok


Revision tags: OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.16 19-Dec-2001 art

UBC was a disaster. It worked very good when it worked, but on some
machines or some configurations or in some phase of the moon (we actually
don't know when or why) files disappeared. Since we've not been able to
track down the problem in two weeks intense debugging and we need -current
to be stable, back out everything to a state it had before UBC.

We apologise for the inconvenience.


Revision tags: UBC_BASE
# 1.15 28-Nov-2001 art

branches: 1.15.2;
Sync in more uvm from NetBSD. Mostly just cosmetic stuff.
Contains also support for page coloring.


# 1.14 10-Nov-2001 art

Merge in some parts of the ubc work that has been done in NetBSD that are not
UBC, but prerequsites for it.

- Create a daemon that processes async I/O (swap and paging in the future)
requests that need processing in process context and that were processed
in the pagedaemon before.
- Convert some ugly ifdef DIAGNOSTIC code to less intrusive KASSERTs.
- misc other cleanups.


# 1.13 05-Nov-2001 art

Minor sync to NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.12 11-Aug-2001 art

Various random fixes from NetBSD.
Including support for zeroing pages in the idle loop (not enabled yet).


# 1.11 18-Jul-2001 art

Unconfuse UVM_UNLOCK_AND_WAIT. From NetBSD.


# 1.10 23-Jun-2001 smart

Sync with NetBSD 19990911 (just before PMAP_NEW was required)
- thread_sleep_msg() -> uvm_sleep()
- initialize reference count lock in uvm_anon_{init,add}()
- add uao_flush()
- replace boolean 'islocked' with 'lockflags'
- in uvm_fault() change FALSE to TRUE to in 'wide' fault handling
- get rid of uvm_km_get()
- various bug fixes


Revision tags: OPENBSD_2_9_BASE
# 1.9 10-Apr-2001 niklas

Fix for machines which need to enlarge the kernel address space, at least
1GB i386 machines needs this. The fix is heavily based on Jason Thorpe's
found in NetBSD. Here is his original commit message:

Instead of checking vm_physmem[<physseg>].pgs to determine if
uvm_page_init() has completed, add a boolean uvm.page_init_done,
and test against that. Use this same boolean (rather than
pmap_initialized) in pmap_growkernel() to determine if we are
being called via uvm_page_init() to grow the kernel address space.

This fixes a problem on some i386 configurations where pmap_init()
itself was needing to have the kernel page table grown, and since
pmap_initialized was not yet set to TRUE, pmap_growkernel() was
choosing the wrong code path.


# 1.8 22-Mar-2001 smart

Sync style, typo, and comments a little closer to NetBSD. art@ ok


# 1.7 09-Mar-2001 smart

Protect protypes, certain macros, and inlines from userland. Checked userland
with a 'make build'. From NetBSD. art@ ok


# 1.6 29-Jan-2001 niklas

$OpenBSD$


Revision tags: OPENBSD_2_8_BASE
# 1.5 27-May-2000 provos

use rijndael instead of blowfish because of faster key setup.
break swap paritions into sections, each section has own
encryption key. if a section's key becomes unreferenced, erase it.


Revision tags: OPENBSD_2_7_BASE
# 1.4 15-Mar-2000 art

Fix the NetBSD id strings.


Revision tags: OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 23-Aug-1999 art

branches: 1.3.4;
sync with NetBSD from 1999.05.24 (there is a reason for this date)
Mostly cleanups, but also a few improvements to pagedaemon for better
handling of low memory and/or low swap conditions.


Revision tags: OPENBSD_2_5_BASE
# 1.2 26-Feb-1999 art

add OpenBSD tags


# 1.1 26-Feb-1999 art

Import of uvm from NetBSD. Some local changes, some code disabled


# 1.64 01-Mar-2019 cheloha

New mmap(2) flag: MAP_CONCEAL.

MAP_CONCEAL'd memory is not written to disk in the event of a core dump.
It may grow other qualities in the future.

Wanted by libressl, probably useful elsewhere, too.

Prompted by deraadt@, concept from deraadt@/kettenis@. With input from
deraadt@, cjeker@, kettenis@, otto@, bcook@, matthew@, guenther@, djm@,
and tedu@.

ok otto@ deraadt@


# 1.63 31-Oct-2018 kettenis

Add support to uvm to establish write-combining mappings. Use this in the
inteldrm driver to add support for the I915_MMAP_WC flag.

ok deraadt@, jsg@


Revision tags: OPENBSD_6_4_BASE
# 1.62 12-Apr-2018 deraadt

Implement MAP_STACK option for mmap(). Synchronous faults (pagefault and
syscall) confirm the stack register points at MAP_STACK memory, otherwise
SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
to create a MAP_STACK sub-region which satisfies alignment requirements.
Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
contents of the region -- there is no mprotect() equivalent operation, so
there is no MAP_STACK-adding gadget.
This opportunistic software-emulation of a stack protection bit makes
stack-pivot operations during ROPchain fragile (kind of like removing a
tool from the toolbox).
original discussion with tedu, uvm work by stefan, testing by mortimer
ok kettenis


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.61 11-Aug-2016 dlg

replace abuse of the static map entries RB_ENTRY pointers with an SLIST

free static entries are kept in a simple linked list, so use SLIST
to make this obvious. the RB_PARENT manipulations are ugly and
confusing.

ok kettenis@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.60 08-Oct-2015 kettenis

Lock the page queues by turning uvm_lock_pageq() and uvm_unlock_pageq() into
mtx_enter() and mtx_leave() operations. Not 100% this won't blow up but
there is only one way to find out, and we need this to make progress on
further unlocking uvm.

prodded by deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.59 04-May-2015 dlg

reduce the scope of things that include uvm_swap_encrypt.h.

uvm_meter.c needs it to route the sysctl, uvm_swap.c needs it to
use the functionality, and uvm_swap_encrypt.c needs it to for obvious
reasons. userland sysctl already includes it explicitely.

everything else doesnt and shouldnt care.

ok miod@


# 1.58 23-Apr-2015 dlg

tedu remnants of the previous attempt to implement page zeroing in
the idle thread.

ok deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.57 03-Oct-2014 kettenis

Introduce __MAP_NOFAULT, a mmap(2) flag that makes sure a mapping will not
cause a SIGSEGV or SIGBUS when a mapped file gets truncated. Access to
pages that are not backed by a file on such a mapping will be replaced by
zero-filled anonymous pages. Makes passing file descriptors of mapped files
usable without having to play tricks with signal handlers.

"steal your mmap flag" deraadt@


Revision tags: OPENBSD_5_6_BASE
# 1.56 11-Jul-2014 jsg

Chuck Cranor rescinded clauses in his license
on the 2nd of February 2011 in NetBSD.

http://marc.info/?l=netbsd-source-changes&m=129658899212732&w=2
http://marc.info/?l=netbsd-source-changes&m=129659095515558&w=2
http://marc.info/?l=netbsd-source-changes&m=129659157916514&w=2
http://marc.info/?l=netbsd-source-changes&m=129665962324372&w=2
http://marc.info/?l=netbsd-source-changes&m=129666033625342&w=2
http://marc.info/?l=netbsd-source-changes&m=129666052825545&w=2
http://marc.info/?l=netbsd-source-changes&m=129666922906480&w=2
http://marc.info/?l=netbsd-source-changes&m=129667725518082&w=2


# 1.55 08-Jul-2014 deraadt

decouple struct uvmexp into a new file, so that uvm_extern.h and sysctl.h
don't need to be married.
ok guenther miod beck jsing kettenis


# 1.54 08-Jul-2014 deraadt

white space repairs


# 1.53 28-Mar-2014 mpi

Reduce uvm include madness. Use <uvm/uvm_extern.h> instead of
<uvm/uvm.h> if possible and remove double inclusions.

ok beck@, mlarkin@, deraadt@


Revision tags: OPENBSD_5_5_BASE
# 1.52 09-Nov-2013 guenther

Add KASSERT()s to tsleep() and msleep() to verify that bogus flags
aren't being passed to them. Fix UVM_WAIT() to not pass PNORELOCK to
tsleep(), as that flag only does something with msleep().

ok beck@ dlg@


Revision tags: OPENBSD_5_4_BASE
# 1.51 30-May-2013 tedu

UVM_UNLOCK_AND_WAIT no longer unlocks, so rename it to UVM_WAIT.


# 1.50 30-May-2013 tedu

remove lots of comments about locking per beck's request


# 1.49 30-May-2013 tedu

remove simple_locks from uvm code. ok beck deraadt


# 1.48 29-May-2013 tedu

uvm_loan has not (ever) been compiled or used.


Revision tags: OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.47 09-Mar-2012 ariane

New vmmap implementation.

no oks (it is really a pain to review properly)
extensively tested, I'm confident it'll be stable
'now is the time' from several icb inhabitants

Diff provides:
- ability to specify different allocators for different regions/maps
- a simpler implementation of the current allocator
- currently in compatibility mode: it will generate similar addresses
as the old allocator


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.46 06-Jul-2011 beck

uvm changes for buffer cache improvements.
1) Make the pagedaemon aware of the memory ranges and size of allocations
where memory is being requested, and pass this information on to
bufbackoff(), which will later (not yet) be used to ensure that the
buffer cache gets out of the way in the right area of memory.

Note that this commit does not yet make it *do* that - as currently
the buffer cache is all in dma-able memory and it will simply back
off.

2) Add uvm_pagerealloc_multi - to be used by the buffer cache code
for reallocating pages to particular regions.

much of this work by ariane, with smatterings of me, art,and oga

ok oga@, thib@, ariane@, deraadt@


# 1.45 03-Jul-2011 oga

endodoify UVM_CNT too.

``beat it'' tedu@ the deleteotron


# 1.44 03-Jul-2011 oga

Rip out and burn support for UVM_HIST.

The vm hackers don't use it, don't maintain it and have to look at it all the
time. About time this 800 lines of code hit /dev/null.

``never liked it'' tedu@. ariane@ was very happy when i told her i wrote
this diff.


# 1.43 30-May-2011 oga

Remove the freelist member from vm_physseg

The new world order of pmemrange makes this data completely redundant
(being dealt with by the pmemrange constraints instead). Remove all code
that messes with the freelist.

While touching every caller of uvm_page_physload() anyway, add the flags
argument to all callers (all but one is 0 and that one already used
PHYSLOAD_DEVICE) and remove the macro magic to allow callers to continue
without it.

Should shrink the code a bit, as well.

matthew@ pointed out some mistakes i'd made.
``freelist death, I like. Ok.' ariane@
`I agree with the general direction, go ahead and i'll fix any fallout
shortly'' miod@ (68k 88k and vax i could not check would build)


# 1.42 15-Apr-2011 oga

When I switched uvm objects to use a per-object page tree instead of the
global hash I forgot to remove the has declarations from struct uvm. So
remove them now.

pointed out by blambert@, ok beck@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.41 29-Jun-2010 thib

Add a no_constraint uvm_constraint_range; use it in the pool code.

ok tedu@, beck@, oga@


# 1.40 27-Jun-2010 thib

uvm constraints. Add two mandatory MD symbols, uvm_md_constraints
which contains the constraints for DMA/memory allocation for each
architecture, and dma_constraints which contains the range of addresses
that are dma accessable by the system.

This is based on ariane@'s physcontig diff, with lots of bugfixes and
additions the following additions by my self:

Introduce a new function pool_set_constraints() which sets the address
range for which we allocate pages for the pool from, this is now used
for the mbuf/mbuf cluster pools to keep them dma accessible.

The !direct archs no longer stuff pages into the kernel object in
uvm_km_getpage_pla but rather do a pmap_extract() in uvm_km_putpages.

Tested heavily by my self on i386, amd64 and sparc64. Some tests on
alpha and SGI.

"commit it" beck, art, oga, deraadt
"i like the diff" deraadt


# 1.39 09-Jun-2010 thib

Move the prototype for uvm_wait() to uvm_extern.h and remove
uvm_pdaemon.h has it was only holding that one prototype.

OK art@, oga@, miod@, deraadt@


# 1.38 22-Apr-2010 oga

Committing on behalf or ariane@.

recommit pmemrange:
physmem allocator: change the view of free memory from single
free pages to free ranges. Classify memory based on region with
associated use-counter (which is used to construct a priority
list of where to allocate memory).

Based on code from tedu@, help from many.

Useable now that bugs have been found and fixed in most architecture's
pmap.c

ok by everyone who has done a pmap or uvm commit in the last year.


Revision tags: OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.37 16-Jun-2009 oga

date based reversion of uvm to the 4th May.

We still have no idea why this stops the crashes. but it does.

a machine forced to 64mb of ram cycled 10GB through swap with this diff
and is still running as I type this. Other tests by ariane@ and thib@
also seem to show that it's alright.

ok deraadt@, thib@, ariane@


# 1.36 16-Jun-2009 ariane

Backout pmemrange (which to most people is more well known as physmem
allocator).

"i can't see any obvious problems" oga


# 1.35 16-Jun-2009 oga

Backout all changes to uvm after pmemrange (which will be backed out
separately).

a change at or just before the hackathon has either exposed or added a
very very nasty memory corruption bug that is giving us hell right now.
So in the interest of kernel stability these diffs are being backed out
until such a time as that corruption bug has been found and squashed,
then the ones that are proven good may slowly return.

a quick hitlist of the main commits this backs out:

mine:
uvm_objwire
the lock change in uvm_swap.c
using trees for uvm objects instead of the hash
removing the pgo_releasepg callback.

art@'s:
putting pmap_page_protect(VM_PROT_NONE) in uvm_pagedeactivate() since
all callers called that just prior anyway.

ok beck@, ariane@.

prompted by deraadt@.


# 1.34 02-Jun-2009 oga

Instead of the global hash table with the terrible hashfunction and a
global lock, switch the uvm object pages to being kept in a per-object
RB_TREE. Right now this is approximately the same speed, but cleaner.
When biglock usage is reduced this will improve concurrency due to lock
contention..

ok beck@ art@. Thanks to jasper for the speed testing.


# 1.33 01-Jun-2009 ariane

physmem allocator: change the view of free memory from single free pages
to free ranges.
Classify memory based on region with associated use-counter (which is used
to construct a priority list of where to allocate memory).

Based on code from tedu@, help from many.
Ok art@


# 1.32 04-May-2009 oga

Instead of keeping two ints in the uvm structure specifically just to
sleep on them (and otherwise ignore them) sleep on the pointer to the
{aiodoned,pagedaemon}_proc members, and nuke the two extra words.

"no objections" art@, ok beck@.


# 1.31 28-Apr-2009 miod

Revert pageqlock back from a mutex to a simple_lock, as it needs to be
recursive in some cases (mostly involving swapping). A proper fix is in
the works, but this will unbreak kernels for now.


# 1.30 14-Apr-2009 oga

The use of uvm.pagedaemon_lock is incredibly inconsistent. only a
fraction of the wakeups and sleeps involved here actually grab that
lock. The remainder, on the other hand, always have the fpageq_lock
locked.

So, make this locking correct by switching the other users over to
fpageq_lock, too.

This would probably be better off being a semaphore, but for now at
least it's correct.

"ok, unless you want to implement semaphores" art@


# 1.29 13-Apr-2009 oga

Convert the page queue lock to a mutex instead of a simplelock.

Fix up the one case of lock recursion (which blatantly ignored the
comment right above it saying that we don't need to lock). The rest of
the lock usage has been checked and appears to be correct.

ok ariane@.


# 1.28 06-Apr-2009 oga

Instead of doing splbio(); simple_lock(&uvm.aiodoned_lock); just replace
the simple lock with a real lock - a IPL_BIO mutex. While i'm here, make
the sleeping condition one hell of a lot simpler in the aio daemon.

some ideas from and ok art@.


# 1.27 26-Mar-2009 oga

Convert splvm() + simplelock(&uvm.hashlock); around the page hash table
into a IPL_VM blocking mutex, also slightly extend the locked area so
that it actually protects access to the page array (as the comment on
the lock declaration says it should).

ansify a few functions while i'm in the file.

"ok, even though you're sneaking in ansification in a diff. You dirty
you." art@


# 1.26 25-Mar-2009 oga

Move all of the pseudo-inline functions in uvm into C files.

By pseudo-inline, I mean that if a certain macro was defined, they would
be inlined. However, no architecture defines that, and none has for a
very very long time. Therefore mainly this just makes the code a damned
sight easier to read. Some k&r -> ansi declarations while I'm in there.

"just commit it" art@. ok weingart@.


Revision tags: OPENBSD_4_5_BASE
# 1.25 27-Jan-2009 miod

Get rid of the last traces of uvm.pager_[se]va


Revision tags: OPENBSD_4_4_BASE
# 1.24 09-Jun-2008 miod

Define a new flag, UVM_FLAG_HOLE, for uvm_map to create a vm_map_entry of
a new etype, UVM_ET_HOLE, meaning it has no backend.

UVM_ET_HOLE entries (which should be created as UVM_PROT_NONE and with
UVM_FLAG_NOMERGE and UVM_FLAG_HOLE) are skipped in uvm_unmap_remove(), so
that pmap_{k,}remove() is not called on the entry.

This is intended to save time, and behave better, on pmaps with MMU holes
at process exit time.

ok art@, kettenis@ provided feedback as well.


# 1.23 05-May-2008 thib

retire ltsleep(); The only refrence left too it is in an
ifdef netbsd block in drm code, but oga@ says he'll remove
it soon...

OK art@, oga@;


Revision tags: OPENBSD_4_3_BASE
# 1.22 29-Nov-2007 tedu

use a working mutex for the freepage list. ok art deraadt


Revision tags: OPENBSD_4_2_BASE
# 1.21 18-Jun-2007 pedro

Bring back Mickey's UVM anon change. Testing by thib@, beck@ and
ckuethe@ for a while. Okay beck@, "it is good timing" deraadt@.


Revision tags: OPENBSD_4_0_BASE OPENBSD_4_1_BASE
# 1.20 13-Jul-2006 deraadt

Back out the anon change. Apparently it was tested by a few, but most of
us did not see it or get a chance to test it before it was commited. It
broke cvs, in the ami driver, making it not succeed at seeing it's devices.


# 1.19 21-Jun-2006 mickey

from netbsd: make anons dynamically allocated from pool.
this results in lesse kva waste due to static preallocation of those
for every phys page and also every swap page.
tested by beck krw miod


Revision tags: OPENBSD_3_9_BASE
# 1.18 16-Jan-2006 mickey

add another uvm histroy for physpage alloc/free and propagate a debugging pgfree check into pglist; no functional change for normal kernels; make histories uncommon


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B UBC_SYNC_A
# 1.17 29-Mar-2003 mickey

ubchist is not a fully cooked kadaver and though use the other well formed pdhist one until ubc gaets back. art@ ok


Revision tags: OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.16 19-Dec-2001 art

UBC was a disaster. It worked very good when it worked, but on some
machines or some configurations or in some phase of the moon (we actually
don't know when or why) files disappeared. Since we've not been able to
track down the problem in two weeks intense debugging and we need -current
to be stable, back out everything to a state it had before UBC.

We apologise for the inconvenience.


Revision tags: UBC_BASE
# 1.15 28-Nov-2001 art

branches: 1.15.2;
Sync in more uvm from NetBSD. Mostly just cosmetic stuff.
Contains also support for page coloring.


# 1.14 10-Nov-2001 art

Merge in some parts of the ubc work that has been done in NetBSD that are not
UBC, but prerequsites for it.

- Create a daemon that processes async I/O (swap and paging in the future)
requests that need processing in process context and that were processed
in the pagedaemon before.
- Convert some ugly ifdef DIAGNOSTIC code to less intrusive KASSERTs.
- misc other cleanups.


# 1.13 05-Nov-2001 art

Minor sync to NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.12 11-Aug-2001 art

Various random fixes from NetBSD.
Including support for zeroing pages in the idle loop (not enabled yet).


# 1.11 18-Jul-2001 art

Unconfuse UVM_UNLOCK_AND_WAIT. From NetBSD.


# 1.10 23-Jun-2001 smart

Sync with NetBSD 19990911 (just before PMAP_NEW was required)
- thread_sleep_msg() -> uvm_sleep()
- initialize reference count lock in uvm_anon_{init,add}()
- add uao_flush()
- replace boolean 'islocked' with 'lockflags'
- in uvm_fault() change FALSE to TRUE to in 'wide' fault handling
- get rid of uvm_km_get()
- various bug fixes


Revision tags: OPENBSD_2_9_BASE
# 1.9 10-Apr-2001 niklas

Fix for machines which need to enlarge the kernel address space, at least
1GB i386 machines needs this. The fix is heavily based on Jason Thorpe's
found in NetBSD. Here is his original commit message:

Instead of checking vm_physmem[<physseg>].pgs to determine if
uvm_page_init() has completed, add a boolean uvm.page_init_done,
and test against that. Use this same boolean (rather than
pmap_initialized) in pmap_growkernel() to determine if we are
being called via uvm_page_init() to grow the kernel address space.

This fixes a problem on some i386 configurations where pmap_init()
itself was needing to have the kernel page table grown, and since
pmap_initialized was not yet set to TRUE, pmap_growkernel() was
choosing the wrong code path.


# 1.8 22-Mar-2001 smart

Sync style, typo, and comments a little closer to NetBSD. art@ ok


# 1.7 09-Mar-2001 smart

Protect protypes, certain macros, and inlines from userland. Checked userland
with a 'make build'. From NetBSD. art@ ok


# 1.6 29-Jan-2001 niklas

$OpenBSD$


Revision tags: OPENBSD_2_8_BASE
# 1.5 27-May-2000 provos

use rijndael instead of blowfish because of faster key setup.
break swap paritions into sections, each section has own
encryption key. if a section's key becomes unreferenced, erase it.


Revision tags: OPENBSD_2_7_BASE
# 1.4 15-Mar-2000 art

Fix the NetBSD id strings.


Revision tags: OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 23-Aug-1999 art

branches: 1.3.4;
sync with NetBSD from 1999.05.24 (there is a reason for this date)
Mostly cleanups, but also a few improvements to pagedaemon for better
handling of low memory and/or low swap conditions.


Revision tags: OPENBSD_2_5_BASE
# 1.2 26-Feb-1999 art

add OpenBSD tags


# 1.1 26-Feb-1999 art

Import of uvm from NetBSD. Some local changes, some code disabled


# 1.63 31-Oct-2018 kettenis

Add support to uvm to establish write-combining mappings. Use this in the
inteldrm driver to add support for the I915_MMAP_WC flag.

ok deraadt@, jsg@


Revision tags: OPENBSD_6_4_BASE
# 1.62 12-Apr-2018 deraadt

Implement MAP_STACK option for mmap(). Synchronous faults (pagefault and
syscall) confirm the stack register points at MAP_STACK memory, otherwise
SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
to create a MAP_STACK sub-region which satisfies alignment requirements.
Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
contents of the region -- there is no mprotect() equivalent operation, so
there is no MAP_STACK-adding gadget.
This opportunistic software-emulation of a stack protection bit makes
stack-pivot operations during ROPchain fragile (kind of like removing a
tool from the toolbox).
original discussion with tedu, uvm work by stefan, testing by mortimer
ok kettenis


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.61 11-Aug-2016 dlg

replace abuse of the static map entries RB_ENTRY pointers with an SLIST

free static entries are kept in a simple linked list, so use SLIST
to make this obvious. the RB_PARENT manipulations are ugly and
confusing.

ok kettenis@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.60 08-Oct-2015 kettenis

Lock the page queues by turning uvm_lock_pageq() and uvm_unlock_pageq() into
mtx_enter() and mtx_leave() operations. Not 100% this won't blow up but
there is only one way to find out, and we need this to make progress on
further unlocking uvm.

prodded by deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.59 04-May-2015 dlg

reduce the scope of things that include uvm_swap_encrypt.h.

uvm_meter.c needs it to route the sysctl, uvm_swap.c needs it to
use the functionality, and uvm_swap_encrypt.c needs it to for obvious
reasons. userland sysctl already includes it explicitely.

everything else doesnt and shouldnt care.

ok miod@


# 1.58 23-Apr-2015 dlg

tedu remnants of the previous attempt to implement page zeroing in
the idle thread.

ok deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.57 03-Oct-2014 kettenis

Introduce __MAP_NOFAULT, a mmap(2) flag that makes sure a mapping will not
cause a SIGSEGV or SIGBUS when a mapped file gets truncated. Access to
pages that are not backed by a file on such a mapping will be replaced by
zero-filled anonymous pages. Makes passing file descriptors of mapped files
usable without having to play tricks with signal handlers.

"steal your mmap flag" deraadt@


Revision tags: OPENBSD_5_6_BASE
# 1.56 11-Jul-2014 jsg

Chuck Cranor rescinded clauses in his license
on the 2nd of February 2011 in NetBSD.

http://marc.info/?l=netbsd-source-changes&m=129658899212732&w=2
http://marc.info/?l=netbsd-source-changes&m=129659095515558&w=2
http://marc.info/?l=netbsd-source-changes&m=129659157916514&w=2
http://marc.info/?l=netbsd-source-changes&m=129665962324372&w=2
http://marc.info/?l=netbsd-source-changes&m=129666033625342&w=2
http://marc.info/?l=netbsd-source-changes&m=129666052825545&w=2
http://marc.info/?l=netbsd-source-changes&m=129666922906480&w=2
http://marc.info/?l=netbsd-source-changes&m=129667725518082&w=2


# 1.55 08-Jul-2014 deraadt

decouple struct uvmexp into a new file, so that uvm_extern.h and sysctl.h
don't need to be married.
ok guenther miod beck jsing kettenis


# 1.54 08-Jul-2014 deraadt

white space repairs


# 1.53 28-Mar-2014 mpi

Reduce uvm include madness. Use <uvm/uvm_extern.h> instead of
<uvm/uvm.h> if possible and remove double inclusions.

ok beck@, mlarkin@, deraadt@


Revision tags: OPENBSD_5_5_BASE
# 1.52 09-Nov-2013 guenther

Add KASSERT()s to tsleep() and msleep() to verify that bogus flags
aren't being passed to them. Fix UVM_WAIT() to not pass PNORELOCK to
tsleep(), as that flag only does something with msleep().

ok beck@ dlg@


Revision tags: OPENBSD_5_4_BASE
# 1.51 30-May-2013 tedu

UVM_UNLOCK_AND_WAIT no longer unlocks, so rename it to UVM_WAIT.


# 1.50 30-May-2013 tedu

remove lots of comments about locking per beck's request


# 1.49 30-May-2013 tedu

remove simple_locks from uvm code. ok beck deraadt


# 1.48 29-May-2013 tedu

uvm_loan has not (ever) been compiled or used.


Revision tags: OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.47 09-Mar-2012 ariane

New vmmap implementation.

no oks (it is really a pain to review properly)
extensively tested, I'm confident it'll be stable
'now is the time' from several icb inhabitants

Diff provides:
- ability to specify different allocators for different regions/maps
- a simpler implementation of the current allocator
- currently in compatibility mode: it will generate similar addresses
as the old allocator


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.46 06-Jul-2011 beck

uvm changes for buffer cache improvements.
1) Make the pagedaemon aware of the memory ranges and size of allocations
where memory is being requested, and pass this information on to
bufbackoff(), which will later (not yet) be used to ensure that the
buffer cache gets out of the way in the right area of memory.

Note that this commit does not yet make it *do* that - as currently
the buffer cache is all in dma-able memory and it will simply back
off.

2) Add uvm_pagerealloc_multi - to be used by the buffer cache code
for reallocating pages to particular regions.

much of this work by ariane, with smatterings of me, art,and oga

ok oga@, thib@, ariane@, deraadt@


# 1.45 03-Jul-2011 oga

endodoify UVM_CNT too.

``beat it'' tedu@ the deleteotron


# 1.44 03-Jul-2011 oga

Rip out and burn support for UVM_HIST.

The vm hackers don't use it, don't maintain it and have to look at it all the
time. About time this 800 lines of code hit /dev/null.

``never liked it'' tedu@. ariane@ was very happy when i told her i wrote
this diff.


# 1.43 30-May-2011 oga

Remove the freelist member from vm_physseg

The new world order of pmemrange makes this data completely redundant
(being dealt with by the pmemrange constraints instead). Remove all code
that messes with the freelist.

While touching every caller of uvm_page_physload() anyway, add the flags
argument to all callers (all but one is 0 and that one already used
PHYSLOAD_DEVICE) and remove the macro magic to allow callers to continue
without it.

Should shrink the code a bit, as well.

matthew@ pointed out some mistakes i'd made.
``freelist death, I like. Ok.' ariane@
`I agree with the general direction, go ahead and i'll fix any fallout
shortly'' miod@ (68k 88k and vax i could not check would build)


# 1.42 15-Apr-2011 oga

When I switched uvm objects to use a per-object page tree instead of the
global hash I forgot to remove the has declarations from struct uvm. So
remove them now.

pointed out by blambert@, ok beck@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.41 29-Jun-2010 thib

Add a no_constraint uvm_constraint_range; use it in the pool code.

ok tedu@, beck@, oga@


# 1.40 27-Jun-2010 thib

uvm constraints. Add two mandatory MD symbols, uvm_md_constraints
which contains the constraints for DMA/memory allocation for each
architecture, and dma_constraints which contains the range of addresses
that are dma accessable by the system.

This is based on ariane@'s physcontig diff, with lots of bugfixes and
additions the following additions by my self:

Introduce a new function pool_set_constraints() which sets the address
range for which we allocate pages for the pool from, this is now used
for the mbuf/mbuf cluster pools to keep them dma accessible.

The !direct archs no longer stuff pages into the kernel object in
uvm_km_getpage_pla but rather do a pmap_extract() in uvm_km_putpages.

Tested heavily by my self on i386, amd64 and sparc64. Some tests on
alpha and SGI.

"commit it" beck, art, oga, deraadt
"i like the diff" deraadt


# 1.39 09-Jun-2010 thib

Move the prototype for uvm_wait() to uvm_extern.h and remove
uvm_pdaemon.h has it was only holding that one prototype.

OK art@, oga@, miod@, deraadt@


# 1.38 22-Apr-2010 oga

Committing on behalf or ariane@.

recommit pmemrange:
physmem allocator: change the view of free memory from single
free pages to free ranges. Classify memory based on region with
associated use-counter (which is used to construct a priority
list of where to allocate memory).

Based on code from tedu@, help from many.

Useable now that bugs have been found and fixed in most architecture's
pmap.c

ok by everyone who has done a pmap or uvm commit in the last year.


Revision tags: OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.37 16-Jun-2009 oga

date based reversion of uvm to the 4th May.

We still have no idea why this stops the crashes. but it does.

a machine forced to 64mb of ram cycled 10GB through swap with this diff
and is still running as I type this. Other tests by ariane@ and thib@
also seem to show that it's alright.

ok deraadt@, thib@, ariane@


# 1.36 16-Jun-2009 ariane

Backout pmemrange (which to most people is more well known as physmem
allocator).

"i can't see any obvious problems" oga


# 1.35 16-Jun-2009 oga

Backout all changes to uvm after pmemrange (which will be backed out
separately).

a change at or just before the hackathon has either exposed or added a
very very nasty memory corruption bug that is giving us hell right now.
So in the interest of kernel stability these diffs are being backed out
until such a time as that corruption bug has been found and squashed,
then the ones that are proven good may slowly return.

a quick hitlist of the main commits this backs out:

mine:
uvm_objwire
the lock change in uvm_swap.c
using trees for uvm objects instead of the hash
removing the pgo_releasepg callback.

art@'s:
putting pmap_page_protect(VM_PROT_NONE) in uvm_pagedeactivate() since
all callers called that just prior anyway.

ok beck@, ariane@.

prompted by deraadt@.


# 1.34 02-Jun-2009 oga

Instead of the global hash table with the terrible hashfunction and a
global lock, switch the uvm object pages to being kept in a per-object
RB_TREE. Right now this is approximately the same speed, but cleaner.
When biglock usage is reduced this will improve concurrency due to lock
contention..

ok beck@ art@. Thanks to jasper for the speed testing.


# 1.33 01-Jun-2009 ariane

physmem allocator: change the view of free memory from single free pages
to free ranges.
Classify memory based on region with associated use-counter (which is used
to construct a priority list of where to allocate memory).

Based on code from tedu@, help from many.
Ok art@


# 1.32 04-May-2009 oga

Instead of keeping two ints in the uvm structure specifically just to
sleep on them (and otherwise ignore them) sleep on the pointer to the
{aiodoned,pagedaemon}_proc members, and nuke the two extra words.

"no objections" art@, ok beck@.


# 1.31 28-Apr-2009 miod

Revert pageqlock back from a mutex to a simple_lock, as it needs to be
recursive in some cases (mostly involving swapping). A proper fix is in
the works, but this will unbreak kernels for now.


# 1.30 14-Apr-2009 oga

The use of uvm.pagedaemon_lock is incredibly inconsistent. only a
fraction of the wakeups and sleeps involved here actually grab that
lock. The remainder, on the other hand, always have the fpageq_lock
locked.

So, make this locking correct by switching the other users over to
fpageq_lock, too.

This would probably be better off being a semaphore, but for now at
least it's correct.

"ok, unless you want to implement semaphores" art@


# 1.29 13-Apr-2009 oga

Convert the page queue lock to a mutex instead of a simplelock.

Fix up the one case of lock recursion (which blatantly ignored the
comment right above it saying that we don't need to lock). The rest of
the lock usage has been checked and appears to be correct.

ok ariane@.


# 1.28 06-Apr-2009 oga

Instead of doing splbio(); simple_lock(&uvm.aiodoned_lock); just replace
the simple lock with a real lock - a IPL_BIO mutex. While i'm here, make
the sleeping condition one hell of a lot simpler in the aio daemon.

some ideas from and ok art@.


# 1.27 26-Mar-2009 oga

Convert splvm() + simplelock(&uvm.hashlock); around the page hash table
into a IPL_VM blocking mutex, also slightly extend the locked area so
that it actually protects access to the page array (as the comment on
the lock declaration says it should).

ansify a few functions while i'm in the file.

"ok, even though you're sneaking in ansification in a diff. You dirty
you." art@


# 1.26 25-Mar-2009 oga

Move all of the pseudo-inline functions in uvm into C files.

By pseudo-inline, I mean that if a certain macro was defined, they would
be inlined. However, no architecture defines that, and none has for a
very very long time. Therefore mainly this just makes the code a damned
sight easier to read. Some k&r -> ansi declarations while I'm in there.

"just commit it" art@. ok weingart@.


Revision tags: OPENBSD_4_5_BASE
# 1.25 27-Jan-2009 miod

Get rid of the last traces of uvm.pager_[se]va


Revision tags: OPENBSD_4_4_BASE
# 1.24 09-Jun-2008 miod

Define a new flag, UVM_FLAG_HOLE, for uvm_map to create a vm_map_entry of
a new etype, UVM_ET_HOLE, meaning it has no backend.

UVM_ET_HOLE entries (which should be created as UVM_PROT_NONE and with
UVM_FLAG_NOMERGE and UVM_FLAG_HOLE) are skipped in uvm_unmap_remove(), so
that pmap_{k,}remove() is not called on the entry.

This is intended to save time, and behave better, on pmaps with MMU holes
at process exit time.

ok art@, kettenis@ provided feedback as well.


# 1.23 05-May-2008 thib

retire ltsleep(); The only refrence left too it is in an
ifdef netbsd block in drm code, but oga@ says he'll remove
it soon...

OK art@, oga@;


Revision tags: OPENBSD_4_3_BASE
# 1.22 29-Nov-2007 tedu

use a working mutex for the freepage list. ok art deraadt


Revision tags: OPENBSD_4_2_BASE
# 1.21 18-Jun-2007 pedro

Bring back Mickey's UVM anon change. Testing by thib@, beck@ and
ckuethe@ for a while. Okay beck@, "it is good timing" deraadt@.


Revision tags: OPENBSD_4_0_BASE OPENBSD_4_1_BASE
# 1.20 13-Jul-2006 deraadt

Back out the anon change. Apparently it was tested by a few, but most of
us did not see it or get a chance to test it before it was commited. It
broke cvs, in the ami driver, making it not succeed at seeing it's devices.


# 1.19 21-Jun-2006 mickey

from netbsd: make anons dynamically allocated from pool.
this results in lesse kva waste due to static preallocation of those
for every phys page and also every swap page.
tested by beck krw miod


Revision tags: OPENBSD_3_9_BASE
# 1.18 16-Jan-2006 mickey

add another uvm histroy for physpage alloc/free and propagate a debugging pgfree check into pglist; no functional change for normal kernels; make histories uncommon


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B UBC_SYNC_A
# 1.17 29-Mar-2003 mickey

ubchist is not a fully cooked kadaver and though use the other well formed pdhist one until ubc gaets back. art@ ok


Revision tags: OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.16 19-Dec-2001 art

UBC was a disaster. It worked very good when it worked, but on some
machines or some configurations or in some phase of the moon (we actually
don't know when or why) files disappeared. Since we've not been able to
track down the problem in two weeks intense debugging and we need -current
to be stable, back out everything to a state it had before UBC.

We apologise for the inconvenience.


Revision tags: UBC_BASE
# 1.15 28-Nov-2001 art

branches: 1.15.2;
Sync in more uvm from NetBSD. Mostly just cosmetic stuff.
Contains also support for page coloring.


# 1.14 10-Nov-2001 art

Merge in some parts of the ubc work that has been done in NetBSD that are not
UBC, but prerequsites for it.

- Create a daemon that processes async I/O (swap and paging in the future)
requests that need processing in process context and that were processed
in the pagedaemon before.
- Convert some ugly ifdef DIAGNOSTIC code to less intrusive KASSERTs.
- misc other cleanups.


# 1.13 05-Nov-2001 art

Minor sync to NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.12 11-Aug-2001 art

Various random fixes from NetBSD.
Including support for zeroing pages in the idle loop (not enabled yet).


# 1.11 18-Jul-2001 art

Unconfuse UVM_UNLOCK_AND_WAIT. From NetBSD.


# 1.10 23-Jun-2001 smart

Sync with NetBSD 19990911 (just before PMAP_NEW was required)
- thread_sleep_msg() -> uvm_sleep()
- initialize reference count lock in uvm_anon_{init,add}()
- add uao_flush()
- replace boolean 'islocked' with 'lockflags'
- in uvm_fault() change FALSE to TRUE to in 'wide' fault handling
- get rid of uvm_km_get()
- various bug fixes


Revision tags: OPENBSD_2_9_BASE
# 1.9 10-Apr-2001 niklas

Fix for machines which need to enlarge the kernel address space, at least
1GB i386 machines needs this. The fix is heavily based on Jason Thorpe's
found in NetBSD. Here is his original commit message:

Instead of checking vm_physmem[<physseg>].pgs to determine if
uvm_page_init() has completed, add a boolean uvm.page_init_done,
and test against that. Use this same boolean (rather than
pmap_initialized) in pmap_growkernel() to determine if we are
being called via uvm_page_init() to grow the kernel address space.

This fixes a problem on some i386 configurations where pmap_init()
itself was needing to have the kernel page table grown, and since
pmap_initialized was not yet set to TRUE, pmap_growkernel() was
choosing the wrong code path.


# 1.8 22-Mar-2001 smart

Sync style, typo, and comments a little closer to NetBSD. art@ ok


# 1.7 09-Mar-2001 smart

Protect protypes, certain macros, and inlines from userland. Checked userland
with a 'make build'. From NetBSD. art@ ok


# 1.6 29-Jan-2001 niklas

$OpenBSD$


Revision tags: OPENBSD_2_8_BASE
# 1.5 27-May-2000 provos

use rijndael instead of blowfish because of faster key setup.
break swap paritions into sections, each section has own
encryption key. if a section's key becomes unreferenced, erase it.


Revision tags: OPENBSD_2_7_BASE
# 1.4 15-Mar-2000 art

Fix the NetBSD id strings.


Revision tags: OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 23-Aug-1999 art

branches: 1.3.4;
sync with NetBSD from 1999.05.24 (there is a reason for this date)
Mostly cleanups, but also a few improvements to pagedaemon for better
handling of low memory and/or low swap conditions.


Revision tags: OPENBSD_2_5_BASE
# 1.2 26-Feb-1999 art

add OpenBSD tags


# 1.1 26-Feb-1999 art

Import of uvm from NetBSD. Some local changes, some code disabled


# 1.62 12-Apr-2018 deraadt

Implement MAP_STACK option for mmap(). Synchronous faults (pagefault and
syscall) confirm the stack register points at MAP_STACK memory, otherwise
SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
to create a MAP_STACK sub-region which satisfies alignment requirements.
Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
contents of the region -- there is no mprotect() equivalent operation, so
there is no MAP_STACK-adding gadget.
This opportunistic software-emulation of a stack protection bit makes
stack-pivot operations during ROPchain fragile (kind of like removing a
tool from the toolbox).
original discussion with tedu, uvm work by stefan, testing by mortimer
ok kettenis


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.61 11-Aug-2016 dlg

replace abuse of the static map entries RB_ENTRY pointers with an SLIST

free static entries are kept in a simple linked list, so use SLIST
to make this obvious. the RB_PARENT manipulations are ugly and
confusing.

ok kettenis@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.60 08-Oct-2015 kettenis

Lock the page queues by turning uvm_lock_pageq() and uvm_unlock_pageq() into
mtx_enter() and mtx_leave() operations. Not 100% this won't blow up but
there is only one way to find out, and we need this to make progress on
further unlocking uvm.

prodded by deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.59 04-May-2015 dlg

reduce the scope of things that include uvm_swap_encrypt.h.

uvm_meter.c needs it to route the sysctl, uvm_swap.c needs it to
use the functionality, and uvm_swap_encrypt.c needs it to for obvious
reasons. userland sysctl already includes it explicitely.

everything else doesnt and shouldnt care.

ok miod@


# 1.58 23-Apr-2015 dlg

tedu remnants of the previous attempt to implement page zeroing in
the idle thread.

ok deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.57 03-Oct-2014 kettenis

Introduce __MAP_NOFAULT, a mmap(2) flag that makes sure a mapping will not
cause a SIGSEGV or SIGBUS when a mapped file gets truncated. Access to
pages that are not backed by a file on such a mapping will be replaced by
zero-filled anonymous pages. Makes passing file descriptors of mapped files
usable without having to play tricks with signal handlers.

"steal your mmap flag" deraadt@


Revision tags: OPENBSD_5_6_BASE
# 1.56 11-Jul-2014 jsg

Chuck Cranor rescinded clauses in his license
on the 2nd of February 2011 in NetBSD.

http://marc.info/?l=netbsd-source-changes&m=129658899212732&w=2
http://marc.info/?l=netbsd-source-changes&m=129659095515558&w=2
http://marc.info/?l=netbsd-source-changes&m=129659157916514&w=2
http://marc.info/?l=netbsd-source-changes&m=129665962324372&w=2
http://marc.info/?l=netbsd-source-changes&m=129666033625342&w=2
http://marc.info/?l=netbsd-source-changes&m=129666052825545&w=2
http://marc.info/?l=netbsd-source-changes&m=129666922906480&w=2
http://marc.info/?l=netbsd-source-changes&m=129667725518082&w=2


# 1.55 08-Jul-2014 deraadt

decouple struct uvmexp into a new file, so that uvm_extern.h and sysctl.h
don't need to be married.
ok guenther miod beck jsing kettenis


# 1.54 08-Jul-2014 deraadt

white space repairs


# 1.53 28-Mar-2014 mpi

Reduce uvm include madness. Use <uvm/uvm_extern.h> instead of
<uvm/uvm.h> if possible and remove double inclusions.

ok beck@, mlarkin@, deraadt@


Revision tags: OPENBSD_5_5_BASE
# 1.52 09-Nov-2013 guenther

Add KASSERT()s to tsleep() and msleep() to verify that bogus flags
aren't being passed to them. Fix UVM_WAIT() to not pass PNORELOCK to
tsleep(), as that flag only does something with msleep().

ok beck@ dlg@


Revision tags: OPENBSD_5_4_BASE
# 1.51 30-May-2013 tedu

UVM_UNLOCK_AND_WAIT no longer unlocks, so rename it to UVM_WAIT.


# 1.50 30-May-2013 tedu

remove lots of comments about locking per beck's request


# 1.49 30-May-2013 tedu

remove simple_locks from uvm code. ok beck deraadt


# 1.48 29-May-2013 tedu

uvm_loan has not (ever) been compiled or used.


Revision tags: OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.47 09-Mar-2012 ariane

New vmmap implementation.

no oks (it is really a pain to review properly)
extensively tested, I'm confident it'll be stable
'now is the time' from several icb inhabitants

Diff provides:
- ability to specify different allocators for different regions/maps
- a simpler implementation of the current allocator
- currently in compatibility mode: it will generate similar addresses
as the old allocator


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.46 06-Jul-2011 beck

uvm changes for buffer cache improvements.
1) Make the pagedaemon aware of the memory ranges and size of allocations
where memory is being requested, and pass this information on to
bufbackoff(), which will later (not yet) be used to ensure that the
buffer cache gets out of the way in the right area of memory.

Note that this commit does not yet make it *do* that - as currently
the buffer cache is all in dma-able memory and it will simply back
off.

2) Add uvm_pagerealloc_multi - to be used by the buffer cache code
for reallocating pages to particular regions.

much of this work by ariane, with smatterings of me, art,and oga

ok oga@, thib@, ariane@, deraadt@


# 1.45 03-Jul-2011 oga

endodoify UVM_CNT too.

``beat it'' tedu@ the deleteotron


# 1.44 03-Jul-2011 oga

Rip out and burn support for UVM_HIST.

The vm hackers don't use it, don't maintain it and have to look at it all the
time. About time this 800 lines of code hit /dev/null.

``never liked it'' tedu@. ariane@ was very happy when i told her i wrote
this diff.


# 1.43 30-May-2011 oga

Remove the freelist member from vm_physseg

The new world order of pmemrange makes this data completely redundant
(being dealt with by the pmemrange constraints instead). Remove all code
that messes with the freelist.

While touching every caller of uvm_page_physload() anyway, add the flags
argument to all callers (all but one is 0 and that one already used
PHYSLOAD_DEVICE) and remove the macro magic to allow callers to continue
without it.

Should shrink the code a bit, as well.

matthew@ pointed out some mistakes i'd made.
``freelist death, I like. Ok.' ariane@
`I agree with the general direction, go ahead and i'll fix any fallout
shortly'' miod@ (68k 88k and vax i could not check would build)


# 1.42 15-Apr-2011 oga

When I switched uvm objects to use a per-object page tree instead of the
global hash I forgot to remove the has declarations from struct uvm. So
remove them now.

pointed out by blambert@, ok beck@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.41 29-Jun-2010 thib

Add a no_constraint uvm_constraint_range; use it in the pool code.

ok tedu@, beck@, oga@


# 1.40 27-Jun-2010 thib

uvm constraints. Add two mandatory MD symbols, uvm_md_constraints
which contains the constraints for DMA/memory allocation for each
architecture, and dma_constraints which contains the range of addresses
that are dma accessable by the system.

This is based on ariane@'s physcontig diff, with lots of bugfixes and
additions the following additions by my self:

Introduce a new function pool_set_constraints() which sets the address
range for which we allocate pages for the pool from, this is now used
for the mbuf/mbuf cluster pools to keep them dma accessible.

The !direct archs no longer stuff pages into the kernel object in
uvm_km_getpage_pla but rather do a pmap_extract() in uvm_km_putpages.

Tested heavily by my self on i386, amd64 and sparc64. Some tests on
alpha and SGI.

"commit it" beck, art, oga, deraadt
"i like the diff" deraadt


# 1.39 09-Jun-2010 thib

Move the prototype for uvm_wait() to uvm_extern.h and remove
uvm_pdaemon.h has it was only holding that one prototype.

OK art@, oga@, miod@, deraadt@


# 1.38 22-Apr-2010 oga

Committing on behalf or ariane@.

recommit pmemrange:
physmem allocator: change the view of free memory from single
free pages to free ranges. Classify memory based on region with
associated use-counter (which is used to construct a priority
list of where to allocate memory).

Based on code from tedu@, help from many.

Useable now that bugs have been found and fixed in most architecture's
pmap.c

ok by everyone who has done a pmap or uvm commit in the last year.


Revision tags: OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.37 16-Jun-2009 oga

date based reversion of uvm to the 4th May.

We still have no idea why this stops the crashes. but it does.

a machine forced to 64mb of ram cycled 10GB through swap with this diff
and is still running as I type this. Other tests by ariane@ and thib@
also seem to show that it's alright.

ok deraadt@, thib@, ariane@


# 1.36 16-Jun-2009 ariane

Backout pmemrange (which to most people is more well known as physmem
allocator).

"i can't see any obvious problems" oga


# 1.35 16-Jun-2009 oga

Backout all changes to uvm after pmemrange (which will be backed out
separately).

a change at or just before the hackathon has either exposed or added a
very very nasty memory corruption bug that is giving us hell right now.
So in the interest of kernel stability these diffs are being backed out
until such a time as that corruption bug has been found and squashed,
then the ones that are proven good may slowly return.

a quick hitlist of the main commits this backs out:

mine:
uvm_objwire
the lock change in uvm_swap.c
using trees for uvm objects instead of the hash
removing the pgo_releasepg callback.

art@'s:
putting pmap_page_protect(VM_PROT_NONE) in uvm_pagedeactivate() since
all callers called that just prior anyway.

ok beck@, ariane@.

prompted by deraadt@.


# 1.34 02-Jun-2009 oga

Instead of the global hash table with the terrible hashfunction and a
global lock, switch the uvm object pages to being kept in a per-object
RB_TREE. Right now this is approximately the same speed, but cleaner.
When biglock usage is reduced this will improve concurrency due to lock
contention..

ok beck@ art@. Thanks to jasper for the speed testing.


# 1.33 01-Jun-2009 ariane

physmem allocator: change the view of free memory from single free pages
to free ranges.
Classify memory based on region with associated use-counter (which is used
to construct a priority list of where to allocate memory).

Based on code from tedu@, help from many.
Ok art@


# 1.32 04-May-2009 oga

Instead of keeping two ints in the uvm structure specifically just to
sleep on them (and otherwise ignore them) sleep on the pointer to the
{aiodoned,pagedaemon}_proc members, and nuke the two extra words.

"no objections" art@, ok beck@.


# 1.31 28-Apr-2009 miod

Revert pageqlock back from a mutex to a simple_lock, as it needs to be
recursive in some cases (mostly involving swapping). A proper fix is in
the works, but this will unbreak kernels for now.


# 1.30 14-Apr-2009 oga

The use of uvm.pagedaemon_lock is incredibly inconsistent. only a
fraction of the wakeups and sleeps involved here actually grab that
lock. The remainder, on the other hand, always have the fpageq_lock
locked.

So, make this locking correct by switching the other users over to
fpageq_lock, too.

This would probably be better off being a semaphore, but for now at
least it's correct.

"ok, unless you want to implement semaphores" art@


# 1.29 13-Apr-2009 oga

Convert the page queue lock to a mutex instead of a simplelock.

Fix up the one case of lock recursion (which blatantly ignored the
comment right above it saying that we don't need to lock). The rest of
the lock usage has been checked and appears to be correct.

ok ariane@.


# 1.28 06-Apr-2009 oga

Instead of doing splbio(); simple_lock(&uvm.aiodoned_lock); just replace
the simple lock with a real lock - a IPL_BIO mutex. While i'm here, make
the sleeping condition one hell of a lot simpler in the aio daemon.

some ideas from and ok art@.


# 1.27 26-Mar-2009 oga

Convert splvm() + simplelock(&uvm.hashlock); around the page hash table
into a IPL_VM blocking mutex, also slightly extend the locked area so
that it actually protects access to the page array (as the comment on
the lock declaration says it should).

ansify a few functions while i'm in the file.

"ok, even though you're sneaking in ansification in a diff. You dirty
you." art@


# 1.26 25-Mar-2009 oga

Move all of the pseudo-inline functions in uvm into C files.

By pseudo-inline, I mean that if a certain macro was defined, they would
be inlined. However, no architecture defines that, and none has for a
very very long time. Therefore mainly this just makes the code a damned
sight easier to read. Some k&r -> ansi declarations while I'm in there.

"just commit it" art@. ok weingart@.


Revision tags: OPENBSD_4_5_BASE
# 1.25 27-Jan-2009 miod

Get rid of the last traces of uvm.pager_[se]va


Revision tags: OPENBSD_4_4_BASE
# 1.24 09-Jun-2008 miod

Define a new flag, UVM_FLAG_HOLE, for uvm_map to create a vm_map_entry of
a new etype, UVM_ET_HOLE, meaning it has no backend.

UVM_ET_HOLE entries (which should be created as UVM_PROT_NONE and with
UVM_FLAG_NOMERGE and UVM_FLAG_HOLE) are skipped in uvm_unmap_remove(), so
that pmap_{k,}remove() is not called on the entry.

This is intended to save time, and behave better, on pmaps with MMU holes
at process exit time.

ok art@, kettenis@ provided feedback as well.


# 1.23 05-May-2008 thib

retire ltsleep(); The only refrence left too it is in an
ifdef netbsd block in drm code, but oga@ says he'll remove
it soon...

OK art@, oga@;


Revision tags: OPENBSD_4_3_BASE
# 1.22 29-Nov-2007 tedu

use a working mutex for the freepage list. ok art deraadt


Revision tags: OPENBSD_4_2_BASE
# 1.21 18-Jun-2007 pedro

Bring back Mickey's UVM anon change. Testing by thib@, beck@ and
ckuethe@ for a while. Okay beck@, "it is good timing" deraadt@.


Revision tags: OPENBSD_4_0_BASE OPENBSD_4_1_BASE
# 1.20 13-Jul-2006 deraadt

Back out the anon change. Apparently it was tested by a few, but most of
us did not see it or get a chance to test it before it was commited. It
broke cvs, in the ami driver, making it not succeed at seeing it's devices.


# 1.19 21-Jun-2006 mickey

from netbsd: make anons dynamically allocated from pool.
this results in lesse kva waste due to static preallocation of those
for every phys page and also every swap page.
tested by beck krw miod


Revision tags: OPENBSD_3_9_BASE
# 1.18 16-Jan-2006 mickey

add another uvm histroy for physpage alloc/free and propagate a debugging pgfree check into pglist; no functional change for normal kernels; make histories uncommon


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B UBC_SYNC_A
# 1.17 29-Mar-2003 mickey

ubchist is not a fully cooked kadaver and though use the other well formed pdhist one until ubc gaets back. art@ ok


Revision tags: OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.16 19-Dec-2001 art

UBC was a disaster. It worked very good when it worked, but on some
machines or some configurations or in some phase of the moon (we actually
don't know when or why) files disappeared. Since we've not been able to
track down the problem in two weeks intense debugging and we need -current
to be stable, back out everything to a state it had before UBC.

We apologise for the inconvenience.


Revision tags: UBC_BASE
# 1.15 28-Nov-2001 art

branches: 1.15.2;
Sync in more uvm from NetBSD. Mostly just cosmetic stuff.
Contains also support for page coloring.


# 1.14 10-Nov-2001 art

Merge in some parts of the ubc work that has been done in NetBSD that are not
UBC, but prerequsites for it.

- Create a daemon that processes async I/O (swap and paging in the future)
requests that need processing in process context and that were processed
in the pagedaemon before.
- Convert some ugly ifdef DIAGNOSTIC code to less intrusive KASSERTs.
- misc other cleanups.


# 1.13 05-Nov-2001 art

Minor sync to NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.12 11-Aug-2001 art

Various random fixes from NetBSD.
Including support for zeroing pages in the idle loop (not enabled yet).


# 1.11 18-Jul-2001 art

Unconfuse UVM_UNLOCK_AND_WAIT. From NetBSD.


# 1.10 23-Jun-2001 smart

Sync with NetBSD 19990911 (just before PMAP_NEW was required)
- thread_sleep_msg() -> uvm_sleep()
- initialize reference count lock in uvm_anon_{init,add}()
- add uao_flush()
- replace boolean 'islocked' with 'lockflags'
- in uvm_fault() change FALSE to TRUE to in 'wide' fault handling
- get rid of uvm_km_get()
- various bug fixes


Revision tags: OPENBSD_2_9_BASE
# 1.9 10-Apr-2001 niklas

Fix for machines which need to enlarge the kernel address space, at least
1GB i386 machines needs this. The fix is heavily based on Jason Thorpe's
found in NetBSD. Here is his original commit message:

Instead of checking vm_physmem[<physseg>].pgs to determine if
uvm_page_init() has completed, add a boolean uvm.page_init_done,
and test against that. Use this same boolean (rather than
pmap_initialized) in pmap_growkernel() to determine if we are
being called via uvm_page_init() to grow the kernel address space.

This fixes a problem on some i386 configurations where pmap_init()
itself was needing to have the kernel page table grown, and since
pmap_initialized was not yet set to TRUE, pmap_growkernel() was
choosing the wrong code path.


# 1.8 22-Mar-2001 smart

Sync style, typo, and comments a little closer to NetBSD. art@ ok


# 1.7 09-Mar-2001 smart

Protect protypes, certain macros, and inlines from userland. Checked userland
with a 'make build'. From NetBSD. art@ ok


# 1.6 29-Jan-2001 niklas

$OpenBSD$


Revision tags: OPENBSD_2_8_BASE
# 1.5 27-May-2000 provos

use rijndael instead of blowfish because of faster key setup.
break swap paritions into sections, each section has own
encryption key. if a section's key becomes unreferenced, erase it.


Revision tags: OPENBSD_2_7_BASE
# 1.4 15-Mar-2000 art

Fix the NetBSD id strings.


Revision tags: OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 23-Aug-1999 art

branches: 1.3.4;
sync with NetBSD from 1999.05.24 (there is a reason for this date)
Mostly cleanups, but also a few improvements to pagedaemon for better
handling of low memory and/or low swap conditions.


Revision tags: OPENBSD_2_5_BASE
# 1.2 26-Feb-1999 art

add OpenBSD tags


# 1.1 26-Feb-1999 art

Import of uvm from NetBSD. Some local changes, some code disabled


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.61 11-Aug-2016 dlg

replace abuse of the static map entries RB_ENTRY pointers with an SLIST

free static entries are kept in a simple linked list, so use SLIST
to make this obvious. the RB_PARENT manipulations are ugly and
confusing.

ok kettenis@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.60 08-Oct-2015 kettenis

Lock the page queues by turning uvm_lock_pageq() and uvm_unlock_pageq() into
mtx_enter() and mtx_leave() operations. Not 100% this won't blow up but
there is only one way to find out, and we need this to make progress on
further unlocking uvm.

prodded by deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.59 04-May-2015 dlg

reduce the scope of things that include uvm_swap_encrypt.h.

uvm_meter.c needs it to route the sysctl, uvm_swap.c needs it to
use the functionality, and uvm_swap_encrypt.c needs it to for obvious
reasons. userland sysctl already includes it explicitely.

everything else doesnt and shouldnt care.

ok miod@


# 1.58 23-Apr-2015 dlg

tedu remnants of the previous attempt to implement page zeroing in
the idle thread.

ok deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.57 03-Oct-2014 kettenis

Introduce __MAP_NOFAULT, a mmap(2) flag that makes sure a mapping will not
cause a SIGSEGV or SIGBUS when a mapped file gets truncated. Access to
pages that are not backed by a file on such a mapping will be replaced by
zero-filled anonymous pages. Makes passing file descriptors of mapped files
usable without having to play tricks with signal handlers.

"steal your mmap flag" deraadt@


Revision tags: OPENBSD_5_6_BASE
# 1.56 11-Jul-2014 jsg

Chuck Cranor rescinded clauses in his license
on the 2nd of February 2011 in NetBSD.

http://marc.info/?l=netbsd-source-changes&m=129658899212732&w=2
http://marc.info/?l=netbsd-source-changes&m=129659095515558&w=2
http://marc.info/?l=netbsd-source-changes&m=129659157916514&w=2
http://marc.info/?l=netbsd-source-changes&m=129665962324372&w=2
http://marc.info/?l=netbsd-source-changes&m=129666033625342&w=2
http://marc.info/?l=netbsd-source-changes&m=129666052825545&w=2
http://marc.info/?l=netbsd-source-changes&m=129666922906480&w=2
http://marc.info/?l=netbsd-source-changes&m=129667725518082&w=2


# 1.55 08-Jul-2014 deraadt

decouple struct uvmexp into a new file, so that uvm_extern.h and sysctl.h
don't need to be married.
ok guenther miod beck jsing kettenis


# 1.54 08-Jul-2014 deraadt

white space repairs


# 1.53 28-Mar-2014 mpi

Reduce uvm include madness. Use <uvm/uvm_extern.h> instead of
<uvm/uvm.h> if possible and remove double inclusions.

ok beck@, mlarkin@, deraadt@


Revision tags: OPENBSD_5_5_BASE
# 1.52 09-Nov-2013 guenther

Add KASSERT()s to tsleep() and msleep() to verify that bogus flags
aren't being passed to them. Fix UVM_WAIT() to not pass PNORELOCK to
tsleep(), as that flag only does something with msleep().

ok beck@ dlg@


Revision tags: OPENBSD_5_4_BASE
# 1.51 30-May-2013 tedu

UVM_UNLOCK_AND_WAIT no longer unlocks, so rename it to UVM_WAIT.


# 1.50 30-May-2013 tedu

remove lots of comments about locking per beck's request


# 1.49 30-May-2013 tedu

remove simple_locks from uvm code. ok beck deraadt


# 1.48 29-May-2013 tedu

uvm_loan has not (ever) been compiled or used.


Revision tags: OPENBSD_5_2_BASE OPENBSD_5_3_BASE
# 1.47 09-Mar-2012 ariane

New vmmap implementation.

no oks (it is really a pain to review properly)
extensively tested, I'm confident it'll be stable
'now is the time' from several icb inhabitants

Diff provides:
- ability to specify different allocators for different regions/maps
- a simpler implementation of the current allocator
- currently in compatibility mode: it will generate similar addresses
as the old allocator


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE
# 1.46 06-Jul-2011 beck

uvm changes for buffer cache improvements.
1) Make the pagedaemon aware of the memory ranges and size of allocations
where memory is being requested, and pass this information on to
bufbackoff(), which will later (not yet) be used to ensure that the
buffer cache gets out of the way in the right area of memory.

Note that this commit does not yet make it *do* that - as currently
the buffer cache is all in dma-able memory and it will simply back
off.

2) Add uvm_pagerealloc_multi - to be used by the buffer cache code
for reallocating pages to particular regions.

much of this work by ariane, with smatterings of me, art,and oga

ok oga@, thib@, ariane@, deraadt@


# 1.45 03-Jul-2011 oga

endodoify UVM_CNT too.

``beat it'' tedu@ the deleteotron


# 1.44 03-Jul-2011 oga

Rip out and burn support for UVM_HIST.

The vm hackers don't use it, don't maintain it and have to look at it all the
time. About time this 800 lines of code hit /dev/null.

``never liked it'' tedu@. ariane@ was very happy when i told her i wrote
this diff.


# 1.43 30-May-2011 oga

Remove the freelist member from vm_physseg

The new world order of pmemrange makes this data completely redundant
(being dealt with by the pmemrange constraints instead). Remove all code
that messes with the freelist.

While touching every caller of uvm_page_physload() anyway, add the flags
argument to all callers (all but one is 0 and that one already used
PHYSLOAD_DEVICE) and remove the macro magic to allow callers to continue
without it.

Should shrink the code a bit, as well.

matthew@ pointed out some mistakes i'd made.
``freelist death, I like. Ok.' ariane@
`I agree with the general direction, go ahead and i'll fix any fallout
shortly'' miod@ (68k 88k and vax i could not check would build)


# 1.42 15-Apr-2011 oga

When I switched uvm objects to use a per-object page tree instead of the
global hash I forgot to remove the has declarations from struct uvm. So
remove them now.

pointed out by blambert@, ok beck@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.41 29-Jun-2010 thib

Add a no_constraint uvm_constraint_range; use it in the pool code.

ok tedu@, beck@, oga@


# 1.40 27-Jun-2010 thib

uvm constraints. Add two mandatory MD symbols, uvm_md_constraints
which contains the constraints for DMA/memory allocation for each
architecture, and dma_constraints which contains the range of addresses
that are dma accessable by the system.

This is based on ariane@'s physcontig diff, with lots of bugfixes and
additions the following additions by my self:

Introduce a new function pool_set_constraints() which sets the address
range for which we allocate pages for the pool from, this is now used
for the mbuf/mbuf cluster pools to keep them dma accessible.

The !direct archs no longer stuff pages into the kernel object in
uvm_km_getpage_pla but rather do a pmap_extract() in uvm_km_putpages.

Tested heavily by my self on i386, amd64 and sparc64. Some tests on
alpha and SGI.

"commit it" beck, art, oga, deraadt
"i like the diff" deraadt


# 1.39 09-Jun-2010 thib

Move the prototype for uvm_wait() to uvm_extern.h and remove
uvm_pdaemon.h has it was only holding that one prototype.

OK art@, oga@, miod@, deraadt@


# 1.38 22-Apr-2010 oga

Committing on behalf or ariane@.

recommit pmemrange:
physmem allocator: change the view of free memory from single
free pages to free ranges. Classify memory based on region with
associated use-counter (which is used to construct a priority
list of where to allocate memory).

Based on code from tedu@, help from many.

Useable now that bugs have been found and fixed in most architecture's
pmap.c

ok by everyone who has done a pmap or uvm commit in the last year.


Revision tags: OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.37 16-Jun-2009 oga

date based reversion of uvm to the 4th May.

We still have no idea why this stops the crashes. but it does.

a machine forced to 64mb of ram cycled 10GB through swap with this diff
and is still running as I type this. Other tests by ariane@ and thib@
also seem to show that it's alright.

ok deraadt@, thib@, ariane@


# 1.36 16-Jun-2009 ariane

Backout pmemrange (which to most people is more well known as physmem
allocator).

"i can't see any obvious problems" oga


# 1.35 16-Jun-2009 oga

Backout all changes to uvm after pmemrange (which will be backed out
separately).

a change at or just before the hackathon has either exposed or added a
very very nasty memory corruption bug that is giving us hell right now.
So in the interest of kernel stability these diffs are being backed out
until such a time as that corruption bug has been found and squashed,
then the ones that are proven good may slowly return.

a quick hitlist of the main commits this backs out:

mine:
uvm_objwire
the lock change in uvm_swap.c
using trees for uvm objects instead of the hash
removing the pgo_releasepg callback.

art@'s:
putting pmap_page_protect(VM_PROT_NONE) in uvm_pagedeactivate() since
all callers called that just prior anyway.

ok beck@, ariane@.

prompted by deraadt@.


# 1.34 02-Jun-2009 oga

Instead of the global hash table with the terrible hashfunction and a
global lock, switch the uvm object pages to being kept in a per-object
RB_TREE. Right now this is approximately the same speed, but cleaner.
When biglock usage is reduced this will improve concurrency due to lock
contention..

ok beck@ art@. Thanks to jasper for the speed testing.


# 1.33 01-Jun-2009 ariane

physmem allocator: change the view of free memory from single free pages
to free ranges.
Classify memory based on region with associated use-counter (which is used
to construct a priority list of where to allocate memory).

Based on code from tedu@, help from many.
Ok art@


# 1.32 04-May-2009 oga

Instead of keeping two ints in the uvm structure specifically just to
sleep on them (and otherwise ignore them) sleep on the pointer to the
{aiodoned,pagedaemon}_proc members, and nuke the two extra words.

"no objections" art@, ok beck@.


# 1.31 28-Apr-2009 miod

Revert pageqlock back from a mutex to a simple_lock, as it needs to be
recursive in some cases (mostly involving swapping). A proper fix is in
the works, but this will unbreak kernels for now.


# 1.30 14-Apr-2009 oga

The use of uvm.pagedaemon_lock is incredibly inconsistent. only a
fraction of the wakeups and sleeps involved here actually grab that
lock. The remainder, on the other hand, always have the fpageq_lock
locked.

So, make this locking correct by switching the other users over to
fpageq_lock, too.

This would probably be better off being a semaphore, but for now at
least it's correct.

"ok, unless you want to implement semaphores" art@


# 1.29 13-Apr-2009 oga

Convert the page queue lock to a mutex instead of a simplelock.

Fix up the one case of lock recursion (which blatantly ignored the
comment right above it saying that we don't need to lock). The rest of
the lock usage has been checked and appears to be correct.

ok ariane@.


# 1.28 06-Apr-2009 oga

Instead of doing splbio(); simple_lock(&uvm.aiodoned_lock); just replace
the simple lock with a real lock - a IPL_BIO mutex. While i'm here, make
the sleeping condition one hell of a lot simpler in the aio daemon.

some ideas from and ok art@.


# 1.27 26-Mar-2009 oga

Convert splvm() + simplelock(&uvm.hashlock); around the page hash table
into a IPL_VM blocking mutex, also slightly extend the locked area so
that it actually protects access to the page array (as the comment on
the lock declaration says it should).

ansify a few functions while i'm in the file.

"ok, even though you're sneaking in ansification in a diff. You dirty
you." art@


# 1.26 25-Mar-2009 oga

Move all of the pseudo-inline functions in uvm into C files.

By pseudo-inline, I mean that if a certain macro was defined, they would
be inlined. However, no architecture defines that, and none has for a
very very long time. Therefore mainly this just makes the code a damned
sight easier to read. Some k&r -> ansi declarations while I'm in there.

"just commit it" art@. ok weingart@.


Revision tags: OPENBSD_4_5_BASE
# 1.25 27-Jan-2009 miod

Get rid of the last traces of uvm.pager_[se]va


Revision tags: OPENBSD_4_4_BASE
# 1.24 09-Jun-2008 miod

Define a new flag, UVM_FLAG_HOLE, for uvm_map to create a vm_map_entry of
a new etype, UVM_ET_HOLE, meaning it has no backend.

UVM_ET_HOLE entries (which should be created as UVM_PROT_NONE and with
UVM_FLAG_NOMERGE and UVM_FLAG_HOLE) are skipped in uvm_unmap_remove(), so
that pmap_{k,}remove() is not called on the entry.

This is intended to save time, and behave better, on pmaps with MMU holes
at process exit time.

ok art@, kettenis@ provided feedback as well.


# 1.23 05-May-2008 thib

retire ltsleep(); The only refrence left too it is in an
ifdef netbsd block in drm code, but oga@ says he'll remove
it soon...

OK art@, oga@;


Revision tags: OPENBSD_4_3_BASE
# 1.22 29-Nov-2007 tedu

use a working mutex for the freepage list. ok art deraadt


Revision tags: OPENBSD_4_2_BASE
# 1.21 18-Jun-2007 pedro

Bring back Mickey's UVM anon change. Testing by thib@, beck@ and
ckuethe@ for a while. Okay beck@, "it is good timing" deraadt@.


Revision tags: OPENBSD_4_0_BASE OPENBSD_4_1_BASE
# 1.20 13-Jul-2006 deraadt

Back out the anon change. Apparently it was tested by a few, but most of
us did not see it or get a chance to test it before it was commited. It
broke cvs, in the ami driver, making it not succeed at seeing it's devices.


# 1.19 21-Jun-2006 mickey

from netbsd: make anons dynamically allocated from pool.
this results in lesse kva waste due to static preallocation of those
for every phys page and also every swap page.
tested by beck krw miod


Revision tags: OPENBSD_3_9_BASE
# 1.18 16-Jan-2006 mickey

add another uvm histroy for physpage alloc/free and propagate a debugging pgfree check into pglist; no functional change for normal kernels; make histories uncommon


Revision tags: OPENBSD_3_4_BASE OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE SMP_SYNC_A SMP_SYNC_B UBC_SYNC_A
# 1.17 29-Mar-2003 mickey

ubchist is not a fully cooked kadaver and though use the other well formed pdhist one until ubc gaets back. art@ ok


Revision tags: OPENBSD_3_1_BASE OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_B
# 1.16 19-Dec-2001 art

UBC was a disaster. It worked very good when it worked, but on some
machines or some configurations or in some phase of the moon (we actually
don't know when or why) files disappeared. Since we've not been able to
track down the problem in two weeks intense debugging and we need -current
to be stable, back out everything to a state it had before UBC.

We apologise for the inconvenience.


Revision tags: UBC_BASE
# 1.15 28-Nov-2001 art

branches: 1.15.2;
Sync in more uvm from NetBSD. Mostly just cosmetic stuff.
Contains also support for page coloring.


# 1.14 10-Nov-2001 art

Merge in some parts of the ubc work that has been done in NetBSD that are not
UBC, but prerequsites for it.

- Create a daemon that processes async I/O (swap and paging in the future)
requests that need processing in process context and that were processed
in the pagedaemon before.
- Convert some ugly ifdef DIAGNOSTIC code to less intrusive KASSERTs.
- misc other cleanups.


# 1.13 05-Nov-2001 art

Minor sync to NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.12 11-Aug-2001 art

Various random fixes from NetBSD.
Including support for zeroing pages in the idle loop (not enabled yet).


# 1.11 18-Jul-2001 art

Unconfuse UVM_UNLOCK_AND_WAIT. From NetBSD.


# 1.10 23-Jun-2001 smart

Sync with NetBSD 19990911 (just before PMAP_NEW was required)
- thread_sleep_msg() -> uvm_sleep()
- initialize reference count lock in uvm_anon_{init,add}()
- add uao_flush()
- replace boolean 'islocked' with 'lockflags'
- in uvm_fault() change FALSE to TRUE to in 'wide' fault handling
- get rid of uvm_km_get()
- various bug fixes


Revision tags: OPENBSD_2_9_BASE
# 1.9 10-Apr-2001 niklas

Fix for machines which need to enlarge the kernel address space, at least
1GB i386 machines needs this. The fix is heavily based on Jason Thorpe's
found in NetBSD. Here is his original commit message:

Instead of checking vm_physmem[<physseg>].pgs to determine if
uvm_page_init() has completed, add a boolean uvm.page_init_done,
and test against that. Use this same boolean (rather than
pmap_initialized) in pmap_growkernel() to determine if we are
being called via uvm_page_init() to grow the kernel address space.

This fixes a problem on some i386 configurations where pmap_init()
itself was needing to have the kernel page table grown, and since
pmap_initialized was not yet set to TRUE, pmap_growkernel() was
choosing the wrong code path.


# 1.8 22-Mar-2001 smart

Sync style, typo, and comments a little closer to NetBSD. art@ ok


# 1.7 09-Mar-2001 smart

Protect protypes, certain macros, and inlines from userland. Checked userland
with a 'make build'. From NetBSD. art@ ok


# 1.6 29-Jan-2001 niklas

$OpenBSD$


Revision tags: OPENBSD_2_8_BASE
# 1.5 27-May-2000 provos

use rijndael instead of blowfish because of faster key setup.
break swap paritions into sections, each section has own
encryption key. if a section's key becomes unreferenced, erase it.


Revision tags: OPENBSD_2_7_BASE
# 1.4 15-Mar-2000 art

Fix the NetBSD id strings.


Revision tags: OPENBSD_2_6_BASE SMP_BASE kame_19991208
# 1.3 23-Aug-1999 art

branches: 1.3.4;
sync with NetBSD from 1999.05.24 (there is a reason for this date)
Mostly cleanups, but also a few improvements to pagedaemon for better
handling of low memory and/or low swap conditions.


Revision tags: OPENBSD_2_5_BASE
# 1.2 26-Feb-1999 art

add OpenBSD tags


# 1.1 26-Feb-1999 art

Import of uvm from NetBSD. Some local changes, some code disabled