History log of /freebsd-9.3-release/sys/sys/mman.h
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 267654 19-Jun-2014 gjb

Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 258870 03-Dec-2013 jhb

MFC 253471,253620,254430,254538:
Change mmap() to more optimally use superpages and provide support for
tweaking alignment of virtual mappings.
- Add a new address space allocation method (VMFS_OPTIMAL_SPACE) for
vm_map_find() that will try to alter the alignment of a mapping to match
any existing superpage mappings of the object being mapped. If no
suitable address range is found with the necessary alignment,
vm_map_find() will fall back to using the simple first-fit strategy
(VMFS_ANY_SPACE).
- Change mmap() without MAP_FIXED, shmat(), shm_map(), and the GEM mapping
ioctl to use VMFS_OPTIMAL_SPACE instead of VMFS_ANY_SPACE.
- MAP_ALIGNED(n) requests a mapping aligned on a boundary of (1 << n).
Requests for n >= number of bits in a pointer or less than the size of
a page fail with EINVAL. This matches the API provided by NetBSD.
- MAP_ALIGNED_SUPER is a special case of MAP_ALIGNED. It can be used
to optimize the chances of using large pages. By default it will align
the mapping on a large page boundary (the system is free to choose any
large page size to align to that seems best for the mapping request).
However, if the object being mapped is already using large pages, then
it will align the virtual mapping to match the existing large pages in
the object instead.
- Internally, VMFS_ALIGNED_SPACE is now renamed to VMFS_SUPER_SPACE, and
VMFS_ALIGNED_SPACE(n) is repurposed for specifying a specific alignment.
MAP_ALIGNED(n) maps to using VMFS_ALIGNED_SPACE(n), while
MAP_ALIGNED_SUPER maps to VMFS_SUPER_SPACE.
- mmap() of a device object now uses VMFS_OPTIMAL_SPACE rather than
explicitly using VMFS_SUPER_SPACE. All device objects are forced to
use a specific color on creation, so VMFS_OPTIMAL_SPACE is effectively
equivalent.

PR: ports/184173 (exp-run)


# 236698 06-Jun-2012 jhb

MFC 233760:
Export some more useful info about shared memory objects to userland
via procstat(1) and fstat(1):
- Change shm file descriptors to track the pathname they are associated
with and add a shm_path() method to copy the path out to a caller-supplied
buffer.
- Use the fo_stat() method of shared memory objects and shm_path() to
export the path, mode, and size of a shared memory object via
struct kinfo_file.
- Add a struct shmstat to the libprocstat(3) interface along with a
procstat_get_shm_info() to export the mode and size of a shared memory
object.
- Change procstat to always print out the path for a given object if it
is valid.
- Teach fstat about shared memory objects and to display their path,
mode, and size.


# 236683 06-Jun-2012 jhb

MFC 228509,228620,228533:
Add a helper API to allow in-kernel code to map portions of shared memory
objects created by shm_open(2) into the kernel's address space. This
provides a convenient way for creating shared memory buffers between
userland and the kernel without requiring custom character devices.


# 225736 22-Sep-2011 kensmith

Copy head to stable/9 as part of 9.0-RELEASE release cycle.

Approved by: re (implicit)


# 211937 28-Aug-2010 alc

Add the MAP_PREFAULT_READ option to mmap(2).

Reviewed by: jhb, kib


# 198973 06-Nov-2009 ed

Add MAP_ANONYMOUS.

Many operating systems also provide MAP_ANONYMOUS. It's not hard to
support this ourselves, we'd better add it to make it more likely for
applications to work out of the box.

Reviewed by: alc (mman.h)


# 197331 19-Sep-2009 alc

Add getpagesizes(3). This functions either the number of supported page
sizes or some number of the sizes themselves. It is functionally
compatible with a function by the same name under Solaris.

Reviewed by: jhb


# 177680 28-Mar-2008 ps

Add support to mincore for detecting whether a page is part of a
"super" page or not.

Reviewed by: alc, ups


# 175164 08-Jan-2008 jhb

Add a new file descriptor type for IPC shared memory objects and use it to
implement shm_open(2) and shm_unlink(2) in the kernel:
- Each shared memory file descriptor is associated with a swap-backed vm
object which provides the backing store. Each descriptor starts off with
a size of zero, but the size can be altered via ftruncate(2). The shared
memory file descriptors also support fstat(2). read(2), write(2),
ioctl(2), select(2), poll(2), and kevent(2) are not supported on shared
memory file descriptors.
- shm_open(2) and shm_unlink(2) are now implemented as system calls that
manage shared memory file descriptors. The virtual namespace that maps
pathnames to shared memory file descriptors is implemented as a hash
table where the hash key is generated via the 32-bit Fowler/Noll/Vo hash
of the pathname.
- As an extension, the constant 'SHM_ANON' may be specified in place of the
path argument to shm_open(2). In this case, an unnamed shared memory
file descriptor will be created similar to the IPC_PRIVATE key for
shmget(2). Note that the shared memory object can still be shared among
processes by sharing the file descriptor via fork(2) or sendmsg(2), but
it is unnamed. This effectively serves to implement the getmemfd() idea
bandied about the lists several times over the years.
- The backing store for shared memory file descriptors are garbage
collected when they are not referenced by any open file descriptors or
the shm_open(2) virtual namespace.

Submitted by: dillon, peter (previous versions)
Submitted by: rwatson (I based this on his version)
Reviewed by: alc (suggested converting getmemfd() to shm_open())


# 144531 02-Apr-2005 das

Namespace issues.


# 128680 27-Apr-2004 mux

Remove mlockall() and munlockall() from the list of unimplemented
syscalls in a comment, we have them since quite some time now.

Noticed by: Stefan Farfeleder <stefan@fafoe.narf.at>


# 127976 07-Apr-2004 imp

Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core


# 118771 11-Aug-2003 bms

Add the mlockall() and munlockall() system calls.
- All those diffs to syscalls.master for each architecture *are*
necessary. This needed clarification; the stub code generation for
mlockall() was disabled, which would prevent applications from
linking to this API (suggested by mux)
- Giant has been quoshed. It is no longer held by the code, as
the required locking has been pushed down within vm_map.c.
- Callers must specify VM_MAP_WIRE_HOLESOK or VM_MAP_WIRE_NOHOLES
to express their intention explicitly.
- Inspected at the vmstat, top and vm pager sysctl stats level.
Paging-in activity is occurring correctly, using a test harness.
- The RES size for a process may appear to be greater than its SIZE.
This is believed to be due to mappings of the same shared library
page being wired twice. Further exploration is needed.
- Believed to back out of allocations and locks correctly
(tested with WITNESS, MUTEX_PROFILING, INVARIANTS and DIAGNOSTIC).

PR: kern/43426, standards/54223
Reviewed by: jake, alc
Approved by: jake (mentor)
MFC after: 2 weeks


# 118684 09-Aug-2003 bms

Add the POSIX 1003.1-2001 posix_madvise() interface.

PR: standards/54634
Reviewed by: das
Approved by: jake (mentor)


# 112881 31-Mar-2003 wes

Add a facility allowing processes to inform the VM subsystem they are
critical and should not be killed when pageout is looking for more
memory pages in all the wrong places.

Reviewed by: arch@
Sponsored by: St. Bernard Software


# 103304 13-Sep-2002 mike

mlockall() and munlockall() are unimplemented; remove their prototypes.


# 102325 23-Aug-2002 mike

o Fix namespace issues in <sys/mman.h>.
o Move mode_t details from <sys/types.h> into <sys/_types.h>.
o Add primitives for sharing the mode_t and off_t typedefs.
o Add typedefs mode_t, off_t, and size_t to <sys/mman.h>.

PR: 21644


# 92719 19-Mar-2002 alfred

Remove __P


# 82295 24-Aug-2001 dillon

Add INHERIT_XXX defines for minherit() system call.
Remove MAP_INHERIT - it is no longer supported.


# 82285 24-Aug-2001 dillon

Remove MAP_NOEXTEND. It came from 4.4-lite and not only was never
implemented, but mmap()'s default behavior is *already* to not extend
files. Only write() or ftruncate() can extend a file.


# 57550 28-Feb-2000 ps

Add MAP_NOCORE to mmap(2), and MADV_NOCORE and MADV_CORE to madvise(2).
This
This feature allows you to specify if mmap'd data is included in
an application's corefile.

Change the type of eflags in struct vm_map_entry from u_char to
vm_eflags_t (an unsigned int).

Reviewed by: dillon,jdp,alfred
Approved by: jkh


# 55205 29-Dec-1999 peter

Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.


# 54467 12-Dec-1999 dillon

Add MAP_NOSYNC feature to mmap(), and MADV_NOSYNC and MADV_AUTOSYNC to
madvise().

This feature prevents the update daemon from gratuitously flushing
dirty pages associated with a mapped file-backed region of memory. The
system pager will still page the memory as necessary and the VM system
will still be fully coherent with the filesystem. Modifications made
by other means to the same area of memory, for example by write(), are
unaffected. The feature works on a page-granularity basis.

MAP_NOSYNC allows one to use mmap() to share memory between processes
without incuring any significant filesystem overhead, putting it in
the same performance category as SysV Shared memory and anonymous memory.

Reviewed by: julian, alc, dg


# 50477 27-Aug-1999 peter

$Id$ -> $FreeBSD$


# 43209 26-Jan-1999 julian

Mostly remove the VM_STACK OPTION.
This changes the definitions of a few items so that structures are the
same whether or not the option itself is enabled. This allows
people to enable and disable the option without recompilng the world.

As the author says:

|I ran into a problem pulling out the VM_STACK option. I was aware of this
|when I first did the work, but then forgot about it. The VM_STACK stuff
|has some code changes in the i386 branch. There need to be corresponding
|changes in the alpha branch before it can come out completely.

what is done:
|
|1) Pull the VM_STACK option out of the header files it appears in. This
|really shouldn't affect anything that executes with or without the rest
|of the VM_STACK patches. The vm_map_entry will then always have one
|extra element (avail_ssize). It just won't be used if the VM_STACK
|option is not turned on.
|
|I've also pulled the option out of vm_map.c. This shouldn't harm anything,
|since the routines that are enabled as a result are not called unless
|the VM_STACK option is enabled elsewhere.
|
|2) Add what appears to be appropriate code the the alpha branch, still
|protected behind the VM_STACK switch. I don't have an alpha machine,
|so we would need to get some testers with alpha machines to try it out.
|
|Once there is some testing, we can consider making the change permanent
|for both i386 and alpha.
|
[..]
|
|Once the alpha code is adequately tested, we can pull VM_STACK out
|everywhere.
|

Submitted by: "Richard Seaman, Jr." <dick@tar.com>


# 42360 06-Jan-1999 julian

Add (but don't activate) code for a special VM option to make
downward growing stacks more general.
Add (but don't activate) code to use the new stack facility
when running threads, (specifically the linux threads support).
This allows people to use both linux compiled linuxthreads, and also the
native FreeBSD linux-threads port.

The code is conditional on VM_STACK. Not using this will
produce the old heavily tested system.

Submitted by: Richard Seaman <dick@tar.com>


# 34925 28-Mar-1998 dufault

Finish _POSIX_PRIORITY_SCHEDULING. Needs P1003_1B and
_KPOSIX_PRIORITY_SCHEDULING options to work. Changes:

Change all "posix4" to "p1003_1b". Misnamed files are left
as "posix4" until I'm told if I can simply delete them and add
new ones;

Add _POSIX_PRIORITY_SCHEDULING system calls for FreeBSD and Linux;

Add man pages for _POSIX_PRIORITY_SCHEDULING system calls;

Add options to LINT;

Minor fixes to P1003_1B code during testing.


# 34319 08-Mar-1998 dufault

Reviewed by: bde

Changes to support building with _POSIX_SOURCE set to 199309L:

1. Add sys/_posix.h to handle those preprocessor defs that POSIX
says have effects when defined before including any header files;

2. Change POSIX4_VISIBLE back to _POSIX4_VISIBLE

3. Add _POSIX4_VISIBLE_HISTORICALLY for pre-existing BSD features now
defined in POSIX. These show up when:

_POSIX_SOURCE and _POSIX_C_SOURCE are not set or
_POSIX_C_SOURCE is set >= 199309L

and vanish when:

_POSIX_SOURCE is set or _POSIX_C_SOURCE is < 199309L.

4. Explain these in man 9 posix4;

5. Include _posix.h and conditionalize on new feature test.


# 34030 04-Mar-1998 dufault

Reviewed by: msmith, bde long ago
POSIX.4 headers and sysctl variables. Nothing should change
unless POSIX4 is defined or _POSIX_VERSION is set to 199309.


# 32131 30-Dec-1997 alex

Convert caddr_t --> void * for sys/mman.h functions.

mlock, mmap, mprotect, msync, munlock, and munmap are defined by
POSIX as taking void *. The const modifier has been added to
mlock, munlock, and mprotect as the standard dictates.

minherit comes from OpenBSD and has been updated to conform with
their recent change to void *.

madvise and mincore are not defined by POSIX, but their arguments
have been modified to be consistent with the POSIX-defined functions.
mincore takes a const pointer, but madvise does not due to the
MADV_FREE case.

Discussed with: bde


# 31497 02-Dec-1997 dyson

Define MS_SYNC for compatibility.


# 24896 13-Apr-1997 bde

#ifdef'ed the declaration of lseek() so that -Wredundant-decls doesn't
cause noise.

Duplicated the lseek() redeclaration hack for all functions involving
off_t's (ftruncate(), mmap() and truncate()) to help broken programs
work.


# 22975 22-Feb-1997 peter

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 21673 14-Jan-1997 jkh

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# 20346 11-Dec-1996 alex

POSIX.4 defines MAP_FAILED to be the error return from mmap().


# 15873 22-May-1996 dyson

Initial support for MADV_FREE, support for pages that we don't care
about the contents anymore. This gives us alot of the advantage of
freeing individual pages through munmap, but with almost none of the
overhead.


# 15819 19-May-1996 dyson

Initial support for mincore and madvise. Both are almost fully
supported, except madvise does not page in with MADV_WILLNEED, and
MADV_DONTNEED doesn't force dirty pages out.


# 14482 11-Mar-1996 hsu

Merge in Lite2: sync up comments.
Reviewed by: davidg & bde


# 14327 02-Mar-1996 peter

Remove redundant comment about the 'int len' variables that should be
changed to size_t's.


# 14323 02-Mar-1996 peter

Change madvise prototype from 'int len' to 'size_t len'. All the other
m* syscalls were prototyped as size_t already. Add missing mincore() and
minherit() prototypes, as suggested by bde.


# 12548 30-Nov-1995 se

Add definition of PROT_NONE=0 for compatibility with SunOS/Solaris/Linux ...

Reviewed by: julian


# 9507 13-Jul-1995 dg

NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct
proc or any VM system structure will have to be rebuilt!!!

Much needed overhaul of the VM system. Included in this first round of
changes:

1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages,
haspage, and sync operations are supported. The haspage interface now
provides information about clusterability. All pager routines now take
struct vm_object's instead of "pagers".

2) Improved data structures. In the previous paradigm, there is constant
confusion caused by pagers being both a data structure ("allocate a
pager") and a collection of routines. The idea of a pager structure has
escentially been eliminated. Objects now have types, and this type is
used to index the appropriate pager. In most cases, items in the pager
structure were duplicated in the object data structure and thus were
unnecessary. In the few cases that remained, a un_pager structure union
was created in the object to contain these items.

3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now
be removed. For instance, vm_object_enter(), vm_object_lookup(),
vm_object_remove(), and the associated object hash list were some of the
things that were removed.

4) simple_lock's removed. Discussion with several people reveals that the
SMP locking primitives used in the VM system aren't likely the mechanism
that we'll be adopting. Even if it were, the locking that was in the code
was very inadequate and would have to be mostly re-done anyway. The
locking in a uni-processor kernel was a no-op but went a long way toward
making the code difficult to read and debug.

5) Places that attempted to kludge-up the fact that we don't have kernel
thread support have been fixed to reflect the reality that we are really
dealing with processes, not threads. The VM system didn't have complete
thread support, so the comments and mis-named routines were just wrong.
We now use tsleep and wakeup directly in the lock routines, for instance.

6) Where appropriate, the pagers have been improved, especially in the
pager_alloc routines. Most of the pager_allocs have been rewritten and
are now faster and easier to maintain.

7) The pagedaemon pageout clustering algorithm has been rewritten and
now tries harder to output an even number of pages before and after
the requested page. This is sort of the reverse of the ideal pagein
algorithm and should provide better overall performance.

8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup
have been removed. Some other unnecessary casts have also been removed.

9) Some almost useless debugging code removed.

10) Terminology of shadow objects vs. backing objects straightened out.
The fact that the vm_object data structure escentially had this
backwards really confused things. The use of "shadow" and "backing
object" throughout the code is now internally consistent and correct
in the Mach terminology.

11) Several minor bug fixes, including one in the vm daemon that caused
0 RSS objects to not get purged as intended.

12) A "default pager" has now been created which cleans up the transition
of objects to the "swap" type. The previous checks throughout the code
for swp->pg_data != NULL were really ugly. This change also provides
the rudiments for future backing of "anonymous" memory by something
other than the swap pager (via the vnode pager, for example), and it
allows the decision about which of these pagers to use to be made
dynamically (although will need some additional decision code to do
this, of course).

13) (dyson) MAP_COPY has been deprecated and the corresponding "copy
object" code has been removed. MAP_COPY was undocumented and non-
standard. It was furthermore broken in several ways which caused its
behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will
continue to work correctly, but via the slightly different semantics
of MAP_PRIVATE.

14) (dyson) Sharing maps have been removed. It's marginal usefulness in a
threads design can be worked around in other ways. Both #12 and #13
were done to simplify the code and improve readability and maintain-
ability. (As were most all of these changes)

TODO:

1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing
this will reduce the vnode pager to a mere fraction of its current size.

2) Rewrite vm_fault and the swap/vnode pagers to use the clustering
information provided by the new haspage pager interface. This will
substantially reduce the overhead by eliminating a large number of
VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be
improved to provide both a "behind" and "ahead" indication of
contiguousness.

3) Implement the extended features of pager_haspage in swap_pager_haspage().
It currently just says 0 pages ahead/behind.

4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps
via a much more general mechanism that could also be used for disk
striping of regular filesystems.

5) Do something to improve the architecture of vm_object_collapse(). The
fact that it makes calls into the swap pager and knows too much about
how the swap pager operates really bothers me. It also doesn't allow
for collapsing of non-swap pager objects ("unnamed" objects backed by
other pagers).


# 8519 14-May-1995 nate

Prototype for madvise() is missing from sys/mman.h

Submitted by: Kai Vorma <vode@snakemail.hut.fi>


# 7363 25-Mar-1995 dg

Fixed msync() prototype.


# 7358 25-Mar-1995 dg

Added flags definitions for msync().


# 2165 21-Aug-1994 paul

Made them all idempotent.
Reviewed by:
Submitted by:


# 1822 02-Aug-1994 dg

Added MAP_FILE definition that does nothing - for backward source
compatibility.


# 1817 02-Aug-1994 dg

Added $Id$


# 1542 24-May-1994 rgrimes

This commit was generated by cvs2svn to compensate for changes in r1541,
which included commits to RCS files with non-trunk default branches.


# 1541 24-May-1994 rgrimes

BSD 4.4 Lite Kernel Sources