#
331722 |
|
29-Mar-2018 |
eadler |
Revert r330897:
This was intended to be a non-functional change. It wasn't. The commit message was thus wrong. In addition it broke arm, and merged crypto related code.
Revert with prejudice.
This revert skips files touched in r316370 since that commit was since MFCed. This revert also skips files that require $FreeBSD$ property changes.
Thank you to those who helped me get out of this mess including but not limited to gonzo, kevans, rgrimes.
Requested by: gjb (re)
|
#
330897 |
|
14-Mar-2018 |
eadler |
Partial merge of the SPDX changes
These changes are incomplete but are making it difficult to determine what other changes can/should be merged.
No objections from: pfg
|
#
320605 |
|
03-Jul-2017 |
markj |
MFC r309017 (partially): Omit v_cache_count when computing the number of free pages, since its value is always 0.
|
#
318716 |
|
23-May-2017 |
markj |
MFC r308474, r308691, r309203, r309365, r309703, r309898, r310720, r308489, r308706: Add PQ_LAUNDRY and remove PG_CACHED pages.
|
#
311047 |
|
02-Jan-2017 |
kib |
MFC r310616: Remove redundancy in vmtotal().
|
#
302408 |
|
07-Jul-2016 |
gjb |
Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here.
Additional commits post-branch will follow.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
302063 |
|
21-Jun-2016 |
kib |
The vmtotal sysctl handler marks active vm objects to calculate statistics. Marking is done by setting the OBJ_ACTIVE flag. The flags change is locked, but the problem is that many parts of system assume that vm object initialization ensures that no other code could change the object, and thus performed lockless. The end result is corrupted flags in vm objects, most visible is spurious OBJ_DEAD flag, causing random hangs.
Avoid the active object marking, instead provide equally inexact but immutable is_object_alive() definition for the object mapped state.
Avoid iterating over the processes mappings altogether by using arguably improved definition of the paging thread as one which sleeps on the v_free_count.
PR: 204764 Diagnosed by: pho Tested by: pho (previous version) Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week Approved by: re (gjb)
|
#
287640 |
|
11-Sep-2015 |
markj |
Remove the v_cache_min and v_cache_max sysctls. They are unused and have no effect.
Reviewed by: alc Sponsored by: EMC / Isilon Storage Division
|
#
282660 |
|
08-May-2015 |
jhb |
Place VM objects on the object list when created and never remove them. This is ok since objects come from a NOFREE zone and allows objects to be locked while traversing the object list without triggering a LOR.
Ensure that objects on the list are marked DEAD while free or stillborn, and that they have a refcount of zero. This required updating most of the pagers to explicitly mark an object as dead when deallocating it. (Only the vnode pager did this previously.)
Differential Revision: https://reviews.freebsd.org/D2423 Reviewed by: alc, kib (earlier version) MFC after: 2 weeks Sponsored by: Norse Corp, Inc.
|
#
263620 |
|
22-Mar-2014 |
bdrewery |
Rename global cnt to vm_cnt to avoid shadowing.
To reduce the diff struct pcu.cnt field was not renamed, so PCPU_OP(cnt.field) is still used. pc_cnt and pcpu are also used in kvm(3) and vmstat(8). The goal was to not affect externally used KPI.
Bump __FreeBSD_version_ in case some out-of-tree module/code relies on the the global cnt variable.
Exp-run revealed no ports using it directly.
No objection from: arch@ Sponsored by: EMC / Isilon Storage Division
|
#
248084 |
|
09-Mar-2013 |
attilio |
Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes.
The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs.
The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example).
Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho
|
#
246032 |
|
28-Jan-2013 |
zont |
- Add system wide page faults requiring I/O counter.
Reviewed by: alc MFC after: 2 weeks
|
#
244043 |
|
08-Dec-2012 |
alc |
In the past four years, we've added two new vm object types. Each time, similar changes had to be made in various places throughout the machine- independent virtual memory layer to support the new vm object type. However, in most of these places, it's actually not the type of the vm object that matters to us but instead certain attributes of its pages. For example, OBJT_DEVICE, OBJT_MGTDEVICE, and OBJT_SG objects contain fictitious pages. In other words, in most of these places, we were testing the vm object's type to determine if it contained fictitious (or unmanaged) pages.
To both simplify the code in these places and make the addition of future vm object types easier, this change introduces two new vm object flags that describe attributes of the vm object's pages, specifically, whether they are fictitious or unmanaged.
Reviewed and tested by: kib
|
#
228498 |
|
14-Dec-2011 |
eadler |
- The previous commit (r228449) accidentally moved the vm.stats.vm.* sysctls to vm.stats.sys. Move them back.
Noticed by: pho Reviewed by: bde (earlier version) Approved by: bz MFC after: 1 week Pointy hat to: me
|
#
228449 |
|
12-Dec-2011 |
eadler |
Document a large number of currently undocumented sysctls. While here fix some style(9) issues and reduce redundancy.
PR: kern/155491 PR: kern/155490 PR: kern/155489 Submitted by: Galimov Albert <wtfcrap@mail.ru> Approved by: bde Reviewed by: jhb MFC after: 1 week
|
#
219968 |
|
24-Mar-2011 |
jhb |
Fix some locking nits with the p_state field of struct proc: - Hold the proc lock while changing the state from PRS_NEW to PRS_NORMAL in fork to honor the locking requirements. While here, expand the scope of the PROC_LOCK() on the new process (p2) to avoid some LORs. Previously the code was locking the new child process (p2) after it had locked the parent process (p1). However, when locking two processes, the safe order is to lock the child first, then the parent. - Fix various places that were checking p_state against PRS_NEW without having the process locked to use PROC_LOCK(). Every place was already locking the process, just after the PRS_NEW check. - Remove or reduce the use of PROC_SLOCK() for places that were checking p_state against PRS_NEW. The PROC_LOCK() alone is sufficient for reading the current state. - Reorder fill_kinfo_proc() slightly so it only acquires PROC_SLOCK() once.
MFC after: 1 week
|
#
218592 |
|
12-Feb-2011 |
jmallett |
Use CPU_FOREACH rather than expecting CPUs 0 through mp_ncpus-1 to be present. Don't micro-optimize the uniprocessor case; use the same loop there.
Submitted by: Bhanu Prakash Reviewed by: kib, jhb
|
#
217192 |
|
09-Jan-2011 |
kib |
Move repeated MAXSLP definition from machine/vmparam.h to sys/vmmeter.h. Update the outdated comments describing MAXSLP and the process selection algorithm for swap out.
Comments wording and reviewed by: alc
|
#
195840 |
|
24-Jul-2009 |
jhb |
Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to a device pager (OBJT_DEVICE) object in that it uses fictitious pages to provide aliases to other memory addresses. The primary difference is that it uses an sglist(9) to determine the physical addresses for a given offset into the object instead of invoking the d_mmap() method in a device driver.
Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks
|
#
187658 |
|
23-Jan-2009 |
jhb |
- Mark all standalone INT/LONG/QUAD sysctl's MPSAFE. This is done inside the SYSCTL() macros and thus does not need to be done for all of the nodes scattered across the source tree. - Mark the name-cache related sysctl's (including debug.hashstat.*) MPSAFE. - Mark vm.loadavg MPSAFE. - Remove GIANT_REQUIRED from vmtotal() (everything in this routine already has sufficient locking) and mark vm.vmtotal MPSAFE. - Mark the vm.stats.(sys|vm).* sysctls MPSAFE.
|
#
181887 |
|
19-Aug-2008 |
julian |
A bunch of formatting fixes brough to light by, or created by the Vimage commit a few days ago.
|
#
177368 |
|
19-Mar-2008 |
jeff |
- Relax requirements for p_numthreads, p_threads, p_swtick, and p_nice from requiring the per-process spinlock to only requiring the process lock. - Reflect these changes in the proc.h documentation and consumers throughout the kernel. This is a substantial reduction in locking cost for these fields and was made possible by recent changes to threading support.
|
#
177085 |
|
12-Mar-2008 |
jeff |
- Pass the priority argument from *sleep() into sleepq and down into sched_sleep(). This removes extra thread_lock() acquisition and allows the scheduler to decide what to do with the static boost. - Change the priority arguments to cv_* to match sleepq/msleep/etc. where 0 means no priority change. Catch -1 in cv_broadcastpri() and convert it to 0 for now. - Set a flag when sleeping in a way that is compatible with swapping since direct priority comparisons are meaningless now. - Add a sysctl to ule, kern.sched.static_boost, that defaults to on which controls the boost behavior. Turning it off gives better performance in some workloads but needs more investigation. - While we're modifying sleepq, change signal and broadcast to both return with the lock held as the lock was held on enter.
Reviewed by: jhb, peter
|
#
171633 |
|
27-Jul-2007 |
alc |
Add a counter for the total number of pages cached and support for reporting the value of this counter in the program "vmstat".
Approved by: re (rwatson)
|
#
171417 |
|
12-Jul-2007 |
alc |
Eliminate dead code.
Approved by: re (hrs)
|
#
170307 |
|
04-Jun-2007 |
jeff |
Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization.
Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
|
#
170170 |
|
31-May-2007 |
attilio |
Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately.
Requested by: alc Approved by: jeff (mentor)
|
#
169667 |
|
18-May-2007 |
jeff |
- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines.
Contributed by: Attilio Rao <attilio@FreeBSD.org>
|
#
164437 |
|
20-Nov-2006 |
ru |
Remove a redundant pointer-type variable.
|
#
164429 |
|
19-Nov-2006 |
ru |
When counting vm totals, skip unreferenced objects, including vnodes representing mounted file systems.
Reviewed by: alc MFC after: 3 days
|
#
160561 |
|
21-Jul-2006 |
alc |
Retire debug.mpsafevm. None of the architectures supported in CVS require it any longer.
|
#
160236 |
|
10-Jul-2006 |
alc |
Set debug.mpsafevm to true on PowerPC. (Now, by default, all architectures in CVS have debug.mpsafevm set to true.)
Tested by: grehan@
|
#
159475 |
|
10-Jun-2006 |
alc |
Enable debug.mpsafevm on arm by default.
Tested by: cognet@
|
#
159054 |
|
29-May-2006 |
tegge |
Close race between vmspace_exitfree() and exit1() and races between vmspace_exitfree() and vmspace_free() which could result in the same vmspace being freed twice.
Factor out part of exit1() into new function vmspace_exit(). Attach to vmspace0 to allow old vmspace to be freed earlier.
Add new function, vmspace_acquire_ref(), for obtaining a vmspace reference for a vmspace belonging to another process. Avoid changing vmspace refcount from 0 to 1 since that could also lead to the same vmspace being freed twice.
Change vmtotal() and swapout_procs() to use vmspace_acquire_ref().
Reviewed by: alc
|
#
146017 |
|
08-May-2005 |
marcel |
Enable debug_mpsafevm on ia64 due to the severe functional regression caused by recent locking changes when it's off. Revert the logic to trim down the conditional.
Clued-in by: alc@
|
#
144970 |
|
12-Apr-2005 |
jhb |
Tidy vcnt() by moving a duplicated line above #ifdef and removing a useless variable.
|
#
144635 |
|
04-Apr-2005 |
jhb |
Flip the switch and turn mpsafevm on by default for sparc64.
Approved by: alc
|
#
141696 |
|
11-Feb-2005 |
phk |
sysctl node vm.stats can not be static (for ia64 reasons).
|
#
141630 |
|
10-Feb-2005 |
phk |
Make three SYSCTL_NODEs static
|
#
139825 |
|
07-Jan-2005 |
imp |
/* -> /*- for license, minor formatting changes
|
#
138981 |
|
17-Dec-2004 |
alc |
Enable debug.mpsafevm by default on alpha.
|
#
136404 |
|
11-Oct-2004 |
peter |
Put on my peril sensitive sunglasses and add a flags field to the internal sysctl routines and state. Add some code to use it for signalling the need to downconvert a data structure to 32 bits on a 64 bit OS when requested by a 32 bit app.
I tried to do this in a generic abi wrapper that intercepted the sysctl oid's, or looked up the format string etc, but it was a real can of worms that turned into a fragile mess before I even got it partially working.
With this, we can now run 'sysctl -a' on a 32 bit sysctl binary and have it not abort. Things like netstat, ps, etc have a long way to go.
This also fixes a bug in the kern.ps_strings and kern.usrstack hacks. These do matter very much because they are used by libc_r and other things.
|
#
134747 |
|
04-Sep-2004 |
alc |
Enable debug.mpsafevm by default on amd64 and i386. This enables copy-on- write and zero-fill faults to run without holding Giant. It is still possible to disable Giant-free operation by setting debug.mpsafevm to 0 in loader.conf.
|
#
133807 |
|
16-Aug-2004 |
alc |
- Introduce and use a new tunable "debug.mpsafevm". At present, setting "debug.mpsafevm" results in (almost) Giant-free execution of zero-fill page faults. (Giant is held only briefly, just long enough to determine if there is a vnode backing the faulting address.)
Also, condition the acquisition and release of Giant around calls to pmap_remove() on "debug.mpsafevm".
The effect on performance is significant. On my dual Opteron, I see a 3.6% reduction in "buildworld" time.
- Use atomic operations to update several counters in vm_fault().
|
#
127961 |
|
06-Apr-2004 |
imp |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999.
Approved by: core
|
#
124083 |
|
02-Jan-2004 |
alc |
Avoid lock-order reversal between the vm object list mutex and the vm object mutex.
|
#
116226 |
|
11-Jun-2003 |
obrien |
Use __FBSDID().
|
#
113448 |
|
13-Apr-2003 |
alc |
Lock some manipulations of the vm object's flags.
|
#
109097 |
|
11-Jan-2003 |
dillon |
Make 'sysctl vm.vmtotal' work properly using updated patch from Hiten. (the patch in the PR was stale).
PR: kern/5689 Submitted by: Hiten Pandya <hiten@unixdaemons.com>
|
#
108585 |
|
03-Jan-2003 |
alc |
Add vm map and vm object locking to vmtotal().
|
#
108551 |
|
02-Jan-2003 |
alc |
Lock the vm object when performing vm_object_clear_flag().
|
#
108197 |
|
23-Dec-2002 |
alc |
Eliminate some dead code. (Any possible use for this code died with vm/vm_page.c revision 1.220.)
Submitted by: bde
|
#
108171 |
|
22-Dec-2002 |
dillon |
The UP -current was not properly counting the per-cpu VM stats in the sysctl code. This makes 'systat -vm 1's syscall count work again.
Submitted by: Michal Mertl <mime@traveller.cz> Note: also slated for 5.0
|
#
104387 |
|
02-Oct-2002 |
jhb |
Rename the mutex thread and process states to use a more generic 'LOCK' name instead. (e.g., SLOCK instead of SMTX, TD_ON_LOCK() instead of TD_ON_MUTEX()) Eventually a turnstile abstraction will be added that will be shared with mutexes and other types of locks. SLOCK/TDI_LOCK will be used internally by the turnstile code and will not be specific to mutexes. Making the change now ensures that turnstiles can be dropped in at a later date without affecting the ABI of userland applications.
|
#
103216 |
|
11-Sep-2002 |
julian |
Completely redo thread states.
Reviewed by: davidxu@freebsd.org
|
#
99072 |
|
29-Jun-2002 |
julian |
Part 1 of KSE-III
The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools)
Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands)
NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..
|
#
95112 |
|
20-Apr-2002 |
alc |
Reintroduce locking on accesses to vm_object_list.
|
#
93823 |
|
04-Apr-2002 |
dillon |
Embed a struct vmmeter in the per-cpu structure and add a macro, PCPU_LAZY_INC() which increments elements in it for cases where we can afford the occassional inaccuracy. Use of per-cpu stats counters avoids significant cache stalls in various critical paths that would otherwise severely limit our cpu scaleability.
Adjust all sysctl's accessing cnt.* elements to now use a procedure which aggregates the requested field for all cpus and for the global vmmeter.
The global vmmeter is retained, since some stats counters, like v_free_min, cannot be made per-cpu. Also, this allows us to convert counters from the global vmmeter to the per-cpu vmmeter in a piecemeal fashion, so have at it!
|
#
90538 |
|
11-Feb-2002 |
julian |
In a threaded world, differnt priorirites become properties of different entities. Make it so.
Reviewed by: jhb@freebsd.org (john baldwin)
|
#
85227 |
|
20-Oct-2001 |
iedowse |
Move the code that computes the system load average from vm_meter.c to kern_synch.c in preparation for adding some jitter to the inter-sample time.
Note that the "vm.loadavg" sysctl still lives in vm_meter.c which isn't the right place, but it is appropriate for the current (bad) name of that sysctl.
Suggested by: jhb (some time ago) Reviewed by: bde
|
#
84488 |
|
04-Oct-2001 |
iedowse |
Remove the SSLEEP case from the load average computation. This has been a no-op for as long as our CVS history goes back. Processes in state SSLEEP could only be counted if p_slptime == 0, but immediately before loadav() is called, schedcpu() has just incremented p_slptime on all SSLEEP processes.
|
#
83366 |
|
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
#
79242 |
|
04-Jul-2001 |
dillon |
whitespace / register cleanup
|
#
79224 |
|
04-Jul-2001 |
dillon |
With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
|
#
77582 |
|
01-Jun-2001 |
tmm |
Clean up the code exporting interrupt statistics via sysctl a bit: - move the sysctl code to kern_intr.c - do not use INTRCNT_COUNT, but rather eintrcnt - intrcnt to determine the length of the intrcnt array - move the declarations of intrnames, eintrnames, intrcnt and eintrcnt from machine-dependent include files to sys/interrupt.h - remove the hw.nintr sysctl, it is not needed. - fix various style bugs
Requested by: bde Reviewed by: bde (some time ago)
|
#
76827 |
|
18-May-2001 |
alfred |
Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level vm operations.
faults can not be taken without holding Giant.
Memory subsystems can now call the base page allocators safely.
Almost all atomic ops were removed as they are covered under the vm mutex.
Alpha and ia64 now need to catch up to i386's trap handlers.
FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties).
Reviewed (partially) by: jake, jhb
|
#
76778 |
|
17-May-2001 |
jhb |
- Use a timeout for the tsleep in scheduler() instead of having vmmeter() wakeup proc0 by hand to enforce the timeout. - When swapping out a process, keep the process locked via the proc lock from the first checks up until we clear PS_INMEM and set PS_SWAPPING in swapout(). The swapout() function now must be called with the proc lock held and releases it before returning. - Comment out the code to attempt to lock a process' VM structures before swapping out. It is broken in that it releases the lock after obtaining it. If it does grab the lock, it needs to hand it off to swapout() instead of releasing it. This can be revisisted when the VM is locked as this is a valid test to perform. It also causes a lock order reversal for the time being, which is the immediate cause for temporarily disabling it.
|
#
76166 |
|
01-May-2001 |
markm |
Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files.
Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files.
Sort sys/*.h includes where possible in affected files.
OK'ed by: bde (with reservations)
|
#
75523 |
|
15-Apr-2001 |
alfred |
use TAILQ_FOREACH, fix a comment's location
|
#
74927 |
|
28-Mar-2001 |
jhb |
Convert the allproc and proctree locks from lockmgr locks to sx locks.
|
#
74670 |
|
23-Mar-2001 |
tmm |
Export intrnames and intrcnt as sysctls (hw.nintr, hw.intrnames and hw.intrcnt).
Approved by: rwatson
|
#
72376 |
|
11-Feb-2001 |
jake |
Implement a unified run queue and adjust priority levels accordingly.
- All processes go into the same array of queues, with different scheduling classes using different portions of the array. This allows user processes to have their priorities propogated up into interrupt thread range if need be. - I chose 64 run queues as an arbitrary number that is greater than 32. We used to have 4 separate arrays of 32 queues each, so this may not be optimal. The new run queue code was written with this in mind; changing the number of run queues only requires changing constants in runq.h and adjusting the priority levels. - The new run queue code takes the run queue as a parameter. This is intended to be used to create per-cpu run queues. Implement wrappers for compatibility with the old interface which pass in the global run queue structure. - Group the priority level, user priority, native priority (before propogation) and the scheduling class into a struct priority. - Change any hard coded priority levels that I found to use symbolic constants (TTIPRI and TTOPRI). - Remove the curpriority global variable and use that of curproc. This was used to detect when a process' priority had lowered and it should yield. We now effectively yield on every interrupt. - Activate propogate_priority(). It should now have the desired effect without needing to also propogate the scheduling class. - Temporarily comment out the call to vm_page_zero_idle() in the idle loop. It interfered with propogate_priority() because the idle process needed to do a non-blocking acquire of Giant and then other processes would try to propogate their priority onto it. The idle process should not do anything except idle. vm_page_zero_idle() will return in the form of an idle priority kernel thread which is woken up at apprioriate times by the vm system. - Update struct kinfo_proc to the new priority interface. Deliberately change its size by adjusting the spare fields. It remained the same size, but the layout has changed, so userland processes that use it would parse the data incorrectly. The size constraint should really be changed to an arbitrary version number. Also add a debug.sizeof sysctl node for struct kinfo_proc.
|
#
72200 |
|
09-Feb-2001 |
bmilekic |
Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:
mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)
similarily, for releasing a lock, we now have:
mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument.
The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind.
Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two:
MTX_QUIET and MTX_NOSWITCH
The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers:
mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively.
Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case.
Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled.
Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those.
Finally, caught up to the interface changes in all sys code.
Contributors: jake, jhb, jasone (in no particular order)
|
#
71999 |
|
04-Feb-2001 |
phk |
Mechanical change to use <sys/queue.h> macro API instead of fondling implementation details.
Created with: sed(1) Reviewed by: md5(1)
|
#
71572 |
|
24-Jan-2001 |
jhb |
- Catch up to proc flag changes. - Minimal proc locking. - Use queue macros.
|
#
71429 |
|
23-Jan-2001 |
ume |
Add mibs to hold the number of forks since boot. New mibs are:
vm.stats.vm.v_forks vm.stats.vm.v_vforks vm.stats.vm.v_rforks vm.stats.vm.v_kthreads vm.stats.vm.v_forkpages vm.stats.vm.v_vforkpages vm.stats.vm.v_rforkpages vm.stats.vm.v_kthreadpages
Submitted by: Paul Herman <pherman@frenchfries.net> Reviewed by: alfred
|
#
69947 |
|
12-Dec-2000 |
jake |
- Change the allproc_lock to use a macro, ALLPROC_LOCK(how), instead of explicit calls to lockmgr. Also provides macros for the flags pased to specify shared, exclusive or release which map to the lockmgr flags. This is so that the use of lockmgr can be easily replaced with optimized reader-writer locks. - Add some locking that I missed the first time.
|
#
69022 |
|
22-Nov-2000 |
jake |
Protect the following with a lockmgr lock:
allproc zombproc pidhashtbl proc.p_list proc.p_hash nextpid
Reviewed by: jhb Obtained from: BSD/OS and netbsd
|
#
65904 |
|
15-Sep-2000 |
jhb |
- Add a new process flag P_NOLOAD that marks a process that should be ignored during load average calcuations. - Set this flag for the idle processes and the softinterrupt process.
|
#
65557 |
|
06-Sep-2000 |
jasone |
Major update to the way synchronization is done in the kernel. Highlights include:
* Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.)
* Per-CPU idle processes.
* Interrupts are run in their own separate kernel threads and can be preempted (i386 only).
Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
|
#
62622 |
|
05-Jul-2000 |
jhb |
Support for unsigned integer and long sysctl variables. Update the SYSCTL_LONG macro to be consistent with other integer sysctl variables and require an initial value instead of assuming 0. Update several sysctl variables to use the unsigned types.
PR: 15251 Submitted by: Kelly Yancey <kbyanc@posi.net>
|
#
62573 |
|
04-Jul-2000 |
phk |
Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.
Pointed out by: bde
|
#
62454 |
|
03-Jul-2000 |
phk |
Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:
Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our sources:
-sysctl_vm_zone SYSCTL_HANDLER_ARGS +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)
|
#
58705 |
|
27-Mar-2000 |
charnier |
Revert spelling mistake I made in the previous commit Requested by: Alan and Bruce
|
#
58634 |
|
26-Mar-2000 |
charnier |
Spelling
|
#
52635 |
|
29-Oct-1999 |
phk |
useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs.
This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
|
#
51337 |
|
17-Sep-1999 |
dillon |
Reviewed by: Alan Cox <alc@cs.rice.edu>, David Greenman <dg@root.com>
Replace various VM related page count calculations strewn over the VM code with inlines to aid in readability and to reduce fragility in the code where modules depend on the same test being performed to properly sleep and wakeup.
Split out a portion of the page deactivation code into an inline in vm_page.c to support vm_page_dontneed().
add vm_page_dontneed(), which handles the madvise MADV_DONTNEED feature in a related commit coming up for vm_map.c/vm_object.c. This code prevents degenerate cases where an essentially active page may be rotated through a subset of the paging lists, resulting in premature disposal.
|
#
50477 |
|
27-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
#
46381 |
|
03-May-1999 |
billf |
Add sysctl descriptions to many SYSCTL_XXXs
PR: kern/11197 Submitted by: Adrian Chadd <adrian@FreeBSD.org> Reviewed by: billf(spelling/style/minor nits) Looked at by: bde(style)
|
#
43748 |
|
07-Feb-1999 |
dillon |
Remove MAP_ENTRY_IS_A_MAP 'share' maps. These maps were once used to attempt to optimize forks but were essentially given-up on due to problems and replaced with an explicit dup of the vm_map_entry structure. Prior to the removal, they were entirely unused.
|
#
42971 |
|
21-Jan-1999 |
dillon |
Objects associated with raw devices are no longer counted in the VM stats total because they may contain absurd numbers ( like the size of all of physical memory if you mmap() /dev/mem ).
|
#
42957 |
|
21-Jan-1999 |
dillon |
This is a rather large commit that encompasses the new swapper, changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues.
Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
|
#
40794 |
|
31-Oct-1998 |
peter |
Add John Dyson's SYSCTL descriptions, and an export of more stats to a sysctl hierarchy (vm.stats.*). SYSCTL descriptions are only present in source, they do not get compiled into the binaries taking up memory.
|
#
38517 |
|
24-Aug-1998 |
dfr |
Change various syscalls to use size_t arguments instead of u_int.
Add some overflow checks to read/write (from bde).
Change all modifications to vm_page::flags, vm_page::busy, vm_object::flags and vm_object::paging_in_progress to use operations which are not interruptable.
Reviewed by: Bruce Evans <bde@zeta.org.au>
|
#
34961 |
|
30-Mar-1998 |
phk |
Eradicate the variable "time" from the kernel, using various measures. "time" wasn't a atomic variable, so splfoo() protection were needed around any access to it, unless you just wanted the seconds part.
Most uses of time.tv_sec now uses the new variable time_second instead.
gettime() changed to getmicrotime(0.
Remove a couple of unneeded splfoo() protections, the new getmicrotime() is atomic, (until Bruce sets a breakpoint in it).
A couple of places needed random data, so use read_random() instead of mucking about with time which isn't random.
Add a new nfs_curusec() function.
Mark a couple of bogosities involving the now disappeard time variable.
Update ffs_update() to avoid the weird "== &time" checks, by fixing the one remaining call that passwd &time as args.
Change profiling in ncr.c to use ticks instead of time. Resolution is the same.
Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call hzto() which subtracts time" sequences.
Reviewed by: bde
|
#
34924 |
|
28-Mar-1998 |
bde |
Moved some #includes from <sys/param.h> nearer to where they are actually used.
|
#
31393 |
|
24-Nov-1997 |
bde |
Removed all traces of P_IDLEPROC. It was tested but never set.
|
#
28992 |
|
01-Sep-1997 |
bde |
Removed unused #includes.
|
#
25164 |
|
26-Apr-1997 |
peter |
Man the liferafts! Here comes the long awaited SMP -> -current merge!
There are various options documented in i386/conf/LINT, there is more to come over the next few days.
The kernel should run pretty much "as before" without the options to activate SMP mode.
There are a handful of known "loose ends" that need to be fixed, but have been put off since the SMP kernel is in a moderately good condition at the moment.
This commit is the result of the tinkering and testing over the last 14 months by many people. A special thanks to Steve Passe for implementing the APIC code!
|
#
22975 |
|
22-Feb-1997 |
peter |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
22521 |
|
10-Feb-1997 |
dyson |
This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes.
The system boots and can mount UFS filesystems.
Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed.
Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
|
#
21754 |
|
16-Jan-1997 |
dyson |
Change the map entry flags from bitfields to bitmasks. Allows for some code simplification.
|
#
21673 |
|
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
18169 |
|
08-Sep-1996 |
dyson |
Addition of page coloring support. Various levels of coloring are afforded. The default level works with minimal overhead, but one can also enable full, efficient use of a 512K cache. (Parameters can be generated to support arbitrary cache sizes also.)
|
#
15809 |
|
18-May-1996 |
dyson |
This set of commits to the VM system does the following, and contain contributions or ideas from Stephen McKay <syssgm@devetir.qld.gov.au>, Alan Cox <alc@cs.rice.edu>, David Greenman <davidg@freebsd.org> and me:
More usage of the TAILQ macros. Additional minor fix to queue.h. Performance enhancements to the pageout daemon. Addition of a wait in the case that the pageout daemon has to run immediately. Slightly modify the pageout algorithm. Significant revamp of the pmap/fork code: 1) PTE's and UPAGES's are NO LONGER in the process's map. 2) PTE's and UPAGES's reside in their own objects. 3) TOTAL elimination of recursive page table pagefaults. 4) The page directory now resides in the PTE object. 5) Implemented pmap_copy, thereby speeding up fork time. 6) Changed the pv entries so that the head is a pointer and not an entire entry. 7) Significant cleanup of pmap_protect, and pmap_remove. 8) Removed significant amounts of machine dependent fork code from vm_glue. Pushed much of that code into the machine dependent pmap module. 9) Support more completely the reuse of already zeroed pages (Page table pages and page directories) as being already zeroed. Performance and code cleanups in vm_map: 1) Improved and simplified allocation of map entries. 2) Improved vm_map_copy code. 3) Corrected some minor problems in the simplify code. Implemented splvm (combo of splbio and splimp.) The VM code now seldom uses splhigh. Improved the speed of and simplified kmem_malloc. Minor mod to vm_fault to avoid using pre-zeroed pages in the case of objects with backing objects along with the already existant condition of having a vnode. (If there is a backing object, there will likely be a COW... With a COW, it isn't necessary to start with a pre-zeroed page.) Minor reorg of source to perhaps improve locality of ref.
|
#
14531 |
|
11-Mar-1996 |
hsu |
For Lite2: proc LIST changes. Reviewed by: davidg & bde
|
#
12820 |
|
14-Dec-1995 |
phk |
Another mega commit to staticize things.
|
#
12726 |
|
10-Dec-1995 |
bde |
Restored used includes of <vm/vm_extern.h>.
|
#
12662 |
|
07-Dec-1995 |
dg |
Untangled the vm.h include file spaghetti.
|
#
12623 |
|
04-Dec-1995 |
phk |
A major sweep over the sysctl stuff.
Move a lot of variables home to their own code (In good time before xmas :-)
Introduce the string descrition of format.
Add a couple more functions to poke into these marvels, while I try to decide what the correct interface should look like.
Next is adding vars on the fly, and sysctl looking at them too.
Removed a tine bit of defunct and #ifdefed notused code in swapgeneric.
|
#
12286 |
|
14-Nov-1995 |
phk |
Move all the VM sysctl stuff home where it belongs.
|
#
9759 |
|
29-Jul-1995 |
bde |
Eliminate sloppy common-style declarations. There should be none left for the LINT configuation.
|
#
9507 |
|
13-Jul-1995 |
dg |
NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct proc or any VM system structure will have to be rebuilt!!!
Much needed overhaul of the VM system. Included in this first round of changes:
1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages, haspage, and sync operations are supported. The haspage interface now provides information about clusterability. All pager routines now take struct vm_object's instead of "pagers".
2) Improved data structures. In the previous paradigm, there is constant confusion caused by pagers being both a data structure ("allocate a pager") and a collection of routines. The idea of a pager structure has escentially been eliminated. Objects now have types, and this type is used to index the appropriate pager. In most cases, items in the pager structure were duplicated in the object data structure and thus were unnecessary. In the few cases that remained, a un_pager structure union was created in the object to contain these items.
3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now be removed. For instance, vm_object_enter(), vm_object_lookup(), vm_object_remove(), and the associated object hash list were some of the things that were removed.
4) simple_lock's removed. Discussion with several people reveals that the SMP locking primitives used in the VM system aren't likely the mechanism that we'll be adopting. Even if it were, the locking that was in the code was very inadequate and would have to be mostly re-done anyway. The locking in a uni-processor kernel was a no-op but went a long way toward making the code difficult to read and debug.
5) Places that attempted to kludge-up the fact that we don't have kernel thread support have been fixed to reflect the reality that we are really dealing with processes, not threads. The VM system didn't have complete thread support, so the comments and mis-named routines were just wrong. We now use tsleep and wakeup directly in the lock routines, for instance.
6) Where appropriate, the pagers have been improved, especially in the pager_alloc routines. Most of the pager_allocs have been rewritten and are now faster and easier to maintain.
7) The pagedaemon pageout clustering algorithm has been rewritten and now tries harder to output an even number of pages before and after the requested page. This is sort of the reverse of the ideal pagein algorithm and should provide better overall performance.
8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup have been removed. Some other unnecessary casts have also been removed.
9) Some almost useless debugging code removed.
10) Terminology of shadow objects vs. backing objects straightened out. The fact that the vm_object data structure escentially had this backwards really confused things. The use of "shadow" and "backing object" throughout the code is now internally consistent and correct in the Mach terminology.
11) Several minor bug fixes, including one in the vm daemon that caused 0 RSS objects to not get purged as intended.
12) A "default pager" has now been created which cleans up the transition of objects to the "swap" type. The previous checks throughout the code for swp->pg_data != NULL were really ugly. This change also provides the rudiments for future backing of "anonymous" memory by something other than the swap pager (via the vnode pager, for example), and it allows the decision about which of these pagers to use to be made dynamically (although will need some additional decision code to do this, of course).
13) (dyson) MAP_COPY has been deprecated and the corresponding "copy object" code has been removed. MAP_COPY was undocumented and non- standard. It was furthermore broken in several ways which caused its behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will continue to work correctly, but via the slightly different semantics of MAP_PRIVATE.
14) (dyson) Sharing maps have been removed. It's marginal usefulness in a threads design can be worked around in other ways. Both #12 and #13 were done to simplify the code and improve readability and maintain- ability. (As were most all of these changes)
TODO:
1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing this will reduce the vnode pager to a mere fraction of its current size.
2) Rewrite vm_fault and the swap/vnode pagers to use the clustering information provided by the new haspage pager interface. This will substantially reduce the overhead by eliminating a large number of VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be improved to provide both a "behind" and "ahead" indication of contiguousness.
3) Implement the extended features of pager_haspage in swap_pager_haspage(). It currently just says 0 pages ahead/behind.
4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps via a much more general mechanism that could also be used for disk striping of regular filesystems.
5) Do something to improve the architecture of vm_object_collapse(). The fact that it makes calls into the swap pager and knows too much about how the swap pager operates really bothers me. It also doesn't allow for collapsing of non-swap pager objects ("unnamed" objects backed by other pagers).
|
#
5464 |
|
10-Jan-1995 |
dg |
Fixed some formatting weirdness that I overlooked in the previous commit.
|
#
5455 |
|
09-Jan-1995 |
dg |
These changes embody the support of the fully coherent merged VM buffer cache, much higher filesystem I/O performance, and much better paging performance. It represents the culmination of over 6 months of R&D.
The majority of the merged VM/cache work is by John Dyson.
The following highlights the most significant changes. Additionally, there are (mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to support the new VM/buffer scheme.
vfs_bio.c: Significant rewrite of most of vfs_bio to support the merged VM buffer cache scheme. The scheme is almost fully compatible with the old filesystem interface. Significant improvement in the number of opportunities for write clustering.
vfs_cluster.c, vfs_subr.c Upgrade and performance enhancements in vfs layer code to support merged VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff.
vm_object.c: Yet more improvements in the collapse code. Elimination of some windows that can cause list corruption.
vm_pageout.c: Fixed it, it really works better now. Somehow in 2.0, some "enhancements" broke the code. This code has been reworked from the ground-up.
vm_fault.c, vm_page.c, pmap.c, vm_object.c Support for small-block filesystems with merged VM/buffer cache scheme.
pmap.c vm_map.c Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of kernel PTs.
vm_glue.c Much simpler and more effective swapping code. No more gratuitous swapping.
proc.h Fixed the problem that the p_lock flag was not being cleared on a fork.
swap_pager.c, vnode_pager.c Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the code doesn't need it anymore.
machdep.c Changes to better support the parameter values for the merged VM/buffer cache scheme.
machdep.c, kern_exec.c, vm_glue.c Implemented a seperate submap for temporary exec string space and another one to contain process upages. This eliminates all map fragmentation problems that previously existed.
ffs_inode.c, ufs_inode.c, ufs_readwrite.c Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on busy buffers.
Submitted by: John Dyson and David Greenman
|
#
2689 |
|
12-Sep-1994 |
dg |
Eliminated a whole pile of ancient (we're taking 4.3BSD) VM system related #define constants. Corrected incorrect VM_MAX_KERNEL_ADDRESS.
Reviewed by: John Dyson
|
#
1817 |
|
02-Aug-1994 |
dg |
Added $Id$
|
#
1549 |
|
25-May-1994 |
rgrimes |
The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
|
#
1542 |
|
24-May-1994 |
rgrimes |
This commit was generated by cvs2svn to compensate for changes in r1541, which included commits to RCS files with non-trunk default branches.
|
#
1541 |
|
24-May-1994 |
rgrimes |
BSD 4.4 Lite Kernel Sources
|