#
267654 |
|
19-Jun-2014 |
gjb |
Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
265554 |
|
07-May-2014 |
alc |
MFC r262338 When the kernel is running in a virtual machine, it cannot rely upon the processor family to determine if the workaround for AMD Family 10h Erratum 383 should be enabled. To enable virtual machine migration among a heterogeneous collection of physical machines, the hypervisor may have been configured to report an older processor family with a reduced feature set. Effectively, the reported processor family and its features are like a "least common denominator" for the collection of machines.
Therefore, when the kernel is running in a virtual machine, instead of relying upon the processor family, we now test for features that prove that the underlying processor is not affected by the erratum. (The features that we test for are unlikely to ever be emulated in software on an affected physical processor.)
PR: 186061
|
#
256207 |
|
09-Oct-2013 |
mav |
MFC r251703 (by attilio): - Add a BIT_FFS() macro and use it to replace cpusetffs_obj()
|
#
255811 |
|
23-Sep-2013 |
kib |
MFC r255607: In pmap_copy(), when the copied region is mapped with superpage but does not cover entire superpage, avoid copying.
MFC r255620: Merge the change r255607 from amd64 to i386.
|
#
251897 |
|
18-Jun-2013 |
scottl |
Merge the second part of the unmapped I/O changes. This enables the infrastructure in the block layer and UFS filesystem as well as a few drivers. The list of MFC revisions is long, so I won't quote changelogs.
r248508,248510,248511,248512,248514,248515,248516,248517,248518, 248519,248520,248521,248550,248568,248789,248790,249032,250936
Submitted by: kib Approved by: kib Obtained from: Netflix
|
#
248814 |
|
28-Mar-2013 |
kib |
MFC r248280: Add pmap function pmap_copy_pages().
|
#
248085 |
|
09-Mar-2013 |
marius |
MFC: r227309 (partial)
Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
|
#
247547 |
|
01-Mar-2013 |
jhb |
MFC 245577,245640: Don't attempt to use clflush on the local APIC register window. Various CPUs exhibit bad behavior if this is done (Intel Errata AAJ3, hangs on Pentium-M, and trashing of the local APIC registers on a VIA C7). The local APIC is implicitly mapped UC already via MTRRs, so the clflush isn't necessary anyway.
|
#
242172 |
|
27-Oct-2012 |
kib |
MFC r242010: Add missed sched_pin().
|
#
241518 |
|
13-Oct-2012 |
alc |
MFC r241353, r241356, r241400 To avoid page table page corruption, change pmap_pv_reclaim()'s method of mapping page table pages.
Add some assertions that were helpful in debugging the page table page corruption.
|
#
240754 |
|
20-Sep-2012 |
alc |
MFC r240317 Simplify pmap_unmapdev().
Don't set PTE_W on the page table entry in pmap_kenter{,_attr}() on MIPS.
|
#
240151 |
|
05-Sep-2012 |
kib |
MFC r233122,r237086,r237228,r237264,r237290,r237404,r237414,r237513,r237551, r237592,r237604,r237623,r237684,r237733,r237813,r237855,r238124,r238126, r238163,r238414,r238610,r238889,r238970,r239072,r239137,r240126 (all by alc):
Add fine-grained PV chunk and list locking to the amd64 pmap, enabling concurrent execution of the following functions on different pmaps:
pmap_change_wiring() pmap_copy() pmap_enter() pmap_enter_object() pmap_enter_quick() pmap_page_exists_quick() pmap_page_is_mapped() pmap_protect() pmap_remove() pmap_remove_pages()
Requested and approved by: alc
|
#
239996 |
|
01-Sep-2012 |
kib |
MFC r238792: Introduce curpcb magic variable.
|
#
238005 |
|
02-Jul-2012 |
alc |
MFC r236534 Various small changes to PV entry management.
|
#
237966 |
|
02-Jul-2012 |
alc |
MFC r235695, r236158, r236190, r236494 Replace all uses of the vm page queues lock by a r/w lock that is private to this pmap.c.
|
#
237952 |
|
02-Jul-2012 |
alc |
MFC r235912 There is no need for pmap_protect() to acquire the page queues lock unless it is going to access the pv lists or PMAP1/PADDR1.
|
#
237950 |
|
02-Jul-2012 |
alc |
MFC r233290, r235598, r235973, r236045, r236240, r236291, r236378 Change pv_entry_count to a long on amd64.
Rename pmap_collect() to pmap_pv_reclaim() and rewrite it such that it no longer uses the active and inactive paging queues.
Eliminate some purely stylistic differences among the amd64, i386 native, and i386 xen PV entry allocators.
Eliminate code duplication in free_pv_entry() and pmap_remove_pages() by introducing free_pv_chunk().
|
#
237949 |
|
02-Jul-2012 |
alc |
MFC r226843 Eliminate vestiges of page coloring in VM_ALLOC_NOOBJ calls to vm_page_alloc(). While I'm here, for the sake of consistency, always specify the allocation class, such as VM_ALLOC_NORMAL, as the first of the flags.
|
#
236123 |
|
26-May-2012 |
alc |
MFC r228513 Create large page mappings in pmap_map().
|
#
235884 |
|
24-May-2012 |
alc |
MFC r233433 Disable detailed PV entry accounting by default. Add a config option to enable it.
|
#
233775 |
|
02-Apr-2012 |
kib |
MFC r233168: If we ever allow for managed fictitious pages, the pages shall be excluded from superpage promotions. At least one of the reason is that pv_table is sized for non-fictitious pages only.
Consistently check for the page to be non-fictitious before accesing superpage pv list.
MFC note: erronous chunks from r233168 which were fixed in r233185 are not included into the merge.
|
#
230435 |
|
21-Jan-2012 |
alc |
MFC r228923, r228935, and r229007 Eliminate many of the unnecessary differences between the native and paravirtualized pmap implementations for i386.
Fix a bug in the Xen pmap's implementation of pmap_extract_and_hold(): If the page lock acquisition is retried, then the underlying thread is not unpinned.
Wrap nearby lines that exceed 80 columns.
Merge r216333 and r216555 from the native pmap When r207410 eliminated the acquisition and release of the page queues lock from pmap_extract_and_hold(), it didn't take into account that pmap_pte_quick() sometimes requires the page queues lock to be held. This change reimplements pmap_extract_and_hold() such that it no longer uses pmap_pte_quick(), and thus never requires the page queues lock.
Merge r177525 from the native pmap Prevent the overflow in the calculation of the next page directory. The overflow causes the wraparound with consequent corruption of the (almost) whole address space mapping.
Strictly speaking, r177525 is not required by the Xen pmap because the hypervisor steals the uppermost region of the normal kernel address space. I am nonetheless merging it in order to reduce the number of unnecessary differences between the native and Xen pmap implementations.
|
#
225736 |
|
22-Sep-2011 |
kensmith |
Copy head to stable/9 as part of 9.0-RELEASE release cycle.
Approved by: re (implicit)
|
#
225418 |
|
06-Sep-2011 |
kib |
Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomic flags field. Updates to the atomic flags are performed using the atomic ops on the containing word, do not require any vm lock to be held, and are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9) functions are provided to modify afalgs.
Document the changes to flags field to only require the page lock.
Introduce vm_page_reference(9) function to provide a stable KPI and KBI for filesystems like tmpfs and zfs which need to mark a page as referenced.
Reviewed by: alc, attilio Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64) Approved by: re (bz)
|
#
224746 |
|
09-Aug-2011 |
kib |
- Move the PG_UNMANAGED flag from m->flags to m->oflags, renaming the flag to VPO_UNMANAGED (and also making the flag protected by the vm object lock, instead of vm page queue lock). - Mark the fake pages with both PG_FICTITIOUS (as it is now) and VPO_UNMANAGED. As a consequence, pmap code now can use use just VPO_UNMANAGED to decide whether the page is unmanaged.
Reviewed by: alc Tested by: pho (x86, previous version), marius (sparc64), marcel (arm, ia64, powerpc), ray (mips) Sponsored by: The FreeBSD Foundation Approved by: re (bz)
|
#
223758 |
|
04-Jul-2011 |
attilio |
With retirement of cpumask_t and usage of cpuset_t for representing a mask of CPUs, pc_other_cpus and pc_cpumask become highly inefficient.
Remove them and replace their usage with custom pc_cpuid magic (as, atm, pc_cpumask can be easilly represented by (1 << pc_cpuid) and pc_other_cpus by (all_cpus & ~(1 << pc_cpuid))).
This change is not targeted for MFC because of struct pcpu members removal and dependency by cpumask_t retirement.
MD review by: marcel, marius, alc Tested by: pluknet MD testing by: marcel, marius, gonzo, andreast
|
#
223732 |
|
02-Jul-2011 |
alc |
When iterating over a paging queue, explicitly check for PG_MARKER, instead of relying on zeroed memory being interpreted as an empty PV list.
Reviewed by: kib
|
#
223677 |
|
29-Jun-2011 |
alc |
Add a new option, OBJPR_NOTMAPPED, to vm_object_page_remove(). Passing this option to vm_object_page_remove() asserts that the specified range of pages is not mapped, or more precisely that none of these pages have any managed mappings. Thus, vm_object_page_remove() need not call pmap_remove_all() on the pages.
This change not only saves time by eliminating pointless calls to pmap_remove_all(), but it also eliminates an inconsistency in the use of pmap_remove_all() versus related functions, like pmap_remove_write(). It eliminates harmless but pointless calls to pmap_remove_all() that were being performed on PG_UNMANAGED pages.
Update all of the existing assertions on pmap_remove_all() to reflect this change.
Reviewed by: kib
|
#
222813 |
|
07-Jun-2011 |
attilio |
etire the cpumask_t type and replace it with cpuset_t usage.
This is intended to fix the bug where cpu mask objects are capped to 32. MAXCPU, then, can now arbitrarely bumped to whatever value. Anyway, as long as several structures in the kernel are statically allocated and sized as MAXCPU, it is suggested to keep it as low as possible for the time being.
Technical notes on this commit itself: - More functions to handle with cpuset_t objects are introduced. The most notable are cpusetobj_ffs() (which calculates a ffs(3) for a cpuset_t object), cpusetobj_strprint() (which prepares a string representing a cpuset_t object) and cpusetobj_strscan() (which creates a valid cpuset_t starting from a string representation). - pc_cpumask and pc_other_cpus are target to be removed soon. With the moving from cpumask_t to cpuset_t they are now inefficient and not really useful. Anyway, for the time being, please note that access to pcpu datas is protected by sched_pin() in order to avoid migrating the CPU while reading more than one (possible) word - Please note that size of cpuset_t objects may differ between kernel and userland. While this is not directly related to the patch itself, it is good to understand that concept and possibly use the patch as a reference on how to deal with cpuset_t objects in userland, when accessing kernland members. - KTR_CPUMASK is changed and now is represented through a string, to be set as the example reported in NOTES.
Please additively note that no MAXCPU is bumped in this patch, but private testing has been done until to MAXCPU=128 on a real 8x8x2(htt) machine (amd64).
Please note that the FreeBSD version is not yet bumped because of the upcoming pcpu changes. However, note that this patch is not targeted for MFC.
People to thank for the time spent on this patch: - sbruno, pluknet and Nicholas Esborn (nick AT desert DOT net) tested several revision of the patches and really helped in improving stability of this work. - marius fixed several bugs in the sparc64 implementation and reviewed patches related to ktr. - jeff and jhb discussed the basic approach followed. - kib and marcel made targeted review on some specific part of the patch. - marius, art, nwhitehorn and andreast reviewed MD specific part of the patch. - marius, andreast, gonzo, nwhitehorn and jceel tested MD specific implementations of the patch. - Other people have made contributions on other patches that have been already committed and have been listed separately.
Companies that should be mentioned for having participated at several degrees: - Yahoo! for having offered the machines used for testing on big count of CPUs. - The FreeBSD Foundation for having sponsored my devsummit attendance, which has been instrumental. - Sandvine for having offered offices and infrastructure during development.
(I really hope I didn't forget anyone, if it happened I apologize in advance).
|
#
220803 |
|
18-Apr-2011 |
kib |
Make pmap_invalidate_cache_range() available for consumption on amd64.
Add pmap_invalidate_cache_pages() method on x86. It flushes the CPU cache for the set of pages, which are not neccessary mapped. Since its supposed use is to prepare the move of the pages ownership to a device that does not snoop all CPU accesses to the main memory (read GPU in GMCH), do not rely on CPU self-snoop feature.
amd64 implementation takes advantage of the direct map. On i386, extract the helper pmap_flush_page() from pmap_page_set_memattr(), and use it to make a temporary mapping of the flushed page.
Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 3 weeks
|
#
217688 |
|
21-Jan-2011 |
pluknet |
Make MSGBUF_SIZE kernel option a loader tunable kern.msgbufsize.
Submitted by: perryh pluto.rain.com (previous version) Reviewed by: jhb Approved by: kib (mentor) Tested by: universe
|
#
216555 |
|
19-Dec-2010 |
alc |
Redo some parts of r216333, specifically, the locking changes to pmap_extract_and_hold(), and undo the rest. In particular, I forgot that PG_PS and PG_PTE_PAT are the same bit.
|
#
216516 |
|
18-Dec-2010 |
kib |
In pmap_extract(), unlock pmap lock earlier. The calculation does not need the lock when operating on local variables.
Reviewed by: alc
|
#
216333 |
|
09-Dec-2010 |
alc |
When r207410 eliminated the acquisition and release of the page queues lock from pmap_extract_and_hold(), it didn't take into account that pmap_pte_quick() sometimes requires the page queues lock to be held. This change reimplements pmap_extract_and_hold() such that it no longer uses pmap_pte_quick(), and thus never requires the page queues lock.
For consistency, adopt the same idiom as used by the new implementation of pmap_extract_and_hold() in pmap_extract() and pmap_mincore(). It also happens to make these functions shorter.
Fix a style error in pmap_pte().
Reviewed by: kib@
|
#
215754 |
|
23-Nov-2010 |
jkim |
Remove a stale tunable introduced in r215703.
|
#
215703 |
|
22-Nov-2010 |
jkim |
- Disable caches and flush caches/TLBs when we update PAT as we do for MTRR. Flushing TLBs is required to ensure cache coherency according to the AMD64 architecture manual. Flushing caches is only required when changing from a cacheable memory type (WB, WP, or WT) to an uncacheable type (WC, UC, or UC-). Since this function is only used once per processor during startup, there is no need to take any shortcuts. - Leave PAT indices 0-3 at the default of WB, WT, UC-, and UC. Program 5 as WP (from default WT) and 6 as WC (from default UC-). Leave 4 and 7 at the default of WB and UC. This is to avoid transition from a cacheable memory type to an uncacheable type to minimize possible cache incoherency. Since we perform flushing caches and TLBs now, this change may not be necessary any more but we do not want to take any chances. - Remove Apple hardware specific quirks. With the above changes, it seems this hack is no longer needed. - Improve pmap_cache_bits() with an array to map PAT memory type to index. This array is initialized early from pmap_init_pat(), so that we do not need to handle special cases in the function any more. Now this function is identical on both amd64 and i386.
Reviewed by: jhb Tested by: RM (reuf_m at hotmail dot com) Ryszard Czekaj (rychoo at freeshell dot net) army.of.root (army dot of dot root at googlemail dot com) MFC after: 3 days
|
#
214938 |
|
07-Nov-2010 |
alc |
Eliminate a possible race between pmap_pinit() and pmap_kenter_pde() on superpage promotion or demotion.
Micro-optimize pmap_kenter_pde().
Reviewed by: kib, jhb (an earlier version) MFC after: 1 week
|
#
213455 |
|
05-Oct-2010 |
alc |
Initialize KPTmap in locore so that vm86.c can call vtophys() (or really pmap_kextract()) before pmap_bootstrap() is called.
Document the set of pmap functions that may be called before pmap_bootstrap() is called.
Tested by: bde@ Reviewed by: kib@ Discussed with: jhb@ MFC after: 6 weeks
|
#
211424 |
|
17-Aug-2010 |
gahr |
- The iMac9,1 needs the PAT workaround as well
Approved by: cognet
|
#
211149 |
|
10-Aug-2010 |
attilio |
Fix some places that may use cpumask_t while they still use 'int' types. While there, also fix some places assuming cpu type is 'int' while u_int is really meant.
Note: this will also fix some possible races in per-cpu data accessings to be addressed in further commits.
In collabouration with: Yahoo! Incorporated (via sbruno and peter) Tested by: gianni MFC after: 1 month
|
#
209887 |
|
10-Jul-2010 |
alc |
Reduce the number of global TLB shootdowns generated by pmap_qenter(). Specifically, teach pmap_qenter() to recognize the case when it is being asked to replace a mapping with the very same mapping and not generate a shootdown. Unfortunately, the buffer cache commonly passes an entire buffer to pmap_qenter() when only a subset of the mappings are changing. For the extension of buffers in allocbuf() this was resulting in unnecessary shootdowns. The addition of new pages to the end of the buffer need not and did not trigger a shootdown, but overwriting the initial mappings with the very same mappings was seen as a change that necessitated a shootdown. With this change, that is no longer so.
For a "buildworld" on amd64, this change eliminates 14-15% of the pmap_invalidate_range() shootdowns, and about 4% of the overall shootdowns.
MFC after: 3 weeks
|
#
209862 |
|
09-Jul-2010 |
kib |
For both i386 and amd64 pmap, - change the type of pm_active to cpumask_t, which it is; - in pmap_remove_pages(), compare with PCPU(curpmap), instead of dereferencing the long chain of pointers [1]. For amd64 pmap, remove the unneeded checks for validity of curpmap in pmap_activate(), since curpmap should be always valid after r209789.
Submitted by: alc [1] Reviewed by: alc MFC after: 3 weeks
|
#
209048 |
|
11-Jun-2010 |
alc |
Relax one of the new assertions in pmap_enter() a little. Specifically, allow pmap_enter() to be performed on an unmanaged page that doesn't have VPO_BUSY set. Having VPO_BUSY set really only matters for managed pages. (See, for example, pmap_remove_write().)
|
#
208990 |
|
10-Jun-2010 |
alc |
Reduce the scope of the page queues lock and the number of PG_REFERENCED changes in vm_pageout_object_deactivate_pages(). Simplify this function's inner loop using TAILQ_FOREACH(), and shorten some of its overly long lines. Update a stale comment.
Assert that PG_REFERENCED may be cleared only if the object containing the page is locked. Add a comment documenting this.
Assert that a caller to vm_page_requeue() holds the page queues lock, and assert that the page is on a page queue.
Push down the page queues lock into pmap_ts_referenced() and pmap_page_exists_quick(). (As of now, there are no longer any pmap functions that expect to be called with the page queues lock held.)
Neither pmap_ts_referenced() nor pmap_page_exists_quick() should ever be passed an unmanaged page. Assert this rather than returning "0" and "FALSE" respectively.
ARM:
Simplify pmap_page_exists_quick() by switching to TAILQ_FOREACH().
Push down the page queues lock inside of pmap_clearbit(), simplifying pmap_clear_modify(), pmap_clear_reference(), and pmap_remove_write(). Additionally, this allows for avoiding the acquisition of the page queues lock in some cases.
PowerPC/AIM:
moea*_page_exits_quick() and moea*_page_wired_mappings() will never be called before pmap initialization is complete. Therefore, the check for moea_initialized can be eliminated.
Push down the page queues lock inside of moea*_clear_bit(), simplifying moea*_clear_modify() and moea*_clear_reference().
The last parameter to moea*_clear_bit() is never used. Eliminate it.
PowerPC/BookE:
Simplify mmu_booke_page_exists_quick()'s control flow.
Reviewed by: kib@
|
#
208765 |
|
03-Jun-2010 |
alc |
In the unlikely event that pmap_ts_referenced() demoted five superpage mappings to the same underlying physical page, the calling thread would be left forever pinned to the same processor.
MFC after: 3 days
|
#
208667 |
|
31-May-2010 |
alc |
Eliminate a stale comment.
|
#
208657 |
|
30-May-2010 |
alc |
Simplify the inner loop of pmap_collect(): While iterating over the page's pv list, there is no point in checking whether or not the pv list is empty. Instead, wait until the loop completes.
|
#
208645 |
|
29-May-2010 |
alc |
When I pushed down the page queues lock into pmap_is_modified(), I created an ordering dependence: A pmap operation that clears PG_WRITEABLE and calls vm_page_dirty() must perform the call first. Otherwise, pmap_is_modified() could return FALSE without acquiring the page queues lock because the page is not (currently) writeable, and the caller to pmap_is_modified() might believe that the page's dirty field is clear because it has not seen the effect of the vm_page_dirty() call.
When I pushed down the page queues lock into pmap_is_modified(), I overlooked one place where this ordering dependence is violated: pmap_enter(). In a rare situation pmap_enter() can be called to replace a dirty mapping to one page with a mapping to another page. (I say rare because replacements generally occur as a result of a copy-on-write fault, and so the old page is not dirty.) This change delays clearing PG_WRITEABLE until after vm_page_dirty() has been called.
Fixing the ordering dependency also makes it easy to introduce a small optimization: When pmap_enter() used to replace a mapping to one page with a mapping to another page, it freed the pv entry for the first mapping and later called the pv entry allocator for the new mapping. Now, pmap_enter() attempts to recycle the old pv entry, saving two calls to the pv entry allocator.
There is no point in setting PG_WRITEABLE on unmanaged pages, so don't. Update a comment to reflect this.
Tidy up the variable declarations at the start of pmap_enter().
|
#
208609 |
|
28-May-2010 |
alc |
Defer freeing any page table pages in pmap_remove_all() until after the page queues lock is released. This may reduce the amount of time that the page queues lock is held by pmap_remove_all().
|
#
208574 |
|
26-May-2010 |
alc |
Push down page queues lock acquisition in pmap_enter_object() and pmap_is_referenced(). Eliminate the corresponding page queues lock acquisitions from vm_map_pmap_enter() and mincore(), respectively. In mincore(), this allows some additional cases to complete without ever acquiring the page queues lock.
Assert that the page is managed in pmap_is_referenced().
On powerpc/aim, push down the page queues lock acquisition from moea*_is_modified() and moea*_is_referenced() into moea*_query_bit(). Again, this will allow some additional cases to complete without ever acquiring the page queues lock.
Reorder a few statements in vm_page_dontneed() so that a race can't lead to an old reference persisting. This scenario is described in detail by a comment.
Correct a spelling error in vm_page_dontneed().
Assert that the object is locked in vm_page_clear_dirty(), and restrict the page queues lock assertion to just those cases in which the page is currently writeable.
Add object locking to vnode_pager_generic_putpages(). This was the one and only place where vm_page_clear_dirty() was being called without the object being locked.
Eliminate an unnecessary vm_page_lock() around vnode_pager_setsize()'s call to vm_page_clear_dirty().
Change vnode_pager_generic_putpages() to the modern-style of function definition. Also, change the name of one of the parameters to follow virtual memory system naming conventions.
Reviewed by: kib
|
#
208504 |
|
24-May-2010 |
alc |
Roughly half of a typical pmap_mincore() implementation is machine- independent code. Move this code into mincore(), and eliminate the page queues lock from pmap_mincore().
Push down the page queues lock into pmap_clear_modify(), pmap_clear_reference(), and pmap_is_modified(). Assert that these functions are never passed an unmanaged page.
Eliminate an inaccurate comment from powerpc/powerpc/mmu_if.m: Contrary to what the comment says, pmap_mincore() is not simply an optimization. Without a complete pmap_mincore() implementation, mincore() cannot return either MINCORE_MODIFIED or MINCORE_REFERENCED because only the pmap can provide this information.
Eliminate the page queues lock from vfs_setdirty_locked_object(), vm_pageout_clean(), vm_object_page_collect_flush(), and vm_object_page_clean(). Generally speaking, these are all accesses to the page's dirty field, which are synchronized by the containing vm object's lock.
Reduce the scope of the page queues lock in vm_object_madvise() and vm_page_dontneed().
Reviewed by: kib (an earlier version)
|
#
208175 |
|
16-May-2010 |
alc |
On entry to pmap_enter(), assert that the page is busy. While I'm here, make the style of assertion used by pmap_enter() consistent across all architectures.
On entry to pmap_remove_write(), assert that the page is neither unmanaged nor fictitious, since we cannot remove write access to either kind of page.
With the push down of the page queues lock, pmap_remove_write() cannot condition its behavior on the state of the PG_WRITEABLE flag if the page is busy. Assert that the object containing the page is locked. This allows us to know that the page will neither become busy nor will PG_WRITEABLE be set on it while pmap_remove_write() is running.
Correct a long-standing bug in vm_page_cowsetup(). We cannot possibly do copy-on-write-based zero-copy transmit on unmanaged or fictitious pages, so don't even try. Previously, the call to pmap_remove_write() would have failed silently.
|
#
207796 |
|
08-May-2010 |
alc |
Push down the page queues into vm_page_cache(), vm_page_try_to_cache(), and vm_page_try_to_free(). Consequently, push down the page queues lock into pmap_enter_quick(), pmap_page_wired_mapped(), pmap_remove_all(), and pmap_remove_write().
Push down the page queues lock into Xen's pmap_page_is_mapped(). (I overlooked the Xen pmap in r207702.)
Switch to a per-processor counter for the total number of pages cached.
|
#
207702 |
|
06-May-2010 |
alc |
Push down the page queues lock inside of vm_page_free_toq() and pmap_page_is_mapped() in preparation for removing page queues locking around calls to vm_page_free(). Setting aside the assertion that calls pmap_page_is_mapped(), vm_page_free_toq() now acquires and holds the page queues lock just long enough to actually add or remove the page from the paging queues.
Update vm_page_unhold() to reflect the above change.
|
#
207410 |
|
29-Apr-2010 |
kmacy |
On Alan's advice, rather than do a wholesale conversion on a single architecture from page queue lock to a hashed array of page locks (based on a patch by Jeff Roberson), I've implemented page lock support in the MI code and have only moved vm_page's hold_count out from under page queue mutex to page lock. This changes pmap_extract_and_hold on all pmaps.
Supported by: Bitgravity Inc.
Discussed with: alc, jeffr, and kib
|
#
207205 |
|
25-Apr-2010 |
alc |
Clearing a page table entry's accessed bit (PG_A) and setting the page's PG_REFERENCED flag in pmap_protect() can't really be justified. In contrast to pmap_remove() or pmap_remove_all(), the mapping is not being destroyed, so the notion that the page was accessed is not lost. Moreover, clearing the page table entry's accessed bit and setting the page's PG_REFERENCED flag can throw off the page daemon's activity count calculation. Finally, in my tests, I found that 15% of the atomic memory operations being performed by pmap_protect() were only to clear PG_A, and not change protection. This could, by itself, be fixed, but I don't see the point given the above argument.
Remove a comment from pmap_protect_pde() that is no longer meaningful after the above change.
|
#
207163 |
|
24-Apr-2010 |
kmacy |
- fix style issues on i386 as well
requested by: alc@
|
#
207155 |
|
24-Apr-2010 |
alc |
Resurrect pmap_is_referenced() and use it in mincore(). Essentially, pmap_ts_referenced() is not always appropriate for checking whether or not pages have been referenced because it clears any reference bits that it encounters. For example, in mincore(), clearing the reference bits has two negative consequences. First, it throws off the activity count calculations performed by the page daemon. Specifically, a page on which mincore() has called pmap_ts_referenced() looks less active to the page daemon than it should. Consequently, the page could be deactivated prematurely by the page daemon. Arguably, this problem could be fixed by having mincore() duplicate the activity count calculation on the page. However, there is a second problem for which that is not a solution. In order to clear a reference on a 4KB page, it may be necessary to demote a 2/4MB page mapping. Thus, a mincore() by one process can have the side effect of demoting a superpage mapping within another process!
|
#
205778 |
|
27-Mar-2010 |
alc |
Correctly handle preemption of pmap_update_pde_invalidate().
X-MFC after: r205573
|
#
205773 |
|
27-Mar-2010 |
alc |
Simplify pmap_growkernel(), making the i386 version more like the amd64 version.
MFC after: 3 weeks
|
#
205652 |
|
25-Mar-2010 |
alc |
A ptrace(2) by one processor may trigger a promotion in the address space of another process. Modify pmap_promote_pde() to handle this. (This is not a problem on amd64 due to implementation differences.)
Reported by: jh@ MFC after: 1 week
|
#
205573 |
|
24-Mar-2010 |
alc |
Adapt r204907 and r205402, the amd64 implementation of the workaround for AMD Family 10h Erratum 383, to i386.
Enable machine check exceptions by default, just like r204913 for amd64.
Enable superpage promotion only if the processor actually supports large pages, i.e., PG_PS.
MFC after: 2 weeks
|
#
205334 |
|
19-Mar-2010 |
avg |
pmap amd64/i386: fix a typo in a comment
MFC after: 3 days
|
#
205063 |
|
12-Mar-2010 |
jhb |
Fix the previous attempt to fix kernel builds of HEAD on 7.x. Use the __gnu_inline__ attribute for PMAP_INLINE when using the 7.x compiler to match what 7.x uses for PMAP_INLINE.
|
#
204957 |
|
10-Mar-2010 |
kib |
Fall back to wbinvd when region for CLFLUSH is >= 2MB.
Submitted by: Kevin Day <toasty dragondata com> Reviewed by: jhb MFC after: 2 weeks
|
#
204420 |
|
27-Feb-2010 |
alc |
When running as a guest operating system, the FreeBSD kernel must assume that the virtual machine monitor has enabled machine check exceptions. Unfortunately, on AMD Family 10h processors the machine check hardware has a bug (Erratum 383) that can result in a false machine check exception when a superpage promotion occurs. Thus, I am disabling superpage promotion when the FreeBSD kernel is running as a guest operating system on an AMD Family 10h processor.
Reviewed by: jhb, kib MFC after: 3 days
|
#
204041 |
|
18-Feb-2010 |
ed |
Allow the pmap code to be built with GCC from FreeBSD 7 again.
This patch basically gives us the best of both worlds. Instead of forcing the compiler to emulate GNU-style inline semantics even though we're using ISO C99, it will only use GNU-style inlining when the compiler is configured that way (__GNUC_GNU_INLINE__).
Tested by: jhb
|
#
203351 |
|
01-Feb-2010 |
alc |
Change the default value for the flag enabling superpage mapping and promotion to "on".
|
#
203085 |
|
27-Jan-2010 |
alc |
Optimize pmap_demote_pde() by using the new KPTmap to access a kernel page table page instead of creating a temporary mapping to it.
Set the PG_G bit on the page table entries that implement the KPTmap.
Locore initializes the unused portions of the NKPT kernel page table pages that it allocates to zero. So, pmap_bootstrap() needn't zero the page table entries referenced by CMAP1 and CMAP3.
Simplify pmap_set_pg().
MFC after: 10 days
|
#
202894 |
|
23-Jan-2010 |
alc |
Handle a race between pmap_kextract() and pmap_promote_pde(). This race is known to cause a kernel crash in ZFS on i386 when superpage promotion is enabled.
Tested by: netchild MFC after: 1 week
|
#
202628 |
|
19-Jan-2010 |
ed |
Recommit r193732:
Remove __gnu89_inline.
Now that we use C99 almost everywhere, just use C99-style in the pmap code. Since the pmap code is the only consumer of __gnu89_inline, remove it from cdefs.h as well. Because the flag was only introduced 17 months ago, I don't expect any problems.
Reviewed by: alc
It was backed out, because it prevented us from building kernels using a 7.x compiler. Now that most people use 8.x, there is nothing that holds us back. Even if people run 7.x, they should be able to build a kernel if they run `make kernel-toolchain' or `make buildworld' first.
|
#
202085 |
|
11-Jan-2010 |
alc |
Simplify pmap_init(). Additionally, correct a harmless misbehavior on i386. Specifically, where locore had created large page mappings for the kernel, the wrong vm page array entries were being initialized. The vm page array entries for the pages containing the kernel were being initialized instead of the vm page array entries for page table pages.
MFC after: 1 week
|
#
201934 |
|
09-Jan-2010 |
alc |
Long ago, in r120654, the rounding of KERNend and physfree in locore was changed from a small page boundary to a large page boundary. As a consequence pmap_kmem_choose() became a pointless waste of address space. Eliminate it.
|
#
201751 |
|
07-Jan-2010 |
alc |
Make pmap_set_pg() static.
|
#
199184 |
|
11-Nov-2009 |
avg |
reflect that pg_ps_enabled is a tunable, not just a read-only sysctl
Nod from: jhb
|
#
198341 |
|
21-Oct-2009 |
marcel |
o Introduce vm_sync_icache() for making the I-cache coherent with the memory or D-cache, depending on the semantics of the platform. vm_sync_icache() is basically a wrapper around pmap_sync_icache(), that translates the vm_map_t argumument to pmap_t. o Introduce pmap_sync_icache() to all PMAP implementation. For powerpc it replaces the pmap_page_executable() function, added to solve the I-cache problem in uiomove_fromphys(). o In proc_rwmem() call vm_sync_icache() when writing to a page that has execute permissions. This assures that when breakpoints are written, the I-cache will be coherent and the process will actually hit the breakpoint. o This also fixes the Book-E PMAP implementation that was missing necessary locking while trying to deal with the I-cache coherency in pmap_enter() (read: mmu_booke_enter_locked).
The key property of this change is that the I-cache is made coherent *after* writes have been done. Doing it in the PMAP layer when adding or changing a mapping means that the I-cache is made coherent *before* any writes happen. The difference is key when the I-cache prefetches.
|
#
197317 |
|
18-Sep-2009 |
alc |
When superpages are enabled, add the 2 or 4MB page size to the array of supported page sizes.
Reviewed by: jhb MFC after: 3 weeks
|
#
197070 |
|
10-Sep-2009 |
jkim |
Consolidate CPUID to CPU family/model macros for amd64 and i386 to reduce unnecessary #ifdef's for shared code between them.
|
#
196771 |
|
02-Sep-2009 |
jkim |
Fix confusing comments about default PAT entries.
|
#
196769 |
|
02-Sep-2009 |
jkim |
- Work around ACPI mode transition problem for recent NVIDIA 9400M chipset based Intel Macs. Since r189055, these platforms started freezing when ACPI is being initialized for unknown reason. For these platforms, we just use the old PAT layout. Note this change is not enough to boot fully on these platforms because of other problems but it makes debugging possible. Note MacBook5,2 may be affected as well but it was not added here because of lack of hardware to test. - Initialize PAT MSR fully instead of reading and modifying it for safety.
Reported by: rpaulo, hps, Eygene Ryabinkin (rea-fbsd at codelabs dot ru) Reviewed by: jhb
|
#
196707 |
|
31-Aug-2009 |
jhb |
Simplify pmap_change_attr() a bit: - Always calculate the cache bits instead of doing it on-demand. - Always set changed to TRUE rather than only doing it if it is false.
Discussed with: alc MFC after: 3 days
|
#
196705 |
|
31-Aug-2009 |
jhb |
Improve pmap_change_attr() so that it is able to demote a large (2/4MB) page into 4KB pages as needed. This should be fairly rare in practice on i386. This includes merging the following changes from the amd64 pmap: 180430, 180485, 180845, 181043, 181077, and 196318. - Add basic support for changing attributes on PDEs to pmap_change_attr() similar to the support in the initial version of pmap_change_attr() on amd64 including inlines for pmap_pde_attr() and pmap_pte_attr(). - Extend pmap_demote_pde() to include the ability to instantiate a new page table page where none existed before. - Enhance pmap_change_attr(). Use pmap_demote_pde() to demote a 2/4MB page mapping to 4KB page mappings when the specified attribute change only applies to a portion of the 2/4MB page. Previously, in such cases, pmap_change_attr() gave up and returned an error. - Correct a critical accounting error in pmap_demote_pde().
Reviewed by: alc MFC after: 3 days
|
#
196643 |
|
29-Aug-2009 |
rnoland |
Swap the start/end virtual addresses in pmap_invalidate_cache_range().
This fixes the functionality on non SelfSnoop hardware.
Found by: rnoland Submitted by: alc Reviewed by: kib MFC after: 3 days
|
#
195940 |
|
29-Jul-2009 |
kib |
As was done in r195820 for amd64, use clflush for flushing cache lines when memory page caching attributes changed, and CPU does not support self-snoop, but implemented clflush, for i386.
Take care of possible mappings of the page by sf buffer by utilizing the mapping for clflush, otherwise map the page transiently. Amd64 used direct map.
Proposed and reviewed by: alc Approved by: re (kensmith)
|
#
195840 |
|
24-Jul-2009 |
jhb |
Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to a device pager (OBJT_DEVICE) object in that it uses fictitious pages to provide aliases to other memory addresses. The primary difference is that it uses an sglist(9) to determine the physical addresses for a given offset into the object instead of invoking the d_mmap() method in a device driver.
Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks
|
#
195836 |
|
23-Jul-2009 |
alc |
Eliminate unnecessary cache and TLB flushes by pmap_change_attr(). (This optimization was implemented in the amd64 version roughly 1 year ago.)
Approved by: re (kensmith)
|
#
195774 |
|
19-Jul-2009 |
alc |
Change the handling of fictitious pages by pmap_page_set_memattr() on amd64 and i386. Essentially, fictitious pages provide a mechanism for creating aliases for either normal or device-backed pages. Therefore, pmap_page_set_memattr() on a fictitious page needn't update the direct map or flush the cache. Such actions are the responsibility of the "primary" instance of the page or the device driver that "owns" the physical address. For example, these actions are already performed by pmap_mapdev().
The device pager needn't restore the memory attributes on a fictitious page before releasing it. It's now pointless.
Add pmap_page_set_memattr() to the Xen pmap.
Approved by: re (kib)
|
#
195749 |
|
17-Jul-2009 |
alc |
An addendum to r195649, "Add support to the virtual memory system for configuring machine-dependent memory attributes...":
Don't set the memory attribute for a "real" page that is allocated to a device object in vm_page_alloc(). It is a pointless act, because the device pager replaces this "real" page with a "fake" page and sets the memory attribute on that "fake" page.
Eliminate pointless code from pmap_cache_bits() on amd64.
Employ the "Self Snoop" feature supported by some x86 processors to avoid cache flushes in the pmap.
Approved by: re (kib)
|
#
195649 |
|
12-Jul-2009 |
alc |
Add support to the virtual memory system for configuring machine- dependent memory attributes:
Rename vm_cache_mode_t to vm_memattr_t. The new name reflects the fact that there are machine-dependent memory attributes that have nothing to do with controlling the cache's behavior.
Introduce vm_object_set_memattr() for setting the default memory attributes that will be given to an object's pages.
Introduce and use pmap_page_{get,set}_memattr() for getting and setting a page's machine-dependent memory attributes. Add full support for these functions on amd64 and i386 and stubs for them on the other architectures. The function pmap_page_set_memattr() is also responsible for any other machine-dependent aspects of changing a page's memory attributes, such as flushing the cache or updating the direct map. The uses include kmem_alloc_contig(), vm_page_alloc(), and the device pager:
kmem_alloc_contig() can now be used to allocate kernel memory with non-default memory attributes on amd64 and i386.
vm_page_alloc() and the device pager will set the memory attributes for the real or fictitious page according to the object's default memory attributes.
Update the various pmap functions on amd64 and i386 that map pages to incorporate each page's memory attributes in the mapping.
Notes: (1) Inherent to this design are safety features that prevent the specification of inconsistent memory attributes by different mappings on amd64 and i386. In addition, the device pager provides a warning when a device driver creates a fictitious page with memory attributes that are inconsistent with the real page that the fictitious page is an alias for. (2) Storing the machine-dependent memory attributes for amd64 and i386 as a dedicated "int" in "struct md_page" represents a compromise between space efficiency and the ease of MFCing these changes to RELENG_7.
In collaboration with: jhb
Approved by: re (kib)
|
#
195385 |
|
05-Jul-2009 |
alc |
PAE adds another level to the i386 page table. This level is a small 4-entry table that must be located within the first 4GB of RAM. This requirement is met by defining an UMA zone with a custom back-end allocator function. This revision makes two changes to this back-end allocator function: (1) It replaces the use of contigmalloc() with the use of kmem_alloc_contig(). This eliminates "double accounting", i.e., accounting by both the UMA zone and malloc tags. (I made the same change for the same reason to the zones supporting jumbo frames a week ago.) (2) It passes through the "wait" parameter, i.e., M_WAITOK, M_ZERO, etc. to kmem_alloc_contig() rather than ignoring it. pmap_init() calls uma_zalloc() with both M_WAITOK and M_ZERO. At the moment, this is harmless only because the default behavior of contigmalloc()/kmem_alloc_contig() is to wait and because pmap_init() doesn't really depend on the memory being zeroed.
The back-end allocator function in the Xen pmap is dead code. I am changing it nonetheless because I don't want to leave any "bad examples" in the source tree for someone to copy at a later date.
Approved by: re (kib)
|
#
194209 |
|
14-Jun-2009 |
alc |
Long, long ago in r27464 special case code for mapping device-backed memory with 4MB pages was added to pmap_object_init_pt(). This code assumes that the pages of a OBJT_DEVICE object are always physically contiguous. Unfortunately, this is not always the case. For example, jhb@ informs me that the recently introduced /dev/ksyms driver creates a OBJT_DEVICE object that violates this assumption. Thus, this revision modifies pmap_object_init_pt() to abort the mapping if the OBJT_DEVICE object's pages are not physically contiguous. This revision also changes some inconsistent if not buggy behavior. For example, the i386 version aborts if the first 4MB virtual page that would be mapped is already valid. However, it incorrectly replaces any subsequent 4MB virtual page mappings that it encounters, potentially leaking a page table page. The amd64 version has a bug of my own creation. It potentially busies the wrong page and always an insufficent number of pages if it blocks allocating a page table page.
To my knowledge, there have been no reports of these bugs, hence, their persistance. I suspect that the existing restrictions that pmap_object_init_pt() placed on the OBJT_DEVICE objects that it would choose to map, for example, that the first page must be aligned on a 2 or 4MB physical boundary and that the size of the mapping must be a multiple of the large page size, were enough to avoid triggering the bug for drivers like ksyms. However, one side effect of testing the OBJT_DEVICE object's pages for physical contiguity is that a dubious difference between pmap_object_init_pt() and the standard path for mapping devices pages, i.e., vm_fault(), has been eliminated. Previously, pmap_object_init_pt() would only instantiate the first PG_FICTITOUS page being mapped because it never examined the rest. Now, however, pmap_object_init_pt() uses the new function vm_object_populate() to instantiate them all (in order to support testing their physical contiguity). These pages need to be instantiated for the mechanism that I have prototyped for automatically maintaining the consistency of the PAT settings across multiple mappings, particularly, amd64's direct mapping, to work. (Translation: This change is also being made to support jhb@'s work on the Nvidia feature requests.)
Discussed with: jhb@
|
#
193734 |
|
08-Jun-2009 |
ed |
Revert my change; reintroduce __gnu89_inline.
It turns out our compiler in stable/7 can't build this code anymore. Even though my opinion is that those people should just run `make kernel-toolchain' before building a kernel, I am willing to wait and commit this after we've branched stable/8.
Requested by: rwatson
|
#
193732 |
|
08-Jun-2009 |
ed |
Remove __gnu89_inline.
Now that we use C99 almost everywhere, just use C99-style in the pmap code. Since the pmap code is the only consumer of __gnu89_inline, remove it from cdefs.h as well. Because the flag was only introduced 17 months ago, I don't expect any problems.
Reviewed by: alc
|
#
192114 |
|
14-May-2009 |
attilio |
FreeBSD right now support 32 CPUs on all the architectures at least. With the arrival of 128+ cores it is necessary to handle more than that. One of the first thing to change is the support for cpumask_t that needs to handle more than 32 bits masking (which happens now). Some places, however, still assume that cpumask_t is a 32 bits mask. Fix that situation by using always correctly cpumask_t when needed.
While here, remove the part under STOP_NMI for the Xen support as it is broken in any case.
Additively make ipi_nmi_pending as static.
Reviewed by: jhb, kmacy Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
|
#
192035 |
|
13-May-2009 |
alc |
Correct a rare use-after-free error in pmap_copy(). This error was introduced in amd64 revision 1.540 and i386 revision 1.547. However, it had no harmful effects until after a recent change, r189698, on amd64. (In other words, the error is harmless in RELENG_7.)
The error is triggered by the failure to allocate a pv entry for the one and only mapping in a page table page. I am addressing the error by changing pmap_copy() to abort if either pv entry allocation or page table page allocation fails. This is appropriate because the creation of mappings by pmap_copy() is optional. They are a (possible) optimization, and not a requirement.
Correct a nearby whitespace error in the i386 pmap_copy().
Crash reported by: jeff@ MFC after: 6 weeks
|
#
189795 |
|
14-Mar-2009 |
alc |
MFamd64 r189785 Update the pmap's resident page count when a page table page is freed in pmap_remove_pde() and pmap_remove_pages().
MFC after: 6 weeks
|
#
189055 |
|
25-Feb-2009 |
jkim |
Enable support for PAT_WRITE_PROTECTED and PAT_UNCACHED cache modes unconditionally on amd64. On i386, we assume PAT is usable if the CPU vendor is not Intel or CPU model is newer than Pentium IV.
Reviewed by: alc, jhb
|
#
188932 |
|
23-Feb-2009 |
alc |
Optimize free_pv_entry(); specifically, avoid repeated TAILQ_REMOVE()s.
MFC after: 1 week
|
#
188608 |
|
14-Feb-2009 |
alc |
Remove unnecessary page queues locking around vm_page_busy() and vm_page_wakeup(). (This change is applicable to RELENG_7 but not RELENG_6.)
MFC after: 1 week
|
#
183207 |
|
20-Sep-2008 |
alc |
MFamd64 SVN rev 179749 CVS rev 1.620 Reverse the direction of pmap_promote_pde()'s traversal over the specified page table page. The direction of the traversal can matter if pmap_promote_pde() has to remove write access (PG_RW) from a PTE that hasn't been modified (PG_M). In general, if there are two or more such PTEs to choose among, it is better to write protect the one nearer the high end of the page table page rather than the low end. This is because most programs access memory in an ascending direction. The net result of this change is a sometimes significant reduction in the number of failed promotion attempts and the number of pages that are write protected by pmap_promote_pde().
MFamd64 SVN rev 179777 CVS rev 1.621 Tweak the promotion test in pmap_promote_pde(). Specifically, test PG_A before PG_M. This sometimes prevents unnecessary removal of write access from a PTE. Overall, the net result is fewer demotions and promotion failures.
|
#
183169 |
|
19-Sep-2008 |
alc |
MFamd64 SVN rev 179471 CVS rev 1.619 Correct an error in pmap_promote_pde() that may result in an errant promotion within the kernel's address space.
|
#
181284 |
|
04-Aug-2008 |
alc |
Make pmap_kenter_attr() static.
|
#
180873 |
|
28-Jul-2008 |
alc |
Correct an off-by-one error in the previous change to pmap_change_attr(). Change the nearby comment to mention the recursive map.
|
#
180871 |
|
28-Jul-2008 |
alc |
Don't allow pmap_change_attr() to be applied to the recursive mapping.
|
#
180846 |
|
27-Jul-2008 |
alc |
Style fixes to several function definitions.
|
#
180601 |
|
18-Jul-2008 |
alc |
Correct an error in pmap_change_attr()'s initial loop that verifies that the given range of addresses are mapped. Previously, the loop was testing the same address every time.
Submitted by: Magesh Dhasayyan
|
#
180600 |
|
18-Jul-2008 |
alc |
Simplify pmap_extract()'s control flow, making it more like the related functions pmap_extract_and_hold() and pmap_kextract().
|
#
180352 |
|
07-Jul-2008 |
alc |
In FreeBSD 7.0 and beyond, pmap_growkernel() should pass VM_ALLOC_INTERRUPT to vm_page_alloc() instead of VM_ALLOC_SYSTEM. VM_ALLOC_SYSTEM was the logical choice before FreeBSD 7.0 because VM_ALLOC_INTERRUPT could not reclaim a cached page. Simply put, there was no ordering between VM_ALLOC_INTERRUPT and VM_ALLOC_SYSTEM as to which "dug deeper" into the cache and free queues. Now, there is; VM_ALLOC_INTERRUPT dominates VM_ALLOC_SYSTEM.
While I'm here, teach pmap_growkernel() to request a prezeroed page.
MFC after: 1 week
|
#
179081 |
|
18-May-2008 |
alc |
Retire pmap_addr_hint(). It is no longer used.
|
#
178947 |
|
11-May-2008 |
alc |
Correct an error in pmap_align_superpage(). Specifically, correctly handle the case where the mapping is greater than a superpage in size but the alignment of the physical pages spans a superpage boundary.
|
#
178875 |
|
09-May-2008 |
alc |
Introduce pmap_align_superpage(). It increases the starting virtual address of the given mapping if a different alignment might result in more superpage mappings.
|
#
178493 |
|
25-Apr-2008 |
alc |
Always use PG_PS_FRAME to extract the physical address of a 2/4MB page from a PDE.
|
#
178070 |
|
10-Apr-2008 |
alc |
Correct pmap_copy()'s method for extracting the physical address of a 2/4MB page from a PDE. Specifically, change it to use PG_PS_FRAME, not PG_FRAME, to extract the physical address of a 2/4MB page from a PDE.
Change the last argument passed to pmap_pv_insert_pde() from a vm_page_t representing the first 4KB page of a 2/4MB page to the vm_paddr_t of the 2/4MB page. This avoids an otherwise unnecessary conversion from a vm_paddr_t to a vm_page_t in pmap_copy().
|
#
177967 |
|
07-Apr-2008 |
alc |
Update pmap_page_wired_mappings() so that it counts 2/4MB page mappings.
|
#
177921 |
|
04-Apr-2008 |
alc |
Reintroduce UMA_SLAB_KMAP; however, change its spelling to UMA_SLAB_KERNEL for consistency with its sibling UMA_SLAB_KMEM. (UMA_SLAB_KMAP met its original demise in revision 1.30 of vm/uma_core.c.) UMA_SLAB_KERNEL is now required by the jumbo frame allocators. Without it, UMA cannot correctly return pages from the jumbo frame zones to the VM system because it resets the pages' object field to NULL instead of the kernel object. In more detail, the jumbo frame zones are created with the option UMA_ZONE_REFCNT. This causes UMA to overwrite the pages' object field with the address of the slab. However, when UMA wants to release these pages, it doesn't know how to restore the object field, so it sets it to NULL. This change teaches UMA how to reset the object field to the kernel object.
Crashes reported by: kris Fix tested by: kris Fix discussed with: jeff MFC after: 6 weeks
|
#
177702 |
|
29-Mar-2008 |
alc |
Eliminate an #if 0/#endif that was unintentionally introduced by the previous revision.
|
#
177684 |
|
28-Mar-2008 |
brooks |
Use ; instead of : to end a line.
Submitted by: Niclas Zeising <niclas dot zeising at gmail dot com>
|
#
177680 |
|
28-Mar-2008 |
ps |
Add support to mincore for detecting whether a page is part of a "super" page or not.
Reviewed by: alc, ups
|
#
177659 |
|
27-Mar-2008 |
alc |
MFamd64 with few changes:
1. Add support for automatic promotion of 4KB page mappings to 2MB page mappings. Automatic promotion can be enabled by setting the tunable "vm.pmap.pg_ps_enabled" to a non-zero value. By default, automatic promotion is disabled. Tested by: kris
2. To date, we have assumed that the TLB will only set the PG_M bit in a PTE if that PTE has the PG_RW bit set. However, this assumption does not hold on recent processors from Intel. For example, consider a PTE that has the PG_RW bit set but the PG_M bit clear. Suppose this PTE is cached in the TLB and later the PG_RW bit is cleared in the PTE, but the corresponding TLB entry is not (yet) invalidated. Historically, upon a write access using this (stale) TLB entry, the TLB would observe that the PG_RW bit had been cleared and initiate a page fault, aborting the setting of the PG_M bit in the PTE. Now, however, P4- and Core2-family processors will set the PG_M bit before observing that the PG_RW bit is clear and initiating a page fault. In other words, the write does not occur but the PG_M bit is still set.
The real impact of this difference is not that great. Specifically, we should no longer assert that any PTE with the PG_M bit set must also have the PG_RW bit set, and we should ignore the state of the PG_M bit unless the PG_RW bit is set.
|
#
177525 |
|
23-Mar-2008 |
kib |
Prevent the overflow in the calculation of the next page directory. The overflow causes the wraparound with consequent corruption of the (almost) whole address space mapping.
As Alan noted, pmap_copy() does not require the wrap-around checks because it cannot be applied to the kernel's pmap. The checks there are included for consistency.
Reported and tested by: kris (i386/pmap.c:pmap_remove() part) Reviewed by: alc MFC after: 1 week
|
#
175404 |
|
17-Jan-2008 |
alc |
Retire PMAP_DIAGNOSTIC. Any useful diagnostics that were conditionally compiled under PMAP_DIAGNOSTIC are now KASSERT()s. (Note: The kernel option DIAGNOSTIC still disables inlining of certain pmap functions.)
Eliminate dead code from pmap_enter(). This code implemented an assertion. On i386, an equivalent check is already implemented. However, on amd64, a small change is required to implement an equivalent check.
Eliminate \n from a nearby panic string.
Use KASSERT() to reimplement pmap_copy()'s two assertions.
|
#
175328 |
|
14-Jan-2008 |
peter |
Add a CTASSERT that KERNBASE is valid. This is usually messed up by an invalid KVA_PAGES, so add a pointer to there.
|
#
175155 |
|
08-Jan-2008 |
alc |
Convert a PMAP_DIAGNOSTIC to a KASSERT.
|
#
175119 |
|
06-Jan-2008 |
alc |
Shrink the size of struct vm_page on amd64 and i386 by eliminating pv_list_count from struct md_page. Ever since Peter rewrote the pv entry allocator for amd64 and i386 pv_list_count has been correctly maintained but otherwise unused.
|
#
175067 |
|
03-Jan-2008 |
alc |
Add an access type parameter to pmap_enter(). It will be used to implement superpage promotion.
Correct a style error in kmem_malloc(): pmap_enter()'s last parameter is a Boolean.
|
#
175056 |
|
02-Jan-2008 |
alc |
Provide a legitimate pindex to vm_page_alloc() in pmap_growkernel() instead of writing apologetic comments. As it turns out, I need every kernel page table page to have a legitimate pindex to support superpage promotion on kernel memory.
Correct a nearby style error: Pointers should be compared to NULL.
|
#
174496 |
|
09-Dec-2007 |
alc |
Eliminate compilation warnings due to the use of non-static inlines through the introduction and use of the __gnu89_inline attribute.
Submitted by: bde (i386) MFC after: 3 days
|
#
174250 |
|
04-Dec-2007 |
alc |
Correct an error under COUNT_IPIS within pmap_lazyfix_action(): Increment the counter that the pointer refers to, not the pointer.
MFC after: 3 days
|
#
174104 |
|
30-Nov-2007 |
alc |
Improve get_pv_entry()'s handling of low-memory conditions. After page allocation fails and pv entries are reclaimed, there may be an unused pv entry in a pv chunk that survived the reclamation. However, previously, after reclamation, get_pv_entry() did not look for an unused pv entry in a surviving pv chunk; it simply retried the page allocation. Now, it does look for an unused pv entry before retrying the page allocation.
Note: This only applies to RELENG_7. Earlier branches use a different pv entry allocator.
MFC after: 6 weeks
|
#
173708 |
|
17-Nov-2007 |
alc |
Prevent the leakage of wired pages in the following circumstances: First, a file is mmap(2)ed and then mlock(2)ed. Later, it is truncated. Under "normal" circumstances, i.e., when the file is not mlock(2)ed, the pages beyond the EOF are unmapped and freed. However, when the file is mlock(2)ed, the pages beyond the EOF are unmapped but not freed because they have a non-zero wire count. This can be a mistake. Specifically, it is a mistake if the sole reason why the pages are wired is because of wired, managed mappings. Previously, unmapping the pages destroys these wired, managed mappings, but does not reduce the pages' wire count. Consequently, when the file is unmapped, the pages are not unwired because the wired mapping has been destroyed. Moreover, when the vm object is finally destroyed, the pages are leaked because they are still wired. The fix is to reduce the pages' wired count by the number of wired, managed mappings destroyed. To do this, I introduce a new pmap function pmap_page_wired_mappings() that returns the number of managed mappings to the given physical page that are wired, and I use this function in vm_object_page_remove().
Reviewed by: tegge MFC after: 6 weeks
|
#
173592 |
|
13-Nov-2007 |
peter |
Drastically simplify the i386 pcpu backend by merging parts of the amd64 mechanism over. Instead of page table hackery that isn't actually needed, just use 'struct pcpu __pcpu[MAXCPU]' for backing like all the other platforms do. Get rid of 'struct privatespace' and a while mess of #ifdef SMP garbage that set it up. As a bonus, this returns the 4MB of KVA that we stole to implement it the old way. This also allows you to read the pcpu data for each cpu when reading a minidump.
Background information: Originally, pcpu stuff was implemented as having per-cpu page tables and magic to make different data structures appear at the same actual address. In order to share page tables, we switched to using the GDT and %fs/%gs to access it. But we still did the evil magic to set it up for the old way. The "idle stacks" are not used for the idle process anymore and are just used for a few functions during bootup, then ignored. (excercise for reader: free these afterwards).
|
#
173370 |
|
05-Nov-2007 |
alc |
Add comments explaining why all stores updating a non-kernel page table must be globally performed before calling any of the TLB invalidation functions.
With one exception, on amd64, this requirement was already met. Fix this one case. Also, as a clarification, change an existing atomic op into a release. (Suggested by: jhb)
Reported and reviewed by: ups MFC after: 3 days
|
#
173361 |
|
05-Nov-2007 |
kib |
Fix for the panic("vm_thread_new: kstack allocation failed") and silent NULL pointer dereference in the i386 and sparc64 pmap_pinit() when the kmem_alloc_nofault() failed to allocate address space. Both functions now return error instead of panicing or dereferencing NULL.
As consequence, vmspace_exec() and vmspace_unshare() returns the errno int. struct vmspace arg was added to vm_forkproc() to avoid dealing with failed allocation when most of the fork1() job is already done.
The kernel stack for the thread is now set up in the thread_alloc(), that itself may return NULL. Also, allocation of the first process thread is performed in the fork1() to properly deal with stack allocation failure. proc_linkup() is separated into proc_linkup() called from fork1(), and proc_linkup0(), that is used to set up the kernel process (was known as swapper).
In collaboration with: Peter Holm Reviewed by: jhb
|
#
173296 |
|
03-Nov-2007 |
alc |
Eliminate spurious "Approaching the limit on PV entries, ..." warnings. Specifically, whenever vm_page_alloc(9) returned NULL to get_pv_entry(), we issued a warning regardless of the number of pv entries in use. (Note: The older pv entry allocator in RELENG_6 does not have this problem.)
Reported by: Jeremy Chadwick
Eliminate the direct call to pagedaemon_wakeup() by get_pv_entry(). This was a holdover from earlier times when the page daemon was responsible for the reclamation of pv entries.
MFC after: 5 days
|
#
171907 |
|
21-Aug-2007 |
alc |
In general, when we map a page into the kernel's address space, we no longer create a pv entry for that mapping. (The two exceptions are mappings into the kernel's exec and pipe submaps.) Consequently, there is no reason for get_pv_entry() to dig deep into the free page queues, i.e., use VM_ALLOC_SYSTEM, by default. This revision changes get_pv_entry() to use VM_ALLOC_NORMAL by default, i.e., before calling pmap_collect() to reclaim pv entries.
Approved by: re (kensmith)
|
#
171128 |
|
01-Jul-2007 |
alc |
Pages that do belong to an object and page queue can now be freed without holding the page queues lock. Thus, the page table pages released by pmap_remove() and pmap_remove_pages() can be freed after the page queues lock is released.
Approved by: re (kensmith)
|
#
170170 |
|
31-May-2007 |
attilio |
Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately.
Requested by: alc Approved by: jeff (mentor)
|
#
169805 |
|
20-May-2007 |
jeff |
- rename VMCNT_DEC to VMCNT_SUB to reflect the count argument.
Suggested by: julian@ Contributed by: attilio@
|
#
169667 |
|
18-May-2007 |
jeff |
- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines.
Contributed by: Attilio Rao <attilio@FreeBSD.org>
|
#
169040 |
|
25-Apr-2007 |
ups |
Forced commit to add more comment to:
1.583 src/sys/amd64/amd64/pmap.c 1.588 src/sys/i386/i386/pmap.c
Invalidate all TLBs and page structure caches that may reference a page for page walk acceleration before releasing it to the free list.
This is required for CPUs implementing page structure caches as described in the Intel application note: "TLBs, Paging-Structure Caches, and Their Invalidation" (Document Number: 317080-001)
Reviewed by: Siddha Suresh (Intel), alc@, peter@
|
#
168930 |
|
21-Apr-2007 |
ups |
Modify TLB invalidation handling.
Reviewed by: alc@, peter@ MFC after: 1 week
|
#
168691 |
|
13-Apr-2007 |
alc |
Eliminate the misuse of PG_FRAME to truncate a virtual address to a virtual page boundary.
Reviewed by: ru@
|
#
168439 |
|
06-Apr-2007 |
ru |
Add the PG_NX support for i386/PAE.
Reviewed by: alc
|
#
167869 |
|
24-Mar-2007 |
alc |
In order to satisfy ACPI's need for an identity mapping, modify the temporary mapping created by locore so that the lowest two to four megabytes can become a permanent identity mapping. This implementation avoids any use of a large page mapping.
|
#
167668 |
|
17-Mar-2007 |
alc |
Eliminate an unused parameter.
|
#
167578 |
|
14-Mar-2007 |
njl |
Create an identity mapping (V=P) super page for the low memory region on boot. Then, just switch to the kernel pmap when suspending instead of allocating/freeing our own mapping every time. This should solve a panic of pmap_remove() being called with interrupts disabled. Thanks to Alan Cox for developing this patch.
Note: this means that ACPI requires super page (PG_PS) support in the CPU. This has been present since the Pentium and first documented in the Pentium Pro. However, it may need to be revisited later.
Submitted by: alc MFC after: 1 month
|
#
167250 |
|
05-Mar-2007 |
alc |
Acquiring smp_ipi_mtx on every call to pmap_invalidate_*() is wasteful. For example, during a buildworld more than half of the calls do not generate an IPI because the only TLB entry invalidated is on the calling processor. This revision pushes down the acquisition and release of smp_ipi_mtx into smp_tlb_shootdown() and smp_targeted_tlb_shootdown() and instead uses sched_pin() and sched_unpin() in pmap_invalidate_*() so that thread migration doesn't lead to a missed TLB invalidation.
Reviewed by: jhb MFC after: 3 weeks
|
#
166810 |
|
18-Feb-2007 |
alc |
Eliminate some acquisitions and releases of the page queues lock that are no longer necessary.
|
#
166074 |
|
17-Jan-2007 |
delphij |
Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form.
|
#
164413 |
|
19-Nov-2006 |
alc |
The global variable avail_end is redundant and only used once. Eliminate it. Make avail_start static to the pmap on amd64. (It no longer exists on other architectures.)
|
#
164332 |
|
16-Nov-2006 |
maxim |
o Make pv_maxchunks no less than maxproc. This helps to survive a forkbomb explosion.
Reviewed by: alc Security: local DoS X-MFC atfer: RELENG_6 is not affected due to a different pv_entry allocation code.
|
#
164229 |
|
12-Nov-2006 |
alc |
Make pmap_enter() responsible for setting PG_WRITEABLE instead of its caller. (As a beneficial side-effect, a high-contention acquisition of the page queues lock in vm_fault() is eliminated.)
|
#
163603 |
|
22-Oct-2006 |
alc |
Eliminate unnecessary PG_BUSY tests.
|
#
161285 |
|
14-Aug-2006 |
jhb |
Don't try to preserve PAT bits in pmap_enter(). We currently on pages that aren't mapped via pmap_enter() (KVA). We will eventually support PAT bits on user pages, but those will require some sort of MI caching mode stored in the vm_page.
Reviewed by: alc
|
#
161223 |
|
11-Aug-2006 |
jhb |
First pass at allowing memory to be mapped using cache modes other than WB (write-back) on x86 via control bits in PTEs and PDEs (including making use of the PAT MSR). Changes include: - A new pmap_mapdev_attr() function for amd64 and i386 which takes an additional parameter (relative to pmap_mapdev()) specifying the cache mode for this mapping. Note that on amd64 only WB mappings are done with the direct map, all other modes result in a private mapping. - pmap_mapdev() on i386 and amd64 now defaults to using UC (uncached) mappings rather than WB. Previously we relied on the BIOS setting up MTRR's to enforce memio regions being treated as UC. This might make hw.cbb_start_memory unnecessary in some cases now for example. - A new pmap_mapbios()/pmap_unmapbios() API has been added to allow places that used pmap_mapdev() to map non-device memory (such as ACPI tables) to do so using WB as before. - A new pmap_change_attr() function for amd64 and i386 that changes the caching mode for a range of KVA.
Reviewed by: alc
|
#
161016 |
|
06-Aug-2006 |
alc |
Eliminate the acquisition and release of the page queues lock around a call to vm_page_sleep_if_busy().
|
#
160889 |
|
01-Aug-2006 |
alc |
Complete the transition from pmap_page_protect() to pmap_remove_write(). Originally, I had adopted sparc64's name, pmap_clear_write(), for the function that is now pmap_remove_write(). However, this function is more like pmap_remove_all() than like pmap_clear_modify() or pmap_clear_reference(), hence, the name change.
The higher-level rationale behind this change is described in src/sys/amd64/amd64/pmap.c revision 1.567. The short version is that I'm trying to clean up and fix our support for execute access.
Reviewed by: marcel@ (ia64)
|
#
160525 |
|
20-Jul-2006 |
alc |
Add pmap_clear_write() to the interface between the virtual memory system's machine-dependent and machine-independent layers. Once pmap_clear_write() is implemented on all of our supported architectures, I intend to replace all calls to pmap_page_protect() by calls to pmap_clear_write(). Why? Both the use and implementation of pmap_page_protect() in our virtual memory system has subtle errors, specifically, the management of execute permission is broken on some architectures. The "prot" argument to pmap_page_protect() should behave differently from the "prot" argument to other pmap functions. Instead of meaning, "give the specified access rights to all of the physical page's mappings," it means "don't take away the specified access rights from all of the physical page's mappings, but do take away the ones that aren't specified." However, owing to our i386 legacy, i.e., no support for no-execute rights, all but one invocation of pmap_page_protect() specifies VM_PROT_READ only, when the intent is, in fact, to remove only write permission. Consequently, a faithful implementation of pmap_page_protect(), e.g., ia64, would remove execute permission as well as write permission. On the other hand, some architectures that support execute permission have basically ignored whether or not VM_PROT_EXECUTE is passed to pmap_page_protect(), e.g., amd64 and sparc64. This change represents the first step in replacing pmap_page_protect() by the less subtle pmap_clear_write() that is already implemented on amd64, i386, and sparc64.
Discussed with: grehan@ and marcel@
|
#
160461 |
|
18-Jul-2006 |
alc |
MFamd64 pmap_clear_ptes() is already convoluted. This will worsen with the implementation of superpages. Eliminate it and add pmap_clear_write().
There are no functional changes. Checked by: md5
|
#
160419 |
|
17-Jul-2006 |
alc |
Now that free_pv_entry() accesses the pmap, call free_pv_entry() in pmap_remove_all() before rather than after the pmap is unlocked. At present, the page queues lock provides sufficient sychronization. In the future, the page queues lock may not always be held when free_pv_entry() is called.
|
#
160412 |
|
16-Jul-2006 |
alc |
MFamd64 Make three simplifications to pmap_ts_referenced(): Eliminate an initialized but otherwise unused variable. Eliminate an unnecessary test. Exit the loop in a shorter way.
|
#
160408 |
|
16-Jul-2006 |
alc |
Eliminate the remaining uses of "register".
Convert the remaining K&R-style function declarations to ANSI-style.
Eliminate excessive white space from pmap_ts_referenced().
|
#
160379 |
|
15-Jul-2006 |
alc |
Make pc_freemask an array of uint32_t, rather than uint64_t. (I believe that the use of the latter is simply an oversight in porting the new pv entry code from amd64.)
|
#
160073 |
|
02-Jul-2006 |
alc |
Correct an error in the new pmap_collect(), thus only affecting HEAD. Specifically, the pv entry was always being freed to the caller's pmap instead of the pmap to which the pv entry belongs.
|
#
159970 |
|
27-Jun-2006 |
alc |
Correct a very old and very obscure bug: vmspace_fork() calls pmap_copy() if the mapping is VM_INHERIT_SHARE. Suppose the mapping is also wired. vmspace_fork() clears the wiring attributes in the vm map entry but pmap_copy() copies the PG_W attribute in the PTE. I don't think this is catastrophic. It blocks pmap_remove_pages() from destroying the mapping and corrupts the pmap's wiring count.
This revision fixes the problem by changing pmap_copy() to clear the PG_W attribute.
Reviewed by: tegge@
|
#
159933 |
|
25-Jun-2006 |
alc |
Eliminate a comment that became stale after revision 1.547.
|
#
159803 |
|
20-Jun-2006 |
alc |
Change get_pv_entry() such that the call to vm_page_alloc() specifies VM_ALLOC_NORMAL instead of VM_ALLOC_SYSTEM when try is TRUE. In other words, when get_pv_entry() is permitted to fail, it no longer tries as hard to allocate a page.
Change pmap_enter_quick_locked() to fail rather than wait if it is unable to allocate a page table page. This prevents a race between pmap_enter_object() and the page daemon. Specifically, an inactive page that is a successor to the page that was given to pmap_enter_quick_locked() might become a cache page while pmap_enter_quick_locked() waits and later pmap_enter_object() maps the cache page violating the invariant that cache pages are never mapped. Similarly, change pmap_enter_quick_locked() to call pmap_try_insert_pv_entry() rather than pmap_insert_entry(). Generally speaking, pmap_enter_quick_locked() is used to create speculative mappings. So, it should not try hard to allocate memory if free memory is scarce.
Add an assertion that the object containing m_start is locked in pmap_enter_object(). Remove a similar assertion from pmap_enter_quick_locked() because that function no longer accesses the containing object.
Remove a stale comment.
Reviewed by: ups@
|
#
159627 |
|
14-Jun-2006 |
ups |
Remove mpte optimization from pmap_enter_quick(). There is a race with the current locking scheme and removing it should have no measurable performance impact. This fixes page faults leading to panics in pmap_enter_quick_locked() on amd64/i386.
Reviewed by: alc,jhb,peter,ps
|
#
159546 |
|
12-Jun-2006 |
alc |
Don't invalidate the TLB in pmap_qenter() unless the old mapping was valid. Most often, it isn't.
Reviewed by: tegge@
|
#
159303 |
|
05-Jun-2006 |
alc |
Introduce the function pmap_enter_object(). It maps a sequence of resident pages from the same object. Use it in vm_map_pmap_enter() to reduce the locking overhead of premapping objects.
Reviewed by: tegge@
|
#
159244 |
|
05-Jun-2006 |
alc |
MFamd64 Eliminate unnecessary, recursive acquisitions and releases of the page queues lock by free_pv_entry() and pmap_remove_pages().
Reduce the scope of the page queues lock in pmap_remove_pages().
|
#
158238 |
|
01-May-2006 |
jhb |
Add various constants for the PAT MSR and the PAT PTE and PDE flags. Initialize the PAT MSR during boot to map PAT type 2 to Write-Combining (WC) instead of Uncached (UC-).
MFC after: 1 month
|
#
158236 |
|
01-May-2006 |
jhb |
Add a new 'pmap_invalidate_cache()' to flush the CPU caches via the wbinvd() instruction. This includes a new IPI so that all CPU caches on all CPUs are flushed for the SMP case.
MFC after: 1 month
|
#
158235 |
|
01-May-2006 |
peter |
Using an idea from Stephan Uphoff, use the empty pte's that correspond to the unused kva in the pv memory block to thread a freelist through. This allows us to free pages that used to be used for pv entry chunks since we can now track holes in the kva memory block.
Idea from: ups
|
#
158226 |
|
01-May-2006 |
peter |
Fix missing changes required for the amd64->i386 conversion. Add the missing VM_ALLOC_WIRED flags to vm_page_alloc() calls I added.
Submitted by: alc
|
#
158120 |
|
28-Apr-2006 |
peter |
Interim fix for pmap problems I introduced with my last commit. Remove the code to dyanmically change the pv_entry limits. Go back to a single fixed kva reservation for pv entries, like was done before when using the uma zone. Go back to never freeing pages back to the free pool after they are no longer used, just like before.
This stops the lock order reversal due to aquiring the kernel map lock while pmap was locked.
This fixes the recursive panic if invariants are enabled.
The problem was that allocating/freeing kva causes vm_map_entry nodes to be allocated/freed. That can recurse back into pmap as new pages are hooked up to kvm and hence all the problem. Allocating/freeing kva indirectly allocate/frees memory.
So, by going back to a single fixed size kva block and an index, we avoid the recursion panics and the LOR.
The problem is that now with a linear block of kva, we have no mechanism to track holes once pages are freed. UMA has the same problem when using custom object for a zone and a fixed reservation of kva. Simple solutions like having a bitmap would work, but would be very inefficient when there are hundreds of thousands of bits in the map. A first-free pointer is similarly flawed because pages can be freed at random and the first-free pointer would be rewinding huge amounts. If we could allocate memory for tree strucures or an external freelist, that would work. Except we cannot allocate/free memory here because we cannot allocate/free address space to use it in. Anyway, my change here reverts back to the UMA behavior of not freeing pages for now, thereby avoiding holes in the map.
ups@ had a truely evil idea that I'll investigate. It should allow freeing unused pages again by giving us a no-cost way to track the holes in the kva block. But in the meantime, this should get people booting with witness and/or invariants again.
Footnote: amd64 doesn't have this problem because of the direct map access method. I'd done all my witness/invariants testing there. I'd never considered that the harmless-looking kmem_alloc/kmem_free calls would cause such a problem and it didn't show up on the boot test.
|
#
158088 |
|
27-Apr-2006 |
alc |
In general, bits in the page directory entry (PDE) and the page table entry (PTE) have the same meaning. The exception to this rule is the eighth bit (0x080). It is the PS bit in a PDE and the PAT bit in a PTE. This change avoids the possibility that pmap_enter() confuses a PAT bit with a PS bit, avoiding a panic().
Eliminate a diagnostic printf() from the i386 pmap_enter() that serves no current purpose, i.e., I've seen no bug reports in the last two years that are helped by this printf().
Reviewed by: jhb
|
#
158067 |
|
27-Apr-2006 |
delphij |
Fix build on i386
|
#
158060 |
|
26-Apr-2006 |
peter |
MFamd64: shrink pv entries from 24 bytes to about 12 bytes. (336 pv entries per page = effectively 12.19 bytes per pv entry after overheads). Instead of using a shared UMA zone for 24 byte pv entries (two 8-byte tailq nodes, a 4 byte pointer, and a 4 byte address), we allocate a page at a time per process. This provides 336 pv entries per process (actually, per pmap address space) and eliminates one of the 8-byte tailq entries since we now can track per-process pv entries implicitly. The pointer to the pmap can be eliminated by doing address arithmetic to find the metadata on the page headers to find a single pointer shared by all 336 entries. There is an 11-int bitmap for the freelist of those 336 entries.
This is mostly a mechanical conversion from amd64, except: * i386 has to allocate kvm and map the pages, amd64 has them outside of kvm * native word size is smaller, so bitmaps etc become 32 bit instead of 64 * no dump_add_page() etc stuff because they are in kvm always. * various pmap internals tweaks because pmap uses direct map on amd64 but on i386 it has to use sched_pin and temporary mappings.
Also, sysctl vm.pmap.pv_entry_max and vm.pmap.shpgperproc are now dynamic sysctls. Like on amd64, i386 can now tune the pv entry limits without a recompile or reboot.
This is important because of the following scenario. If you have a 1GB file (262144 pages) mmap()ed into 50 processes, that requires 13 million pv entries. At 24 bytes per pv entry, that is 314MB of ram and kvm, while at 12 bytes it is 157MB. A 157MB saving is significant.
Test-run by: scottl (Thanks!)
|
#
157680 |
|
12-Apr-2006 |
alc |
Retire pmap_track_modified(). We no longer need it because we do not create managed mappings within the clean submap. To prevent regressions, add assertions blocking the creation of managed mappings within the clean submap.
Reviewed by: tegge
|
#
157443 |
|
03-Apr-2006 |
peter |
Remove the unused sva and eva arguments from pmap_remove_pages().
|
#
157394 |
|
02-Apr-2006 |
alc |
Introduce pmap_try_insert_pv_entry(), a function that conditionally creates a pv entry if the number of entries is below the high water mark for pv entries.
Use pmap_try_insert_pv_entry() in pmap_copy() instead of pmap_insert_entry(). This avoids possible recursion on a pmap lock in get_pv_entry().
Eliminate the explicit low-memory checks in pmap_copy(). The check that the number of pv entries was below the high water mark was largely ineffective because it was located in the outer loop rather than the inner loop where pv entries were allocated. Instead of checking, we attempt the allocation and handle the failure.
Reviewed by: tegge Reported by: kris MFC after: 5 days
|
#
156963 |
|
21-Mar-2006 |
alc |
Eliminate unnecessary invalidations of the entire TLB by pmap_remove(). Specifically, on mappings with PG_G set pmap_remove() not only performs the necessary per-page invlpg invalidations but also performs an unnecessary invalidation of the entire set of non-PG_G entries.
Reviewed by: tegge
|
#
156930 |
|
21-Mar-2006 |
davidxu |
Remove stale KSE code.
Reviewed by: alc
|
#
155771 |
|
16-Feb-2006 |
tegge |
Rounding addr upwards to next 4M or 2M boundary in pmap_growkernel() could cause addr to become 0, resulting in an early return without populating the last PDE.
Reviewed by: alc
|
#
153141 |
|
05-Dec-2005 |
jhb |
- Move the code to deal with handling an IPI_STOP IPI out of ipi_nmi_handler() and into a new cpustop_handler() function. Change the Xcpustop IPI_STOP handler to call this function instead of duplicating all the same logic in assembly. - EOI the local APIC for the lapic timer interrupt in C rather than assembly. - Bump the lazypmap IPI counter if COUNT_IPIS is defined in C rather than assembly.
|
#
152630 |
|
20-Nov-2005 |
alc |
Eliminate pmap_init2(). It's no longer used.
|
#
152404 |
|
13-Nov-2005 |
imp |
Provide a dummy NO_XBOX option that lives in opt_xbox.h for pc98. This allows us to eliminate a three ifdef PC98 instances.
|
#
152359 |
|
13-Nov-2005 |
alc |
In get_pv_entry() use PMAP_LOCK() instead of PMAP_TRYLOCK() when deadlock cannot possibly occur.
|
#
152238 |
|
09-Nov-2005 |
nyan |
Fix pc98 build.
|
#
152224 |
|
09-Nov-2005 |
alc |
Reimplement the reclamation of PV entries. Specifically, perform reclamation synchronously from get_pv_entry() instead of asynchronously as part of the page daemon. Additionally, limit the reclamation to inactive pages unless allocation from the PV entry zone or reclamation from the inactive queue fails. Previously, reclamation destroyed mappings to both inactive and active pages. get_pv_entry() still, however, wakes up the page daemon when reclamation occurs. The reason being that the page daemon may move some pages from the active queue to the inactive queue, making some new pages available to future reclamations.
Print the "reclaiming PV entries" message at most once per minute, but don't stop printing it after the fifth time. This way, we do not give the impression that the problem has gone away.
Reviewed by: tegge
|
#
152219 |
|
09-Nov-2005 |
imp |
Add support for XBOX to the FreeBSD port. The xbox architecture is nearly identical to wintel/ia32, with a couple of tweaks. Since it is so similar to ia32, it is optionally added to a i386 kernel. This port is preliminary, but seems to work well. Further improvements will improve the interaction with syscons(4), port Linux nforce driver and future versions of the xbox.
This supports the 64MB and 128MB boxes. You'll need the most recent CVS version of Cromwell (the Linux BIOS for the XBOX) to boot.
Rink will be maintaining this port, and is interested in feedback. He's setup a website http://xbox-bsd.nl to report the latest developments.
Any silly mistakes are my fault.
Submitted by: Rink P.W. Springer rink at stack dot nl and Ed Schouten ed at fxq dot nl
|
#
152042 |
|
04-Nov-2005 |
alc |
Begin and end the initialization of pvzone in pmap_init(). Previously, pvzone's initialization was split between pmap_init() and pmap_init2(). This split initialization was the underlying cause of some UMA panics during initialization. Specifically, if the UMA boot pages was exhausted before the pvzone was fully initialized, then UMA, through no fault of its own, would use an inappropriate back-end allocator leading to a panic. (Previously, as a workaround, we have increased the UMA boot pages.) Fortunately, there is no longer any reason that pvzone's initialization cannot be completed in pmap_init().
Eliminate a check for whether pv_entry_high_water has been initialized or not from get_pv_entry(). Since pvzone's initialization is completed in pmap_init(), this check is no longer needed.
Use cnt.v_page_count, the actual count of available physical pages, instead of vm_page_array_size to compute the maximum number of pv entries.
Introduce the vm.pmap.pv_entries tunable on alpha and ia64.
Eliminate some unnecessary white space.
Discussed with: tegge (item #1) Tested by: marcel (ia64)
|
#
151910 |
|
31-Oct-2005 |
alc |
Instead of a panic()ing in pmap_insert_entry() if get_pv_entry() fails, reclaim a pv entry by destroying a mapping to an inactive page.
Change the format strings in many of the assertions that were recently converted from PMAP_DIAGNOSTIC printf()s so that they are compatible with PAE. Avoid unnecessary differences between the amd64 and i386 format strings.
|
#
151889 |
|
30-Oct-2005 |
alc |
Replace diagnostic printf()s by assertions. Use consistent style for similar assertions.
|
#
151543 |
|
21-Oct-2005 |
ade |
Specifically panic() in the case where pmap_insert_entry() fails to get a new pv under high system load where the available pv entries have been exhausted before the pagedaemon has a chance to wake up to reclaim some.
Prior to this, the NULL pointer dereference ended up causing secondary panics with rather less than useful resulting tracebacks.
Reviewed by: alc, jhb MFC after: 1 week
|
#
149786 |
|
04-Sep-2005 |
alc |
Eliminate unnecessary TLB invalidations by pmap_enter(). Specifically, eliminate TLB invalidations when permissions are relaxed, such as when a read-only mapping is changed to a read/write mapping. Additionally, eliminate TLB invalidations when bits that are ignored by the hardware, such as PG_W ("wired mapping"), are changed.
Reviewed by: tegge
|
#
149768 |
|
03-Sep-2005 |
alc |
Pass a value of type vm_prot_t to pmap_enter_quick() so that it determine whether the mapping should permit execute access.
|
#
149532 |
|
27-Aug-2005 |
alc |
MFamd64 revision 1.526 When pmap_allocpte() destroys a 2/4MB "superpage" mapping it does not reduce the pmap's resident count accordingly. It should.
|
#
149058 |
|
14-Aug-2005 |
alc |
Simplify the page table page reference counting by pmap_enter()'s change of mapping case.
Eliminate a stale comment from pmap_enter().
Reviewed by: tegge
|
#
148976 |
|
11-Aug-2005 |
alc |
Eliminate unneeded diagnostic code.
Eliminate an unused #include. (Kernel stack allocation and deallocation long ago migrated to the machine-independent code.)
|
#
148971 |
|
11-Aug-2005 |
alc |
Eliminate unneeded diagnostic code.
Reviewed by: tegge
|
#
148952 |
|
11-Aug-2005 |
alc |
Decouple the unrefing of a page table page from the removal of a pv entry. In other words, change pmap_remove_entry() such that it no longer unrefs the page table page. Now, it only removes the pv entry.
Reviewed by: tegge
|
#
148835 |
|
07-Aug-2005 |
alc |
When support for 2MB/4MB pages was added in revision 1.148 an error was made in pmap_protect(): The pmap's resident count should not be reduced unless mappings are removed.
The errant change to the pmap's resident count could result in a later pmap_remove() failing to remove any mappings if the errant change has set the pmap's resident count to zero.
|
#
148539 |
|
29-Jul-2005 |
jhb |
Fix a bug in pmap_protect() in the PAE case where it would try to look up the vm_page_t associated with a pte using only the lower 32-bits of the pte instead of the full 64-bits.
Submitted by: Greg Taleck greg at isilon dot com Reviewed by: jeffr, alc MFC after: 3 days
|
#
147741 |
|
02-Jul-2005 |
delphij |
Remove the CPU_ENABLE_SSE option from the i386 and pc98 architectures, as they are already default for I686_CPU for almost 3 years, and CPU_DISABLE_SSE always disables it. On the other hand, CPU_ENABLE_SSE does not work for I486_CPU and I586_CPU.
This commit has: - Removed the option from conf/options.* - Removed the option and comments from MD NOTES files - Simplified the CPU_ENABLE_SSE ifdef's so they don't deal with CPU_ENABLE_SSE from kernel configuration. (*)
For most users, this commit should be largely no-op. If you used to place CPU_ENABLE_SSE into your kernel configuration for some reason, it is time to remove it.
(*) The ifdef's of CPU_ENABLE_SSE are not removed at this point, since we need to change it to !defined(CPU_DISABLE_SSE) && defined(I686_CPU), not just !defined(CPU_DISABLE_SSE), if we really want to do so.
Discussed on: -arch Approved by: re (scottl)
|
#
147217 |
|
10-Jun-2005 |
alc |
Introduce a procedure, pmap_page_init(), that initializes the vm_page's machine-dependent fields. Use this function in vm_pageq_add_new_page() so that the vm_page's machine-dependent and machine-independent fields are initialized at the same time.
Remove code from pmap_init() for initializing the vm_page's machine-dependent fields.
Remove stale comments from pmap_init().
Eliminate the Boolean variable pmap_initialized from the alpha, amd64, i386, and ia64 pmap implementations. Its use is no longer required because of the above changes and earlier changes that result in physical memory that is being mapped at initialization time being mapped without pv entries.
Tested by: cognet, kensmith, marcel
|
#
141367 |
|
05-Feb-2005 |
alc |
Implement proper handling of PG_G mappings in pmap_protect(). (I don't believe that this omission mattered before the introduction of MemGuard.)
Reviewed by: tegge@ MFC after: 1 week
|
#
139241 |
|
23-Dec-2004 |
alc |
Modify pmap_enter_quick() so that it expects the page queues to be locked on entry and it assumes the responsibility for releasing the page queues lock if it must sleep.
Remove a bogus comment from pmap_enter_quick().
Using the first change, modify vm_map_pmap_enter() so that the page queues lock is acquired and released once, rather than each time that a page is mapped.
|
#
138897 |
|
15-Dec-2004 |
alc |
In the common case, pmap_enter_quick() completes without sleeping. In such cases, the busying of the page and the unlocking of the containing object by vm_map_pmap_enter() and vm_fault_prefault() is unnecessary overhead. To eliminate this overhead, this change modifies pmap_enter_quick() so that it expects the object to be locked on entry and it assumes the responsibility for busying the page and unlocking the object if it must sleep. Note: alpha, amd64, i386 and ia64 are the only implementations optimized by this change; arm, powerpc, and sparc64 still conservatively busy the page and unlock the object within every pmap_enter_quick() call.
Additionally, this change is the first case where we synchronize access to the page's PG_BUSY flag and busy field using the containing object's lock rather than the global page queues lock. (Modifications to the page's PG_BUSY flag and busy field have asserted both locks for several weeks, enabling an incremental transition.)
|
#
138504 |
|
07-Dec-2004 |
ups |
Move reading the current CPU mask in pmap_lazyfix() to where the thread is protected from migrating to another CPU.
Approved by: sam (mentor) MFC after: 4 weeks
|
#
138303 |
|
02-Dec-2004 |
alc |
For efficiency move the call to pmap_pte_quick() out of pmap_protect()'s and pmap_remove()'s inner loop.
Reviewed by: peter@, tegge@
|
#
138129 |
|
27-Nov-2004 |
das |
Don't include sys/user.h merely for its side-effect of recursively including other headers.
|
#
137784 |
|
16-Nov-2004 |
jhb |
Initiate deorbit burn sequence for 80386 support in FreeBSD: Remove 80386 (I386_CPU) support from the kernel.
|
#
137052 |
|
29-Oct-2004 |
alc |
Implement per-CPU SYSMAPs, i.e., CADDR* and CMAP*, to reduce lock contention within pmap_zero_page() and pmap_copy_page().
|
#
136252 |
|
08-Oct-2004 |
alc |
Make pte_load_store() an atomic operation in all cases, not just i386 PAE.
Restructure pmap_enter() to prevent the loss of a page modified (PG_M) bit in a race between processors. (This restructuring assumes the newly atomic pte_load_store() for correct operation.)
Reviewed by: tegge@ PR: i386/61852
|
#
136101 |
|
03-Oct-2004 |
alc |
Undo revision 1.251. This change was a performance pessimizing work-around that is no longer required. (In fact, it is not clear that it was ever required in HEAD or RELENG_4, only RELENG_3 required a work-around.) Now, as before revision 1.251, if the preexisting PTE is invalid, pmap_enter() does not call pmap_invalidate_page() to update the TLB(s).
Note: Even with this change, the handling of a copy-on-write fault is inefficient, in such cases pmap_enter() calls pmap_invalidate_page() twice.
Discussed with: bde@ PR: kern/16568
|
#
136070 |
|
02-Oct-2004 |
alc |
The physical address stored in the vm_page is page aligned. There is no need to mask off the page offset bits. (This operation made some sense prior to i386/i386/pmap.c revision 1.254 when we passed a physical address rather than a vm_page pointer to pmap_enter().)
|
#
136050 |
|
02-Oct-2004 |
alc |
Eliminate unnecessary uses of PHYS_TO_VM_PAGE() from pmap_enter(). These uses predate the change in the pmap_enter() interface that replaced the page's physical address by the address of its vm_page structure. The PHYS_TO_VM_PAGE() was being used to compute the address of the same vm_page structure that was being passed in.
|
#
135939 |
|
29-Sep-2004 |
alc |
Prevent the unexpected deallocation of a page table page while performing pmap_copy(). This entails additional locking in pmap_copy() and the addition of a "flags" parameter to the page table page allocator for specifying whether it may sleep when memory is unavailable. (Already, pmap_copy() checks the availability of memory, aborting if it is scarce. In theory, another CPU could, however, allocate memory between pmap_copy()'s check and the call to the page table page allocator, causing the current thread to release its locks and sleep. This change makes this scenario impossible.)
Reviewed by: tegge@
|
#
135565 |
|
22-Sep-2004 |
alc |
Correct a long-standing error in _pmap_unwire_pte_hold() affecting multiprocessors. Specifically, the error is conditioning the call to pmap_invalidate_page() on whether the pmap is active on the current CPU. This call must be unconditional. Regardless of whether the pmap is active on the CPU performing _pmap_unwire_pte_hold(), it could be active on another CPU. For example, a call to pmap_remove_all() by the page daemon could result in a call to _pmap_unwire_pte_hold() with the pmap inactive on the current CPU and active on another CPU. In such circumstances, failing to call pmap_invalidate_page() results in a stale TLB entry on the other CPU that still maps the now deallocated page table page. What happens next is typically a mysterious panic in pmap_enter() by the other CPU, either "pmap_enter: attempted pmap_enter on 4MB page" or "pmap_enter: pte vanished, va: 0x%lx". Both occur because the former page table page has been recycled and allocated to a new purpose. Consequently, it no longer contains zeroes.
See also Peter's i386/i386/pmap.c revision 1.448 and the related e-mail thread last year.
Many thanks to the engineers at Sandvine for providing clear and concise information until all of the pieces of the puzzle fell into place and for testing an earlier patch.
MT5 Candidate
|
#
135479 |
|
19-Sep-2004 |
alc |
Simplify the reference counting of page table pages. Specifically, use the page table page's wired count rather than its hold count to contain the reference count. My rationale for this change is based on several factors:
1. The machine-independent and pmap layers used the same hold count field in subtly different ways. The machine-independent layer uses the hold count to implement a form of ephemeral wiring that is used by pipes, physio, etc. In other words, subsystems where we wish to temporarily block a page from being swapped out while it is mapped into the kernel's address space. Such pages are never removed from the page queues. Instead, the page daemon recognizes a non-zero hold count to mean "hands off this page." In contrast, page table pages are never in the page queues; they are wired from birth to death. The hold count was being used as a kind of reference count, specifically, the number of valid page table entries within the page. Not surprisingly, these two different uses imply different synchronization rules: in the machine- independent layer access to the hold count requires the page queues lock; whereas in the pmap layer the pmap lock is required. Thus, continued use by the pmap layer of vm_page_unhold(), which asserts that the page queues lock is held, made no sense.
2. _pmap_unwire_pte_hold() was too forgiving in its handling of the wired count. An unexpected wired count on a page table page was ignored and the underlying page leaked.
3. In a word, microoptimization. Using the wired count exclusively, rather than a combination of the wired and hold counts, makes the code slightly smaller and faster.
Reviewed by: tegge@
|
#
135452 |
|
19-Sep-2004 |
alc |
Remove an outdated assertion from _pmap_allocpte(). (When vm_page_alloc() succeeds, the page's queue field is unconditionally set to PQ_NONE by vm_pageq_remove_nowakeup().)
|
#
135443 |
|
18-Sep-2004 |
alc |
Release the page queues lock earlier in pmap_protect() and pmap_remove() in order to reduce contention.
|
#
135123 |
|
12-Sep-2004 |
alc |
Use an atomic op to update the pte in pmap_protect(). This is to prevent the loss of a page modified (PG_M) bit in a race between processors.
Quoting Tor: One scenario where the old code could cause a lost PG_M bit is a multithreaded linux program (or FreeBSD program using the linuxthreads port) where one thread was starting a subprocess. The thread doing fork() would call vmspace_fork(), which would then call vm_map_copy_entry() which would call pmap_protect() on an area possibly accessed by other threads.
Additionally, make the clearing of PG_M by pmap_protect() unconditional if write permission is removed. Previously, PG_M could persist on a read-only unmanaged page. That seems inconsistent and confusing.
In collaboration with: tegge@
MT5 candidate PR: 61852
|
#
135076 |
|
11-Sep-2004 |
scottl |
Revert the previous round of changes to td_pinned. The scheduler isn't fully initialed when the pmap layer tries to call sched_pini() early in the boot and results in an quick panic. Use ke_pinned instead as was originally done with Tor's patch.
Approved by: julian
|
#
135056 |
|
10-Sep-2004 |
julian |
Make up my mind if cpu pinning is stored in the thread structure or the scheduler specific extension to it. Put it in the extension as the implimentation details of how the pinning is done needn't be visible outside the scheduler.
Submitted by: tegge (of course!) (with changes) MFC after: 3 days
|
#
134960 |
|
08-Sep-2004 |
alc |
Use atomic ops in pmap_clear_ptes() to prevent SMP races that could result in the loss of an accessed or modified bit from the pte.
In collaboration with: tegge@
MT5 candidate
|
#
134611 |
|
01-Sep-2004 |
alc |
Correction to the previous revision: I forgot to apply the ones complement to a constant. This didn't show in testing because the broken expression produced the same result in my tests as the correct expression.
|
#
134607 |
|
01-Sep-2004 |
alc |
Modify pmap_pte() to support its use on non-current, non-kernel pmaps without holding Giant.
|
#
134509 |
|
30-Aug-2004 |
alc |
Remove unnecessary check for curthread == NULL.
|
#
134416 |
|
27-Aug-2004 |
obrien |
s/smp_rv_mtx/smp_ipi_mtx/g
Requested by: jhb
|
#
134393 |
|
27-Aug-2004 |
alc |
The machine-independent parts of the virtual memory system always pass a valid pmap to the pmap functions that require one. Remove the checks for NULL. (These checks have their origins in the Mach pmap.c that was integrated into BSD. None of the new code written specifically for FreeBSD included them.)
|
#
134227 |
|
23-Aug-2004 |
peter |
Commit Doug White and Alan Cox's fix for the cross-ipi smp deadlock. We were obtaining different spin mutexes (which disable interrupts after aquisition) and spin waiting for delivery. For example, KSE processes do LDT operations which use smp_rendezvous, while other parts of the system are doing things like tlb shootdowns with a different mutex.
This patch uses the common smp_rendezvous mutex for all MD home-grown IPIs that spinwait for delivery. Having the single mutex means that the spinloop to aquire it will enable interrupts periodically, thus avoiding the cross-ipi deadlock.
Obtained from: dwhite, alc Reviewed by: jhb
|
#
133292 |
|
07-Aug-2004 |
alc |
With the advent of pmap locking it makes sense for pmap_copy() to be less forgiving about inconsistencies in the source pmap. Also, remove a new- line character terminating a nearby panic string.
|
#
133139 |
|
04-Aug-2004 |
jhb |
Remove a potential deadlock on i386 SMP by changing the lazypmap ipi and spin-wait code to use the same spin mutex (smp_tlb_mtx) as the TLB ipi and spin-wait code snippets so that you can't get into the situation of one CPU doing a TLB shootdown to another CPU that is doing a lazy pmap shootdown each of which are waiting on each other. With this change, only one of the CPUs would do an IPI and spin-wait at a time.
|
#
133124 |
|
04-Aug-2004 |
alc |
Post-locking clean up/simplification, particularly, the elimination of vm_page_sleep_if_busy() and the page table page's busy flag as a synchronization mechanism on page table pages.
Also, relocate the inline pmap_unwire_pte_hold() so that it can be used to shorten _pmap_unwire_pte_hold() on alpha and amd64. This places pmap_unwire_pte_hold() next to a comment that more accurately describes it than _pmap_unwire_pte_hold().
|
#
132919 |
|
31-Jul-2004 |
alc |
Add pmap locking to pmap_object_init_pt().
|
#
132852 |
|
29-Jul-2004 |
alc |
Advance the state of pmap locking on alpha, amd64, and i386.
- Enable recursion on the page queues lock. This allows calls to vm_page_alloc(VM_ALLOC_NORMAL) and UMA's obj_alloc() with the page queues lock held. Such calls are made to allocate page table pages and pv entries. - The previous change enables a partial reversion of vm/vm_page.c revision 1.216, i.e., the call to vm_page_alloc() by vm_page_cowfault() now specifies VM_ALLOC_NORMAL rather than VM_ALLOC_INTERRUPT. - Add partial locking to pmap_copy(). (As a side-effect, pmap_copy() should now be faster on i386 SMP because it no longer generates IPIs for TLB shootdown on the other processors.) - Complete the locking of pmap_enter() and pmap_enter_quick(). (As of now, all changes to a user-level pmap on alpha, amd64, and i386 are performed with appropriate locking.)
|
#
132505 |
|
21-Jul-2004 |
cognet |
Using NULL as a malloc type when calling contigmalloc() is wrong, so introduce a new malloc type, and use it.
|
#
132365 |
|
18-Jul-2004 |
alc |
Utilize pmap_pte_quick() rather than pmap_pte() in pmap_protect(). The reason being that pmap_pte_quick() requires the page queues lock, which is already held, rather than Giant.
|
#
132318 |
|
17-Jul-2004 |
alc |
Remedy my omission of one change in the prevision revision: pmap_remove() must pin the current thread in order to call pmap_pte_quick().
|
#
132316 |
|
17-Jul-2004 |
alc |
- Utilize pmap_pte_quick() rather than pmap_pte() in pmap_remove() and pmap_remove_page(). The reason being that pmap_pte_quick() requires the page queues lock, which is already held, rather than Giant. - Assert that the page queues lock is held in pmap_remove_page() and pmap_remove_pte().
|
#
132220 |
|
15-Jul-2004 |
alc |
Push down the acquisition and release of the page queues lock into pmap_protect() and pmap_remove(). In general, they require the lock in order to modify a page's pv list or flags. In some cases, however, pmap_protect() can avoid acquiring the lock.
|
#
132082 |
|
13-Jul-2004 |
alc |
Push down the acquisition and release of the page queues lock into pmap_remove_pages(). (The implementation of pmap_remove_pages() is optional. If pmap_remove_pages() is unimplemented, the acquisition and release of the page queues lock is unnecessary.)
Remove spl calls from the alpha, arm, and ia64 pmap_remove_pages().
|
#
131744 |
|
07-Jul-2004 |
alc |
Simplify the control flow in pmap_extract(), enabling the elimination of a PMAP_UNLOCK() call.
|
#
131730 |
|
07-Jul-2004 |
alc |
White space and style changes only.
|
#
131666 |
|
06-Jul-2004 |
alc |
Style changes to pmap_extract().
|
#
131301 |
|
29-Jun-2004 |
peter |
Fix leftover argument to pmap_unuse_pt(). I committed the wrong diff.
Submmitted by: Jon Noack <noackjr@alumni.rice.edu>
|
#
131272 |
|
29-Jun-2004 |
peter |
Reduce the size of pv entries by 15%. This saves 1MB of KVA for mapping pv entries per 1GB of user virtual memory. (eg: if we had 1GB file was mmaped into 30 processes, that would theoretically reduce the KVA demand by 30MB for pv entries. In reality though, we limit pv entries so we don't have that many at once.)
We used to store the vm_page_t for the page table page. But we recently had the pa of the ptp, or can calculate it fairly quickly. If we wanted to avoid the shift/mask operation in pmap_pde(), we could recover the pa but that means we have to store it for a while.
This does not measurably change performance.
Suggested by: alc Tested by: alc
|
#
131150 |
|
26-Jun-2004 |
alc |
In case pmap_extract_and_hold() is ever performed on a different pmap than the current one, we need to pin the current thread to its CPU.
Submitted by: tegge@
|
#
130932 |
|
22-Jun-2004 |
alc |
Implement the protection check required by the pmap_extract_and_hold() specification. This enables the elimination of Giant from that function.
Reviewed by: tegge@
|
#
130814 |
|
20-Jun-2004 |
alc |
- Simplify pmap_remove_pages(), eliminating unnecessary indirection. - Simplify the locking of pmap_is_modified() by converting control flow to data flow.
|
#
130765 |
|
20-Jun-2004 |
alc |
Add pmap locking to pmap_is_prefaultable().
|
#
130738 |
|
19-Jun-2004 |
alc |
Remove unused pt_entry_ts. Remove an unneeded semicolon.
|
#
130626 |
|
17-Jun-2004 |
alc |
Do not preset PG_BUSY on VM_ALLOC_NOOBJ pages. Such pages are not accessible through an object. Thus, PG_BUSY serves no purpose.
|
#
130573 |
|
16-Jun-2004 |
alc |
MFamd64 Introduce pmap locking to many of the pmap functions.
|
#
130560 |
|
16-Jun-2004 |
alc |
MFamd64 Remove dead or unneeded code, e.g., spl calls.
|
#
130539 |
|
15-Jun-2004 |
alc |
Remove a stale comment.
|
#
130433 |
|
13-Jun-2004 |
alc |
Prevent the loss of a PG_M bit through an SMP race in pmap_ts_referenced().
|
#
130386 |
|
12-Jun-2004 |
alc |
In a multiprocessor, the PG_W bit in the pte must be changed atomically. Otherwise, the setting of the PG_M bit by one processor could be lost if another processor is simultaneously changing the PG_W bit.
Reviewed by: tegge@
|
#
129817 |
|
28-May-2004 |
alc |
Remove a broken micro-optimization from pmap_enter(). The ill effect of this micro-optimization occurs when we call pmap_enter() to wire an already mapped page. Because of the micro-optimization, we fail to mark the PTE as wired. Later, on teardown of the address space, pmap_remove_pages() destroys the PTE before vm_fault_unwire() has unwired the page. (pmap_remove_pages() is not supposed to destroy wired PTEs. They are destroyed by a later call to pmap_remove().) Thus, the page becomes lost.
Note: The page is not lost if the application called munlock(2), only if it relies on teardown of the address space to unwire its pages.
For the historically inclined, this bug was introduced by a megacommit, revision 1.182, roughly six years ago.
Leak observed by: green@ and dillon independently Patch submitted by: dillon at backplane dot com Reviewed by: tegge@ MFC after: 1 week
|
#
128098 |
|
10-Apr-2004 |
alc |
- pmap_kenter_temporary()'s first parameter, which is a physical address, should be declared as vm_paddr_t not vm_offset_t.
|
#
127875 |
|
05-Apr-2004 |
alc |
Remove avail_start on those platforms that no longer use it. (Only amd64 does anything with it beyond simple initialization.)
|
#
127869 |
|
04-Apr-2004 |
alc |
Remove unused arguments from pmap_init().
|
#
126728 |
|
07-Mar-2004 |
alc |
Retire pmap_pinit2(). Alpha was the last platform that used it. However, ever since alpha/alpha/pmap.c revision 1.81 introduced the list allpmaps, there has been no reason for having this function on Alpha. Briefly, when pmap_growkernel() relied upon the list of all processes to find and update the various pmaps to reflect a growth in the kernel's valid address space, pmap_init2() served to avoid a race between pmap initialization and pmap_growkernel(). Specifically, pmap_pinit2() was responsible for initializing the kernel portions of the pmap and pmap_pinit2() was called after the process structure contained a pointer to the new pmap for use by pmap_growkernel(). Thus, an update to the kernel's address space might be applied to the new pmap unnecessarily, but an update would never be lost.
|
#
125308 |
|
01-Feb-2004 |
alc |
Eliminate all TLB shootdowns by pmap_pte_quick(): By temporarily pinning the thread that calls pmap_pte_quick() and by virtue of the page queues lock being held, we can manage PADDR1/PMAP1 as a CPU private mapping.
The most common effect of this change is to reduce the overhead of the page daemon on multiprocessors.
In collaboration with: tegge
|
#
124956 |
|
25-Jan-2004 |
jeff |
- Now that both schedulers support temporary cpu pinning use this rather than the switchin functions to guarantee that we're operating with the correct tlb entry. - Remove the post copy/zero tlb invalidations. It is faster to invalidate an entry that is known to exist and so it is faster to invalidate after use. However, some architectures implement speculative page table prefetching so we can not be guaranteed that the invalidated entry is still invalid when we re-enter any of these functions. As a result of this we must always invalidate before use to be safe.
|
#
124360 |
|
11-Jan-2004 |
alc |
Include "opt_cpu.h" and related #ifdef's for SSE so that pagezero() actually includes the call to sse2_pagezero().
|
#
123945 |
|
28-Dec-2003 |
alc |
Don't bother clearing PG_ZERO on the page table page in _pmap_allocpte(); it serves no purpose.
|
#
123925 |
|
28-Dec-2003 |
alc |
Don't bother clearing and setting PG_BUSY on page table directory pages.
|
#
123710 |
|
21-Dec-2003 |
alc |
- Significantly reduce the number of preallocated pv entries in pmap_init(). Such a large preallocation is unnecessary and wastes nearly eight megabytes of kernel virtual address space per gigabyte of managed physical memory. - Increase UMA_BOOT_PAGES by two. This enables the removal of pmap_pv_allocf(). (Note: this function was only used during initialization, specifically, after pmap_init() but before pmap_init2(). During pmap_init2(), a new allocator is installed.)
|
#
123650 |
|
18-Dec-2003 |
jhb |
MFamd64: Remove i386_protection_init() and the protection_codes[] array and replace them with a simple if test to turn on PG_RW. i386 != vax.
|
#
122284 |
|
08-Nov-2003 |
alc |
- Similar to post-PAE RELENG_4 split pmap_pte_quick() into two cases, pmap_pte() and pmap_pte_quick(). The distinction being based upon the locks that are held by the caller. When the given pmap is not the current pmap, pmap_pte() should be used when Giant is held and pmap_pte_quick() should be used when the vm page queues lock is held. - When assigning to PMAP1 or PMAP2, include PG_A anf PG_M. - Reenable the inlining of pmap_is_current().
In collaboration with: tegge
|
#
121996 |
|
03-Nov-2003 |
jhb |
New i386 SMP code:
- The MP code no longer knows anything specific about an MP Table. Instead, the local APIC code adds CPUs via the cpu_add() function when a local APIC is enumerated by an APIC enumerator. - Don't divide the argument to mp_bootaddress() by 1024 just so that we can turn around and mulitply it by 1024 again. - We no longer panic if SMP is enabled but we are booted on a UP machine. - init_secondary(), the asm code between init_secondary() and ap_init() in mpboot.s and ap_init() have all been merged together in C into init_secondary(). - We now use the cpuid feature bits to determine if we should enable PSE, PGE, or VME on each AP. - Due to the change in the implementation of critical sections, acquire the SMP TLB mutex around a slightly larger chunk of code for TLB shootdowns. - Remove some of the debug code from the original SMP implementation that is no longer used or no longer applies to the new APIC code. - Use a temporary hack to disable the ACPI module until the SMP code has been further reorganized to allow ACPI to work as a module again. - Add a DDB command to dump the interesting contents of the IDT.
|
#
121823 |
|
31-Oct-2003 |
jhb |
For physical address regions between 0 and KERNLOAD, allow pmap_mapdev() to use the direct mapped KVA at KERNBASE to service the request. This also allows pmap_mapdev() to be used for such addresses very early during the boot process and might provide some small savings on KVA.
Reviewed by: peter
|
#
121758 |
|
30-Oct-2003 |
peter |
Change the pmap_invalidate_xxx() functions so they test against pmap == kernel_pmap rather than pmap->pm_active == -1. gcc's inliner can remove more code that way. Only kernel_pmap has a pm_active of -1.
|
#
121621 |
|
27-Oct-2003 |
jhb |
Fix pmap_unmapdev() to call pmap_kremove() instead of implementing it directly so that it more closely mirrors pmap_mapdev() which calls pmap_kenter().
|
#
121512 |
|
25-Oct-2003 |
peter |
For the SMP case, flush the TLB at the beginning of the page zero/copy routines. Otherwise we run into trouble with speculative tlb preloads on SMP systems. This effectively defeats Jeff's revision 1.438 optimization (for his pentium4-M laptop) in the SMP case. It breaks other systems, particularly athlon-MP's.
|
#
121494 |
|
25-Oct-2003 |
peter |
GC workaround code for detecting pentium4's and disabling PSE and PG_G. It's been ifdef'ed out for ages.
|
#
121102 |
|
14-Oct-2003 |
peter |
Get some more data if we hit the pmap_enter() thing.
|
#
121055 |
|
13-Oct-2003 |
alc |
- Modify pmap_is_current() to return FALSE when a pmap's page table is in use because a kernel thread is borrowing it. The borrowed page table can change spontaneously, making any dependence on its continued use subject to a race condition. - _pmap_unwire_pte_hold() cannot use pmap_is_current(): If a change is made to a page table page mapping for a borrowed page table, the TLB must be updated.
In collaboration with: tegge
|
#
121024 |
|
12-Oct-2003 |
phk |
Initialize CMAP3 to 0
|
#
120772 |
|
04-Oct-2003 |
alc |
Don't bother setting a page table page's valid field. It is unused and not setting it is consistent with other uses of VM_ALLOC_NOOBJ pages.
|
#
120734 |
|
04-Oct-2003 |
jeff |
- The proper test is CPU_ENABLE_SSE and not CPU_ENABLED_SSE. This effectively disabled the sse2_pagezero() code.
Spotted by: bde
|
#
120722 |
|
03-Oct-2003 |
alc |
Migrate pmap_prefault() into the machine-independent virtual memory layer.
A small helper function pmap_is_prefaultable() is added. This function encapsulate the few lines of pmap_prefault() that actually vary from machine to machine. Note: pmap_is_prefaultable() and pmap_mincore() have much in common. Going forward, it's worth considering their merger.
|
#
120654 |
|
01-Oct-2003 |
peter |
Commit Bosko's patch to clean up the PSE/PG_G initialization to and avoid problems with some Pentium 4 cpus and some older PPro/Pentium2 cpus. There are several problems, some documented in Intel errata. This patch: 1) moves the kernel to the second page in the PSE case. There is an errata that says that you Must Not point a 4MB page at physical address zero on older cpus. We avoided bugs here due to sheer luck. 2) sets up PSE page tables right from the start in locore, rather than trying to switch from 4K to 4M (or 2M) pages part way through the boot sequence at the same time that we're messing with PG_G.
For some reason, the pmap work over the last 18 months seems to tickle the problems, and the PAE infrastructure changes disturb the cpu bugs even more.
A couple of people have reported a problem with APM bios calls during boot. I'll work with people to get this resolved.
Obtained from: bmilekic
|
#
120623 |
|
01-Oct-2003 |
jeff |
- Hide more #ifdef logic in a new invlcaddr inline. This function flushes the full tlb if you're on an I386or does an invlpg otherwise.
Glanced at by: peter
|
#
120621 |
|
01-Oct-2003 |
jeff |
- Define an inline pagezero() to select the appropriate full-page zeroing function from one of bzero, i686_pagezero, or sse2_pagezero. - Use pagezero() in the three pmap functions that need to zero full pages.
|
#
120613 |
|
30-Sep-2003 |
jeff |
- Correct a problem with the last commit. The CMAP ptes need to be zeroed prior to invalidating the TLB to be certain that the processor doesn't keep a cached copy.
Discussed with: pete Paniced: tegge Pointy Hat: The usual spot
|
#
120602 |
|
30-Sep-2003 |
jeff |
- On my Pentium4-M laptop, invalpg takes ~1100 cycles if the page is found in the TLB and ~1600 if it is not. Therefore, it is more effecient to invalidate the TLB after operations that use CMAP rather than before. - So that the tlb is invalidated prior to switching off of a processor, we must change the switchin functions to switchout functions. - Remove td_switchout from the thread and move it to the x86 pcb. - Move the code that calls switchout into swtch.s. These changes make this optimization truely x86 specific.
|
#
120499 |
|
27-Sep-2003 |
alc |
Addendum to the previous revision: If vm_page_alloc() for the page table page fails, perform a VM_WAIT; update some comments in _pmap_allocpte().
|
#
120424 |
|
25-Sep-2003 |
alc |
- Eliminate the pte object. - Use kmem_alloc_nofault() rather than kmem_alloc_pageable() to allocate KVA space for the page directory page(s). Submitted by: tegge
|
#
120322 |
|
21-Sep-2003 |
alc |
Allocate the page table directory page(s) as "no object" pages. (This leaves one explicit use of the pte object.)
|
#
120307 |
|
20-Sep-2003 |
alc |
Reimplement pmap_release() such that it uses the page table rather than the pte object to locate the page table directory pages. (This is another step toward the elimination of the pte object.)
|
#
120040 |
|
13-Sep-2003 |
alc |
Simplify (and micro-optimize) pmap_unuse_pt(): Only one caller, pmap_remove_pte(), passed NULL instead of the required page table page to pmap_unuse_pt(). Compute the necessary page table page in pmap_remove_pte(). Also, remove some unreachable code from pmap_remove_pte().
|
#
119999 |
|
12-Sep-2003 |
alc |
Add a new parameter to pmap_extract_and_hold() that is needed to eliminate Giant from vmapbuf().
Idea from: tegge
|
#
119869 |
|
08-Sep-2003 |
alc |
Introduce a new pmap function, pmap_extract_and_hold(). This function atomically extracts and holds the physical page that is associated with the given pmap and virtual address. Such a function is needed to make the memory mapping optimizations used by, for example, pipes and raw disk I/O MP-safe.
Reviewed by: tegge
|
#
119452 |
|
25-Aug-2003 |
obrien |
Fix copyright comment & FBSDID style nits.
Requested by: bde
|
#
119276 |
|
22-Aug-2003 |
alc |
Eliminate the last (direct) use of vm_page_lookup() on the pte object.
|
#
119139 |
|
19-Aug-2003 |
alc |
Eliminate a possible race condition for multithreaded applications in _pmap_allocpte(): Guarantee that the page table page is zero filled before adding it to the directory. Otherwise, a 2nd, 3rd, etc. thread could access a nearby virtual address and use garbage for the address translation.
Discussed with: peter, tegge
|
#
119054 |
|
17-Aug-2003 |
alc |
Acquire the pte object's mutex when performing vm_page_grab(). Note: It is my long-term objective to eliminate the pte object. In the near term, this does, however, enable the addition of some vm object locking assertions.
|
#
119006 |
|
17-Aug-2003 |
alc |
In pmap_copy(), since we have the page table page's physical address in hand, use PHYS_TO_VM_PAGE() rather than vm_page_lookup().
|
#
118894 |
|
14-Aug-2003 |
alc |
Eliminate pmap_page_lookup() and its uses. Instead, use PHYS_TO_VM_PAGE() to convert the pte's physical address into a vm page.
Reviewed by: peter
|
#
118741 |
|
10-Aug-2003 |
alc |
Rename pmap_changebit() to pmap_clear_ptes() and remove the last parameter. The new name better reflects what the function does and how it is used. The last parameter was always FALSE.
Note: In theory, gcc would perform constant propagation and dead code elimination to achieve the same effect as removing the last parameter, which is always FALSE. In practice, recent versions do not. So, there is little point in letting unused code pessimize execution.
|
#
118562 |
|
06-Aug-2003 |
alc |
Correct a mistake in the previous revision: Reduce the scope of the page queues lock such that it isn't held around the call to get_pv_entry(), which calls uma_zalloc(). At the point of the call to get_pv_entry(), the lock isn't necessary and holding it could lead to recursive acquisition, which isn't allowed.
|
#
118561 |
|
06-Aug-2003 |
alc |
Acquire the page queues lock in pmap_insert_entry(). (I used to believe that the page's busy flag could be relied upon to synchronize access to the pv list. I don't any longer. See, for example, the call to pmap_insert_entry() from pmap_copy().)
|
#
118344 |
|
02-Aug-2003 |
alc |
- Use kmem_alloc_nofault() rather than kmem_alloc_pageable() in pmap_mapdev(). See revision 1.140 of kern/sys_pipe.c for a detailed rationale. Submitted by: tegge - Remove GIANT_REQUIRED from pmap_mapdev().
|
#
118244 |
|
31-Jul-2003 |
bmilekic |
Make sure that when the PV ENTRY zone is created in pmap, that it's created not only with UMA_ZONE_VM but also with UMA_ZONE_NOFREE. In the i386 case in particular, the pmap code would hook a special page allocation routine that allocated from kernel_map and not kmem_map, and so when/if the pageout daemon drained the zones, it could actually push out slabs from the PV ENTRY zone but call UMA's default page_free, which resulted in pages allocated from kernel_map being freed to kmem_map; bad. kmem_free() ignores the return value of the vm_map_delete and just returns. I'm not sure what the exact repercussions could be, but it doesn't look good.
In the PAE case on i386, we also set-up a zone in pmap, so be conservative for now and make that zone also ZONE_NOFREE and ZONE_VM. Do this for the pmap zones for the other archs too, although in some cases it may not be entirely necessarily. We'd rather be safe than sorry at this point.
Perhaps all UMA_ZONE_VM zones should by default be also UMA_ZONE_NOFREE?
May fix some of silby's crashes on the PV ENTRY zone.
|
#
117929 |
|
23-Jul-2003 |
alc |
Annotate pmap_changebit() as __always_inline. This function was written as a template that when inlined is specialized for the caller through constant value propagation and dead code elimination. Thus, the specialized code that is generated for pmap_clear_reference() et al. avoids several conditional branches inside of a loop.
|
#
117372 |
|
09-Jul-2003 |
peter |
unifdef -DLAZY_SWITCH and start to tidy up the associated glue.
|
#
117340 |
|
08-Jul-2003 |
alc |
In pmap_object_init_pt(), the pmap_invalidate_all() should be performed on the caller-provided pmap, not the kernel_pmap. Using the kernel_pmap results in an unnecessary IPI for TLB shootdown on SMPs.
Reviewed by: jake, peter
|
#
117242 |
|
04-Jul-2003 |
alc |
Add vm object locking to pmap_prefault().
|
#
117206 |
|
03-Jul-2003 |
alc |
Background: pmap_object_init_pt() premaps the pages of a object in order to avoid the overhead of later page faults. In general, it implements two cases: one for vnode-backed objects and one for device-backed objects. Only the device-backed case is really machine-dependent, belonging in the pmap.
This commit moves the vnode-backed case into the (relatively) new function vm_map_pmap_enter(). On amd64 and i386, this commit only amounts to code rearrangement. On alpha and ia64, the new machine independent (MI) implementation of the vnode case is smaller and more efficient than their pmap-based implementations. (The MI implementation takes advantage of the fact that objects in -CURRENT are ordered collections of pages.) On sparc64, pmap_object_init_pt() hadn't (yet) been implemented.
|
#
117045 |
|
29-Jun-2003 |
alc |
- Export pmap_enter_quick() to the MI VM. This will permit the implementation of a largely MI pmap_object_init_pt() for vnode-backed objects. pmap_enter_quick() is implemented via pmap_enter() on sparc64 and powerpc. - Correct a mismatch between pmap_object_init_pt()'s prototype and its various implementations. (I plan to keep pmap_object_init_pt() as the MD hook for device-backed objects on i386 and amd64.) - Correct an error in ia64's pmap_enter_quick() and adjust its interface to match the other versions. Discussed with: marcel
|
#
116930 |
|
27-Jun-2003 |
peter |
Tidy up leftover lazy_switch instrumentation that is no longer needed. This cleans up some #ifdef hell.
|
#
116926 |
|
27-Jun-2003 |
peter |
Fix the false IPIs on smp when using LAZY_SWITCH caused by pmap_activate() not releasing the pm_active bit in the old pmap.
|
#
116361 |
|
14-Jun-2003 |
davidxu |
Rename P_THREADED to P_SA. P_SA means a process is using scheduler activations.
|
#
116355 |
|
14-Jun-2003 |
alc |
Migrate the thread stack management functions from the machine-dependent to the machine-independent parts of the VM. At the same time, this introduces vm object locking for the non-i386 platforms.
Two details:
1. KSTACK_GUARD has been removed in favor of KSTACK_GUARD_PAGES. The different machine-dependent implementations used various combinations of KSTACK_GUARD and KSTACK_GUARD_PAGES. To disable guard page, set KSTACK_GUARD_PAGES to 0.
2. Remove the (unnecessary) clearing of PG_ZERO in vm_thread_new. In 5.x, (but not 4.x,) PG_ZERO can only be set if VM_ALLOC_ZERO is passed to vm_page_alloc() or vm_page_grab().
|
#
116328 |
|
14-Jun-2003 |
alc |
Move the *_new_altkstack() and *_dispose_altkstack() functions out of the various pmap implementations into the machine-independent vm. They were all identical.
|
#
116304 |
|
13-Jun-2003 |
alc |
Add vm object locking to pmap_object_init_pt().
|
#
116279 |
|
13-Jun-2003 |
alc |
Add vm object locking to various pagers' "get pages" methods, i386 stack management functions, and a u area management function.
|
#
115683 |
|
02-Jun-2003 |
obrien |
Use __FBSDID().
|
#
114177 |
|
28-Apr-2003 |
jake |
Use inlines for loading and storing page table entries. Use cmpxchg8b for the PAE case to ensure idempotent 64 bit loads and stores.
Sponsored by: DARPA, Network Associates Laboratories
|
#
114013 |
|
25-Apr-2003 |
jake |
Remove harmless invalid cast.
Sponsored by: DARPA, Network Associates Laboratories
|
#
113040 |
|
03-Apr-2003 |
jake |
- Removed APTD and associated macros, it is no longer used.
BANG BANG BANG etc.
Sponsored by: DARPA, Network Associates Laboratories
|
#
112993 |
|
02-Apr-2003 |
peter |
Commit a partial lazy thread switch mechanism for i386. it isn't as lazy as it could be and can do with some more cleanup. Currently its under options LAZY_SWITCH. What this does is avoid %cr3 reloads for short context switches that do not involve another user process. ie: we can take an interrupt, switch to a kthread and return to the user without explicitly flushing the tlb. However, this isn't as exciting as it could be, the interrupt overhead is still high and too much blocks on Giant still. There are some debug sysctls, for stats and for an on/off switch.
The main problem with doing this has been "what if the process that you're running on exits while we're borrowing its address space?" - in this case we use an IPI to give it a kick when we're about to reclaim the pmap.
Its not compiled in unless you add the LAZY_SWITCH option. I want to fix a few more things and get some more feedback before turning it on by default.
This is NOT a replacement for Bosko's lazy interrupt stuff. This was more meant for the kthread case, while his was for interrupts. Mine helps a little for interrupts, but his helps a lot more.
The stats are enabled with options SWTCH_OPTIM_STATS - this has been a pseudo-option for years, I just added a bunch of stuff to it.
One non-trivial change was to select a new thread before calling cpu_switch() in the first place. This allows us to catch the silly case of doing a cpu_switch() to the current process. This happens uncomfortably often. This simplifies a bit of the asm code in cpu_switch (no longer have to call choosethread() in the middle). This has been implemented on i386 and (thanks to jake) sparc64. The others will come soon. This is actually seperate to the lazy switch stuff.
Glanced at by: jake, jhb
|
#
112841 |
|
30-Mar-2003 |
jake |
- Add support for PAE and more than 4 gigs of ram on x86, dependent on the kernel opition 'options PAE'. This will only work with device drivers which either use busdma, or are able to handle 64 bit physical addresses.
Thanks to Lanny Baron from FreeBSD Systems for the loan of a test machine with 6 gigs of ram.
Sponsored by: DARPA, Network Associates Laboratories, FreeBSD Systems
|
#
112837 |
|
29-Mar-2003 |
jake |
- Remove invalid casts.
Sponsored by: DARPA, Network Associates Laboratories
|
#
112836 |
|
29-Mar-2003 |
jake |
- Convert all uses of pmap_pte and get_ptbase to pmap_pte_quick. When accessing an alternate address space this causes 1 page table page at a time to be mapped in, rather than using the recursive mapping technique to map in an entire alternate address space. The recursive mapping technique changes large portions of the address space and requires global tlb flushes, which seem to cause problems when PAE is enabled. This will also allow IPIs to be avoided when mapping in new page table pages using the same technique as is used for pmap_copy_page and pmap_zero_page.
Sponsored by: DARPA, Network Associates Laboratories
|
#
112569 |
|
24-Mar-2003 |
jake |
- Add vm_paddr_t, a physical address type. This is required for systems where physical addresses larger than virtual addresses, such as i386s with PAE. - Use this to represent physical addresses in the MI vm system and in the i386 pmap code. This also changes the paddr parameter to d_mmap_t. - Fix printf formats to handle physical addresses >4G in the i386 memory detection code, and due to kvtop returning vm_paddr_t instead of u_long.
Note that this is a name change only; vm_paddr_t is still the same as vm_offset_t on all currently supported platforms.
Sponsored by: DARPA, Network Associates Laboratories Discussed with: re, phk (cdevsw change)
|
#
112133 |
|
12-Mar-2003 |
jake |
- Added support for multiple page directory pages to pmap_pinit and pmap_release. - Merged pmap_release and pmap_release_free_page. When pmap_release is called only the page directory page(s) can be left in the pmap pte object, since all page table pages will have been freed by pmap_remove_pages and pmap_remove. In addition, there can only be one reference to the pmap and the page directory is wired, so the page(s) can never be busy. So all there is to do is clear the magic mappings from the page directory and free the page(s).
Sponsored by: DARPA, Network Associates Laboratories
|
#
111585 |
|
27-Feb-2003 |
julian |
Change the process flags P_KSES to be P_THREADED. This is just a cosmetic change but I've been meaning to do it for about a year.
|
#
111493 |
|
25-Feb-2003 |
jake |
- Added inlines pmap_is_current, pmap_is_alternate and pmap_set_alternate for testing and setting the current and alternate address spaces. - Changed PTDpde and APTDpde to arrays to support multiple page directory pages.
ponsored by: DARPA, Network Associates Laboratories
|
#
111462 |
|
25-Feb-2003 |
mux |
Cleanup of the d_mmap_t interface.
- Get rid of the useless atop() / pmap_phys_address() detour. The device mmap handlers must now give back the physical address without atop()'ing it. - Don't borrow the physical address of the mapping in the returned int. Now we properly pass a vm_offset_t * and expect it to be filled by the mmap handler when the mapping was successful. The mmap handler must now return 0 when successful, any other value is considered as an error. Previously, returning -1 was the only way to fail. This change thus accidentally fixes some devices which were bogusly returning errno constants which would have been considered as addresses by the device pager. - Garbage collect the poorly named pmap_phys_address() now that it's no longer used. - Convert all the d_mmap_t consumers to the new API.
I'm still not sure wheter we need a __FreeBSD_version bump for this, since and we didn't guarantee API/ABI stability until 5.1-RELEASE.
Discussed with: alc, phk, jake Reviewed by: peter Compile-tested on: LINT (i386), GENERIC (alpha and sparc64) Runtime-tested on: i386
|
#
111385 |
|
23-Feb-2003 |
jake |
Use the direct mapping of IdlePTD setup in locore for proc0's page directory, instead of allocating another page of kva and mapping it in again. This was likely an oversight in revision 1.174 (cut and paste from pmap_pinit).
Discussed with: peter, tegge Sponsored by: DARPA, Network Associates Laboratories
|
#
111363 |
|
23-Feb-2003 |
jake |
- Added macros NPGPTD, NBPTD, and NPDEPTD, for dealing with the size of the page directory. - Use these instead of the magic constants 1 or PAGE_SIZE where appropriate. There are still numerous assumptions that the page directory is exactly 1 page.
Sponsored by: DARPA, Network Associates Laboratories
|
#
111299 |
|
23-Feb-2003 |
jake |
- Added macros PDESHIFT and PTESHIFT, use these instead of magic constants in locore. - Removed the macros PTESIZE and PDESIZE, use sizeof instead in C.
Sponsored by: DARPA, Network Associates Laboratories
|
#
111272 |
|
22-Feb-2003 |
alc |
The root of the splay tree maintained within the pm_pteobj always refers to the last accessed pte page. Thus, the pm_ptphint is redundant and can be removed.
|
#
110955 |
|
15-Feb-2003 |
alc |
Assert that the kernel map's system mutex is held in pmap_growkernel().
|
#
110845 |
|
14-Feb-2003 |
alc |
- Add a mutex for synchronizing the use of CMAP/CADDR 1 and 2. - Eliminate small style differences between pmap_zero_page(), pmap_copy_page(), etc.
|
#
110781 |
|
13-Feb-2003 |
peter |
Oops. I mis-remembered about the P4 problems. It was 5.0-DP2 that was shipped with DISABLE_PG_G and DISABLE_PSE, not 5.0-REL. *blush* Disable the code - but still leave it there in case its still lurking.
|
#
110780 |
|
12-Feb-2003 |
peter |
Turn of PG_PS and PG_G for Pentium-4 cpus at boot time. This is so that we can stop turning off PG_G and PG_PS globally for releases.
|
#
110747 |
|
12-Feb-2003 |
alc |
Remove kptobj. Instead, use VM_ALLOC_NOOBJ.
|
#
110532 |
|
08-Feb-2003 |
alc |
MF alpha - Synchronize access to the allpmaps list with a mutex.
|
#
110480 |
|
06-Feb-2003 |
peter |
Commit some cosmetic changes I had laying around and almost included with another commit. Unwrap a line. Unexpand a pmap_kenter().
|
#
110254 |
|
02-Feb-2003 |
alc |
- Make allpmaps static. - Use atomic subtract to update the global wired pages count. (See also vm/vm_page.c revision 1.233.) - Assert that the page queue lock is held in pmap_remove_entry().
|
#
109964 |
|
28-Jan-2003 |
alc |
Merge pmap_testbit() and pmap_is_modified(). The latter is the only caller of the former.
|
#
108533 |
|
01-Jan-2003 |
schweikh |
Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup, especially in troff files.
|
#
108337 |
|
27-Dec-2002 |
alc |
Assert that the page queues lock is held in pmap_testbit().
|
#
108253 |
|
24-Dec-2002 |
alc |
- Hold the page queues lock around calls to vm_page_wakeup() and vm_page_flag_clear().
|
#
107853 |
|
14-Dec-2002 |
alc |
Add page locking to pmap_mincore().
Submitted (in part) by: tjr@
|
#
107538 |
|
03-Dec-2002 |
alc |
Avoid recursive acquisition of the page queues lock in pmap_unuse_pt().
Approved by: re
|
#
107490 |
|
02-Dec-2002 |
alc |
Hold the page queues lock when calling pmap_unwire_pte_hold() or pmap_remove_pte(). Use vm_page_sleep_if_busy() in _pmap_unwire_pte_hold() so that the page queues lock is released when sleeping.
Approved by: re (blanket)
|
#
107434 |
|
30-Nov-2002 |
alc |
Assert that the page queues lock is held in pmap_changebit() and pmap_ts_referenced().
Approved by: re (blanket)
|
#
107410 |
|
30-Nov-2002 |
alc |
Assert that the page queues lock is held in pmap_page_exists_quick().
Approved by: re (blanket)
|
#
107217 |
|
25-Nov-2002 |
alc |
Assert that the page queues lock is held in pmap_remove_pages().
Approved by: re (blanket)
|
#
107184 |
|
23-Nov-2002 |
alc |
- Assert that the page queues lock is held in pmap_remove_all(). - Fix a diagnostic message and comment in pmap_remove_all(). - Eliminate excessive white space from pmap_remove_all().
Approved by: re
|
#
106838 |
|
13-Nov-2002 |
alc |
Move pmap_collect() out of the machine-dependent code, rename it to reflect its new location, and add page queue and flag locking.
Notes: (1) alpha, i386, and ia64 had identical implementations of pmap_collect() in terms of machine-independent interfaces; (2) sparc64 doesn't require it; (3) powerpc had it as a TODO.
|
#
106753 |
|
11-Nov-2002 |
alc |
- Clear the page's PG_WRITEABLE flag in the i386's pmap_changebit() if we're removing write access from the page's PTEs. - Export pmap_remove_all() on alpha, i386, and ia64. (It's already exported on sparc64.)
|
#
106567 |
|
07-Nov-2002 |
alc |
Simplify and optimize pmap_object_init_pt(). More specifically, take advantage of the fact that the vm object's list of pages is now ordered to reduce the overhead of finding the desired set of pages to be mapped. (See revision 1.215 of vm/vm_page.c.)
|
#
104354 |
|
02-Oct-2002 |
scottl |
Some kernel threads try to do significant work, and the default KSTACK_PAGES doesn't give them enough stack to do much before blowing away the pcb. This adds MI and MD code to allow the allocation of an alternate kstack who's size can be speficied when calling kthread_create. Passing the value 0 prevents the alternate kstack from being created. Note that the ia64 MD code is missing for now, and PowerPC was only partially written due to the pmap.c being incomplete there. Though this patch does not modify anything to make use of the alternate kstack, acpi and usb are good candidates.
Reviewed by: jake, peter, jhb
|
#
104319 |
|
01-Oct-2002 |
phk |
The pmap_prefault_pageorder[] array was initialize with wrong values due to a missing comma.
I have no idea what trouble, if any, this may have caused.
Pointed out by: FlexeLint
|
#
104094 |
|
28-Sep-2002 |
phk |
Be consistent about "static" functions: if the function is marked static in its prototype, mark it static at the definition too.
Inspired by: FlexeLint warning #512
|
#
102399 |
|
25-Aug-2002 |
alc |
o Retire pmap_pageable(). It's an advisory routine that none of our platforms implements.
|
#
102241 |
|
21-Aug-2002 |
archie |
Don't use "NULL" when "0" is really meant.
|
#
102041 |
|
18-Aug-2002 |
alc |
o Simplify the ptphint test in pmap_release_free_page(). In other words, make it just like the test in _pmap_unwire_pte_hold().
|
#
101770 |
|
13-Aug-2002 |
alc |
o Remove an unnecessary vm_page_flash() from _pmap_unwire_pte_hold().
Reviewed by: peter
|
#
101751 |
|
12-Aug-2002 |
alc |
o Convert three instances of vm_page_sleep_busy() into vm_page_sleep_if_busy() with page queue locking.
|
#
101721 |
|
12-Aug-2002 |
iedowse |
Use roundup2() to avoid a problem where pmap_growkernel was unable to extend the kernel VM to the maximum possible address of 4G-4M.
PR: i386/22441 Submitted by: Bill Carpenter <carp@world.std.com> Reviewed by: alc
|
#
101635 |
|
10-Aug-2002 |
alc |
o Remove the setting and clearing of the PG_MAPPED flag. (This flag is obsolete.)
|
#
101352 |
|
05-Aug-2002 |
peter |
Revert rev 1.356 and 1.352 (pmap_mapdev hacks). It wasn't worth the pain.
|
#
101322 |
|
04-Aug-2002 |
peter |
Fix a mistake in 1.352 - I was returning a pointer to the rounded down address. I expect this will fix acpica.
|
#
101294 |
|
04-Aug-2002 |
alc |
o Request a wired page from vm_page_grab() in _pmap_allocpte().
|
#
101280 |
|
03-Aug-2002 |
alc |
o Ask for a prezeroed page in pmap_pinit() for the page directory page.
|
#
101254 |
|
03-Aug-2002 |
alc |
o Don't set PG_MAPPED on the page allocated and mapped in _pmap_allocpte(). (Only set this flag if the mapping has a corresponding pv list entry, which this mapping doesn't.)
|
#
101249 |
|
02-Aug-2002 |
peter |
Take advantage of the fact that there is a small 1MB direct mapped region on x86 in between KERNBASE and the kernel load address. pmap_mapdev() can return pointers to this for devices operating in the isa "hole".
|
#
101197 |
|
02-Aug-2002 |
alc |
o Lock page queue accesses by vm_page_deactivate().
|
#
101105 |
|
31-Jul-2002 |
alc |
o Setting PG_MAPPED and PG_WRITEABLE on pages that are mapped and unmapped by pmap_qenter() and pmap_qremove() is pointless. In fact, it probably leads to unnecessary pmap_page_protect() calls if one of these pages is paged out after unwiring.
Note: setting PG_MAPPED asserts that the page's pv list may be non-empty. Since checking the status of the page's pv list isn't any harder than checking this flag, the flag should probably be eliminated. Alternatively, PG_MAPPED could be set by pmap_enter() exclusively rather than various places throughout the kernel.
|
#
100912 |
|
30-Jul-2002 |
alc |
o Lock page queue accesses by pmap_release_free_page().
|
#
100862 |
|
29-Jul-2002 |
alc |
o Pass VM_ALLOC_WIRED to vm_page_grab() rather than calling vm_page_wire() in pmap_new_thread(), pmap_pinit(), and vm_proc_new(). o Lock page queue accesses by vm_page_free() in pmap_object_init_pt().
|
#
100432 |
|
21-Jul-2002 |
peter |
Move SWTCH_OPTIM_STATS related code out of cpufunc.h. (This sort of stat gathering is not an x86 cpu feature)
|
#
100378 |
|
19-Jul-2002 |
alc |
o Use vm_page_alloc(... | VM_ALLOC_WIRED) in place of vm_page_wire().
|
#
100264 |
|
17-Jul-2002 |
peter |
Avoid trying to set PG_G on the first 4MB when we set up the 4MB page. This solves the SMP panic for at least one system. I'd still like to know why my xeon works though.
Tested by: bmilekic
|
#
99987 |
|
14-Jul-2002 |
alc |
o Lock page queue accesses by vm_page_wire().
|
#
99931 |
|
13-Jul-2002 |
peter |
Two invlpg's slipped through that were not protected from I386_CPU
Pointed out by: dillon
|
#
99930 |
|
13-Jul-2002 |
peter |
invlpg() does not work too well on i386 cpus. Add token i386 support back in to the pmap_zero_page* stuff.
|
#
99929 |
|
13-Jul-2002 |
peter |
Do global shootdowns when switching to/from 4MB pages. I believe we can do a shootdown on a 4MB "page" though, but this should be safer for now.
Noticed by: tegge
|
#
99928 |
|
13-Jul-2002 |
peter |
Bandaid for SMP. Changing APTDpde without a global shootdown is not safe yet. We used to do a global shootdown here anyway so another day or so shouldn't hurt.
|
#
99925 |
|
13-Jul-2002 |
alc |
o Lock some page queue accesses, in particular, those by vm_page_unwire().
|
#
99890 |
|
12-Jul-2002 |
dillon |
Re-enable the idle page-zeroing code. Remove all IPIs from the idle page-zeroing code as well as from the general page-zeroing code and use a lazy tlb page invalidation scheme based on a callback made at the end of mi_switch.
A number of people came up with this idea at the same time so credit belongs to Peter, John, and Jake as well.
Two-way SMP buildworld -j 5 tests (second run, after stabilization) 2282.76 real 2515.17 user 704.22 sys before peter's IPI commit 2266.69 real 2467.50 user 633.77 sys after peter's commit 2232.80 real 2468.99 user 615.89 sys after this commit
Reviewed by: peter, jhb Approved by: peter
|
#
99862 |
|
12-Jul-2002 |
peter |
Revive backed out pmap related changes from Feb 2002. The highlights are: - It actually works this time, honest! - Fine grained TLB shootdowns for SMP on i386. IPI's are very expensive, so try and optimize things where possible. - Introduce ranged shootdowns that can be done as a single IPI. - PG_G support for i386 - Specific-cpu targeted shootdowns. For example, there is no sense in globally purging the TLB cache for where we are stealing a page from the local unshared process on the local cpu. Use pm_active to track this. - Add some instrumentation for the tlb shootdown code. - Rip out SMP code from <machine/cpufunc.h> - Try and fix some very bogus PG_G and PG_PS interactions that were bad enough to cause vm86 bios calls to break. vm86 depended on our existing bugs and this was the cause of the VESA panics last time. - Fix the silly one-line error that caused the 'panic: bad pte' last time. - Fix a couple of other silly one-line errors that should have caused more pain than they did.
Some more work is needed: - pmap_{zero,copy}_page[_idle]. These can be done without IPI's if we have a hook in cpu_switch. - The IPI handlers need some cleanup. I have a bogus %ds load that can be avoided. - APTD handling is rather bogus and appears to be a large source of global TLB IPI shootdowns for no really good reason.
I see speedups of between 1.5% and ~4% on buildworlds in a while 1 loop. I expect to see a bigger difference when there is significant pageout activity or the system otherwise has memory shortages.
I have backed out a few optimizations that I had been using over the last few days in order to be a little more conservative. I'll revisit these again over the next few days as the dust settles.
New option: DISABLE_PG_G - In case I missed something.
|
#
99852 |
|
12-Jul-2002 |
peter |
Unexpand a couple of 8-space indents that I added in rev 1.285.
|
#
99571 |
|
08-Jul-2002 |
peter |
Add a special page zero entry point intended to be called via the single threaded VM pagezero kthread outside of Giant. For some platforms, this is really easy since it can just use the direct mapped region. For others, IPI sending is involved or there are other issues, so grab Giant when needed.
We still have preemption issues to deal with, but Alan Cox has an interesting suggestion on how to minimize the problem on x86.
Use Luigi's hack for preserving the (lack of) priority.
Turn the idle zeroing back on since it can now actually do something useful outside of Giant in many cases.
|
#
99561 |
|
07-Jul-2002 |
peter |
Fix a hideous TLB bug. pmap_unmapdev neglected to remove the device mappings from the page tables, which were mapped with PG_G! We could reuse the page table entry for another mapping (pmap_mapdev) but it would never have cleared any remaining PG_G TLB entries.
|
#
99559 |
|
07-Jul-2002 |
peter |
Collect all the (now equivalent) pmap_new_proc/pmap_dispose_proc/ pmap_swapin_proc/pmap_swapout_proc functions from the MD pmap code and use a single equivalent MI version. There are other cleanups needed still.
While here, use the UMA zone hooks to keep a cache of preinitialized proc structures handy, just like the thread system does. This eliminates one dependency on 'struct proc' being persistent even after being freed. There are some comments about things that can be factored out into ctor/dtor functions if it is worth it. For now they are mostly just doing statistics to get a feel of how it is working.
|
#
99415 |
|
04-Jul-2002 |
peter |
Diff reduction (microoptimization) with another WIP. Move the frame calculation in get_ptbase() to a little later on.
|
#
99398 |
|
03-Jul-2002 |
julian |
Don't free pages we never allocated..
My eyes openned by: Matt
|
#
99397 |
|
03-Jul-2002 |
julian |
Slight restatement of the code and remove some unused variables.
|
#
99381 |
|
03-Jul-2002 |
julian |
Add comments and slightly rearrange the thread stack assignment code to try make it less obscure.
|
#
99380 |
|
03-Jul-2002 |
julian |
Remove vestiges of old code... These functions are always called on new memory so they can not already be set up, so don't bother testing for that. (This was left over from before we used UMA (which is cool))
|
#
99072 |
|
29-Jun-2002 |
julian |
Part 1 of KSE-III
The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools)
Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands)
NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..
|
#
98902 |
|
27-Jun-2002 |
arr |
Fix for the problem stated below by Tor Egge: (from: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=832566+0+ \ current/freebsd-current)
"Too many pages were prefaulted in pmap_object_init_pt, thus the wrong physical page was entered in the pmap for the virtual address where the .dynamic section was supposed to be."
Submitted by: tegge Approved by: tegge's patches never fail
|
#
98892 |
|
26-Jun-2002 |
iedowse |
Avoid using the 64-bit vm_pindex_t in a few places where 64-bit types are not required, as the overhead is unnecessary:
o In the i386 pmap_protect(), `sindex' and `eindex' represent page indices within the 32-bit virtual address space. o In swp_pager_meta_build() and swp_pager_meta_ctl(), use a temporary variable to store the low few bits of a vm_pindex_t that gets used as an array index. o vm_uiomove() uses `osize' and `idx' for page offsets within a map entry. o In vm_object_split(), `idx' is a page offset within a map entry.
|
#
98824 |
|
25-Jun-2002 |
iedowse |
Complete the initial set of VM changes required to support full 64-bit file sizes. This step simply addresses the remaining overflows, and does attempt to optimise performance. The details are:
o Use a 64-bit type for the vm_object `size' and the size argument to vm_object_allocate(). o Use the correct type for index variables in dev_pager_getpages(), vm_object_page_clean() and vm_object_page_remove(). o Avoid an overflow in the i386 pmap_object_init_pt().
|
#
98361 |
|
17-Jun-2002 |
jeff |
- Introduce the new M_NOVM option which tells uma to only check the currently allocated slabs and bucket caches for free items. It will not go ask the vm for pages. This differs from M_NOWAIT in that it not only doesn't block, it doesn't even ask.
- Add a new zcreate option ZONE_VM, that sets the BUCKETCACHE zflag. This tells uma that it should only allocate buckets out of the bucket cache, and not from the VM. It does this by using the M_NOVM option to zalloc when getting a new bucket. This is so that the VM doesn't recursively enter itself while trying to allocate buckets for vm_map_entry zones. If there are already allocated buckets when we get here we'll still use them but otherwise we'll skip it.
- Use the ZONE_VM flag on vm map entries and pv entries on x86.
|
#
95710 |
|
29-Apr-2002 |
peter |
Tidy up some loose ends. i386/ia64/alpha - catch up to sparc64/ppc: - replace pmap_kernel() with refs to kernel_pmap - change kernel_pmap pointer to (&kernel_pmap_store) (this is a speedup since ld can set these at compile/link time) all platforms (as suggested by jake): - gc unused pmap_reference - gc unused pmap_destroy - gc unused struct pmap.pm_count (we never used pm_count - we track address space sharing at the vmspace)
|
#
94777 |
|
15-Apr-2002 |
peter |
Pass vm_page_t instead of physical addresses to pmap_zero_page[_area]() and pmap_copy_page(). This gets rid of a couple more physical addresses in upper layers, with the eventual aim of supporting PAE and dealing with the physical addressing mostly within pmap. (We will need either 64 bit physical addresses or page indexes, possibly both depending on the circumstances. Leaving this to pmap itself gives more flexibilitly.)
Reviewed by: jake Tested on: i386, ia64 and (I believe) sparc64. (my alpha was hosed)
|
#
92846 |
|
20-Mar-2002 |
jeff |
Remove references to vm_zone.h and switch over to the new uma API.
|
#
92770 |
|
20-Mar-2002 |
alfred |
Remove __P.
|
#
92654 |
|
19-Mar-2002 |
jeff |
This is the first part of the new kernel memory allocator. This replaces malloc(9) and vm_zone with a slab like allocator.
Reviewed by: arch@
|
#
91471 |
|
28-Feb-2002 |
silby |
Fix a minor swap leak.
Previously, the UPAGES/KSTACK area of processes/threads would leak memory at the time that a previously swapped process was terminated. Lukcily, the leak was only 12K/proc, so it was unlikely to be a major problem unless you had an undersized swap partition.
Submitted by: dillon Reviewed by: silby MFC after: 1 week
|
#
91403 |
|
27-Feb-2002 |
silby |
Fix a horribly suboptimal algorithm in the vm_daemon.
In order to determine what to page out, the vm_daemon checks reference bits on all pages belonging to all processes. Unfortunately, the algorithm used reacted badly with shared pages; each shared page would be checked once per process sharing it; this caused an O(N^2) growth of tlb invalidations. The algorithm has been changed so that each page will be checked only 16 times.
Prior to this change, a fork/sleepbomb of 1300 processes could cause the vm_daemon to take over 60 seconds to complete, effectively freezing the system for that time period. With this change in place, the vm_daemon completes in less than a second. Any system with hundreds of processes sharing pages should benefit from this change.
Note that the vm_daemon is only run when the system is under extreme memory pressure. It is likely that many people with loaded systems saw no symptoms of this problem until they reached the point where swapping began.
Special thanks go to dillon, peter, and Chuck Cranor, who helped me get up to speed with vm internals.
PR: 33542, 20393 Reviewed by: dillon MFC after: 1 week
|
#
91367 |
|
27-Feb-2002 |
peter |
Back out all the pmap related stuff I've touched over the last few days. There is some unresolved badness that has been eluding me, particularly affecting uniprocessor kernels. Turning off PG_G helped (which is a bad sign) but didn't solve it entirely. Userland programs still crashed.
|
#
91358 |
|
27-Feb-2002 |
peter |
Bandaid for the Uniprocessor kernel exploding. This makes a UP kernel boot and run (and indeed I am committing from it) instead of exploding during the int 0x15 call from inside the atkbd driver to get the keyboard repeat rates.
|
#
91353 |
|
27-Feb-2002 |
alfred |
clarify panic message
|
#
91344 |
|
27-Feb-2002 |
peter |
Jake further reduced IPI shootdowns on sparc64 in loops by using ranged shootdowns in a couple of key places. Do the same for i386. This also hides some physical addresses from higher levels and has it use the generic vm_page_t's instead. This will help for PAE down the road.
Obtained from: jake (MI code, suggestions for MD part)
|
#
91341 |
|
26-Feb-2002 |
dillon |
didn't quite undo the last reversion. This gets it.
|
#
91329 |
|
26-Feb-2002 |
dillon |
revert compatibility fix temporarily (thought it would not break anything leaving it in).
|
#
91322 |
|
26-Feb-2002 |
dillon |
Make peter's commit compatible with interrupt-enabled critical_enter() and exit(), which has already solved the problem in regards to deadlocked IPI's.
|
#
91260 |
|
25-Feb-2002 |
peter |
Work-in-progress commit syncing up pmap cleanups that I have been working on for a while: - fine grained TLB shootdown for SMP on i386 - ranged TLB shootdowns.. eg: specify a range of pages to shoot down with a single IPI, since the IPI is very expensive. Adjust some callers that used to trigger this inside tight loops to do a ranged shootdown at the end instead. - PG_G support for SMP on i386 (options ENABLE_PG_G) - defer PG_G activation till after we decide what we are going to do with PSE and the 4MB pages at the start of the kernel. This should solve some rumored strangeness about stale PG_G entries getting stuck underneath the 4MB pages. - add some instrumentation for the fine TLB shootdown - convert some asm instruction wrappers from functions to inlines. gcc seems to do a fair bit better with this. - [temporarily!] pessimize the tlb shootdown IPI handlers. I will fix this again shortly.
This has been working fairly well for me for a while, but I have tweaked it again prior to commit since my last major testing round. The only outstanding problem that I know of is PG_G related, which is why there is an option for it (not on by default for SMP). I have seen a world speedups by a few percent (as much as 4 or 5% in one case) but I have *not* accurately measured this - I am a bit sceptical of these numbers.
|
#
90998 |
|
20-Feb-2002 |
peter |
Pass me the pointy hat please. Be sure to return a value in a non-void function. I've been running with this buried in the mountains of compiler output for about a month on my desktop.
|
#
90947 |
|
19-Feb-2002 |
peter |
Some more tidy-up of stray "unsigned" variables instead of p[dt]_entry_t etc.
|
#
88903 |
|
05-Jan-2002 |
peter |
Convert a bunch of 1 << PCPU_GET(cpuid) to PCPU_GET(cpumask).
|
#
88838 |
|
02-Jan-2002 |
peter |
Allow a specific setting for pv entries. This avoids the need to guess (or calculate by hand) the effect of interactions between shpgperproc, physical ram size, maxproc, maxdsiz, etc.
|
#
88744 |
|
31-Dec-2001 |
dillon |
Grrr. The tlb code is strewn over 3 files and I misread it. Revert the last change (it was a NOP), and remove the XXX comments that no longer apply.
|
#
88742 |
|
31-Dec-2001 |
dillon |
You know those 'XXX what about SMP' comments in pmap_kenter()? Well, they were right. Fix both kenter() and kremove() for SMP by ensuring that the tlb is flushed on other cpu's. This will directly solve random-corruption panic issues in -stable when it is MFC'd. Better to be safe then sorry, we can optimize this later.
Original Suspicion by: peter Maybe MFC: immediately on re's permission
|
#
88245 |
|
20-Dec-2001 |
peter |
Replace a bunch of: for (pv = TAILQ_FIRST(&m->md.pv_list); pv; pv = TAILQ_NEXT(pv, pv_list)) { with: TAILQ_FOREACH(pv, &m->md.pv_list, pv_list) {
|
#
88240 |
|
20-Dec-2001 |
peter |
Fix some whitespace nits, and a minor error that I made in some unused #ifdef DEBUG code (VM_MAXUSER_ADDRESS vs UPT_MAX_ADDRESS).
|
#
87702 |
|
11-Dec-2001 |
jhb |
Overhaul the per-CPU support a bit:
- The MI portions of struct globaldata have been consolidated into a MI struct pcpu. The MD per-CPU data are specified via a macro defined in machine/pcpu.h. A macro was chosen over a struct mdpcpu so that the interface would be cleaner (PCPU_GET(my_md_field) vs. PCPU_GET(md.md_my_md_field)). - All references to globaldata are changed to pcpu instead. In a UP kernel, this data was stored as global variables which is where the original name came from. In an SMP world this data is per-CPU and ideally private to each CPU outside of the context of debuggers. This also included combining machine/globaldata.h and machine/globals.h into machine/pcpu.h. - The pointer to the thread using the FPU on i386 was renamed from npxthread to fpcurthread to be identical with other architectures. - Make the show pcpu ddb command MI with a MD callout to display MD fields. - The globaldata_register() function was renamed to pcpu_init() and now init's MI fields of a struct pcpu in addition to registering it with the internal array and list. - A pcpu_destroy() function was added to remove a struct pcpu from the internal array and list.
Tested on: alpha, i386 Reviewed by: peter, jake
|
#
87637 |
|
10-Dec-2001 |
peter |
Delete some leftover code from a bygone age. We dont have an array of IdlePTDS anymore and dont to the PTD[MPPTDI] swapping etc.
|
#
86486 |
|
16-Nov-2001 |
peter |
Fix the non-KSTACK_GUARD case.. It has been broken since the KSE commit. ptek was not been initialized.
|
#
86485 |
|
16-Nov-2001 |
peter |
Start bringing i386/pmap.c into line with cleanups that were done to alpha pmap. In particular - - pd_entry_t and pt_entry_t are now u_int32_t instead of a pointer. This is to enable cleaner PAE and x86-64 support down the track sor that we can change the pd_entry_t/pt_entry_t types to 64 bit entities. - Terminate "unsigned *ptep, pte" with extreme prejudice and use the correct pt_entry_t/pd_entry_t types. - Various other cosmetic changes to match cleanups elsewhere. - This eliminates a boatload of casts. - use VM_MAXUSER_ADDRESS in place of UPT_MIN_ADDRESS in a couple of places where we're testing user address space limits. Assuming the page tables start directly after the end of user space is not a safe assumption. There is still more to go.
|
#
86443 |
|
16-Nov-2001 |
peter |
Oops, I accidently merged a whitespace error from the original commit. (whitespace at end of line in rev 1.264 pmap.c). Fix them all.
|
#
86439 |
|
16-Nov-2001 |
peter |
Converge/fix some debug code (#if 0'ed on alpha, but whatever) - use NPTEPG/NPDEPG instead of magic 1024 (important for PAE) - use pt_entry_t instead of unsigned (important for PAE) - use vm_offset_t instead of unsigned for va's (important for x86-64)
|
#
85806 |
|
01-Nov-2001 |
peter |
Skip PG_UNMANAGED pages when we're shooting everything down to try and reclaim pv_entries. PG_UNMANAGED pages dont have pv_entries to reclaim.
Reported by: David Xu <davidx@viasoft.com.cn>
|
#
85762 |
|
31-Oct-2001 |
dillon |
Don't let pmap_object_init_pt() exhaust all available free pages (allocating pv entries w/ zalloci) when called in a loop due to an madvise(). It is possible to completely exhaust the free page list and cause a system panic when an expected allocation fails.
|
#
84934 |
|
14-Oct-2001 |
tegge |
Reduce the number of TLB shootdowns caused by a call to pmap_qenter() from number of pages mapped to 1.
Reviewed by: dillon
|
#
83366 |
|
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
#
82624 |
|
31-Aug-2001 |
peter |
Do a style cleanup pass for the pmap_{new,dispose,etc}_proc() functions to get them closer to the KSE tree. I will do the other $machine/pmap.c files shortly.
|
#
82310 |
|
25-Aug-2001 |
julian |
Add another comment. check for 'teh's this time..
|
#
82309 |
|
25-Aug-2001 |
peter |
Optionize UPAGES for the i386. As part of this I split some of the low level implementation stuff out of machine/globaldata.h to avoid exposing UPAGES to lots more places. The end result is that we can double the kernel stack size with 'options UPAGES=4' etc.
This is mainly being done for the benefit of a MFC to RELENG_4 at some point. -current doesn't really need this so much since each interrupt runs on its own kstack.
|
#
82127 |
|
22-Aug-2001 |
dillon |
Move most of the kernel submap initialization code, including the timeout callwheel and buffer cache, out of the platform specific areas and into the machine independant area. i386 and alpha adjusted here. Other cpus can be fixed piecemeal.
Reviewed by: freebsd-smp, jake
|
#
82121 |
|
21-Aug-2001 |
peter |
Introduce two new sysctl's.. vm.kvm_size and vm.kvm_free. These are purely informational and can give some advance indications of tuning problems. These are i386 only for now as it seems that the i386 is the only one suffering kvm pressure.
|
#
80431 |
|
26-Jul-2001 |
peter |
Make PMAP_SHPGPERPROC tunable. One shouldn't need to recompile a kernel for this, since it is easy to run into with large systems with lots of shared mmap space.
Obtained from: yahoo
|
#
79224 |
|
04-Jul-2001 |
dillon |
With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
|
#
77082 |
|
23-May-2001 |
alfred |
pmap_mapdev needs the vm_mtx, aquire it if not already locked
|
#
76941 |
|
21-May-2001 |
jhb |
Sort includes.
|
#
76827 |
|
18-May-2001 |
alfred |
Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level vm operations.
faults can not be taken without holding Giant.
Memory subsystems can now call the base page allocators safely.
Almost all atomic ops were removed as they are covered under the vm mutex.
Alpha and ia64 now need to catch up to i386's trap handlers.
FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties).
Reviewed (partially) by: jake, jhb
|
#
76166 |
|
01-May-2001 |
markm |
Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files.
Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files.
Sort sys/*.h includes where possible in affected files.
OK'ed by: bde (with reservations)
|
#
74927 |
|
28-Mar-2001 |
jhb |
Convert the allproc and proctree locks from lockmgr locks to sx locks.
|
#
74283 |
|
15-Mar-2001 |
peter |
Kill the 4MB kernel limit dead. [I hope :-)]. For UP, we were using $tmp_stk as a stack from the data section. If the kernel text section grew beyond ~3MB, the data section would be pushed beyond the temporary 4MB P==V mapping. This would cause the trampoline up to high memory to fault. The hack workaround I did was to use all of the page table pages that we already have while preparing the initial P==V mapping, instead of just the first one. For SMP, the AP bootstrap process suffered the same sort of problem and got the same treatment.
MFC candidate - this breaks on 4.x just the same..
Thanks to: Richard Todd <rmtodd@ichotolot.servalan.com>
|
#
73936 |
|
07-Mar-2001 |
jhb |
Unrevert the pmap_map() changes. They weren't broken on x86.
Sense beaten into me by: peter
|
#
73903 |
|
06-Mar-2001 |
jhb |
Back out the pmap_map() change for now, it isn't completely stable on the i386.
|
#
73862 |
|
06-Mar-2001 |
jhb |
- Rework pmap_map() to take advantage of direct-mapped segments on supported architectures such as the alpha. This allows us to save on kernel virtual address space, TLB entries, and (on the ia64) VHPT entries. pmap_map() now modifies the passed in virtual address on architectures that do not support direct-mapped segments to point to the next available virtual address. It also returns the actual address that the request was mapped to. - On the IA64 don't use a special zone of PV entries needed for early calls to pmap_kenter() during pmap_init(). This gets us in trouble because we end up trying to use the zone allocator before it is initialized. Instead, with the pmap_map() change, the number of needed PV entries is small enough that we can get by with a static pool that is used until pmap_init() is complete.
Submitted by: dfr Debugging help: peter Tested by: me
|
#
72930 |
|
22-Feb-2001 |
peter |
Activate USER_LDT by default. The new thread libraries are going to depend on this. The linux ABI emulator tries to use it for some linux binaries too. VM86 had a bigger cost than this and it was made default a while ago.
Reviewed by: jhb, imp
|
#
71816 |
|
29-Jan-2001 |
jhb |
Remove unnecessary locking to protect the p_upages_obj and p_addr pointers.
|
#
71526 |
|
24-Jan-2001 |
jhb |
- Proc locking. - P_INMEM -> PS_INMEM.
|
#
71350 |
|
21-Jan-2001 |
des |
First step towards an MP-safe zone allocator: - have zalloc() and zfree() always lock the vm_zone. - remove zalloci() and zfreei(), which are now redundant.
Reviewed by: bmilekic, jasone
|
#
71318 |
|
21-Jan-2001 |
jake |
Remove the per-cpu pages used for copy and zero-ing pages of memory for SMP; just use the same ones as UP. These weren't used without holding Giant anyway, and the routines that use them would have to be protected from pre-emption to avoid migrating cpus.
|
#
71098 |
|
16-Jan-2001 |
peter |
Stop doing runtime checking on i386 cpus for cpu class. The cpu is slow enough as it is, without having to constantly check that it really is an i386 still. It was possible to compile out the conditionals for faster cpus by leaving out 'I386_CPU', but it was not possible to unconditionally compile for the i386. You got the runtime checking whether you wanted it or not. This makes I386_CPU mutually exclusive with the other cpu types, and tidies things up a little in the process.
Reviewed by: alfred, markm, phk, benno, jlemon, jhb, jake, grog, msmith, jasone, dcs, des (and a bunch more people who encouraged it)
|
#
70861 |
|
10-Jan-2001 |
jake |
Use PCPU_GET, PCPU_PTR and PCPU_SET to access all per-cpu variables other then curproc.
|
#
69947 |
|
12-Dec-2000 |
jake |
- Change the allproc_lock to use a macro, ALLPROC_LOCK(how), instead of explicit calls to lockmgr. Also provides macros for the flags pased to specify shared, exclusive or release which map to the lockmgr flags. This is so that the use of lockmgr can be easily replaced with optimized reader-writer locks. - Add some locking that I missed the first time.
|
#
69022 |
|
22-Nov-2000 |
jake |
Protect the following with a lockmgr lock:
allproc zombproc pidhashtbl proc.p_list proc.p_hash nextpid
Reviewed by: jhb Obtained from: BSD/OS and netbsd
|
#
68441 |
|
07-Nov-2000 |
alfred |
Protect against an infinite loop when prefaulting pages. This can happen when the vm system maps past the end of an object or tries to map a zero length object, the pmap layer misses the fact that offsets wrap into negative numbers and we get stuck.
Found by: Joost Pol aka Nohican <nohican@marcella.niets.org> Submitted by: tegge
|
#
67247 |
|
17-Oct-2000 |
ps |
Implement write combining for crashdumps. This is useful when write caching is disabled on both SCSI and IDE disks where large memory dumps could take up to an hour to complete.
Taking an i386 scsi based system with 512MB of ram and timing (in seconds) how long it took to complete a dump, the following results were obtained:
Before: After: WCE TIME WCE TIME ------------------ ------------------ 1 141.820972 1 15.600111 0 797.265072 0 65.480465
Obtained from: Yahoo! Reviewed by: peter
|
#
66696 |
|
05-Oct-2000 |
jhb |
Replace loadandclear() with atomic_readandclear_int().
|
#
66277 |
|
22-Sep-2000 |
ps |
Remove the NCPU, NAPIC, NBUS, NINTR config options. Make NAPIC, NBUS, NINTR dynamic and set NCPU to a maximum of 16 under SMP.
Reviewed by: peter
|
#
66069 |
|
19-Sep-2000 |
eivind |
Better error message when booting an SMP kernel on an UP system.
|
#
65557 |
|
06-Sep-2000 |
jasone |
Major update to the way synchronization is done in the kernel. Highlights include:
* Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.)
* Per-CPU idle processes.
* Interrupts are run in their own separate kernel threads and can be preempted (i386 only).
Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
|
#
64728 |
|
16-Aug-2000 |
tegge |
Prepare for a cleanup of pmap module API pollution introduced by the suggested fix in PR 12378.
Keep track of all existing pmaps independent of existing processes.
This allows for a process to temporarily connect to a different address space without the risk of missing an update of the original address space if the kernel grows.
pmap_pinit2() is no longer needed on the i386 platform but is left as a stub until the alpha pmap code is updated.
PR: 12378
|
#
61081 |
|
29-May-2000 |
dillon |
This is a cleanup patch to Peter's new OBJT_PHYS VM object type and sysv shared memory support for it. It implements a new PG_UNMANAGED flag that has slightly different characteristics from PG_FICTICIOUS.
A new sysctl, kern.ipc.shm_use_phys has been added to enable the use of physically-backed sysv shared memory rather then swap-backed. Physically backed shm segments are not tracked with PV entries, allowing programs which use a large shm segment as a rendezvous point to operate without eating an insane amount of KVM in the PV entry management. Read: Oracle.
Peter's OBJT_PHYS object will also allow us to eventually implement page-table sharing and/or 4MB physical page support for such segments. We're half way there.
|
#
61074 |
|
29-May-2000 |
dfr |
Brucify the pmap_enter_temporary() changes.
|
#
61036 |
|
28-May-2000 |
dfr |
Add a new pmap entry point, pmap_enter_temporary() to be used during dumps to create temporary page mappings. This replaces the use of CADDR1 which is fairly x86 specific.
Reviewed by: dillon
|
#
60755 |
|
21-May-2000 |
peter |
Implement an optimization of the VM<->pmap API. Pass vm_page_t's directly to various pmap_*() functions instead of looking up the physical address and passing that. In many cases, the first thing the pmap code was doing was going to a lot of trouble to get back the original vm_page_t, or it's shadow pv_table entry.
Inspired by: John Dyson's 1998 patches.
Also: Eliminate pv_table as a seperate thing and build it into a machine dependent part of vm_page_t. This eliminates having a seperate set of structions that shadow each other in a 1:1 fashion that we often went to a lot of trouble to translate from one to the other. (see above) This happens to save 4 bytes of physical memory for each page in the system. (8 bytes on the Alpha).
Eliminate the use of the phys_avail[] array to determine if a page is managed (ie: it has pv_entries etc). Store this information in a flag. Things like device_pager set it because they create vm_page_t's on the fly that do not have pv_entries. This makes it easier to "unmanage" a page of physical memory (this will be taken advantage of in subsequent commits).
Add a function to add a new page to the freelist. This could be used for reclaiming the previously wasted pages left over from preloaded loader(8) files.
Reviewed by: dillon
|
#
59440 |
|
20-Apr-2000 |
luoqi |
IO apics are not necessarily page aligned, they are only required to be aligned on 1K boundary. Correct a typo that would cause problem to a second IO apic.
Pointed out by: Steve Passe <smp.csn.net>
|
#
58132 |
|
16-Mar-2000 |
phk |
Eliminate the undocumented, experimental, non-delivering and highly dangerous MAX_PERF option.
|
#
57996 |
|
13-Mar-2000 |
bde |
Disabled the optimization of not doing an invltlb_1pg() when changing pte's from zero. The TLB is supposed to be invalidated when pte's are changed _to_ zero, but this doesn't occur in all cases for global pages (PG_G stops invltlb() from working, and invltlb_1pg() is not used enough).
PR: 14141, 16568 Submitted by: dillon
|
#
53433 |
|
19-Nov-1999 |
phk |
Use LIST_FOREACH to traverse the allproc list.
Submitted by: Jake Burkholder jake@checker.org
|
#
52635 |
|
29-Oct-1999 |
phk |
useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs.
This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
|
#
51183 |
|
11-Sep-1999 |
peter |
Make pmap_mapdev() deal with non-page-aligned requests. Add a corresponding pmap_unmapdev() to release the KVM back to kernel_map.
|
#
50477 |
|
27-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
#
49635 |
|
11-Aug-1999 |
alc |
_pmap_allocpte: If the pte page isn't PQ_NONE, panic rather than silently covering up the problem.
|
#
49591 |
|
10-Aug-1999 |
alc |
pmap_remove_pages: Add KASSERT to detect out of range access to the pv_table and report the errant pte before it's overwritten.
|
#
49337 |
|
31-Jul-1999 |
alc |
pmap_object_init_pt: Verify that object != NULL.
|
#
49304 |
|
31-Jul-1999 |
alc |
Add parentheses for clarity.
Submitted by: dillon
|
#
48963 |
|
21-Jul-1999 |
alc |
Fix the following problem:
When creating new processes (or performing exec), the new page directory is initialized too early. The kernel might grow before p_vmspace is initialized for the new process. Since pmap_growkernel doesn't yet know about the new page directory, it isn't updated, and subsequent use causes a failure.
The fix is (1) to clear p_vmspace early, to stop pmap_growkernel from stomping on memory, and (2) to defer part of the initialization of new page directories until p_vmspace is initialized.
PR: kern/12378 Submitted by: tegge Reviewed by: dfr
|
#
48677 |
|
08-Jul-1999 |
mckusick |
These changes appear to give us benefits with both small (32MB) and large (1G) memory machine configurations. I was able to run 'dbench 32' on a 32MB system without bring the machine to a grinding halt.
* buffer cache hash table now dynamically allocated. This will have no effect on memory consumption for smaller systems and will help scale the buffer cache for larger systems.
* minor enhancement to pmap_clearbit(). I noticed that all the calls to it used constant arguments. Making it an inline allows the constants to propogate to deeper inlines and should produce better code.
* removal of inherent vfs_ioopt support through the emplacement of appropriate #ifdef's, with John's permission. If we do not find a use for it by the end of the year we will remove it entirely.
* removal of getnewbufloops* counters & sysctl's - no longer necessary for debugging, getnewbuf() is now optimal.
* buffer hash table functions removed from sys/buf.h and localized to vfs_bio.c
* VFS_BIO_NEED_DIRTYFLUSH flag and support code added ( bwillwrite() ), allowing processes to block when too many dirty buffers are present in the system.
* removal of a softdep test in bdwrite() that is no longer necessary now that bdwrite() no longer attempts to flush dirty buffers.
* slight optimization added to bqrelse() - there is no reason to test for available buffer space on B_DELWRI buffers.
* addition of reverse-scanning code to vfs_bio_awrite(). vfs_bio_awrite() will attempt to locate clusterable areas in both the forward and reverse direction relative to the offset of the buffer passed to it. This will probably not make much of a difference now, but I believe we will start to rely on it heavily in the future if we decide to shift some of the burden of the clustering closer to the actual I/O initiation.
* Removal of the newbufcnt and lastnewbuf counters that Kirk added. They do not fix any race conditions that haven't already been fixed by the gbincore() test done after the only call to getnewbuf(). getnewbuf() is a static, so there is no chance of it being misused by other modules. ( Unless Kirk can think of a specific thing that this code fixes. I went through it very carefully and didn't see anything ).
* removal of VOP_ISLOCKED() check in flushbufqueues(). I do not think this check is necessary, the buffer should flush properly whether the vnode is locked or not. ( yes? ).
* removal of extra arguments passed to getnewbuf() that are not necessary.
* missed cluster_wbuild() that had to be a cluster_wbuild_wb() in vfs_cluster.c
* vn_write() now calls bwillwrite() *PRIOR* to locking the vnode, which should greatly aid flushing operations in heavy load situations - both the pageout and update daemons will be able to operate more efficiently.
* removal of b_usecount. We may add it back in later but for now it is useless. Prior implementations of the buffer cache never had enough buffers for it to be useful, and current implementations which make more buffers available might not benefit relative to the amount of sophistication required to implement a b_usecount. Straight LRU should work just as well, especially when most things are VMIO backed. I expect that (even though John will not like this assumption) directories will become VMIO backed some point soon.
Submitted by: Matthew Dillon <dillon@backplane.com> Reviewed by: Kirk McKusick <mckusick@mckusick.com>
|
#
48144 |
|
23-Jun-1999 |
luoqi |
Do not setup 4M pdir until all APs are up.
|
#
47842 |
|
08-Jun-1999 |
dt |
Use kmem_alloc_nofault() rather than kmem_alloc_pageable() to allocate kernel virtual address space for UPAGES.
|
#
47763 |
|
05-Jun-1999 |
luoqi |
Fix an accounting problem when prefaulting 4M pages.
PR: kern/11948
|
#
47678 |
|
01-Jun-1999 |
jlemon |
Unifdef VM86.
Reviewed by: silence on on -current
|
#
47575 |
|
28-May-1999 |
alc |
pmap_object_init_pt: The size of vm_object::memq is vm_object::resident_page_count, not vm_object::size.
|
#
47292 |
|
18-May-1999 |
alc |
pmap_qremove: Eliminate unnecessary TLB shootdowns.
|
#
46129 |
|
27-Apr-1999 |
luoqi |
Enable vmspace sharing on SMP. Major changes are, - %fs register is added to trapframe and saved/restored upon kernel entry/exit. - Per-cpu pages are no longer mapped at the same virtual address. - Each cpu now has a separate gdt selector table. A new segment selector is added to point to per-cpu pages, per-cpu global variables are now accessed through this new selector (%fs). The selectors in gdt table are rearranged for cache line optimization. - fask_vfork is now on as default for both UP and SMP. - Some aio code cleanup.
Reviewed by: Alan Cox <alc@cs.rice.edu> John Dyson <dyson@iquest.net> Julian Elischer <julian@whistel.com> Bruce Evans <bde@zeta.org.au> David Greenman <dg@root.com>
|
#
46072 |
|
25-Apr-1999 |
alc |
pmap_dispose_proc and pmap_copy_page: Conditionally compile 386-specific code.
pmap_enter: Eliminate unnecessary TLB shootdowns.
pmap_zero_page and pmap_zero_page_area: Use invltlb_1pg instead of duplicating the code.
|
#
45960 |
|
23-Apr-1999 |
dt |
Make pmap_collect() an official pmap interface.
|
#
45833 |
|
19-Apr-1999 |
alc |
_pmap_unwire_pte_hold and pmap_remove_page: Use pmap_TLB_invalidate instead of invltlb_1pg to eliminate unnecessary IPIs.
pmap_remove, pmap_protect and pmap_remove_pages: Use pmap_TLB_invalidate_all instead of invltlb to eliminate unnecessary IPIs.
pmap_copy: Use cpu_invltlb instead of invltlb when updating APTDpde.
pmap_changebit: Rather than deleting the unused "set bit" option (which may be useful later), make pmap_changebit an inline that is used by the new pmap_clearbit procedure.
Collectively, the first three changes reduce the number of TLB shootdown IPIs by 1/3 for a kernel compile.
|
#
45524 |
|
10-Apr-1999 |
alc |
pmap_remove_pte: Use "loadandclear" to update the pte.
pmap_changebit and pmap_ts_referenced: Switch to pmap_TLB_invalidate from invltlb.
|
#
45405 |
|
07-Apr-1999 |
msmith |
mem.c Split out ioctl handler a little more cleanly, add memory range attribute handling for both kernel and user-space consumers.
pmap.c Remove obsolete P6 MTRR-related code.
i686_mem.c Map generic memory-range attribute interface to the P6 MTRR model.
|
#
45370 |
|
06-Apr-1999 |
alc |
Two changes to pmap_remove_all:
1. Switch to pmap_TLB_invalidate from invltlb, eliminating a full TLB flush where a single-page flush suffices. (Also, this eliminates some unnecessary IPIs.)
2. Use "loadandclear" to update the pte, eliminating a race condition on SMPs.
Change #2 should be committed to -STABLE.
|
#
45347 |
|
05-Apr-1999 |
julian |
Catch a case spotted by Tor where files mmapped could leave garbage in the unallocated parts of the last page when the file ended on a frag but not a page boundary. Delimitted by tags PRE_MATT_MMAP_EOF and POST_MATT_MMAP_EOF, in files alpha/alpha/pmap.c i386/i386/pmap.c nfs/nfs_bio.c vm/pmap.h vm/vm_page.c vm/vm_page.h vm/vnode_pager.c miscfs/specfs/spec_vnops.c ufs/ufs/ufs_readwrite.c kern/vfs_bio.c
Submitted by: Matt Dillon <dillon@freebsd.org> Reviewed by: Alan Cox <alc@freebsd.org>
|
#
45252 |
|
02-Apr-1999 |
alc |
Put in place the infrastructure for improved UP and SMP TLB management.
In particular, replace the unused field pmap::pm_flag by pmap::pm_active, which is a bit mask representing which processors have the pmap activated. (Thus, it is a simple Boolean on UPs.)
Also, eliminate an unnecessary memory reference from cpu_switch() in swtch.s.
Assisted by: John S. Dyson <dyson@iquest.net> Tested by: Luoqi Chen <luoqi@watermarkgroup.com>, Poul-Henning Kamp <phk@critter.freebsd.dk>
|
#
44704 |
|
13-Mar-1999 |
alc |
pmap_qenter/pmap_qremove: Use the pmap_kenter/pmap_kremove inline functions instead of duplicating them.
pmap_remove_all: Eliminate an unused (but initialized) variable.
pmap_ts_reference: Change the implementation. The new implementation is much smaller and simpler, but functionally identical. (Reviewed by "John S. Dyson" <dyson@iquest.net>.)
|
#
44470 |
|
05-Mar-1999 |
alc |
Fix an SMP-only TLB invalidation bug. Specifically, disable a TLB invalidation optimization that won't work given the limitations of our current SMP support.
This patch should be applied to -stable ASAP.
Thanks to John Capo <jc@irbs.com>, Steve Kargl <sgk@troutmask.apl.washington.edu>, and Chuck Robey <chuckr@mat.net> for testing.
|
#
44146 |
|
19-Feb-1999 |
luoqi |
Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This is the preparation step for moving pmap storage out of vmspace proper.
Reviewed by: Alan Cox <alc@cs.rice.edu> Matthew Dillion <dillon@apollo.backplane.com>
|
#
43314 |
|
27-Jan-1999 |
dillon |
Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile
|
#
43138 |
|
24-Jan-1999 |
dillon |
Change all manual settings of vm_page_t->dirty = VM_PAGE_BITS_ALL to use the vm_page_dirty() inline.
The inline can thus do sanity checks ( or not ) over all cases.
|
#
42957 |
|
21-Jan-1999 |
dillon |
This is a rather large commit that encompasses the new swapper, changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues.
Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
|
#
42542 |
|
11-Jan-1999 |
eivind |
Silence warnings by removing unused convenience function and globalizing debugging functions.
|
#
42448 |
|
09-Jan-1999 |
dt |
Oops --<, replace 1.216 with a version that actually check pv_entries (and was tested for month or two in production).
Noticed by: Stephen McKay
Stephen also suggested to remove the complication at all. I don't do it as it would be backout of a large part of 1.190 (from 1998/03/16)...
|
#
42396 |
|
08-Jan-1999 |
luoqi |
Allocate kernel page table object (kptobj) before any kmem_alloc calls. On a system with a large amount of ram (e.g. 2G), allocation of per-page data structures (512K physical pages) could easily bust the initial kernel page table (36M), and growth of kernel page table requires kptobj.
|
#
42381 |
|
07-Jan-1999 |
dt |
Make pmap_ts_referenced check more than 1 pv_entry. (One should be carefull when move elements to the tail of a list in a loop...)
|
#
41591 |
|
07-Dec-1998 |
archie |
The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static and local variables, goto labels, and functions declared but not defined.
|
#
41370 |
|
26-Nov-1998 |
tegge |
Don't forget to update the pmap associated with aio daemons when adding new page directory entries for a growing kernel virtual address space.
|
#
41318 |
|
24-Nov-1998 |
eivind |
Move the declaration of PPro_vmtrr from the header file to pmap.c, replacing the one in the header file with a definition. This makes it easier to work with tools that grok ANSI C only.
|
#
40998 |
|
08-Nov-1998 |
msmith |
Enable 686 class optimisations for all 686-class processors, not just the Pentium Pro. This resolves the "Dog slow SMP" issue for Pentium II systems.
|
#
40700 |
|
28-Oct-1998 |
dg |
Added a second argument, "activate" to the vm_page_unwire() call so that the caller can select either inactive or active queue to put the page on.
|
#
40545 |
|
21-Oct-1998 |
dg |
Decrement the now unused page table page's wire_count prior to freeing it. It will soon be required that pages have a zero wire_count when being freed.
|
#
38893 |
|
06-Sep-1998 |
tegge |
Don't go below the low water mark of free pages due to optional prefaulting of pages. PR: 2431
|
#
38807 |
|
04-Sep-1998 |
ache |
PAGE_WAKEUP -> vm_page_wakeup
|
#
38488 |
|
23-Aug-1998 |
bde |
Fixed printf format errors.
|
#
38349 |
|
15-Aug-1998 |
bde |
pmap.c: Cast pointers to (vm_offset_t) instead of to (u_long) (as before) or to (uintptr_t)(void *) (as would be more correct). Don't cast vm_offset_t's to (u_long) just to do arithmetic on them.
mp_machdep.c: Cast pointers to (uintptr_t) instead of to (u_long). Don't forget to cast pointers to (void *) first or to recover from integral possible integral promotions, although this is too much work for machine-dependent code.
vm code generally avoids warnings for pointer vs long size mismatches by using vm_offset_t to represent pointers; pmap.c often uses plain `unsigned int' instead of vm_offset_t and didn't use u_long elsewhere, but this style was messed up by code apparently imported from mp_machdep.c.
|
#
37561 |
|
11-Jul-1998 |
bde |
Fixed printf format errors.
|
#
37557 |
|
11-Jul-1998 |
phk |
Don't disable pmap_setdevram() which isn't called, but which could be, but instead disable pmap_setvidram() which is called, but probably shouldn't be.
PR: 7227, 7240
|
#
37555 |
|
11-Jul-1998 |
bde |
Fixed printf format errors.
|
#
36275 |
|
21-May-1998 |
dyson |
Make flushing dirty pages work correctly on filesystems that unexpectedly do not complete writes even with sync I/O requests. This should help the behavior of mmaped files when using softupdates (and perhaps in other circumstances also.)
|
#
36179 |
|
19-May-1998 |
phk |
Make the size of the msgbuf (dmesg) a "normal" option.
|
#
36169 |
|
18-May-1998 |
tegge |
Back out part of revision 1.198 commit (clearing kernel stack pages). By request from David Greenman <dg@root.com>
|
#
36125 |
|
17-May-1998 |
tegge |
For SMP, use prv_PPAGE1/prv_PMAP1 instead of PADDR1/PMAP1. get_ptbase and pmap_pte_quick no longer generates IPIs. This should reduce the number of IPIs during heavy paging.
|
#
36121 |
|
17-May-1998 |
tegge |
Clear kernel stack pages before usage. Correct panic message in pmap_zero_page (s/CMAP /CMAP2 /).
|
#
36051 |
|
15-May-1998 |
dyson |
Disable the auto-Write Combining setup for the pmap code. This worked on a couple of machines of mine, but appears to cause problems on others.
|
#
35940 |
|
11-May-1998 |
dyson |
Change some tests from CPU_CLASS686 to CPU_686 as appropriate, and also correct a serious ommision that would cause process faulures due to forgetting an invltlb type operatino. This was just a transcription problem.
|
#
35933 |
|
11-May-1998 |
dyson |
Support better performance with P6 architectures and in SMP mode. Unnecessary TLB flushes removed. More efficient page zeroing on P6 (modify page only if non-zero.)
|
#
35932 |
|
10-May-1998 |
dyson |
Attempt to set write combining mode for graphics devices.
|
#
35296 |
|
19-Apr-1998 |
bde |
Support compiling with gcc -pedantic (don't use a bogus, null cast).
|
#
35210 |
|
15-Apr-1998 |
bde |
Support compiling with `gcc -ansi'.
|
#
35074 |
|
06-Apr-1998 |
peter |
Bogus casts
|
#
34611 |
|
15-Mar-1998 |
dyson |
Some VM improvements, including elimination of alot of Sig-11 problems. Tor Egge and others have helped with various VM bugs lately, but don't blame him -- blame me!!!
pmap.c: 1) Create an object for kernel page table allocations. This fixes a bogus allocation method previously used for such, by grabbing pages from the kernel object, using bogus pindexes. (This was a code cleanup, and perhaps a minor system stability issue.)
pmap.c: 2) Pre-set the modify and accessed bits when prudent. This will decrease bus traffic under certain circumstances.
vfs_bio.c, vfs_cluster.c: 3) Rather than calculating the beginning virtual byte offset multiple times, stick the offset into the buffer header, so that the calculated offset can be reused. (Long long multiplies are often expensive, and this is a probably unmeasurable performance improvement, and code cleanup.)
vfs_bio.c: 4) Handle write recursion more intelligently (but not perfectly) so that it is less likely to cause a system panic, and is also much more robust.
vfs_bio.c: 5) getblk incorrectly wrote out blocks that are incorrectly sized. The problem is fixed, and writes blocks out ONLY when B_DELWRI is true.
vfs_bio.c: 6) Check that already constituted buffers have fully valid pages. If not, then make sure that the B_CACHE bit is not set. (This was a major source of Sig-11 type problems.)
vfs_bio.c: 7) Fix a potential system deadlock due to an incorrectly specified sleep priority while waiting for a buffer write operation. The change that I made opens the system up to serious problems, and we need to examine the issue of process sleep priorities.
vfs_cluster.c, vfs_bio.c: 8) Make clustered reads work more correctly (and more completely) when buffers are already constituted, but not fully valid. (This was another system reliability issue.)
vfs_subr.c, ffs_inode.c: 9) Create a vtruncbuf function, which is used by filesystems that can truncate files. The vinvalbuf forced a file sync type operation, while vtruncbuf only invalidates the buffers past the new end of file, and also invalidates the appropriate pages. (This was a system reliabiliy and performance issue.)
10) Modify FFS to use vtruncbuf.
vm_object.c: 11) Make the object rundown mechanism for OBJT_VNODE type objects work more correctly. Included in that fix, create pager entries for the OBJT_DEAD pager type, so that paging requests that might slip in during race conditions are properly handled. (This was a system reliability issue.)
vm_page.c: 12) Make some of the page validation routines be a little less picky about arguments passed to them. Also, support page invalidation change the object generation count so that we handle generation counts a little more robustly.
vm_pageout.c: 13) Further reduce pageout daemon activity when the system doesn't need help from it. There should be no additional performance decrease even when the pageout daemon is running. (This was a significant performance issue.)
vnode_pager.c: 14) Teach the vnode pager to handle race conditions during vnode deallocations.
|
#
34440 |
|
09-Mar-1998 |
eivind |
Turn "PMAP_SHPGPERPROC" into a new-style option, add it to LINT, and document it there.
|
#
34206 |
|
07-Mar-1998 |
dyson |
This mega-commit is meant to fix numerous interrelated problems. There has been some bitrot and incorrect assumptions in the vfs_bio code. These problems have manifest themselves worse on NFS type filesystems, but can still affect local filesystems under certain circumstances. Most of the problems have involved mmap consistancy, and as a side-effect broke the vfs.ioopt code. This code might have been committed seperately, but almost everything is interrelated.
1) Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that are fully valid. 2) Rather than deactivating erroneously read initial (header) pages in kern_exec, we now free them. 3) Fix the rundown of non-VMIO buffers that are in an inconsistent (missing vp) state. 4) Fix the disassociation of pages from buffers in brelse. The previous code had rotted and was faulty in a couple of important circumstances. 5) Remove a gratuitious buffer wakeup in vfs_vmio_release. 6) Remove a crufty and currently unused cluster mechanism for VBLK files in vfs_bio_awrite. When the code is functional, I'll add back a cleaner version. 7) The page busy count wakeups assocated with the buffer cache usage were incorrectly cleaned up in a previous commit by me. Revert to the original, correct version, but with a cleaner implementation. 8) The cluster read code now tries to keep data associated with buffers more aggressively (without breaking the heuristics) when it is presumed that the read data (buffers) will be soon needed. 9) Change to filesystem lockmgr locks so that they use LK_NOPAUSE. The delay loop waiting is not useful for filesystem locks, due to the length of the time intervals. 10) Correct and clean-up spec_getpages. 11) Implement a fully functional nfs_getpages, nfs_putpages. 12) Fix nfs_write so that modifications are coherent with the NFS data on the server disk (at least as well as NFS seems to allow.) 13) Properly support MS_INVALIDATE on NFS. 14) Properly pass down MS_INVALIDATE to lower levels of the VM code from vm_map_clean. 15) Better support the notion of pages being busy but valid, so that fewer in-transit waits occur. (use p->busy more for pageouts instead of PG_BUSY.) Since the page is fully valid, it is still usable for reads. 16) It is possible (in error) for cached pages to be busy. Make the page allocation code handle that case correctly. (It should probably be a printf or panic, but I want the system to handle coding errors robustly. I'll probably add a printf.) 17) Correct the design and usage of vm_page_sleep. It didn't handle consistancy problems very well, so make the design a little less lofty. After vm_page_sleep, if it ever blocked, it is still important to relookup the page (if the object generation count changed), and verify it's status (always.) 18) In vm_pageout.c, vm_pageout_clean had rotted, so clean that up. 19) Push the page busy for writes and VM_PROT_READ into vm_pageout_flush. 20) Fix vm_pager_put_pages and it's descendents to support an int flag instead of a boolean, so that we can pass down the invalidate bit.
|
#
33936 |
|
01-Mar-1998 |
dyson |
1) Use a more consistent page wait methodology. 2) Do not unnecessarily force page blocking when paging pages out. 3) Further improve swap pager performance and correctness, including fixing the paging in progress deadlock (except in severe I/O error conditions.) 4) Enable vfs_ioopt=1 as a default. 5) Fix and enable the page prezeroing in SMP mode.
All in all, SMP systems especially should show a significant improvement in "snappyness."
|
#
33282 |
|
12-Feb-1998 |
bde |
Fixed initialization of the 4MB page. Kernels larger than about 2.75MB (from _btext to _end) crashed in pmap_bootstrap(). Smaller kernels worked accidentally.
|
#
33221 |
|
10-Feb-1998 |
eivind |
Fix warning after previous staticization.
|
#
33181 |
|
09-Feb-1998 |
eivind |
Staticize.
|
#
33134 |
|
06-Feb-1998 |
eivind |
Back out DIAGNOSTIC changes.
|
#
33109 |
|
05-Feb-1998 |
dyson |
1) Start using a cleaner and more consistant page allocator instead of the various ad-hoc schemes. 2) When bringing in UPAGES, the pmap code needs to do another vm_page_lookup. 3) When appropriate, set the PG_A or PG_M bits a-priori to both avoid some processor errata, and to minimize redundant processor updating of page tables. 4) Modify pmap_protect so that it can only remove permissions (as it originally supported.) The additional capability is not needed. 5) Streamline read-only to read-write page mappings. 6) For pmap_copy_page, don't enable write mapping for source page. 7) Correct and clean-up pmap_incore. 8) Cluster initial kern_exec pagin. 9) Removal of some minor lint from kern_malloc. 10) Correct some ioopt code. 11) Remove some dead code from the MI swapout routine. 12) Correct vm_object_deallocate (to remove backing_object ref.) 13) Fix dead object handling, that had problems under heavy memory load. 14) Add minor vm_page_lookup improvements. 15) Some pages are not in objects, and make sure that the vm_page.c can properly support such pages. 16) Add some more page deficit handling. 17) Some minor code readability improvements.
|
#
33108 |
|
04-Feb-1998 |
eivind |
Turn DIAGNOSTIC into a new-style option.
|
#
33056 |
|
03-Feb-1998 |
bde |
Converted DISABLE_PSE to a new-style option.
Fixed some formatting in options.i386.
|
#
32937 |
|
31-Jan-1998 |
dyson |
Change the busy page mgmt, so that when pages are freed, they MUST be PG_BUSY. It is bogus to free a page that isn't busy, because it is in a state of being "unavailable" when being freed. The additional advantage is that the page_remove code has a better cross-check that the page should be busy and unavailable for other use. There were some minor problems with the collapse code, and this plugs those subtile "holes."
Also, the vfs_bio code wasn't checking correctly for PG_BUSY pages. I am going to develop a more consistant scheme for grabbing pages, busy or otherwise. For now, we are stuck with the current morass.
|
#
32702 |
|
22-Jan-1998 |
dyson |
VM level code cleanups.
1) Start using TSM. Struct procs continue to point to upages structure, after being freed. Struct vmspace continues to point to pte object and kva space for kstack. u_map is now superfluous. 2) vm_map's don't need to be reference counted. They always exist either in the kernel or in a vmspace. The vmspaces are managed by reference counts. 3) Remove the "wired" vm_map nonsense. 4) No need to keep a cache of kernel stack kva's. 5) Get rid of strange looking ++var, and change to var++. 6) Change more data structures to use our "zone" allocator. Added struct proc, struct vmspace and struct vnode. This saves a significant amount of kva space and physical memory. Additionally, this enables TSM for the zone managed memory. 7) Keep ioopt disabled for now. 8) Remove the now bogus "single use" map concept. 9) Use generation counts or id's for data structures residing in TSM, where it allows us to avoid unneeded restart overhead during traversals, where blocking might occur. 10) Account better for memory deficits, so the pageout daemon will be able to make enough memory available (experimental.) 11) Fix some vnode locking problems. (From Tor, I think.) 12) Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp. (experimental.) 13) Significantly shrink, cleanup, and make slightly faster the vm_fault.c code. Use generation counts, get rid of unneded collpase operations, and clean up the cluster code. 14) Make vm_zone more suitable for TSM.
This commit is partially as a result of discussions and contributions from other people, including DG, Tor Egge, PHK, and probably others that I have forgotten to attribute (so let me know, if I forgot.)
This is not the infamous, final cleanup of the vnode stuff, but a necessary step. Vnode mgmt should be correct, but things might still change, and there is still some missing stuff (like ioopt, and physical backing of non-merged cache files, debugging of layering concepts.)
|
#
32585 |
|
17-Jan-1998 |
dyson |
Tie up some loose ends in vnode/object management. Remove an unneeded config option in pmap. Fix a problem with faulting in pages. Clean-up some loose ends in swap pager memory management.
The system should be much more stable, but all subtile bugs aren't fixed yet.
|
#
31934 |
|
22-Dec-1997 |
dyson |
Correct my previous fix for the UPAGES problem.
|
#
31930 |
|
21-Dec-1997 |
dyson |
Hopefully fix the problem with the TLB not being updated correctly. Problem tracked down by bde@freebsd.org, but this is an attempted efficient fix.
|
#
31709 |
|
14-Dec-1997 |
dyson |
After one of my analysis passes to evaluate methods for SMP TLB mgmt, I noticed some major enhancements available for UP situations. The number of UP TLB flushes is decreased much more than significantly with these changes. Since a TLB flush appears to cost minimally approx 80 cycles, this is a "nice" enhancement, equiv to eliminating between 40 and 160 instructions per TLB flush.
Changes include making sure that kernel threads all use the same PTD, and eliminate unneeded PTD switches at context switch time.
|
#
31321 |
|
20-Nov-1997 |
bde |
Moved some extern declarations to header files (unused ones to /dev/null).
|
#
31030 |
|
07-Nov-1997 |
tegge |
Use UPAGES when setting up private pages for SMP (which includes idle stack).
|
#
31017 |
|
07-Nov-1997 |
phk |
Rename some local variables to avoid shadowing other local variables.
Found by: -Wshadow
|
#
31016 |
|
07-Nov-1997 |
phk |
Remove a bunch of variables which were unused both in GENERIC and LINT.
Found by: -Wunused
|
#
30813 |
|
28-Oct-1997 |
bde |
Removed unused #includes.
|
#
30754 |
|
26-Oct-1997 |
dyson |
Check to see if the pv_limits are initialized before checking.
|
#
30732 |
|
26-Oct-1997 |
dyson |
Change the initial amount of memory allocated for pv_entries to be proportional to the amount of system memory. Also, clean-up some of the new pv_entry mgmt code.
|
#
30702 |
|
25-Oct-1997 |
dyson |
Somehow an error crept in during the previous commit.
|
#
30701 |
|
25-Oct-1997 |
dyson |
Support garbage collecting the pmap pv entries. The management doesn't happen until the system would have nearly failed anyway, so no signficant overhead is added. This helps large systems with lots of processes.
|
#
30700 |
|
24-Oct-1997 |
dyson |
Decrease the initial allocation for the zone allocations.
|
#
30309 |
|
11-Oct-1997 |
phk |
Distribute and statizice a lot of the malloc M_* types.
Substantial input from: bde
|
#
29655 |
|
21-Sep-1997 |
dyson |
Add support for more than 1 page of idle process stack on SMP systems.
|
#
29174 |
|
06-Sep-1997 |
dyson |
Fix an intermittent problem during SMP code operation. Not all of the idle page table directories for all of the processors was being updated during kernel grow operations. The problem appears to be gone now.
|
#
28808 |
|
26-Aug-1997 |
peter |
Clean up the SMP AP bootstrap and eliminate the wretched idle procs.
- We now have enough per-cpu idle context, the real idle loop has been revived (cpu's halt now with nothing to do). - Some preliminary support for running some operations outside the global lock (eg: zeroing "free but not yet zeroed pages") is present but appears to cause problems. Off by default. - the smp_active sysctl now behaves differently. It's merely a 'true/false' option. Setting smp_active to zero causes the AP's to halt in the idle loop and stop scheduling processes. - bootstrap is a lot safer. Instead of sharing a statically compiled in stack a number of times (which has caused lots of problems) and then abandoning it, we use the idle context to boot the AP's directly. This should help >2 cpu support since the bootlock stuff was in doubt. - print physical apic id in traps.. helps identify private pages getting out of sync. (You don't want to know how much hair I tore out with this!)
More cleanup to follow, this is more of a checkpoint than a 'finished' thing.
|
#
28747 |
|
25-Aug-1997 |
bde |
Finished (?) support for DISABLE_PSE option. 2-3MB of kernel vm was sometimes wasted.
Fixed type mismatches for functions with vm_prot_t's as args. vm_prot_t is u_char, so the prototypes should have used promoteof(u_char) to match the old-style function definitions. They use just vm_prot_t. This depends on gcc features to work. I fixed the definitions since this is easiest. The correct fix may be to change vm_prot_t to u_int, to optimize for time instead of space.
Removed a stale comment.
|
#
27950 |
|
07-Aug-1997 |
dyson |
Fix the DDB breakpoint code when using the 4MB page support.
|
#
27947 |
|
07-Aug-1997 |
dyson |
More vm_zone cleanup. The sysctl now accounts for items better, and counts the number of allocations.
|
#
27923 |
|
05-Aug-1997 |
dyson |
Another attempt at cleaning up the new memory allocator.
|
#
27922 |
|
05-Aug-1997 |
dyson |
Fix some bugs, document vm_zone better. Add copyright to vm_zone.h. Use the new zone code in pmap.c so that we can get rid of the ugly ad-hoc allocations in pmap.c.
|
#
27904 |
|
04-Aug-1997 |
dyson |
Modify pmap to use our new memory allocator.
|
#
27903 |
|
04-Aug-1997 |
dyson |
Slightly reorder some operations so that the main processor gets global mappings early on.
|
#
27902 |
|
04-Aug-1997 |
dyson |
Remove the PMAP_PVLIST conditionals in pmap.*, and another unneeded define.
|
#
27566 |
|
20-Jul-1997 |
dyson |
Fix a crash that has manifest itself while running X after the 4MB page upgrades.
|
#
27535 |
|
20-Jul-1997 |
bde |
Removed unused #includes.
|
#
27484 |
|
17-Jul-1997 |
dyson |
Hopefully fix a few problems that could cause hangs in SMP mode. 1) Make sure that the region mapped by a 4MB page is properly aligned. 2) Don't turn on the PG_G flag in locore for SMP. I plan to do that later in startup anyway. 3) Make sure the 2nd processor has PSE enabled, so that 4MB pages don't hose it.
We don't use PG_G yet on SMP -- there is work to be done to make that work correctly. It isn't that important anyway...
|
#
27464 |
|
17-Jul-1997 |
dyson |
Add support for 4MB pages. This includes the .text, .data, .data parts of the kernel, and also most of the dynamic parts of the kernel. Additionally, 4MB pages will be allocated for display buffers as appropriate (only.)
The 4MB support for SMP isn't complete, but doesn't interfere with operation either.
|
#
26944 |
|
25-Jun-1997 |
tegge |
Allow kernel configuration file to override PMAP_SHPGPERPROC. The default value (200) is too low in some environments, causing a fatal "panic: get_pv_entry: cannot get a pv_entry_t". The same panic might still occur due to temporary shortage of free physical memory (cf. PR i386/2431).
|
#
26812 |
|
22-Jun-1997 |
peter |
Preliminary support for per-cpu data pages.
This eliminates a lot of #ifdef SMP type code. Things like _curproc reside in a data page that is unique on each cpu, eliminating the expensive macros like: #define curproc (SMPcurproc[cpunumber()])
There are some unresolved bootstrap and address space sharing issues at present, but Steve is waiting on this for other work. There is still some strictly temporary code present that isn't exactly pretty.
This is part of a larger change that has run into some bumps, this part is standalone so it should be safe. The temporary code goes away when the full idle cpu support is finished.
Reviewed by: fsmp, dyson
|
#
26270 |
|
29-May-1997 |
fsmp |
Code such as apic_base[APIC_ID] converted to lapic__id
Changes to pmap.c for lapic_t lapic && ioapic_t ioapic pointers, currently equal to apic_base && io_apic_base, will stand alone with the private page mapping.
|
#
26267 |
|
29-May-1997 |
peter |
remove no longer needed opt_smp.h includes
|
#
25194 |
|
27-Apr-1997 |
peter |
Whoops.. We forgot to turn off the 4MB Virtual==Physical mapping at address zero from bootstrap in the non-SMP case.
Noticed by: bde
|
#
25164 |
|
26-Apr-1997 |
peter |
Man the liferafts! Here comes the long awaited SMP -> -current merge!
There are various options documented in i386/conf/LINT, there is more to come over the next few days.
The kernel should run pretty much "as before" without the options to activate SMP mode.
There are a handful of known "loose ends" that need to be fixed, but have been put off since the SMP kernel is in a moderately good condition at the moment.
This commit is the result of the tinkering and testing over the last 14 months by many people. A special thanks to Steve Passe for implementing the APIC code!
|
#
24851 |
|
13-Apr-1997 |
dyson |
The pmap code was too generous in the allocation of kva space for the pv entries. This problem has become obvious due to the increase in the size of the pv entries. We need to create a more intelligent policy for pv entry management eventually. Submitted by: David Greenman <dg@freebsd.org>
|
#
24848 |
|
12-Apr-1997 |
dyson |
Fully implement vfork. Vfork is now much much faster than even our fork. (On my machine, fork is about 240usecs, vfork is 78usecs.)
Implement rfork(!RFPROC !RFMEM), which allows a thread to divorce its memory from the other threads of a group.
Implement rfork(!RFPROC RFCFDG), which closes all file descriptors, eliminating possible existing shares with other threads/processes.
Implement rfork(!RFPROC RFFDG), which divorces the file descriptors for a thread from the rest of the group.
Fix the case where a thread does an exec. It is almost nonsense for a thread to modify the other threads address space by an exec, so we now automatically divorce the address space before modifying it.
|
#
24691 |
|
07-Apr-1997 |
peter |
The biggie: Get rid of the UPAGES from the top of the per-process address space. (!)
Have each process use the kernel stack and pcb in the kvm space. Since the stacks are at a different address, we cannot copy the stack at fork() and allow the child to return up through the function call tree to return to user mode - create a new execution context and have the new process begin executing from cpu_switch() and go to user mode directly. In theory this should speed up fork a bit.
Context switch the tss_esp0 pointer in the common tss. This is a lot simpler since than swithching the gdt[GPROC0_SEL].sd.sd_base pointer to each process's tss since the esp0 pointer is a 32 bit pointer, and the sd_base setting is split into three different bit sections at non-aligned boundaries and requires a lot of twiddling to reset.
The 8K of memory at the top of the process space is now empty, and unmapped (and unmappable, it's higher than VM_MAXUSER_ADDRESS).
Simplity the pmap code to manage process contexts, we no longer have to double map the UPAGES, this simplifies and should measuably speed up fork().
The following parts came from John Dyson:
Set PG_G on the UPAGES that are now in kernel context, and invalidate them when swapping them out.
Move the upages object (upobj) from the vmspace to the proc structure.
Now that the UPAGES (pcb and kernel stack) are out of user space, make rfork(..RFMEM..) do what was intended by sharing the vmspace entirely via reference counting rather than simply inheriting the mappings.
|
#
22975 |
|
22-Feb-1997 |
peter |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
22521 |
|
10-Feb-1997 |
dyson |
This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes.
The system boots and can mount UFS filesystems.
Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed.
Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
|
#
22129 |
|
30-Jan-1997 |
dg |
Removed unnecessary PG_N flag from device memory mappings. This is handled by the CPU/chipset already and was apparantly triggering a hardware bug that causes strange parity errors.
|
#
21673 |
|
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
21568 |
|
11-Jan-1997 |
dyson |
When we changed pmap_protect to support adding the writeable attribute to a page range, we forgot to set the PG_WRITEABLE flag in the vm_page_t. This fixes that problem.
|
#
21529 |
|
11-Jan-1997 |
dyson |
Prepare better for multi-platform by eliminating another required pmap routine (pmap_is_referenced.) Upper level recoded to use pmap_ts_referenced.
|
#
20997 |
|
29-Dec-1996 |
dyson |
Allow pmap_protect to increase permissions. This mod can eliminate the need for unnecessary vm_faults. Submitted by: Alan Cox <alc@cs.rice.edu>
|
#
19621 |
|
11-Nov-1996 |
dyson |
Support the PG_G flag on Pentium-Pro processors. This pretty much eliminates the unnecessary unmapping of the kernel during context switches and during invtlb...
|
#
19503 |
|
07-Nov-1996 |
joerg |
Fix the message buffer mapping. This actually allows to increase the message buffer size in <sys/msgbuf.h>.
Reviewed by: davidg,joerg Submitted by: bde
|
#
19346 |
|
03-Nov-1996 |
dyson |
Fix a problem with running down processes that have left wired mappings with mlock. This problem only occurred because of the quick unmap code not respecting the wired-ness of pages in the process. In the future, we need to eliminate the dependency intrinsic to the design of the code that wired pages actually be mapped. It is kind-of bogus not to have wired pages mapped, but it is also a weakness for the code to fall flat because of a missing page.
This show fix a problem that Tor Egge has been having, and also should be included into 2.2-RELEASE.
|
#
19119 |
|
23-Oct-1996 |
dyson |
Account for the UPAGES in the same way as before moving the MD code from vm_glue into pmap.c. Now RSS should appear to be the same as before.
|
#
18937 |
|
15-Oct-1996 |
dyson |
Move much of the machine dependent code from vm_glue.c into pmap.c. Along with the improved organization, small proc fork performance is now about 5%-10% faster.
|
#
18904 |
|
12-Oct-1996 |
dyson |
Minor optimization for final rundown of a pmap.
|
#
18897 |
|
12-Oct-1996 |
dyson |
Performance optimizations. One of which was meant to go in before the previous snap. Specifically, kern_exit and kern_exec now makes a call into the pmap module to do a very fast removal of pages from the address space. Additionally, the pmap module now updates the PG_MAPPED and PG_WRITABLE flags. This is an optional optimization, but helpful on the X86.
|
#
18896 |
|
12-Oct-1996 |
bde |
Cleaned up: - fixed a sloppy common-style declaration. - removed an unused macro. - moved once-used macros to the one file where they are used. - removed unused forward struct declarations. - removed __pure. - declared inline functions as inline in their prototype as well as in theire definition (gcc unfortunately allows the prototype to be inconsistent). - staticized.
|
#
18842 |
|
09-Oct-1996 |
bde |
Put I*86_CPU defines in opt_cpu.h.
|
#
18548 |
|
28-Sep-1996 |
dyson |
Essentially rename pmap_update to be invltlb. It is a very machine dependent operation, and not really a correct name. invltlb and invlpg are more descriptive, and in the case of invlpg, a real opcode.
Additionally, fix the tlb management code for 386 machines.
|
#
18538 |
|
28-Sep-1996 |
bde |
Restored my change in rev.1.119 which was clobbered by the previous commit.
|
#
18528 |
|
28-Sep-1996 |
dyson |
Move pmap_update_1pg to cpufunc.h. Additionally, use the invlpg opcode instead of the nasty looking .byte directives. There are some other minor micro-level code improvements to pmap.c
|
#
18275 |
|
13-Sep-1996 |
bde |
Made debugging code (pmap_pvdump()) compile again so that I can test LINT. I don't know if it actually works.
|
#
18260 |
|
12-Sep-1996 |
dyson |
Primarily a fix so that pages are properly tracked for being modified. Pages that are removed by the pageout daemon were the worst affected. Additionally, numerous minor cleanups, including better handling of busy page table pages. This commit fixes the worst of the pmap problems recently introduced.
|
#
18239 |
|
11-Sep-1996 |
dyson |
A minor fix to the new pmap code. This might not fix the global problems with the last major pmap commits.
|
#
18169 |
|
08-Sep-1996 |
dyson |
Addition of page coloring support. Various levels of coloring are afforded. The default level works with minimal overhead, but one can also enable full, efficient use of a 512K cache. (Parameters can be generated to support arbitrary cache sizes also.)
|
#
18163 |
|
08-Sep-1996 |
dyson |
Improve the scalability of certain pmap operations.
|
#
17334 |
|
30-Jul-1996 |
dyson |
Backed out the recent changes/enhancements to the VM code. The problem with the 'shell scripts' was found, but there was a 'strange' problem found with a 486 laptop that we could not find. This commit backs the code back to 25-jul, and will be re-entered after the snapshot in smaller (more easily tested) chunks.
|
#
17329 |
|
29-Jul-1996 |
dyson |
Fix a problem with a DEBUG section of code.
|
#
17325 |
|
29-Jul-1996 |
dyson |
Fix an error in statement order in pmap_remove_pages, remove the pmap pte hint (for now), and general code cleanup.
|
#
17321 |
|
28-Jul-1996 |
dyson |
Fix a problem that pmap update was not being done for kernel_pmap. Also remove some (currently) gratuitious tests for PG_V... This bug could have caused various anomolous (temporary) behavior.
|
#
17294 |
|
27-Jul-1996 |
dyson |
This commit is meant to solve a couple of VM system problems or performance issues.
1) The pmap module has had too many inlines, and so the object file is simply bigger than it needs to be. Some common code is also merged into subroutines. 2) Removal of some *evil* PHYS_TO_VM_PAGE macro calls. Unfortunately, a few have needed to be added also. The removal caused the need for more vm_page_lookups. I added lookup hints to minimize the need for the page table lookup operations. 3) Removal of some bogus performance improvements, that mostly made the code more complex (tracking individual page table page updates unnecessarily). Those improvements actually hurt 386 processors perf (not that people who worry about perf use 386 processors anymore :-)). 4) Changed pv queue manipulations/structures to be TAILQ's. 5) The pv queue code has had some performance problems since day one. Some significant scalability issues are resolved by threading the pv entries from the pmap AND the physical address instead of just the physical address. This makes certain pmap operations run much faster. This does not affect most micro-benchmarks, but should help loaded system performance *significantly*. DG helped and came up with most of the solution for this one. 6) Most if not all pmap bit operations follow the pattern: pmap_test_bit(); pmap_clear_bit(); That made for twice the necessary pv list traversal. The pmap interface now supports only pmap_tc_bit type operations: pmap_[test/clear]_modified, pmap_[test/clear]_referenced. Additionally, the modified routine now takes a vm_page_t arg instead of a phys address. This eliminates a PHYS_TO_VM_PAGE operation. 7) Several rewrites of routines that contain redundant code to use common routines, so that there is a greater likelihood of keeping the cache footprint smaller.
|
#
17121 |
|
12-Jul-1996 |
bde |
Removed "optimization" using gcc's builtin memcpy instead of bcopy. There is little difference now since the amount copied is large, and bcopy will become much faster on some machines.
|
#
16747 |
|
26-Jun-1996 |
dyson |
When page table pages were removed from process address space, the resident page stats were not being decremented. This mode corrects that problem.
|
#
16680 |
|
24-Jun-1996 |
dyson |
Limit the scan for preloading pte's to the end of an object.
|
#
16471 |
|
17-Jun-1996 |
bde |
Removed unused #includes of <i386/isa/icu.h> and <i386/isa/icu.h>. icu.h is only used by the icu support modules and by a few drivers that know too much about the icu (most only use it to convert `n' to `IRQn'). isa.h is only used by ioconf.c and by a few drivers that know too much about isa addresses (a few have to, because config is deficient).
|
#
16415 |
|
17-Jun-1996 |
dyson |
Several bugfixes/improvements: 1) Make it much less likely to miss a wakeup in vm_page_free_wakeup 2) Create a new entry point into pmap: pmap_ts_referenced, eliminates the need to scan the pv lists twice in many cases. Perhaps there is alot more to do here to work on minimizing pv list manipulation 3) Minor improvements to vm_pageout including the use of pmap_ts_ref. 4) Major changes and code improvement to pmap. This code has had several serious bugs in page table page manipulation. In order to simplify the problem, and hopefully solve it for once and all, page table pages are no longer "managed" with the pv list stuff. Page table pages are only (mapped and held/wired) or (free and unused) now. Page table pages are never inactive, active or cached. These changes have probably fixed the hold count problems, but if they haven't, then the code is simpler anyway for future bugfixing. 5) The pmap code has been sorely in need of re-organization, and I have taken a first (of probably many) steps. Please tell me if you have any ideas.
|
#
16324 |
|
12-Jun-1996 |
dyson |
Fix a very significant cnt.v_wire_count leak in vm_page.c, and some minor leaks in pmap.c. Bruce Evans made me aware of this problem.
|
#
16322 |
|
12-Jun-1996 |
gpalmer |
Clean up -Wunused warnings.
Reviewed by: bde
|
#
16197 |
|
08-Jun-1996 |
dyson |
Adjust the threshold for blocking on movement of pages from the cache queue in vm_fault.
Move the PG_BUSY in vm_fault to the correct place.
Remove redundant/unnecessary code in pmap.c.
Properly block on rundown of page table pages, if they are busy.
I think that the VM system is in pretty good shape now, and the following individuals (among others, in no particular order) have helped with this recent bunch of bugs, thanks! If I left anyone out, I apologize!
Stephen McKay, Stephen Hocking, Eric J. Chet, Dan O'Brien, James Raynard, Marc Fournier.
|
#
16166 |
|
07-Jun-1996 |
dyson |
Fix a bug in the pmap_object_init_pt routine that pages aren't taken from the cache queue before being mapped into the process.
|
#
16136 |
|
05-Jun-1996 |
dyson |
I missed a case of the page table page dirty-bit fix.
|
#
16122 |
|
05-Jun-1996 |
dyson |
Keep page-table pages from ever being sensed as dirty. This should fix some problems with the page-table page management code, since it can't deal with the notion of page-table pages being paged out or in transit. Also, clean up some stylistic issues per some suggestions from Stephen McKay.
|
#
16079 |
|
02-Jun-1996 |
dyson |
Don't carry the modified or referenced bits through to the child process during pmap_copy. This minimizes unnecessary swapping or creation of swap space. If there is a hold_count flaw for page-table pages, clear the page before freeing it to lessen the chance of a system crash -- this is a robustness thing only, NOT a fix.
|
#
16057 |
|
01-Jun-1996 |
dyson |
Fix the problem with pmap_copy that breaks X in small memory machines. Also close some windows that are opened up by page table allocations. The prefaulting code no longer uses hold counts, but now uses the busy flag for synchronization.
|
#
16026 |
|
30-May-1996 |
dyson |
This commit is dual-purpose, to fix more of the pageout daemon queue corruption problems, and to apply Gary Palmer's code cleanups. David Greenman helped with these problems also. There is still a hang problem using X in small memory machines.
|
#
15977 |
|
29-May-1996 |
dyson |
The wrong address (pindex) was being used for the page table directory. No negative side effects right now, but just a clean-up.
|
#
15868 |
|
22-May-1996 |
peter |
Fix harmless warning.. pmap_nw_modified was not having it's arg cast to pt_entry_t like the others inside the DIAGNOSTIC code.
|
#
15858 |
|
22-May-1996 |
dyson |
A serious error in pmap.c(pmap_remove) is corrected by this. When comparing the PTD pointers, they needed to be masked by PG_FRAME, and they weren't. Also, the "improved" non-386 code wasn't really an improvement, so I simplified and fixed the code. This might have caused some of the panics caused by the VM megacommit.
|
#
15832 |
|
20-May-1996 |
dyson |
To quote Stephen McKay: pmap_copy is a complex NOP at this moment :-).
With this fix from Stephen, we are getting the target fork performance that I have been trying to attain: P5-166, before the mega-commit: 700-800usecs, after: 600usecs, with Stephen's fix: 500usecs!!! Also, this could be the solution of some strange panic problems... Reviewed by: dyson@freebsd.org Submitted by: Stephen McKay <syssgm@devetir.qld.gov.au>
|
#
15819 |
|
19-May-1996 |
dyson |
Initial support for mincore and madvise. Both are almost fully supported, except madvise does not page in with MADV_WILLNEED, and MADV_DONTNEED doesn't force dirty pages out.
|
#
15809 |
|
18-May-1996 |
dyson |
This set of commits to the VM system does the following, and contain contributions or ideas from Stephen McKay <syssgm@devetir.qld.gov.au>, Alan Cox <alc@cs.rice.edu>, David Greenman <davidg@freebsd.org> and me:
More usage of the TAILQ macros. Additional minor fix to queue.h. Performance enhancements to the pageout daemon. Addition of a wait in the case that the pageout daemon has to run immediately. Slightly modify the pageout algorithm. Significant revamp of the pmap/fork code: 1) PTE's and UPAGES's are NO LONGER in the process's map. 2) PTE's and UPAGES's reside in their own objects. 3) TOTAL elimination of recursive page table pagefaults. 4) The page directory now resides in the PTE object. 5) Implemented pmap_copy, thereby speeding up fork time. 6) Changed the pv entries so that the head is a pointer and not an entire entry. 7) Significant cleanup of pmap_protect, and pmap_remove. 8) Removed significant amounts of machine dependent fork code from vm_glue. Pushed much of that code into the machine dependent pmap module. 9) Support more completely the reuse of already zeroed pages (Page table pages and page directories) as being already zeroed. Performance and code cleanups in vm_map: 1) Improved and simplified allocation of map entries. 2) Improved vm_map_copy code. 3) Corrected some minor problems in the simplify code. Implemented splvm (combo of splbio and splimp.) The VM code now seldom uses splhigh. Improved the speed of and simplified kmem_malloc. Minor mod to vm_fault to avoid using pre-zeroed pages in the case of objects with backing objects along with the already existant condition of having a vnode. (If there is a backing object, there will likely be a COW... With a COW, it isn't necessary to start with a pre-zeroed page.) Minor reorg of source to perhaps improve locality of ref.
|
#
15583 |
|
03-May-1996 |
phk |
Another sweep over the pmap/vm macros, this time with more focus on the usage. I'm not satisfied with the naming, but now at least there is less bogus stuff around.
|
#
15565 |
|
02-May-1996 |
phk |
Move atdevbase out of locore.s and into machdep.c Macroize locore.s' page table setup even more, now it's almost readable. Rename PG_U to PG_A (so that I can...) Rename PG_u to PG_U. "PG_u" was just too ugly... Remove some unused vars in pmap.c Remove PG_KR and PG_KW Remove SSIZE Remove SINCR Remove BTOPKERNBASE
This concludes my spring cleaning, modulus any bug fixes for messes I have made on the way.
(Funny to be back here in pmap.c, that's where my first significant contribution to 386BSD was... :-)
|
#
15543 |
|
02-May-1996 |
phk |
removed: CLBYTES PD_SHIFT PGSHIFT NBPG PGOFSET CLSIZELOG2 CLSIZE pdei() ptei() kvtopte() ptetov() ispt() ptetoav() &c &c new: NPDEPG
Major macro cleanup.
|
#
15340 |
|
22-Apr-1996 |
dyson |
This fixes a troubling oversight in some of the pmap code enhancements. One of the manifiestations of the problem includes the -4 RSS problem in ps.
Reviewed by: dyson Submitted by: Stephen McKay <syssgm@devetir.qld.gov.au>
|
#
15088 |
|
07-Apr-1996 |
dyson |
Major cleanups for the pmap code.
|
#
14967 |
|
31-Mar-1996 |
dg |
Change if/goto into a while loop.
|
#
14868 |
|
28-Mar-1996 |
dyson |
Remove a now unnecessary prototype from pmap.c. Also remove now unnecessary vm_fault's of page table pages in trap.c.
|
#
14867 |
|
28-Mar-1996 |
dyson |
Significant code cleanup, and some performance improvement. Also, mlock will now work properly without killing the system.
|
#
14607 |
|
12-Mar-1996 |
dyson |
Make sure that we pmap_update AFTER modifying the page table entries. The P6 can do a serious job of reordering code, and our stuff could execute incorrectly.
|
#
14527 |
|
11-Mar-1996 |
hsu |
For Lite2: proc LIST changes. Reviewed by: david & bde
|
#
14470 |
|
10-Mar-1996 |
dyson |
Improved efficiency in pmap_remove, and also remove some of the pmap_update optimizations that were probably incorrect.
|
#
14433 |
|
09-Mar-1996 |
dyson |
Correct some new and older lurking bugs. Hold count wasn't being handled correctly. Fix some incorrect code that was included to improve performance. Significantly simplify the pmap_use_pt and pmap_unuse_pt subroutines. Add some more diagnostic code.
|
#
14245 |
|
25-Feb-1996 |
dyson |
Re-insert a missing pmap_remove operation.
|
#
14243 |
|
25-Feb-1996 |
dyson |
Fix a problem with tracking the modified bit. Eliminate the ugly inline-asm code, and speed up the page-table-page tracking.
|
#
13908 |
|
04-Feb-1996 |
dg |
Rewrote cpu_fork so that it doesn't use pmap_activate, and removed pmap_activate since it's not used anymore. Changed cpu_fork so that it uses one line of inline assembly rather than calling mvesp() to get the current stack pointer. Removed mvesp() since it is no longer being used.
|
#
13495 |
|
19-Jan-1996 |
peter |
Some trivial fixes to get it to compile again, plus some new lint: - cpuclass should be cpu_class - CPUCLASS_I386 should be CPUCLASS_386 (^^ those only show up if you compile for i386) - two missing prototypes on new functions - one missing static
|
#
13490 |
|
19-Jan-1996 |
dyson |
Eliminated many redundant vm_map_lookup operations for vm_mmap. Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish overhead for merged cache. Efficiency improvement for vfs_cluster. It used to do alot of redundant calls to cluster_rbuild. Correct the ordering for vrele of .text and release of credentials. Use the selective tlb update for 486/586/P6. Numerous fixes to the size of objects allocated for files. Additionally, fixes in the various pagers. Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs. Fixes in the swap pager for exhausted resources. The pageout code will not as readily thrash. Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE), thereby improving efficiency of several routines. Eliminate even more unnecessary vm_page_protect operations. Significantly speed up process forks. Make vm_object_page_clean more efficient, thereby eliminating the pause that happens every 30seconds. Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the case of filesystems mounted async. Fix a panic with busy pages when write clustering is done for non-VMIO buffers.
|
#
12978 |
|
22-Dec-1995 |
bde |
Staticized code that was hidden by `#ifdef DEBUG'.
|
#
12904 |
|
17-Dec-1995 |
bde |
Fixed 1TB filesize changes. Some pindexes had bogus names and types but worked because vm_pindex_t is indistinuishable from vm_offset_t.
|
#
12850 |
|
14-Dec-1995 |
bde |
Added a prototype. Merged prototype lists.
|
#
12767 |
|
11-Dec-1995 |
dyson |
Changes to support 1Tb filesizes. Pages are now named by an (object,index) pair instead of (object,offset) pair.
|
#
12722 |
|
10-Dec-1995 |
phk |
Staticize and cleanup. remove a TON of #includes from machdep.
|
#
12662 |
|
07-Dec-1995 |
dg |
Untangled the vm.h include file spaghetti.
|
#
12607 |
|
03-Dec-1995 |
bde |
Completed function declarations and/or added prototypes.
|
#
12417 |
|
20-Nov-1995 |
phk |
Remove unused vars.
|
#
11703 |
|
23-Oct-1995 |
dg |
Remove PG_W bit setting in some cases where it should not be set.
Submitted by: John Dyson <dyson>
|
#
11693 |
|
22-Oct-1995 |
dg |
More improvements to the logic for modify-bit checking. Removed pmap_prefault() code as we don't plan to use it at this point in time.
Submitted by: John Dyson <dyson>
|
#
11642 |
|
22-Oct-1995 |
dg |
Simplified some expressions.
|
#
10782 |
|
15-Sep-1995 |
dg |
1) Killed 'BSDVM_COMPAT'. 2) Killed i386pagesperpage as it is not used by anything. 3) Fixed benign miscalculations in pmap_bootstrap(). 4) Moved allocation of ISA DMA memory to machdep.c. 5) Removed bogus vm_map_find()'s in pmap_init() - the entire range was already allocated kmem_init(). 6) Added some comments.
virual_avail is still miscalculated NKPT*NBPG too large, but in order to fix this properly requires moving the variable initialization into locore.s. Some other day.
|
#
9744 |
|
28-Jul-1995 |
dg |
Fixed bug I introduced with the memory-size code rewrite that broke floppy DMA buffers...use avail_start not "first". Removed duplicate (and wrong) declaration of phys_avail[].
Submitted by: Bruce Evans, but fixed differently by me.
|
#
9507 |
|
13-Jul-1995 |
dg |
NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct proc or any VM system structure will have to be rebuilt!!!
Much needed overhaul of the VM system. Included in this first round of changes:
1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages, haspage, and sync operations are supported. The haspage interface now provides information about clusterability. All pager routines now take struct vm_object's instead of "pagers".
2) Improved data structures. In the previous paradigm, there is constant confusion caused by pagers being both a data structure ("allocate a pager") and a collection of routines. The idea of a pager structure has escentially been eliminated. Objects now have types, and this type is used to index the appropriate pager. In most cases, items in the pager structure were duplicated in the object data structure and thus were unnecessary. In the few cases that remained, a un_pager structure union was created in the object to contain these items.
3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now be removed. For instance, vm_object_enter(), vm_object_lookup(), vm_object_remove(), and the associated object hash list were some of the things that were removed.
4) simple_lock's removed. Discussion with several people reveals that the SMP locking primitives used in the VM system aren't likely the mechanism that we'll be adopting. Even if it were, the locking that was in the code was very inadequate and would have to be mostly re-done anyway. The locking in a uni-processor kernel was a no-op but went a long way toward making the code difficult to read and debug.
5) Places that attempted to kludge-up the fact that we don't have kernel thread support have been fixed to reflect the reality that we are really dealing with processes, not threads. The VM system didn't have complete thread support, so the comments and mis-named routines were just wrong. We now use tsleep and wakeup directly in the lock routines, for instance.
6) Where appropriate, the pagers have been improved, especially in the pager_alloc routines. Most of the pager_allocs have been rewritten and are now faster and easier to maintain.
7) The pagedaemon pageout clustering algorithm has been rewritten and now tries harder to output an even number of pages before and after the requested page. This is sort of the reverse of the ideal pagein algorithm and should provide better overall performance.
8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup have been removed. Some other unnecessary casts have also been removed.
9) Some almost useless debugging code removed.
10) Terminology of shadow objects vs. backing objects straightened out. The fact that the vm_object data structure escentially had this backwards really confused things. The use of "shadow" and "backing object" throughout the code is now internally consistent and correct in the Mach terminology.
11) Several minor bug fixes, including one in the vm daemon that caused 0 RSS objects to not get purged as intended.
12) A "default pager" has now been created which cleans up the transition of objects to the "swap" type. The previous checks throughout the code for swp->pg_data != NULL were really ugly. This change also provides the rudiments for future backing of "anonymous" memory by something other than the swap pager (via the vnode pager, for example), and it allows the decision about which of these pagers to use to be made dynamically (although will need some additional decision code to do this, of course).
13) (dyson) MAP_COPY has been deprecated and the corresponding "copy object" code has been removed. MAP_COPY was undocumented and non- standard. It was furthermore broken in several ways which caused its behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will continue to work correctly, but via the slightly different semantics of MAP_PRIVATE.
14) (dyson) Sharing maps have been removed. It's marginal usefulness in a threads design can be worked around in other ways. Both #12 and #13 were done to simplify the code and improve readability and maintain- ability. (As were most all of these changes)
TODO:
1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing this will reduce the vnode pager to a mere fraction of its current size.
2) Rewrite vm_fault and the swap/vnode pagers to use the clustering information provided by the new haspage pager interface. This will substantially reduce the overhead by eliminating a large number of VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be improved to provide both a "behind" and "ahead" indication of contiguousness.
3) Implement the extended features of pager_haspage in swap_pager_haspage(). It currently just says 0 pages ahead/behind.
4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps via a much more general mechanism that could also be used for disk striping of regular filesystems.
5) Do something to improve the architecture of vm_object_collapse(). The fact that it makes calls into the swap pager and knows too much about how the swap pager operates really bothers me. It also doesn't allow for collapsing of non-swap pager objects ("unnamed" objects backed by other pagers).
|
#
8876 |
|
30-May-1995 |
rgrimes |
Remove trailing whitespace.
|
#
8456 |
|
11-May-1995 |
rgrimes |
Fix -Wformat warnings from LINT kernel.
|
#
7693 |
|
09-Apr-1995 |
dg |
Cosmetic changes.
|
#
7490 |
|
30-Mar-1995 |
dg |
Made pmap_testbit a static function.
|
#
7402 |
|
26-Mar-1995 |
dg |
Changed pmap_changebit() into a static function as it always should have been.
Submitted by: John Dyson
|
#
7090 |
|
16-Mar-1995 |
bde |
Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.
|
#
6977 |
|
10-Mar-1995 |
dg |
Removed unnecessary routines vm_get_pmap() and vm_put_pmap(). kmem_alloc() returns zero filled memory, so no need to explicitly bzero() it.
|
#
6807 |
|
01-Mar-1995 |
dg |
Various changes from John and myself that do the following:
New functions create - vm_object_pip_wakeup and pagedaemon_wakeup that are used to reduce the actual number of wakeups. New function vm_page_protect which is used in conjuction with some new page flags to reduce the number of calls to pmap_page_protect. Minor changes to reduce unnecessary spl nesting. Rewrote vm_page_alloc() to improve readability. Various other mostly cosmetic changes.
|
#
6734 |
|
26-Feb-1995 |
bde |
Replace all remaining instances of `i386/include' by `machine' and fix nearby #include inconsistencies.
|
#
6423 |
|
15-Feb-1995 |
dg |
Killed the pmap_use_pt and pmap_unuse_pt prototypes as they are now in machine/pmap.h.
|
#
6126 |
|
02-Feb-1995 |
dg |
Mostly cosmetic changes. Use KERNBASE instead of UPT_MAX_ADDRESS in some comparisons as it is more correct (we want the kernel page tables included). Reorganized some of the expressions for efficiency. Fixed the new pmap_prefault() routine - it would sometimes pick up the wrong page if the page in the shadow was present but the page in object was paged out. The routine remains unused and commented out, however. Explicitly free zero reference count page tables (rather than waiting for the pagedaemon to do it).
Submitted by: John Dyson
|
#
5943 |
|
26-Jan-1995 |
dg |
Fix from Doug Rabson for a panic related to not initializing the kernel's PTD.
Submitted by: John Dyson
|
#
5916 |
|
25-Jan-1995 |
dg |
Comment out pmap_prefault for the time being (perhaps until after 2.1). The object_init_pt routine is still enabled and used, however, and this is where most of the 'pre-faulting' performance improvement comes from.
|
#
5914 |
|
25-Jan-1995 |
dg |
Make sure that the pages being 'pre-faulted' are currently on a queue.
|
#
5910 |
|
25-Jan-1995 |
dg |
Be a bit less fast and loose about setting non-cacheablity of pages.
|
#
5837 |
|
24-Jan-1995 |
dg |
Changed buffer allocation policy (machdep.c) Moved various pmap 'bit' test/set functions back into real functions; gcc generates better code at the expense of more of it. (pmap.c) Fixed a deadlock problem with pv entry allocations (pmap.c) Added a new, optional function 'pmap_prefault' that does clustered page table preloading (pmap.c) Changed the way that page tables are held onto (trap.c).
Submitted by: John Dyson
|
#
5642 |
|
15-Jan-1995 |
dg |
Fixed some page table reference count problems; these changes may not be complete, but should be closer to correct than before.
|
#
5580 |
|
14-Jan-1995 |
dg |
Add missing object_lock/unlock.
|
#
5455 |
|
09-Jan-1995 |
dg |
These changes embody the support of the fully coherent merged VM buffer cache, much higher filesystem I/O performance, and much better paging performance. It represents the culmination of over 6 months of R&D.
The majority of the merged VM/cache work is by John Dyson.
The following highlights the most significant changes. Additionally, there are (mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to support the new VM/buffer scheme.
vfs_bio.c: Significant rewrite of most of vfs_bio to support the merged VM buffer cache scheme. The scheme is almost fully compatible with the old filesystem interface. Significant improvement in the number of opportunities for write clustering.
vfs_cluster.c, vfs_subr.c Upgrade and performance enhancements in vfs layer code to support merged VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff.
vm_object.c: Yet more improvements in the collapse code. Elimination of some windows that can cause list corruption.
vm_pageout.c: Fixed it, it really works better now. Somehow in 2.0, some "enhancements" broke the code. This code has been reworked from the ground-up.
vm_fault.c, vm_page.c, pmap.c, vm_object.c Support for small-block filesystems with merged VM/buffer cache scheme.
pmap.c vm_map.c Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of kernel PTs.
vm_glue.c Much simpler and more effective swapping code. No more gratuitous swapping.
proc.h Fixed the problem that the p_lock flag was not being cleared on a fork.
swap_pager.c, vnode_pager.c Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the code doesn't need it anymore.
machdep.c Changes to better support the parameter values for the merged VM/buffer cache scheme.
machdep.c, kern_exec.c, vm_glue.c Implemented a seperate submap for temporary exec string space and another one to contain process upages. This eliminates all map fragmentation problems that previously existed.
ffs_inode.c, ufs_inode.c, ufs_readwrite.c Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on busy buffers.
Submitted by: John Dyson and David Greenman
|
#
5153 |
|
18-Dec-1994 |
dg |
Move page_unhold's in pmap_object_init_pt down one line to gard against a potential race condition.
|
#
5144 |
|
18-Dec-1994 |
dg |
Check for PG_FAKE too in pmap_object_init_pt.
|
#
3436 |
|
08-Oct-1994 |
phk |
db_disasm.c: Unused var zapped. pmap.c: tons of unused vars zapped, various other warnings silenced. trap.c: unused vars zapped. vm_machdep.c: A wrong argument, which by chance did the right thing, was corrected.
|
#
2826 |
|
16-Sep-1994 |
dg |
Removed inclusion of pio.h and cpufunc.h (cpufunc.h is included from systm.h). Merged functionality of pio.h into cpufunc.h. Cleaned up some related code.
|
#
2492 |
|
04-Sep-1994 |
dg |
Added pmap_mapdev() function to map device memory.
|
#
2455 |
|
02-Sep-1994 |
dg |
Removed all vestiges of tlbflush(). Replaced them with calls to pmap_update(). Made pmap_update an inline assembly function.
|
#
2112 |
|
18-Aug-1994 |
wollman |
Fix up some sloppy coding practices:
- Delete redundant declarations. - Add -Wredundant-declarations to Makefile.i386 so they don't come back. - Delete sloppy COMMON-style declarations of uninitialized data in header files. - Add a few prototypes. - Clean up warnings resulting from the above.
NB: ioconf.c will still generate a redundant-declaration warning, which is unavoidable unless somebody volunteers to make `config' smarter.
|
#
2056 |
|
13-Aug-1994 |
wollman |
Change all #includes to follow the current Berkeley style. Some of these ``changes'' are actually not changes at all, but CVS sometimes has trouble telling the difference.
This also includes support for second-directory compiles. This is not quite complete yet, as `config' doesn't yet do the right thing. You can still make it work trivially, however, by doing the following:
rm /sys/compile mkdir /usr/obj/sys/compile ln -s M-. /sys/compile cd /sys/i386/conf config MYKERNEL cd ../../compile/MYKERNEL ln -s /sys @ rm machine ln -s @/i386/include machine make depend make
|
#
1896 |
|
07-Aug-1994 |
dg |
Made pmap_kenter "TLB safe". ...and then removed all the pmap_updates that are no longer needed because of this.
|
#
1895 |
|
07-Aug-1994 |
dg |
Provide support for upcoming merged VM/buffer cache, and fixed a few bugs that haven't appeared to manifest themselves (yet).
Submitted by: John Dyson
|
#
1890 |
|
06-Aug-1994 |
dg |
Fixed various prototype problems with the pmap functions and the subsequent problems that fixing them caused.
|
#
1887 |
|
06-Aug-1994 |
dg |
Incorporated post 1.1.5 work from John Dyson. This includes performance improvements via the new routines pmap_qenter/pmap_qremove and pmap_kenter/ pmap_kremove. These routine allow fast mapping of pages for those architectures that have "normal" MMUs. Also included is a fix to the pageout daemon to properly check a queue end condition.
Submitted by: John Dyson
|
#
1825 |
|
03-Aug-1994 |
dg |
Merged in post-1.1.5 work done by myself and John Dyson. This includes:
me: 1) TLB flush optimization that effectively eliminates half of all of the TLB flushes. This works by only flushing the TLB when a page is "present" in memory (i.e. the valid bit is set in the page table entry). See section 5.3.5 of the Intel 386 Programmer's Reference Manual. 2) The handling of "CMAP" has been improved to catch attempts at multiple simultaneous use.
John: 1) Added pmap_qenter/pmap_qremove functions for fast mapping of pages into the kernel. This is for future optimizations and support for the upcoming merged VM/buffer cache.
Reviewed by: John Dyson
|
#
1549 |
|
25-May-1994 |
rgrimes |
The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
|
#
1440 |
|
02-May-1994 |
dg |
Removed some tlbflush optimizations as some of them were bogus and lead to some strange behavior.
|
#
1379 |
|
20-Apr-1994 |
dg |
Bug fixes and performance improvements from John Dyson and myself:
1) check va before clearing the page clean flag. Not doing so was causing the vnode pager error 5 messages when paging from NFS. (pmap.c) 2) put back interrupt protection in idle_loop. Bruce didn't think it was necessary, John insists that it is (and I agree). (swtch.s) 3) various improvements to the clustering code (vm_machdep.c). It's now enabled/used by default. 4) bad disk blocks are now handled properly when doing clustered IOs. (wd.c, vm_machdep.c) 5) bogus bad block handling fixed in wd.c. 6) algorithm improvements to the pageout/pagescan daemons. It's amazing how well 4MB machines work now.
|
#
1362 |
|
14-Apr-1994 |
dg |
Changes from John Dyson and myself:
1) Removed all instances of disable_intr()/enable_intr() and changed them back to splimp/splx. The previous method was done to improve the performance, but Bruces recent changes to inline spl* have made this unnecessary. 2) Cleaned up vm_machdep.c considerably. Probably fixed a few bugs, too. 3) Added a new mechanism for collecting page statistics - now done by a new system process "pagescan". Previously this was done by the pageout daemon, but this proved to be impractical. 4) Improved the page usage statistics gathering mechanism - performance is much improved in small memory machines. 5) Modified mbuf.h to enable the support for an external free routine when using mbuf clusters. Added appropriate glue in various places to allow this to work. 6) Adapted a suggested change to the NFS code from Yuval Yurom to take advantage of #5. 7) Added fault/swap statistics support.
|
#
1312 |
|
30-Mar-1994 |
dg |
New routine "pmap_kenter", designed to take advantage of the special case of the kernel pmap.
|
#
1262 |
|
14-Mar-1994 |
dg |
Performance improvements from John Dyson.
1) A new mechanism has been added to prevent pages from being paged out called "vm_page_hold". Similar to vm_page_wire, but much lower overhead. 2) Scheduling algorithm has been changed to improve interactive performance. 3) Paging algorithm improved. 4) Some vnode and swap pager bugs fixed.
|
#
1246 |
|
07-Mar-1994 |
dg |
1) "Pre-faulting" in of pages into process address space Eliminates vm_fault overhead on process startup and mmap referenced data for in-memory pages.
(process startup time using in-memory segments *much* faster)
2) Even more efficient pmap code. Code partially cleaned up. More comments yet to follow.
(generally more efficient pte management)
3) Pageout clustering ( in addition to the FreeBSD V1.1 pagein clustering.)
(much faster paging performance on non-write behind disk subsystems, slightly faster performance on other systems.)
4) Slightly changed vm_pageout code for more efficiency and better statistics. Also, resist swapout a little more.
(less likely to pageout a recently used page)
5) Slight improvement to the page table page trap efficiency.
(generally faster system VM fault performance)
6) Defer creation of unnamed anonymous regions pager until needed.
(speeds up shared memory bss creation)
7) Remove possible deadlock from swap_pager initialization.
8) Enhanced procfs to provide "vminfo" about vm objects and user pmaps.
9) Increased MCLSHIFT/MCLBYTES from 2K to 4K to improve net & socket performance and to prepare for things to come.
John Dyson dyson@implode.root.com David Greenman davidg@root.com
|
#
1151 |
|
13-Feb-1994 |
dg |
Fixed bug in handling of COW - the original code was bogus and it was only accidental that it worked. Also, don't cache non-managed pages.
|
#
1139 |
|
10-Feb-1994 |
dg |
Patch from John Dyson:
a pv chain was being traversed while interrupts were fully enabled in pmap_remove_all ... this is bogus, and has been fixed in pmap.c. (sorry for adding the splimp)
|
#
1124 |
|
08-Feb-1994 |
dg |
Fixes from John Dyson to fix out-of-memory hangs and other problems (such as increased swap space usage) related to (incorrectly) paging out the page tables.
|
#
1055 |
|
31-Jan-1994 |
dg |
Added four pattern memory test routine that is done at startup.
|
#
1045 |
|
31-Jan-1994 |
dg |
VM system performance improvements from John Dyson and myself. The following is a summary:
1) increased object cache back up to a more reasonable value. 2) removed old & bogus cruft from machdep.c (clearseg, copyseg, physcopyseg, etc). 3) inlined many functions in pmap.c 4) changed "load_cr3(rcr3())" into tlbflush() and made tlbflush inline assembly. 5) changed the way that modified pages are tracked - now vm_page struct is kept updated directly - no more scanning page tables. 6) removed lots of unnecessary spl's 7) removed old unused functions from pmap.c 8) removed all use of page_size, page_shift, page_mask variables - replaced with PAGE_ constants. 9) moved trunc/round_page, atop, ptoa, out of vm_param.h and into i386/ include/param.h, and optimized them. 10) numerous changes to sys/vm/ swap_pager, vnode_pager, pageout, fault code to improve performance. LRU algorithm modified to be more effective, read ahead/behind values tuned for better performance, etc, etc...
|
#
1028 |
|
27-Jan-1994 |
dg |
Made pmap_is_managed a static inline function.
|
#
981 |
|
17-Jan-1994 |
dg |
Improvements mostly from John Dyson, with a little bit from me.
* Removed pmap_is_wired * added extra cli/sti protection in idle (swtch.s) * slight code improvement in trap.c * added lots of comments * improved paging and other algorithms in VM system
|
#
974 |
|
14-Jan-1994 |
dg |
"New" VM system from John Dyson & myself. For a run-down of the major changes, see the log of any effected file in the sys/vm directory (swap_pager.c for instance).
|
#
879 |
|
18-Dec-1993 |
wollman |
Make everything compile with -Wtraditional. Make it easier to distribute a binary link-kit. Make all non-optional options (pagers, procfs) standard, and update LINT to reflect new symtab requirements.
NB: -Wtraditional will henceforth be forgotten. This editing pass was primarily intended to detect any constructions where the old code might have been relying on traditional C semantics or syntax. These were all fixed, and the result of fixing some of them means that -Wall is now a realistic possibility within a few weeks.
|
#
856 |
|
13-Dec-1993 |
dg |
added some panics to catch the condition where pmap_pte returns null - indicating that the page table page is non-resident.
|
#
798 |
|
24-Nov-1993 |
wollman |
Make the LINT kernel compile with -W -Wreturn-type -Wcomment -Werror, and add same (sans -Werror) to Makefile for future compilations.
|
#
757 |
|
13-Nov-1993 |
dg |
First steps in rewriting locore.s, and making info useful when the machine panics.
i386/i386/locore.s: 1) got rid of most .set directives that were being used like #define's, and replaced them with appropriate #define's in the appropriate header files (accessed via genassym). 2) added comments to header inclusions and global definitions, and global variables 3) replaced some hardcoded constants with cpp defines (such as PDESIZE and others) 4) aligned all comments to the same column to make them easier to read 5) moved macro definitions for ENTRY, ALIGN, NOP, etc. to /sys/i386/include/asmacros.h 6) added #ifdef BDE_DEBUGGER around all of Bruce's debugger code 7) added new global '_KERNend' to store last location+1 of kernel 8) cleaned up zeroing of bss so that only bss is zeroed 9) fix zeroing of page tables so that it really does zero them all - not just if they follow the bss. 10) rewrote page table initialization code so that 1) works correctly and 2) write protects the kernel text by default 11) properly initialize the kernel page directory, upages, p0stack PT, and page tables. The previous scheme was more than a bit screwy. 12) change allocation of virtual area of IO hole so that it is fixed at KERNBASE + 0xa0000. The previous scheme put it right after the kernel page tables and then later expected it to be at KERNBASE +0xa0000 13) change multiple bogus settings of user read/write of various areas of kernel VM - including the IO hole; we should never be accessing the IO hole in user mode through the kernel page tables 14) split kernel support routines such as bcopy, bzero, copyin, copyout, etc. into a seperate file 'support.s' 15) split swtch and related routines into a seperate 'swtch.s' 16) split routines related to traps, syscalls, and interrupts into a seperate file 'exception.s' 17) remove some unused global variables from locore that got inserted by Garrett when he pulled them out of some .h files.
i386/isa/icu.s: 1) clean up global variable declarations 2) move in declaration of astpending and netisr
i386/i386/pmap.c: 1) fix calculation of virtual_avail. It previously was calculated to be right in the middle of the kernel page tables - not a good place to start allocating kernel VM. 2) properly allocate kernel page dir/tables etc out of kernel map - previously only took out 2 pages.
i386/i386/machdep.c: 1) modify boot() to print a warning that the system will reboot in PANIC_REBOOT_WAIT_TIME amount of seconds, and let the user abort with a key on the console. The machine will wait for ever if a key is typed before the reboot. The default is 15 seconds, but can be set to 0 to mean don't wait at all, -1 to mean wait forever, or any positive value to wait for that many seconds. 2) print "Rebooting..." just before doing it.
kern/subr_prf.c: 1) remove PANICWAIT as it is deprecated by the change to machdep.c
i386/i386/trap.c: 1) add table of trap type strings and use it to print a real trap/ panic message rather than just a number. Lot's of work to be done here, but this is the first step. Symbolic traceback is in the TODO.
i386/i386/Makefile.i386: 1) add support in to build support.s, exception.s and swtch.s
...and various changes to various header files to make all of the above happen.
|
#
608 |
|
15-Oct-1993 |
rgrimes |
genassym.c: Remove NKMEMCLUSTERS, it is no longer define or used.
locores.s: Fix comment on PTDpde and APTDpde to be pde instead of pte Add new equation for calculating location of Sysmap Remove Bill's old #ifdef garbage for counting up memory, that stuff will never be made to work and was just cluttering up the file.
Add code that places the PTD, page table pages, and kernel stack below the 640k ISA hole if there is room for it, otherwise put this stuff all at 1MB. This fixes the 28K bogusity in the boot blocks, that can now go away!
Fix the caclulation of where first is to be dependent on NKPDE so that we can skip over the above mentioned areas. The 28K thing is now 44K in size due to the increase in kernel virtual memory space, but since we no longer have to worry about that this is no big deal.
Use if NNPX > 0 instead of ifdef NPX for floating point code.
machdep.c Change the calculation of for the buffer cache to be 20% of all memory above 2MB and add back the upper limit of 2/5's of the VM_KMEM_SIZE so that we do not eat ALL of the kernel memory space on large memory machines, note that this will not even come into effect unless you have more than 32MB. The current buffer cache limit is 6.7MB due to this caclulation.
It seems that we where erroniously allocating bufpages pages for buffer_map. buffer_map is UNUSED in this implementation of the buffer cache, but since the map is referenced in several if statements a quick fix was to simply allocate 1 vm page (but no real memory) to it.
pmap.h Remove rcsid, don't want them in the kernel files!
Removed some cruft inside an #ifdef DEBUGx that caused compiler errors if you where compiling this for debug.
Use the #defines for PD_SHIFT and PG_SHIFT in place of constants.
trap.c: Remove patch kit header and rcsid, fix $Id$. Now include "npx.h" and use NNPX for controlling the floating point code.
Remove a now completly invalid check for a maximum virtual address, the virtual address now ends at 0xFFFFFFFF so there is no more MAX!! (Thanks David, I completly missed that one!)
vm_machdep.c Remove patch kit header and rcsid, fix $Id$. Now include "npx.h" and use NNPX for controlling the floating point code.
Replace several 0xFE00000 constants with KERNBASE
|
#
589 |
|
12-Oct-1993 |
rgrimes |
KPTDI_LAST renamed to KPTDI
|
#
587 |
|
12-Oct-1993 |
rgrimes |
Eliminate definition of I386_PAGE_SIZE and use NBPG instead Replace 0xFE000000 constants with KERNBASE Use new definition NKPDE in place of a first-last+1 calculation.
|
#
528 |
|
30-Sep-1993 |
rgrimes |
This is a fix for the 32K DMA buffer region that was not accounted for, it relocates it to be after the BIOS memory hole instead of right below the 640K limit. THANK YOU CHRIS!!!
From: <cgd@postgres.Berkeley.EDU> Date: Wed, 29 Sep 93 18:49:58 -0700 basically, reserve a new 32k space right after firstaddr, and put the buffer space there...
the diffs are below, and are in ~cgd/sys/i386/i386 (in machdep.c) on freefall. i obviously can't test them, so if some of you would look the diffs over and try them out...
|
#
424 |
|
08-Sep-1993 |
rgrimes |
Changed the pg("ptdi> %x") to a printf and then a panic, since we are going to panic shortly after this anyway. Destroys less state, and keeps the machine from waiting for someone to smash the return key a few times before it panics!
|
#
200 |
|
27-Jul-1993 |
dg |
* Applied fixes from Bruce Evans to fix COW bugs, >1MB kernel loading, profiling, and various protection checks that cause security holes and system crashes. * Changed min/max/bcmp/ffs/strlen to be static inline functions - included from cpufunc.h in via systm.h. This change improves performance in many parts of the kernel - up to 5% in the networking layer alone. Note that this requires systm.h to be included in any file that uses these functions otherwise it won't be able to find them during the load. * Fixed incorrect call to splx() in if_is.c * Fixed bogus variable assignment to splx() in if_ed.c
|
#
5 |
|
12-Jun-1993 |
rgrimes |
This commit was generated by cvs2svn to compensate for changes in r4, which included commits to RCS files with non-trunk default branches.
|
#
4 |
|
12-Jun-1993 |
rgrimes |
Initial import, 0.1 + pk 0.2.4-B1
|