#
f1d73aac |
|
02-Jun-2024 |
Alan Cox <alc@FreeBSD.org> |
pmap: Skip some superpage promotion attempts that will fail Implement a simple heuristic to skip pointless promotion attempts by pmap_enter_quick_locked() and moea64_enter(). Specifically, when vm_fault() calls pmap_enter_quick() to map neighboring pages at the end of a copy-on-write fault, there is no point in attempting promotion in pmap_enter_quick_locked() and moea64_enter(). Promotion will fail because the base pages have differing protection. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D45431 MFC after: 1 week
|
#
deab5717 |
|
27-May-2024 |
Mitchell Horne <mhorne@FreeBSD.org> |
Adjust comments referencing vm_mem_init() I cannot find a time where the function was not named this. Reviewed by: kib, markj MFC after: 3 days Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D45383
|
#
fd1aa5b3 |
|
02-Feb-2024 |
John Baldwin <jhb@FreeBSD.org> |
x86: Consistently pass true/false to is_pde parameter of pmap_cache_bits Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D43692
|
#
1f1b2286 |
|
31-Jan-2024 |
John Baldwin <jhb@FreeBSD.org> |
pmap: Convert boolean_t to bool. Reviewed by: kib (older version) Differential Revision: https://reviews.freebsd.org/D39921
|
#
29363fb4 |
|
23-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove ancient SCCS tags. Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script. Sponsored by: Netflix
|
#
02320f64 |
|
19-Oct-2023 |
Zhenlei Huang <zlei@FreeBSD.org> |
pmap: Prefer consistent naming for loader tunable The sysctl knob 'vm.pmap.pv_entry_max' becomes a loader tunable since 7ff48af7040f (Allow a specific setting for pv entries) but is fetched from system environment 'vm.pmap.pv_entries'. That is inconsistent and obscure. This reverts 36e1b9702e21 (Correct the tunable name in the message). PR: 231577 Reviewed by: jhibbits, alc, kib MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D42274
|
#
6c1d6d4c |
|
08-Oct-2023 |
Bojan Novković <bojan.novkovic@fer.hr> |
i386: Add a leaf PTP when pmap_enter(psind=1) creates a wired mapping Let pmap_enter_pde() create wired mappings. In particular, allocate a leaf PTP for use during demotion. This is a step towards reverting commit 64087fd7f372. Reviewed by: alc, kib, markj Sponsored by: Google, Inc. (GSoC 2023) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D41635
|
#
902ed64f |
|
24-Sep-2023 |
Alan Cox <alc@FreeBSD.org> |
i386 pmap: Adapt recent amd64/arm64 superpage improvements Don't recompute mpte during promotion. Optimize MADV_WILLNEED on existing superpages. Standardize promotion conditions across amd64, arm64, and i386. Stop requiring the accessed bit for superpage promotion. Tidy up pmap_promote_pde() calls. Retire PMAP_INLINE. It's no longer used. Note: Some of these changes are a prerequisite to fixing a panic that arises when attempting to create a wired superpage mapping by pmap_enter(psind=1) (as opposed to promotion). Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D41944
|
#
685dc743 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
3e04ae43 |
|
14-Jul-2023 |
Doug Moore <dougm@FreeBSD.org> |
vm_radix_init: use initializer Several vm_radix tries are not initialized with vm_radix_init. That works, for now, since static initialization zeroes the root field anyway, but if initialization changes, these tries will fail. Add missing initializer calls. Reviewed by: alc, kib, markj Differential Revision: https://reviews.freebsd.org/D40971
|
#
934bfc12 |
|
17-Oct-2022 |
Konstantin Belousov <kib@FreeBSD.org> |
Add vm_page_any_valid() Use it and several other vm_page_*_valid() functions in more places. Suggested and reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D37024
|
#
4d90a5af |
|
07-Oct-2022 |
John Baldwin <jhb@FreeBSD.org> |
sys: Consolidate common implementation details of PV entries. Add a <sys/_pv_entry.h> intended for use in <machine/pmap.h> to define struct pv_entry, pv_chunk, and related macros and inline functions. Note that powerpc does not yet use this as while the mmu_radix pmap in powerpc uses the new scheme (albeit with fewer PV entries in a chunk than normal due to an used pv_pmap field in struct pv_entry), the Book-E pmaps for powerpc use the older style PV entries without chunks (and thus require the pv_pmap field). Suggested by: kib Reviewed by: kib Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D36685
|
#
f49fd63a |
|
22-Sep-2022 |
John Baldwin <jhb@FreeBSD.org> |
kmem_malloc/free: Use void * instead of vm_offset_t for kernel pointers. Reviewed by: kib, markj Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D36549
|
#
7ae99f80 |
|
22-Sep-2022 |
John Baldwin <jhb@FreeBSD.org> |
pmap_unmapdev/bios: Accept a pointer instead of a vm_offset_t. This matches the return type of pmap_mapdev/bios. Reviewed by: kib, markj Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D36548
|
#
e6639073 |
|
23-Aug-2022 |
John Baldwin <jhb@FreeBSD.org> |
Define _NPCM and the last PC_FREEn constant in terms of _NPCPV. This applies one of the changes from 5567d6b4419b02a2099527228b1a51cc55a5b47d to other architectures besides arm64. Reviewed by: kib Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D36263
|
#
f5ad538d |
|
18-Jul-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
i386: fix pmap_trm_arena_last atomic load type Sponsored by: Rubicon Communications, LLC ("Netgate")
|
#
9a22f7fb |
|
09-Apr-2022 |
Gordon Bergling <gbe@FreeBSD.org> |
i386: Remove a double word in a source code comment - s/an an/an/ MFC after: 3 days
|
#
c1480b17 |
|
08-Apr-2022 |
John Baldwin <jhb@FreeBSD.org> |
i386 pmap: Re-quiet set but unused warnings. __diagused no longer covers KTR, so use explicit #ifdef KTR instead.
|
#
3c942808 |
|
05-Jan-2022 |
Konstantin Belousov <kib@FreeBSD.org> |
Silent some warnings for i386 kernel build Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
e2650af1 |
|
29-Dec-2021 |
Stefan Eßer <se@FreeBSD.org> |
Make CPU_SET macros compliant with other implementations The introduction of <sched.h> improved compatibility with some 3rd party software, but caused the configure scripts of some ports to assume that they were run in a GLIBC compatible environment. Parts of sched.h were made conditional on -D_WITH_CPU_SET_T being added to ports, but there still were compatibility issues due to invalid assumptions made in autoconfigure scripts. The differences between the FreeBSD version of macros like CPU_AND, CPU_OR, etc. and the GLIBC versions was in the number of arguments: FreeBSD used a 2-address scheme (one source argument is also used as the destination of the operation), while GLIBC uses a 3-adderess scheme (2 source operands and a separately passed destination). The GLIBC scheme provides a super-set of the functionality of the FreeBSD macros, since it does not prevent passing the same variable as source and destination arguments. In code that wanted to preserve both source arguments, the FreeBSD macros required a temporary copy of one of the source arguments. This patch set allows to unconditionally provide functions and macros expected by 3rd party software written for GLIBC based systems, but breaks builds of externally maintained sources that use any of the following macros: CPU_AND, CPU_ANDNOT, CPU_OR, CPU_XOR. One contributed driver (contrib/ofed/libmlx5) has been patched to support both the old and the new CPU_OR signatures. If this commit is merged to -STABLE, the version test will have to be extended to cover more ranges. Ports that have added -D_WITH_CPU_SET_T to build on -CURRENT do no longer require that option. The FreeBSD version has been bumped to 1400046 to reflect this incompatible change. Reviewed by: kib MFC after: 2 weeks Relnotes: yes Differential Revision: https://reviews.freebsd.org/D33451
|
#
ff93447d |
|
19-Oct-2021 |
Mark Johnston <markj@FreeBSD.org> |
Use the vm_radix_init() helper when initializing pmaps No functional change intended. Reviewed by: alc, kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32527
|
#
a4667e09 |
|
19-Oct-2021 |
Mark Johnston <markj@FreeBSD.org> |
Convert vm_page_alloc() callers to use vm_page_alloc_noobj(). Remove page zeroing code from consumers and stop specifying VM_ALLOC_NOOBJ. In a few places, also convert an allocation loop to simply use VM_ALLOC_WAITOK. Similarly, convert vm_page_alloc_domain() callers. Note that callers are now responsible for assigning the pindex. Reviewed by: alc, hselasky, kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31986
|
#
b092c58c |
|
14-Jul-2021 |
Mark Johnston <markj@FreeBSD.org> |
Assert that valid PTEs are not overwritten when installing a new PTP amd64 and 32-bit ARM already had assertions to this effect. Add them to other pmaps. Reviewed by: alc, kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31171
|
#
67932460 |
|
17-Feb-2021 |
John Baldwin <jhb@FreeBSD.org> |
Add a VA_IS_CLEANMAP() macro. This macro returns true if a provided virtual address is contained in the kernel's clean submap. In CHERI kernels, the buffer cache and transient I/O map are allocated as separate regions. Abstracting this check reduces the diff relative to FreeBSD. It is perhaps slightly more readable as well. Reviewed by: kib Obtained from: CheriBSD Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D28710
|
#
847ab36b |
|
02-Sep-2020 |
Mark Johnston <markj@FreeBSD.org> |
Include the psind in data returned by mincore(2). Currently we use a single bit to indicate whether the virtual page is part of a superpage. To support a forthcoming implementation of non-transparent 1GB superpages, it is useful to provide more detailed information about large page sizes. The change converts MINCORE_SUPER into a mask for MINCORE_PSIND(psind) values, indicating a mapping of size psind, where psind is an index into the pagesizes array returned by getpagesizes(3), which in turn comes from the hw.pagesizes sysctl. MINCORE_PSIND(1) is equal to the old value of MINCORE_SUPER. For now, two bits are used to record the page size, permitting values of MAXPAGESIZES up to 4. Reviewed by: alc, kib Sponsored by: Juniper Networks, Inc. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D26238
|
#
ed83a561 |
|
01-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
i386: clean up empty lines in .c and .h files
|
#
4ae224c6 |
|
16-Jul-2020 |
Conrad Meyer <cem@FreeBSD.org> |
Revert r240317 to prevent leaking pmap entries Subsequent to r240317, kmem_free() was replaced with kva_free() (r254025). kva_free() releases the KVA allocation for the mapped region, but no longer clears the pmap (pagetable) entries. An affected pmap_unmapdev operation would leave the still-pmap'd VA space free for allocation by other KVA consumers. However, this bug easily avoided notice for ~7 years because most devices (1) never call pmap_unmapdev and (2) on amd64, mostly fit within the DMAP and do not need KVA allocations. Other affected arch are less popular: i386, MIPS, and PowerPC. Arm64, arm32, and riscv are not affected. Reported by: Don Morris <dgmorris AT earthlink.net> Submitted by: Don Morris (amd64 part) Reviewed by: kib, markj, Don (!amd64 parts) MFC after: I don't intend to, but you might want to Sponsored by: Dell Isilon Differential Revision: https://reviews.freebsd.org/D25689
|
#
3b23ffe2 |
|
10-Jun-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
amd64 pmap: reorder IPI send and local TLB flush in TLB invalidations. Right now code first flushes all local TLB entries that needs to be flushed, then signals IPI to remote cores, and then waits for acknowledgements while spinning idle. In the VMWare article 'Don’t shoot down TLB shootdowns!' it was noted that the time spent spinning is lost, and can be more usefully used doing local TLB invalidation. We could use the same invalidation handler for local TLB as for remote, but typically for pmap == curpmap we can use INVLPG for locals instead of INVPCID on remotes, since we cannot control context switches on them. Due to that, keep the local code and provide the callbacks to be called from smp_targeted_tlb_shootdown() after IPIs are fired but before spin wait starts. Reviewed by: alc, cem, markj, Anton Rang <rang at acm.org> Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D25188
|
#
81302f1d |
|
28-May-2020 |
Mark Johnston <markj@FreeBSD.org> |
Fix boot on systems where NUMA domain 0 is unpopulated. - Add vm_phys_early_add_seg(), complementing vm_phys_early_alloc(), to ensure that segments registered during hammer_time() are placed in the right domain. Otherwise, since the SRAT is not parsed at that point, we just add them to domain 0, which may be incorrect and results in a domain with only several MB worth of memory. - Fix uma_startup1() to try allocating memory for zones from any domain. If domain 0 is unpopulated, the allocation will simply fail, resulting in a page fault slightly later during boot. - Change _vm_phys_domain() to return -1 for addresses not covered by the affinity table, and change vm_phys_early_alloc() to handle wildcard domains. This is necessary on amd64, where the page array is dense and pmap_page_array_startup() may allocate page table pages for non-existent page frames. Reported and tested by: Rafael Kitover <rkitover@gmail.com> Reviewed by: cem (earlier version), kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D25001
|
#
10c8fb47 |
|
04-Feb-2020 |
Ryan Libby <rlibby@FreeBSD.org> |
uma: convert mbuf_jumbo_alloc to UMA_ZONE_CONTIG & tag others Remove mbuf_jumbo_alloc and let large mbuf zones use the new uma default contig allocator (a copy of mbuf_jumbo_alloc). Tag other zones which require contiguous objects, even if they don't use the new default contig allocator, so that uma knows about their constraints. Reviewed by: jeff, markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D23238
|
#
f3e982e7 |
|
07-Jan-2020 |
Mark Johnston <markj@FreeBSD.org> |
Define a unified pmap structure for i386. The overloading of struct pmap for PAE and non-PAE pmaps results in three distinct layouts for the structure, which is embedded in struct vmspace. This causes a large number of duplicate structure definitions in the i386 kernel's CTF type graph. Since most pmap fields are the same in the two pmaps, simply provide side-by-side variants of the fields that are distinct, using fixed-size types. PR: 242689 Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22896
|
#
e8bbca1b |
|
07-Jan-2020 |
Mark Johnston <markj@FreeBSD.org> |
Consistently use pmap_t instead of struct pmap *. MFC after: 3 days Sponsored by: The FreeBSD Foundation
|
#
1c3a2410 |
|
04-Jan-2020 |
Alan Cox <alc@FreeBSD.org> |
When a copy-on-write fault occurs, pmap_enter() is called on to replace the mapping to the old read-only page with a mapping to the new read-write page. To destroy the old mapping, pmap_enter() must destroy its page table and PV entries and invalidate its TLB entry. This change simply invalidates that TLB entry a little earlier, specifically, on amd64 and arm64, before the PV list lock is held. Reviewed by: kib, markj MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D23027
|
#
e1ab4060 |
|
28-Dec-2019 |
Alan Cox <alc@FreeBSD.org> |
Correctly implement PMAP_ENTER_NOREPLACE in pmap_enter_{l2,pde}() on kernel mappings. Reduce code duplication by defining a function, pmap_abort_ptp(), for handling a common error case. Simplify error handling in pmap_enter_quick_locked(). Reviewed by: kib Tested by: pho MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D22890
|
#
5cff1f4d |
|
10-Dec-2019 |
Mark Johnston <markj@FreeBSD.org> |
Introduce vm_page_astate. This is a 32-bit structure embedded in each vm_page, consisting mostly of page queue state. The use of a structure makes it easy to store a snapshot of a page's queue state in a stack variable and use cmpset loops to update that state without requiring the page lock. This change merely adds the structure and updates references to atomic state fields. No functional change intended. Reviewed by: alc, jeff, kib Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D22650
|
#
01cef4ca |
|
16-Oct-2019 |
Mark Johnston <markj@FreeBSD.org> |
Remove page locking from pmap_mincore(). After r352110 the page lock no longer protects a page's identity, so there is no purpose in locking the page in pmap_mincore(). Instead, if vm.mincore_mapped is set to the non-default value of 0, re-lookup the page after acquiring its object lock, which holds the page's identity stable. The change removes the last callers of vm_page_pa_tryrelock(), so remove it. Reviewed by: kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21823
|
#
638f8678 |
|
14-Oct-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
(6/6) Convert pmap to expect busy in write related operations now that all callers hold it. This simplifies pmap code and removes a dependency on the object lock. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21596
|
#
205be21d |
|
14-Oct-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
(3/6) Add a shared object busy synchronization mechanism that blocks new page busy acquires while held. This allows code that would need to acquire and release a very large number of page busy locks to use the old mechanism where busy is only checked and not held. This comes at the cost of false positives but never false negatives which the single consumer, vm_fault_soft_fast(), handles. Reviewed by: kib Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21592
|
#
b119329d |
|
25-Sep-2019 |
Mark Johnston <markj@FreeBSD.org> |
Complete the removal of the "wire_count" field from struct vm_page. Convert all remaining references to that field to "ref_count" and update comments accordingly. No functional change intended. Reviewed by: alc, kib Sponsored by: Intel, Netflix Differential Revision: https://reviews.freebsd.org/D21768
|
#
66eb1d63 |
|
22-Sep-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
i386: reduce differences in source between PAE and non-PAE pmaps ... by defining pg_nx as zero for non-PAE and correspondingly simplifying some expressions. Suggested and reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D21757
|
#
b223a692 |
|
22-Sep-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
i386: implement sysctl vm.pmap.kernel_maps. Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D21739
|
#
e8bcf696 |
|
16-Sep-2019 |
Mark Johnston <markj@FreeBSD.org> |
Revert r352406, which contained changes I didn't intend to commit.
|
#
41fd4b94 |
|
16-Sep-2019 |
Mark Johnston <markj@FreeBSD.org> |
Fix a couple of nits in r352110. - Remove a dead variable from the amd64 pmap_extract_and_hold(). - Fix grammar in the vm_page_wire man page. Reported by: alc Reviewed by: alc, kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21639
|
#
fee2a2fa |
|
09-Sep-2019 |
Mark Johnston <markj@FreeBSD.org> |
Change synchonization rules for vm_page reference counting. There are several mechanisms by which a vm_page reference is held, preventing the page from being freed back to the page allocator. In particular, holding the page's object lock is sufficient to prevent the page from being freed; holding the busy lock or a wiring is sufficent as well. These references are protected by the page lock, which must therefore be acquired for many per-page operations. This results in false sharing since the page locks are external to the vm_page structures themselves and each lock protects multiple structures. Transition to using an atomically updated per-page reference counter. The object's reference is counted using a flag bit in the counter. A second flag bit is used to atomically block new references via pmap_extract_and_hold() while removing managed mappings of a page. Thus, the reference count of a page is guaranteed not to increase if the page is unbusied, unmapped, and the object's write lock is held. As a consequence of this, the page lock no longer protects a page's identity; operations which move pages between objects are now synchronized solely by the objects' locks. The vm_page_wire() and vm_page_unwire() KPIs are changed. The former requires that either the object lock or the busy lock is held. The latter no longer has a return value and may free the page if it releases the last reference to that page. vm_page_unwire_noq() behaves the same as before; the caller is responsible for checking its return value and freeing or enqueuing the page as appropriate. vm_page_wire_mapped() is introduced for use in pmap_extract_and_hold(). It fails if the page is concurrently being unmapped, typically triggering a fallback to the fault handler. vm_page_wire() no longer requires the page lock and vm_page_unwire() now internally acquires the page lock when releasing the last wiring of a page (since the page lock still protects a page's queue state). In particular, synchronization details are no longer leaked into the caller. The change excises the page lock from several frequently executed code paths. In particular, vm_object_terminate() no longer bounces between page locks as it releases an object's pages, and direct I/O and sendfile(SF_NOCACHE) completions no longer require the page lock. In these latter cases we now get linear scalability in the common scenario where different threads are operating on different files. __FreeBSD_version is bumped. The DRM ports have been updated to accomodate the KPI changes. Reviewed by: jeff (earlier version) Tested by: gallatin (earlier version), pho Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20486
|
#
c45cbc7a |
|
02-Aug-2019 |
John Baldwin <jhb@FreeBSD.org> |
Don't reset memory attributes when mapping physical addresses for ACPI. Previously, AcpiOsMemory was using pmap_mapbios which would always map the requested address Write-Back (WB). For several AMD Ryzen laptops, the BIOS uses AcpiOsMemory to directly access the PCI MCFG region in order to access PCI config registers. This has the side effect of remapping the MCFG region in the direct map as WB instead of UC hanging the laptops during boot. On the one laptop I examined in detail, the _PIC global method used to switch from 8259A PICs to I/O APICs uses a pair of PCI config space registers at offset 0x84 in the device at 0:0:0 to as a pair of address/data registers to access an indirect register in the chipset and clear a single bit to switch modes. To fix, alter the semantics of pmap_mapbios() such that it does not modify the attributes of any existing mappings and instead uses the existing attributes. If a new mapping is created, this new mapping uses WB (the default memory attribute). Special thanks to the gentleman whose name I don't have who brought two affected laptops to the hacker lounge at BSDCan. Direct access to the affected systems permitted finding the root cause within an hour or so. PR: 231760, 236899 Reviewed by: kib, alc MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D20327
|
#
43ded0a3 |
|
30-Jul-2019 |
Alan Cox <alc@FreeBSD.org> |
In pmap_advise(), when we encounter a superpage mapping, we first demote the mapping and then destroy one of the 4 KB page mappings so that there is a potential trigger for repromotion. Currently, we destroy the first 4 KB page mapping that falls within the (current) superpage mapping or the virtual address range [sva, eva). However, I have found empirically that destroying the last 4 KB mapping produces slightly better results, specifically, more promotions and fewer failed promotion attempts. Accordingly, this revision changes pmap_advise() to destroy the last 4 KB page mapping. It also replaces some nearby uses of boolean_t with bool. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D21115
|
#
5d18382b |
|
25-Jul-2019 |
Alan Cox <alc@FreeBSD.org> |
Simplify the handling of superpages in pmap_clear_modify(). Specifically, if a demotion succeeds, then all of the 4KB page mappings within the superpage-sized region must be valid, so there is no point in testing the validity of the 4KB page mapping that is going to be write protected. Deindent the nearby code. Reviewed by: kib, markj Tested by: pho (amd64, i386) X-MFC after: r350004 (this change depends on arm64 dirty bit emulation) Differential Revision: https://reviews.freebsd.org/D21027
|
#
43184d8e |
|
15-Jul-2019 |
Alan Cox <alc@FreeBSD.org> |
Revert r349973. Upon further reflection, I realized that the comment deleted by r349973 is still valid on i386. Restore it. Discussed with: markj
|
#
f1384063 |
|
13-Jul-2019 |
Alan Cox <alc@FreeBSD.org> |
Remove a stale comment. Reported by: markj MFC after: 1 week
|
#
eeacb3b0 |
|
08-Jul-2019 |
Mark Johnston <markj@FreeBSD.org> |
Merge the vm_page hold and wire mechanisms. The hold_count and wire_count fields of struct vm_page are separate reference counters with similar semantics. The remaining essential differences are that holds are not counted as a reference with respect to LRU, and holds have an implicit free-on-last unhold semantic whereas vm_page_unwire() callers must explicitly determine whether to free the page once the last reference to the page is released. This change removes the KPIs which directly manipulate hold_count. Functions such as vm_fault_quick_hold_pages() now return wired pages instead. Since r328977 the overhead of maintaining LRU for wired pages is lower, and in many cases vm_fault_quick_hold_pages() callers would swap holds for wirings on the returned pages anyway, so with this change we remove a number of page lock acquisitions. No functional change is intended. __FreeBSD_version is bumped. Reviewed by: alc, kib Discussed with: jeff Discussed with: jhb, np (cxgbe) Tested by: pho (previous version) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D19247
|
#
c134ef74 |
|
28-Jun-2019 |
Alan Cox <alc@FreeBSD.org> |
When we protect PTEs (as opposed to PDEs), we only call vm_page_dirty() when, in fact, we are write protecting the page and the PTE has PG_M set. However, pmap_protect_pde() was always calling vm_page_dirty() when the PDE has PG_M set. So, adding PG_NX to a writeable PDE could result in unnecessary (but harmless) calls to vm_page_dirty(). Simplify the loop calling vm_page_dirty() in pmap_protect_pde(). Reviewed by: kib, markj MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D20793
|
#
fd2dae0a |
|
08-Jun-2019 |
Alan Cox <alc@FreeBSD.org> |
Implement an alternative solution to the amd64 and i386 pmap problem that we previously addressed in r348246. This pmap problem also exists on arm64 and riscv. However, the original solution developed for amd64 and i386 cannot be used on arm64 and riscv. In particular, arm64 and riscv do not define a PG_PROMOTED flag in their level 2 PTEs. (A PG_PROMOTED flag makes no sense on arm64, where unlike x86 or riscv we are required to break the old 4KB mappings before making the 2MB mapping; and on riscv there are no unused bits in the PTE to define a PG_PROMOTED flag.) This commit implements an alternative solution that can be used on all four architectures. Moreover, this solution has two other advantages. First, on older AMD processors that required the Erratum 383 workaround, it is less costly. Specifically, it avoids unnecessary calls to pmap_fill_ptp() on a superpage demotion. Second, it enables the elimination of some calls to pagezero() in pmap_kernel_remove_{l2,pde}(). In addition, remove a related stale comment from pmap_enter_{l2,pde}(). Reviewed by: kib, markj (an earlier version) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D20538
|
#
88ea538a |
|
07-Jun-2019 |
Mark Johnston <markj@FreeBSD.org> |
Replace uses of vm_page_unwire(m, PQ_NONE) with vm_page_unwire_noq(m). These calls are not the same in general: the former will dequeue the page if it is enqueued, while the latter will just leave it alone. But, all existing uses of the former apply to unmanaged pages, which are never enqueued in the first place. No functional change intended. Reviewed by: kib MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20470
|
#
d7bb6218 |
|
25-May-2019 |
Mark Johnston <markj@FreeBSD.org> |
Remove pmap_pid_dump() from the i386 pmap. It has not been compilable in a long time and doesn't seem very useful. Suggested by: kib MFC after: 1 week
|
#
a9c7546a |
|
24-May-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix a corner case in demotion of kernel mappings. It is possible for the kernel mapping to be created with superpage by directly installing pde using pmap_enter_2mpage() without filling the corresponding page table page. This can happen e.g. if the range is already backed by reservation and vm_fault_soft_fast() conditions are satisfied, which was observed on the pipe_map. In this case, demotion must fill the page obtained from the pmap radix, same as if the page is newly allocated. Use PG_PROMOTED bit as an indicator that the page is valid, instead of the wire count of the page table page. Since the PG_PROMOTED bit is set on pde when we leave TLB entries for 4k pages around, which in particular means that the ptes were filled, it provides more correct indicator. Note that pmap_protect_pde() clears PG_PROMOTED, which handles the case when protection was changed on the superpage without adjusting ptes. Reported by: pho In collaboration with: alc Tested by: alc, pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D20380
|
#
64087fd7 |
|
21-Mar-2019 |
Mark Johnston <markj@FreeBSD.org> |
Disallow preemptive creation of wired superpage mappings. There are some unusual cases where a process may cause an mlock()ed range of memory to be unmapped. If the application subsequently faults on that region, the handler may attempt to create a superpage mapping backed by the resident, wired pages. However, the pmap code responsible for creating such a mapping (pmap_enter_pde() on i386 and amd64) does not ensure that a leaf page table page is available if the superpage is later demoted; the demotion operation must therefore perform a non-blocking page allocation and must unmap the entire superpage if the allocation fails. The pmap layer ensures that this can never happen for wired mappings, and so the case described above breaks that invariant. For now, simply ensure that the MI fault handler never attempts to create a wired superpage except via promotion. Reviewed by: kib Reported by: syzbot+292d3b0416c27c131505@syzkaller.appspotmail.com MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D19670
|
#
bced332a |
|
28-Feb-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Invalidate cache for the PDPTE page when using PAE paging but PAT is not supported. According to SDM rev. 69 vol. 3, for PDPTE registers loads: - when PAT is not supported, access to the PDPTE page is performed as UC, see 4.9.1; - when PAT is supported, the access is WB, see 4.9.2. So potentially CPU might load stale memory as PDPTEs if both PAT and self-snoop are not implemented. To be safe, add total local cache flush to pmap_cold() before initial load of cr3, and flush PDPTE page in pmap_pinit(), if PAT is not implemented. Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D19365
|
#
d5f2c1e4 |
|
26-Feb-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
i386 PAE: avoid atomic for pte_store() where possible. Instead carefully write upper word, and only than the lower word with PG_V, for previously invalid ptes. It provides some measurable system time saving on buildworld. Reviewed by: markj Tested by: pho Measured by: bde (early version) Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D19226
|
#
eb785fab |
|
06-Feb-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Port sysctl kern.elf32.read_exec from amd64 to i386. Make it more comprehensive on i386, by not setting nx bit for any mapping, not just adding PF_X to all kernel-loaded ELF segments. This is needed for the compatibility with older i386 programs that assume that read access implies exec, e.g. old X servers with hand-rolled module loader. Reported and tested by: bde Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
9a527560 |
|
29-Jan-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
i386: Merge PAE and non-PAE pmaps into same kernel. Effectively all i386 kernels now have two pmaps compiled in: one managing PAE pagetables, and another non-PAE. The implementation is selected at cold time depending on the CPU features. The vm_paddr_t is always 64bit now. As result, nx bit can be used on all capable CPUs. Option PAE only affects the bus_addr_t: it is still 32bit for non-PAE configs, for drivers compatibility. Kernel layout, esp. max kernel address, low memory PDEs and max user address (same as trampoline start) are now same for PAE and for non-PAE regardless of the type of page tables used. Non-PAE kernel (when using PAE pagetables) can handle physical memory up to 24G now, larger memory requires re-tuning the KVA consumers and instead the code caps the maximum at 24G. Unfortunately, a lot of drivers do not use busdma(9) properly so by default even 4G barrier is not easy. There are two tunables added: hw.above4g_allow and hw.above24g_allow, the first one is kept enabled for now to evaluate the status on HEAD, second is only for dev use. i386 now creates three freelists if there is any memory above 4G, to allow proper bounce pages allocation. Also, VM_KMEM_SIZE_SCALE changed from 3 to 1. The PAE_TABLES kernel config option is retired. In collaboarion with: pho Discussed with: emaste Reviewed by: markj MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D18894
|
#
d3f40307 |
|
03-Jan-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix typo in r342710. Noted by: lidl MFC after: 3 days
|
#
9bfc7fa4 |
|
02-Jan-2019 |
Mark Johnston <markj@FreeBSD.org> |
Avoid setting PG_U unconditionally in pmap_enter_quick_locked(). This KPI may in principle be used to create kernel mappings, in which case we certainly should not be setting PG_U. In any case, PG_U must be set on all layers in the page tables to grant user mode access, and we were only setting it on leaf entries. Thus, this change should have no functional impact. Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation
|
#
e68d4438 |
|
31-Dec-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Update comments: paging is initialized in pmap_cold(). MFC after: 3 days Sponsored by: The FreeBSD Foundation
|
#
05d5652a |
|
29-Dec-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
i386: Fix allocation of the KVA frame for pmap_quick_enter_page(). Due to the typo, it shared the frame with the CMAP1 transient mapping. In collaboration with: pho MFC after: 3 days Sponsored by: The FreeBSD Foundation (kib)
|
#
36e1b970 |
|
01-Dec-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Correct the tunable name in the message. Submitted by: Andre Albsmeier <mail@fbsd.e4m.org> PR: 231577 MFC after: 1 week
|
#
9978bd99 |
|
30-Oct-2018 |
Mark Johnston <markj@FreeBSD.org> |
Add malloc_domainset(9) and _domainset variants to other allocator KPIs. Remove malloc_domain(9) and most other _domain KPIs added in r327900. The new functions allow the caller to specify a general NUMA domain selection policy, rather than specifically requesting an allocation from a specific domain. The latter policy tends to interact poorly with M_WAITOK, resulting in situations where a caller is blocked indefinitely because the specified domain is depleted. Most existing consumers of the _domain KPIs are converted to instead use a DOMAINSET_PREF() policy, in which we fall back to other domains to satisfy the allocation request. This change also defines a set of DOMAINSET_FIXED() policies, which only permit allocations from the specified domain. Discussed with: gallatin, jeff Reported and tested by: pho (previous version) MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D17418
|
#
36209a40 |
|
20-Oct-2018 |
Mark Johnston <markj@FreeBSD.org> |
Add an assertion to pmap_enter(). When modifying an existing managed mapping, we should find a PV entry for the old mapping. Verify this. Before r335784 this would have been implicitly tested by the fact that we always freed the PV entry for the old mapping. Reviewed by: alc, kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D17626
|
#
cb4961ab |
|
01-Oct-2018 |
Mark Johnston <markj@FreeBSD.org> |
Apply r339046 to i386. Belatedly add a comment to the amd64 pmap explaining why we initialize the kernel pmap's resident page count. Reviewed by: alc, kib Approved by: re (gjb) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D17377
|
#
5f11ee20 |
|
29-Sep-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix UP build. Reported by: tijl Sponsored by: The FreeBSD Foundation Approved by: re (rgrimes)
|
#
d12c4465 |
|
19-Sep-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Convert x86 cache invalidation functions to ifuncs. This simplifies the runtime logic and reduces the number of runtime-constant branches. Reviewed by: alc, markj Sponsored by: The FreeBSD Foundation Approved by: re (gjb) Differential revision: https://reviews.freebsd.org/D16736
|
#
f0165b1c |
|
28-Aug-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove {max/min}_offset() macros, use vm_map_{max/min}() inlines. Exposing max_offset and min_offset defines in public headers is causing clashes with variable names, for example when building QEMU. Based on the submission by: royger Reviewed by: alc, markj (previous version) Sponsored by: The FreeBSD Foundation (kib) MFC after: 1 week Approved by: re (marius) Differential revision: https://reviews.freebsd.org/D16881
|
#
60b74234 |
|
25-Aug-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Unify amd64 and i386 vmspace0 pmap activation. Add pmap_activate_boot() for i386, move the invocation on APs from MD init_secondary() to x86 init_secondary_tail(). Suggested by: alc Reviewed by: alc, markj Sponsored by: The FreeBSD Foundation Approved by: re (marius) MFC after: 1 week Differential revision: https://reviews.freebsd.org/D16893
|
#
83a90bff |
|
21-Aug-2018 |
Alan Cox <alc@FreeBSD.org> |
Eliminate kmem_malloc()'s unused arena parameter. (The arena parameter became unused in FreeBSD 12.x as a side-effect of the NUMA-related changes.) Reviewed by: kib, markj Discussed with: jeff, re@ Differential Revision: https://reviews.freebsd.org/D16825
|
#
e45b89d2 |
|
01-Aug-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Add pmap_is_valid_memattr(9). Discussed with: alc Sponsored by: The FreeBSD Foundation, Mellanox Technologies MFC after: 1 week Differential revision: https://reviews.freebsd.org/D15583
|
#
6c85795a |
|
27-Jul-2018 |
Mark Johnston <markj@FreeBSD.org> |
Fix handling of KVA in kmem_bootstrap_free(). Do not use vm_map_remove() to release KVA back to the system. Because kernel map entries do not have an associated VM object, with r336030 the vm_map_remove() call will not update the kernel page tables. Avoid relying on the vm_map layer and instead update the pmap and release KVA to the kernel arena directly in kmem_bootstrap_free(). Because the pmap updates will generally result in superpage demotions, modify pmap_init() to insert PTPs shadowed by superpage mappings into the kernel pmap's radix tree. While here, port r329171 to i386. Reported by: alc Reviewed by: alc, kib X-MFC with: r336505 Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D16426
|
#
db016164 |
|
20-Jul-2018 |
Alan Cox <alc@FreeBSD.org> |
Annotate a parameter as unused. X-MFC with: r336288
|
#
697be9a3 |
|
15-Jul-2018 |
Mark Johnston <markj@FreeBSD.org> |
Restore the check for the page size extension after r332489. Without this, the support for transparent superpage promotion on i386 was left disabled. Reviewed by: alc, kib Differential Revision: https://reviews.freebsd.org/D16279
|
#
8c087371 |
|
14-Jul-2018 |
Alan Cox <alc@FreeBSD.org> |
Add support for pmap_enter(..., psind=1) to the i386 pmap. In other words, add support for explicitly requesting that pmap_enter() create a 2 or 4 MB page mapping. (Essentially, this feature allows the machine-independent layer to create superpage mappings preemptively, and not wait for automatic promotion to occur.) Export pmap_ps_enabled() to the machine-independent layer. Add a flag to pmap_pv_insert_pde() that specifies whether it should fail or reclaim a PV entry when one is not available. Refactor pmap_enter_pde() into two functions, one by the same name, that is a general-purpose function for creating PDE PG_PS mappings, and another, pmap_enter_4mpage(), that is used to prefault 2 or 4 MB read- and/or execute-only mappings for execve(2), mmap(2), and shmat(2). Reviewed by: kib Tested by: pho Differential Revision: https://reviews.freebsd.org/D16246
|
#
76999163 |
|
10-Jul-2018 |
Alan Cox <alc@FreeBSD.org> |
Eliminate unnecessary differences between i386's pmap_enter() and amd64's. For example, fully construct the new PTE before entering the critical section. This change is a stepping stone to psind == 1 support on i386. Reviewed by: kib, markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D16188
|
#
717d5c0b |
|
08-Jul-2018 |
Alan Cox <alc@FreeBSD.org> |
Invalidate the mapping before updating its physical address. Doing so ensures that all threads sharing the pmap have a consistent view of the mapping. This fixes the problem described in the commit log messages for r329254 without the overhead of an extra fault in the common case. Once other pmap_enter() implementations are similarly modified, the workaround added in r329254 can be removed, reducing the overhead of CoW faults. See also r335784 for amd64. The i386 implementation of pmap_enter() already reused the PV entry from the old mapping. Reviewed by: kib, markj Tested by: pho MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D16133
|
#
945a6b31 |
|
05-Jul-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Extend r335969 to superpages. It is possible that a fictitious unmanaged userspace mapping of superpage is created on x86, e.g. by pmap_object_init_pt(), with the physical address outside the vm_page_array[] coverage. Noted and reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D16085
|
#
a0ef97f6 |
|
05-Jul-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Revert r335999 to re-commit with the correct error message.
|
#
c59dfa63 |
|
05-Jul-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
In x86 pmap_extract_and_hold(), there is no need to recalculate the physical address, which is readily available after sucessfull vm_page_pa_tryrelock(). Noted and reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D16085
|
#
81dac871 |
|
05-Jul-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
In x86 pmap_extract_and_hold(), there is no need to recalculate the physical address, which is readily available after sucessfull vm_page_pa_tryrelock(). Noted and reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D16085
|
#
84a15fe7 |
|
04-Jul-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
In x86 pmap_extract_and_hold()s, handle the case of PHYS_TO_VM_PAGE() returning NULL. vm_fault_quick_hold_pages() can be legitimately called on userspace mappings backed by fictitious pages created by unmanaged device and sg pagers. Note that other architectures pmap_extract_and_hold() might need similar fix, but I postponed the examination. Reported by: bde Discussed with: alc Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D16085
|
#
d05d616c |
|
30-May-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Use pmap_pte_ufast() instead of pmap_pte() in pmap_extract(), pmap_is_prefaultable() and pmap_incore(), pushing the number of shootdown IPIs back to the 3/1 kernel. Benchmarked by: bde Tested by: pho Sponsored by: The FreeBSD Foundation
|
#
c5981f69 |
|
30-May-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Extract code for fast mapping of pte from pmap_extract_and_hold() into the helper function pmap_pte_ufast(). Benchmarked by: bde Tested by: pho Sponsored by: The FreeBSD Foundation
|
#
7883d57a |
|
30-May-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Restore pmap_copy() for 4/4 i386 pmap. Create yet another temporal pte mapping routine pmap_pte_quick3(), which is the copy of the pmap_pte_quick() and relies on the pvh_global_lock to protect the frame. It accounts into the same counters as pmap_pte_quick(). It is needed since pmap_copy() uses pmap_pte_quick() already, and since a user pmap is no longer current pmap. pmap_copy() still provides the advantage for real-world workloads involving lot of forks where processes do not exec immediately. Benchmarked by: bde Sponsored by: The FreeBSD Foundation
|
#
d94bd372 |
|
30-May-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Do use pmap_pte_quick() in pmap_enter_quick_locked(). Benchmarked by: bde Tested by: pho Sponsored by: The FreeBSD Foundation
|
#
1095694a |
|
30-May-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Avoid unneccessary TLB shootdowns in pmap_unwire_ptp() for user pmaps, which no longer create recursive page table mappings. Benchmarked by: bde Tested by: pho Sponsored by: The FreeBSD Foundation
|
#
ded29bd9 |
|
25-May-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Optimize i386 pmap_extract_and_hold(). In particular, stop using pmap_pte() to read non-promoted pte while walking the page table. pmap_pte() needs to shoot down the kernel mapping globally which causes IPI broadcast. Since pmap_extract_and_hold() is used for slow copyin(9), it is very significant hit for the 4/4 kernels. Instead, create single purpose per-processor page frame and use it to locally map page table page inside the critical section, to avoid reuse of the frame by other thread if context switched. Measurement demostrated very significant improvements in any load that utilizes copyin/copyout. Found and benchmarked by: bde Sponsored by: The FreeBSD Foundation
|
#
507e50d5 |
|
12-May-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Initialize tramp_idleptd during cold pmap startup, before the exception code is copied to the trampoline. The correct value is then copied to trampoline automatically, so tramp_idleptd_reloced can be eliminated. This will allow to use the same exception entry code to handle traps from vm86 bios calls on early boot stage, as after the trampoline is configured. Sponsored by: The FreeBSD Foundation
|
#
919015a4 |
|
18-Apr-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix pmap_trm_alloc(M_ZERO). Sponsored by: The FreeBSD Foundation
|
#
d86c1f0d |
|
13-Apr-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
i386 4/4G split. The change makes the user and kernel address spaces on i386 independent, giving each almost the full 4G of usable virtual addresses except for one PDE at top used for trampoline and per-CPU trampoline stacks, and system structures that must be always mapped, namely IDT, GDT, common TSS and LDT, and process-private TSS and LDT if allocated. By using 1:1 mapping for the kernel text and data, it appeared possible to eliminate assembler part of the locore.S which bootstraps initial page table and KPTmap. The code is rewritten in C and moved into the pmap_cold(). The comment in vmparam.h explains the KVA layout. There is no PCID mechanism available in protected mode, so each kernel/user switch forth and back completely flushes the TLB, except for the trampoline PTD region. The TLB invalidations for userspace becomes trivial, because IPI handlers switch page tables. On the other hand, context switches no longer need to reload %cr3. copyout(9) was rewritten to use vm_fault_quick_hold(). An issue for new copyout(9) is compatibility with wiring user buffers around sysctl handlers. This explains two kind of locks for copyout ptes and accounting of the vslock() calls. The vm_fault_quick_hold() AKA slow path, is only tried after the 'fast path' failed, which temporary changes mapping to the userspace and copies the data to/from small per-cpu buffer in the trampoline. If a page fault occurs during the copy, it is short-circuit by exception.s to not even reach C code. The change was motivated by the need to implement the Meltdown mitigation, but instead of KPTI the full split is done. The i386 architecture already shows the sizing problems, in particular, it is impossible to link clang and lld with debugging. I expect that the issues due to the virtual address space limits would only exaggerate and the split gives more liveness to the platform. Tested by: pho Discussed with: bde Sponsored by: The FreeBSD Foundation MFC after: 1 month Differential revision: https://reviews.freebsd.org/D14633
|
#
8c8ee2ee |
|
04-Mar-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Unify bulk free operations in several pmaps. Submitted by: Yoshihiro Ota Reviewed by: markj MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D13485
|
#
2c0f13aa |
|
20-Feb-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
vm_wait() rework. Make vm_wait() take the vm_object argument which specifies the domain set to wait for the min condition pass. If there is no object associated with the wait, use curthread' policy domainset. The mechanics of the wait in vm_wait() and vm_wait_domain() is supplied by the new helper vm_wait_doms(), which directly takes the bitmask of the domains to wait for passing min condition. Eliminate pagedaemon_wait(). vm_domain_clear() handles the same operations. Eliminate VM_WAIT and VM_WAITPFAULT macros, the direct functions calls are enough. Eliminate several control state variables from vm_domain, unneeded after the vm_wait() conversion. Scetched and reviewed by: jeff Tested by: pho Sponsored by: The FreeBSD Foundation, Mellanox Technologies Differential revision: https://reviews.freebsd.org/D14384
|
#
5bd01497 |
|
14-Feb-2018 |
Conrad Meyer <cem@FreeBSD.org> |
x86 pmap: Make memory mapped via pmap_qenter() non-executable The idea is, the pmap_qenter() API is now defined to not produce executable mappings. If you need executable mappings, use another API. Add pg_nx flag in pmap_qenter on x86 to make kernel pages non-executable. Other architectures that support execute-specific permissons on page table entries should subsequently be updated to match. Submitted by: Darrick Lew <darrick.freebsd AT gmail.com> Reviewed by: markj Discussed with: alc, jhb, kib Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D14062
|
#
e958ad4c |
|
12-Feb-2018 |
Jeff Roberson <jeff@FreeBSD.org> |
Make v_wire_count a per-cpu counter(9) counter. This eliminates a significant source of cache line contention from vm_page_alloc(). Use accessors and vm_page_unwire_noq() so that the mechanism can be easily changed in the future. Reviewed by: markj Discussed with: kib, glebius Tested by: pho (earlier version) Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D14273
|
#
ab7c09f1 |
|
08-Feb-2018 |
Mark Johnston <markj@FreeBSD.org> |
Use vm_page_unwire_noq() instead of directly modifying page wire counts. No functional change intended. Reviewed by: alc, kib (previous revision) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D14266
|
#
c8f9c1f3 |
|
27-Jan-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Use PCID to optimize PTI. Use PCID to avoid complete TLB shootdown when switching between user and kernel mode with PTI enabled. I use the model close to what I read about KAISER, user-mode PCID has 1:1 correspondence to the kernel-mode PCID, by setting bit 11 in PCID. Full kernel-mode TLB shootdown is performed on context switches, since KVA TLB invalidation only works in the current pmap. User-mode part of TLB is flushed on the pmap activations as well. Similarly, IPI TLB shootdowns must handle both kernel and user address spaces for each address. Note that machines which implement PCID but do not have INVPCID instructions, cause the usual complications in the IPI handlers, due to the need to switch to the target PCID temporary. This is racy, but because for PCID/no-INVPCID we disable the interrupts in pmap_activate_sw(), IPI handler cannot see inconsistent state of CPU PCID vs PCPU pmap/kcr3/ucr3 pointers. On the other hand, on kernel/user switches, CR3_PCID_SAVE bit is set and we do not clear TLB. I can imagine alternative use of PCID, where there is only one PCID allocated for the kernel pmap. Then, there is no need to shootdown kernel TLB entries on context switch. But copyout(3) would need to either use method similar to proc_rwmem() to access the userspace data, or (in reverse) provide a temporal mapping for the kernel buffer into user mode PCID and use trampoline for copy. Reviewed by: markj (previous version) Tested by: pho Discussed with: alc (some aspects) Sponsored by: The FreeBSD Foundation MFC after: 3 weeks Differential revision: https://reviews.freebsd.org/D13985
|
#
bd50262f |
|
17-Jan-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
PTI for amd64. The implementation of the Kernel Page Table Isolation (KPTI) for amd64, first version. It provides a workaround for the 'meltdown' vulnerability. PTI is turned off by default for now, enable with the loader tunable vm.pmap.pti=1. The pmap page table is split into kernel-mode table and user-mode table. Kernel-mode table is identical to the non-PTI table, while usermode table is obtained from kernel table by leaving userspace mappings intact, but only leaving the following parts of the kernel mapped: kernel text (but not modules text) PCPU GDT/IDT/user LDT/task structures IST stacks for NMI and doublefault handlers. Kernel switches to user page table before returning to usermode, and restores full kernel page table on the entry. Initial kernel-mode stack for PTI trampoline is allocated in PCPU, it is only 16 qwords. Kernel entry trampoline switches page tables. then the hardware trap frame is copied to the normal kstack, and execution continues. IST stacks are kept mapped and no trampoline is needed for NMI/doublefault, but of course page table switch is performed. On return to usermode, the trampoline is used again, iret frame is copied to the trampoline stack, page tables are switched and iretq is executed. The case of iretq faulting due to the invalid usermode context is tricky, since the frame for fault is appended to the trampoline frame. Besides copying the fault frame and original (corrupted) frame to kstack, the fault frame must be patched to make it look as if the fault occured on the kstack, see the comment in doret_iret detection code in trap(). Currently kernel pages which are mapped during trampoline operation are identical for all pmaps. They are registered using pmap_pti_add_kva(). Besides initial registrations done during boot, LDT and non-common TSS segments are registered if user requested their use. In principle, they can be installed into kernel page table per pmap with some work. Similarly, PCPU can be hidden from userspace mapping using trampoline PCPU page, but again I do not see much benefits besides complexity. PDPE pages for the kernel half of the user page tables are pre-allocated during boot because we need to know pml4 entries which are copied to the top-level paging structure page, in advance on a new pmap creation. I enforce this to avoid iterating over the all existing pmaps if a new PDPE page is needed for PTI kernel mappings. The iteration is a known problematic operation on i386. The need to flush hidden kernel translations on the switch to user mode make global tables (PG_G) meaningless and even harming, so PG_G use is disabled for PTI case. Our existing use of PCID is incompatible with PTI and is automatically disabled if PTI is enabled. PCID can be forced on only for developer's benefit. MCE is known to be broken, it requires IST stack to operate completely correctly even for non-PTI case, and absolutely needs dedicated IST stack because MCE delivery while trampoline did not switched from PTI stack is fatal. The fix is pending. Reviewed by: markj (partially) Tested by: pho (previous version) Discussed with: jeff, jhb Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
#
ab3185d1 |
|
12-Jan-2018 |
Jeff Roberson <jeff@FreeBSD.org> |
Implement NUMA support in uma(9) and malloc(9). Allocations from specific domains can be done by the _domain() API variants. UMA also supports a first-touch policy via the NUMA zone flag. The slab layer is now segregated by VM domains and is precise. It handles iteration for round-robin directly. The per-cpu cache layer remains a mix of domains according to where memory is allocated and freed. Well behaved clients can achieve perfect locality with no performance penalty. The direct domain allocation functions have to visit the slab layer and so require per-zone locks which come at some expense. Reviewed by: Attilio (a slightly older version) Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon
|
#
d5d56007 |
|
18-Dec-2017 |
Bruce Evans <bde@FreeBSD.org> |
Also forgotten in the previous that removed the permanent double mapping of low physical memory: Update the comment about leaving the permanent mapping in place. This also improves the wording of the comment. PTD 0 is still left alone because it is fairly important that it was unmapped earlier, and the comment now describes the unmapping of the other low PTDs that the code actually does. Reviewed by: kib
|
#
3f21dc29 |
|
18-Dec-2017 |
Bruce Evans <bde@FreeBSD.org> |
Fix the undersupported option KERNLOAD, part 1: fix crashes in locore when KERNLOAD is not a multiple of NBPDR (not the default) and PSE is enabled (the default if the CPU supports it). Addresses in PDEs must be a multiple of NBPDR in the PSE case, but were not so in the crashing case. KERNLOAD defaults to NBPDR. NBPDR is 4 MB for !PAE and 2 MB for PAE. The default can be changed by editing i386/include/vmparam.h or using makeoptions. It can be changed to less than NBPDR to save real and virtual memory at a small cost in time, or to more than NBPDR to waste real and virtual memory. It must be larger than 1 MB and a multiple of PAGE_SIZE. When it is less than NBPDR, it is necessarily not a multiple of NBPDR. This case has much larger bugs which will be fixed in part 2. The fix is to only use PSE for physical addresses above <KERNLOAD rounded _up_ to an NBPDR boundary>. When the rounding is non-null, this leaves part of the kernel not using large pages. Rounding down would avoid this pessimization, but would break setting of PAT bits on i/o pages if it goes below 1MB. Since rounding down always goes below 1MB when KERNLOAD < NBPDR and the KERNLOAD > NBPDR case is not useful, never round down. Fix related style bugs (e.g., wrong literal values for NBPDR in comments). Reviewed by: kib
|
#
fb3cc1c3 |
|
07-Dec-2017 |
Bruce Evans <bde@FreeBSD.org> |
Move instantiation of msgbufp from 9 MD files to subr_prf.c. This variable should be pure MI except possibly for reading it in MD dump routines. Its initialization was pure MD in 4.4BSD, but FreeBSD changed this in r36441 in 1998. There were many imperfections in r36441. This commit fixes only a small one, to simplify fixing the others 1 arch at a time. (r47678 added support for special/early/multiple message buffer initialization which I want in a more general form, but this was too fragile to use because hacking on the msgbufp global corrupted it, and was only used for 5 hours in -current...)
|
#
df57947f |
|
18-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
spdx: initial adoption of licensing ID tags. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point. Initially, only tag files that use BSD 4-Clause "Original" license. RelNotes: yes Differential Revision: https://reviews.freebsd.org/D13133
|
#
4e421792 |
|
16-Nov-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove i386 XBOX support. It is for console presented at 2001 and featuring Pentium III processor. Even if any of them are still alive and run FreeBSD, we do not have any sign of life from their users. While removing another dozens of #ifdefs from the i386 sources reduces the aversion from looking at the code and improves the platform vitality. Reviewed by: cem, pfg, rink (XBOX support author) Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D13016
|
#
5fca1d90 |
|
23-Oct-2017 |
Mark Johnston <markj@FreeBSD.org> |
Fix the VM_NRESERVLEVEL == 0 build. Add VM_NRESERVLEVEL guards in the pmaps that implement transparent superpage promotion using reservations. Reviewed by: alc, kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D12764
|
#
2375aaa8 |
|
31-Jul-2017 |
Mark Johnston <markj@FreeBSD.org> |
Batch updates to v_wire_count when freeing page table pages on x86. The removed release stores are not needed since stores are totally ordered on i386 and amd64. Reviewed by: alc, kib (previous revision) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D11790
|
#
510cdf22 |
|
01-Jul-2017 |
Alan Cox <alc@FreeBSD.org> |
When "force" is specified to pmap_invalidate_cache_range(), the given start address is not required to be page aligned. However, the loop within pmap_invalidate_cache_range() that performs the actual cache line invalidations requires that the starting address be truncated to a multiple of the cache line size. This change corrects an error in that truncation. Submitted by: Brett Gutstein <bgutstein@rice.edu> Reviewed by: kib MFC after: 1 week
|
#
67d955aa |
|
08-Apr-2017 |
Patrick Kelsey <pkelsey@FreeBSD.org> |
Corrected misspelled versions of rendezvous. The MFC will include a compat definition of smp_no_rendevous_barrier() that calls smp_no_rendezvous_barrier(). Reviewed by: gnn, kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D10313
|
#
03149668 |
|
26-Feb-2017 |
Alan Cox <alc@FreeBSD.org> |
Refine the fix from r312954. Specifically, add a new PDE-only flag, PG_PROMOTED, that indicates whether lingering 4KB page mappings might need to be flushed on a PDE change that restricts or destroys a 2MB page mapping. This flag allows the pmap to avoid range invalidations that are both unnecessary and costly. Reviewed by: kib, markj MFC after: 6 weeks Differential Revision: https://reviews.freebsd.org/D9665
|
#
dab48644 |
|
18-Feb-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
MFamd64 r313933: microoptimize pmap_protect_pde(). Noted by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
57f6622f |
|
02-Feb-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
For i386, remove config options CPU_DISABLE_CMPXCHG, CPU_DISABLE_SSE and device npx. This means that FPU is always initialized and handled when available, and SSE+ register file and exception are handled when available. This makes the kernel FPU code much easier to maintain by the cost of slight bloat for CPUs older than 25 years. CPU_DISABLE_CMPXCHG outlived its usefulness, see the removed comment explaining the original purpose. Suggested by and discussed with: bde Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 weeks
|
#
a0f64f38 |
|
29-Jan-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not leave stale 4K TLB entries on pde (superpage) removal or protection change. On superpage promotion, x86 pmaps do not invalidate existing 4K entries for the superpage range, because they are compatible with the promoted 2/4M entry. But the invalidation on superpage removal or protection change only did single INVLPG with the base address of the superpage. This reliably flushed superpage TLB entry, and 4K entry for the first page of the superpage, potentially leaving other 4K TLB entries lingering. Do the invalidation of the whole superpage range to correct the problem. Note that the precise invalidation is done by x86 code for kernel_pmap only, for user pmaps whole (per-AS) TLB is flushed. This made the bug well hidden, because promotions of the kernel mappings require specific load. Reported and tested by: Jonathan Looney <jtl@netflix.com> (previous version) Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
d9864508 |
|
29-Jan-2017 |
Jason A. Harmening <jah@FreeBSD.org> |
Implement get_pcpu() for i386 and use it to replace pcpu_find(curcpu) in the i386 pmap. The curcpu macro loads the per-cpu data pointer as its first step, so the remaining steps of pcpu_find(curcpu) are circular. get_pcpu() is already implemented for arm, arm64, and risc-v. My plan is to implement it for the remaining architectures and use it to replace several instances of pcpu_find(curcpu) in MI code. Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D9370
|
#
5611aaa1 |
|
20-Jan-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Use SFENCE for ordering CLFLUSHOPT. SDM states that CLFLUSHOPT instructions can be ordered with other writes by SFENCE, heavier MFENCE is not required. Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
#
86785b54 |
|
14-Jan-2017 |
Jason A. Harmening <jah@FreeBSD.org> |
Add comment explaining relative order of sched_unpin() and mtx_unlock(). Suggested by: alc MFC after: 1 week
|
#
28699efd |
|
14-Jan-2017 |
Jason A. Harmening <jah@FreeBSD.org> |
For i386 temporary mappings, unpin the thread before releasing the cmap lock. Releasing the lock first may result in the thread being immediately rescheduled and bound to the same CPU, only to unpin itself upon resuming execution. Noted by: skra (in review for armv6 equivalent) MFC after: 1 week
|
#
bd7abab0 |
|
10-Jan-2017 |
Mark Johnston <markj@FreeBSD.org> |
Coalesce TLB shootdowns of global PTEs in pmap_advise() on x86. We would previously invalidate such entries individually, resulting in more IPIs than necessary. Reviewed by: alc, kib MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D9094
|
#
43aabbef |
|
23-Dec-2016 |
Jason A. Harmening <jah@FreeBSD.org> |
Move the objects used to create temporary mappings for i386 pmap zero and copy operations to the MD PCPU region. Change sysmap initialization to only allocate KVA pages for CPUs that are actually present. As a minor optimization, this also prevents false sharing between adjacent sysmap objects since the pcpu struct is already cacheline-aligned. While here, move pc_qmap_addr initialization for the BSP into pmap_bootstrap(), which allows use of pmap_quick* functions during early boot. Reviewed by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D8833
|
#
e94965d8 |
|
07-Dec-2016 |
Alan Cox <alc@FreeBSD.org> |
Previously, vm_radix_remove() would panic if the radix trie didn't contain a vm_page_t at the specified index. However, with this change, vm_radix_remove() no longer panics. Instead, it returns NULL if there is no vm_page_t at the specified index. Otherwise, it returns the vm_page_t. The motivation for this change is that it simplifies the use of radix tries in the amd64, arm64, and i386 pmap implementations. Instead of performing a lookup before every remove, the pmap can simply perform the remove. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D8708
|
#
d3e4d71f |
|
28-Oct-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Handle pmap_enter() over an existing 4/2M page in KVA on i386. The userspace case was already handled by pmap_allocpte(). For kernel VA, page table page must exist, and demote cannot fail, so we need to just call pmap_demote_pde(). Also note that due to the machine AS layout, promotions in the KVA on i386 are highly unlikely, so this change is mostly for completeness. Reviewed by: alc, markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D8323
|
#
8cb0c102 |
|
10-Sep-2016 |
Alan Cox <alc@FreeBSD.org> |
Various changes to pmap_ts_referenced() Move PMAP_TS_REFERENCED_MAX out of the various pmap implementations and into vm/pmap.h, and describe what its purpose is. Eliminate the archaic "XXX" comment about its value. I don't believe that its exact value, e.g., 5 versus 6, matters. Update the arm64 and riscv pmap implementations of pmap_ts_referenced() to opportunistically update the page's dirty field. On amd64, use the PDE value already cached in a local variable rather than dereferencing a pointer again and again. Reviewed by: kib, markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D7836
|
#
dbbaf04f |
|
03-Sep-2016 |
Mark Johnston <markj@FreeBSD.org> |
Remove support for idle page zeroing. Idle page zeroing has been disabled by default on all architectures since r170816 and has some bugs that make it seemingly unusable. Specifically, the idle-priority pagezero thread exacerbates contention for the free page lock, and yields the CPU without releasing it in non-preemptive kernels. The pagezero thread also does not behave correctly when superpage reservations are enabled: its target is a function of v_free_count, which includes reserved-but-free pages, but it is only able to zero pages belonging to the physical memory allocator. Reviewed by: alc, imp, kib Differential Revision: https://reviews.freebsd.org/D7714
|
#
53aadae6 |
|
01-Sep-2016 |
Alan Cox <alc@FreeBSD.org> |
As an optimization to the machine-independent layer, change the machine- dependent pmap_ts_referenced() so that it updates the page's dirty field if a modified bit is found while counting reference bits. This opportunistic update can be performed at low cost and can eliminate the need for some future calls to pmap_is_modified() by the machine- independent layer. Reviewed by: kib, markj MFC after: 3 weeks Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D7722
|
#
ef209971 |
|
29-Aug-2016 |
Bruce Evans <bde@FreeBSD.org> |
Shorten banal comments about zeroing and copying pages. Don't give implementation details that last echoed the code 15-20 years ago. But add a detail about pagezero() on i386. Switch from Mach style to BSD style.
|
#
fdb6320d |
|
13-Jul-2016 |
Eric Badger <badger@FreeBSD.org> |
Add explicit detection of KVM hypervisor Set vm_guest to a new enum value (VM_GUEST_KVM) when kvm is detected and use vm_guest in conditionals testing for KVM. Also, fix a conditional checking if we're running in a VM which caught only the generic VM case, but not more specific VMs (KVM, VMWare, etc.). (Spotted by: vangyzen). Differential revision: https://reviews.freebsd.org/D7172 Sponsored by: Dell Inc. Approved by: kib (mentor), vangyzen (mentor) Reviewed by: alc MFC after: 4 weeks
|
#
a3269b08 |
|
14-Apr-2016 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
x86: for pointers replace 0 with NULL. These are mostly cosmetical, no functional change. Found with devel/coccinelle.
|
#
10386b56 |
|
06-Dec-2015 |
Conrad Meyer <cem@FreeBSD.org> |
pmap_invalidate_range: For very large ranges, flush the whole TLB Typical TLBs have 40-512 entries available. At some point, iterating every single page in a requested invalidation range and issuing invlpg on it is more expensive than flushing the TLB and allowing it to reload on demand. Broadwell CPUs have 1536 L2 TLB entries, so I've picked the arbitrary number 4096 entries as a hueristic at which point we flush TLB rather than invalidating every single potential page. Reviewed by: alc Feedback from: jhb, kib MFC notes: Depends on r291688 Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D4280
|
#
1ac62762 |
|
05-Dec-2015 |
Konstantin Belousov <kib@FreeBSD.org> |
In the pmap_set_pg() function, which enables the global bit on the ptes mapping the kernel on CPUs where global TLB entries are supported, revert to flushing only non-global entries, i.e. to the pre-r291688 state. There is no need to flush global TLB entries, since only global entries created during the previous iterations of the loop could exist at this moment. Submitted by: alc Differential revision: https://reviews.freebsd.org/D4368
|
#
27691a24 |
|
03-Dec-2015 |
Konstantin Belousov <kib@FreeBSD.org> |
For amd64 non-PCID machines, and for i386 machines with support for the PG_G global pte flag, pmap_invalidate_all() fails to flush global TLB entries [*]. This is because TLB shootdown handler for such configs reloads CR3, and on i386 pmap_invalidate_all() does the same for the initiating CPU. Note that current code does not issue total invalidation requests for the kernel_pmap. Rename amd64 function invltlb_globpcid() to invltlb_glob(), it is not specific for PCID for quite some time, and implement the same functionality for i386. Use the function instead of invltlb() in shootdown handlers and in i386 pmap_invalidate_all(), but only for the kernel pmap (which maps pages with the PG_G attribute set), which takes care of PG_G TLB entries on flush. To detect the affected pmap in i386 TLB shootdown handler, pmap should be passed to the smp_masked_invltlb() function, which makes amd64 and i386 TLB shootdown code almost identical. Merge the code under x86/. Noted by: jhb [*] Reviewed by: cem, jhb, pho Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D4346
|
#
8e9ef12d |
|
27-Oct-2015 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
Build fix for i386/XBOX and pc98/GENERIC. Reviewed by: kib
|
#
af95bbf5 |
|
24-Oct-2015 |
Konstantin Belousov <kib@FreeBSD.org> |
Intel SDM before revision 56 described the CLFLUSH instruction as only ordered with the MFENCE instruction. Similar weak guarantees are also specified by the AMD APM vol. 3 rev. 3.22. x86 pmap methods pmap_invalidate_cache_range() and pmap_invalidate_cache_pages() braced CLFLUSH loop with MFENCE both before and after the loop. In the revision 56 of SDM, Intel stated that all existing implementations of CLFLUSH are strict, CLFLUSH instructions execution is ordered WRT other CLFLUSH and writes. Also, the strict behaviour is made architectural. A new instruction CLFLUSHOPT (which was documented for some time in the Instruction Set Extensions Programming Reference) provides the weak behaviour which was previously attributed to CLFLUSH. Use CLFLUSHOPT when available. When CLFLUSH is used on Intel CPUs, do not execute MFENCE before and after the flushing loop. Reviewed by: alc Sponsored by: The FreeBSD Foundation
|
#
9f86aba6 |
|
26-Sep-2015 |
Alan Cox <alc@FreeBSD.org> |
Exploit r288122 to address a cosmetic issue. Since PV chunk pages don't belong to a vm object, they can't be paged out. Since they can't be paged out, they are never enqueued in a paging queue. Nonetheless, passing PQ_INACTIVE to vm_page_unwire() creates the appearance that these pages are being enqueued in the inactive queue. As of r288122, we can avoid this false impression by passing PQ_NONE. Submitted by: kmacy (an earlier version) Differential Revision: https://reviews.freebsd.org/D1674
|
#
7ef5e8bc |
|
12-Aug-2015 |
Marcel Moolenaar <marcel@FreeBSD.org> |
Better support memory mapped console devices, such as VGA and EFI frame buffers and memory mapped UARTs. 1. Delay calling cninit() until after pmap_bootstrap(). This makes sure we have PMAP initialized enough to add translations. Keep kdb_init() after cninit() so that we have console when we need to break into the debugger on boot. 2. Unfortunately, the ATPIC code had be moved as well so as to avoid a spurious trap #30. The reason for which is not known at this time. 3. In pmap_mapdev_attr(), when we need to map a device prior to the VM system being initialized, use virtual_avail as the KVA to map the device at. In particular, avoid using the direct map on amd64 because we can't demote by virtue of not being able to allocate yet. Keep track of the translation. Re-use the translation after the VM has been initialized to not waste KVA and to satisfy the assumption in uart(4) that the handle returned for the low-level console is the same as later returned when the device is probed and attached. 4. In pmap_unmapdev() remove the mapping from the table when called pre-init. Otherwise keep the mapping. During bus probe and attach device resources are mapped and unmapped multiple times, which would have us destroy the mapping used by the low-level console. 5. In pmap_init(), set pmap_initialized to signal that we're not pre-init anymore. On amd64, bring the direct map in sync with the translations created at that time. 6. Implement bus_space_map() and bus_space_unmap() for real: when the tag corresponds to memory space, call the corresponding pmap_mapdev() and pmap_unmapdev() functions to construct and actual handle. 7. In efifb.c and vt_vga.c, remove the crutches and hacks and simply call pmap_mapdev_attr() or bus_space_map() as desired. Notes: 1. uart(4) already used bus_space_map() during low-level console setup but since serial ports have traditionally been I/O port based, the lack of a proper implementation for said function was not a problem. It has always supported memory mapped UARTs for low-level consoles by setting hw.uart.console accordingly. 2. The use of the direct map on amd64 without setting caching attributes has been a bigger problem than previously thought. This change has the fortunate (and unexpected) side-effect of fixing various EFI frame buffer problems (though not all). PR: 191564, 194952 Special thanks to: 1. XipLink, Inc -- generously donated an Intel Bay Trail E3800 based eval board (ADLE3800PC). 2. The FreeBSD Foundation, in particular emaste@ -- for UEFI support in general and testing. 3. Everyone who tested the proposed for PR 191564. 4. jhb@ and kib@ for being a soundboard and applying a clue bat if so needed.
|
#
c8fbdcc1 |
|
05-Aug-2015 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix UP build after r286296, ensure that CPU_FOREACH() is defined. Sponsored by: The FreeBSD Foundation
|
#
713841af |
|
04-Aug-2015 |
Jason A. Harmening <jah@FreeBSD.org> |
Add two new pmap functions: vm_offset_t pmap_quick_enter_page(vm_page_t m) void pmap_quick_remove_page(vm_offset_t kva) These will create and destroy a temporary, CPU-local KVA mapping of a specified page. Guarantees: --Will not sleep and will not fail. --Safe to call under a non-sleepable lock or from an ithread Restrictions: --Not guaranteed to be safe to call from an interrupt filter or under a spin mutex on all platforms --Current implementation does not guarantee more than one page of mapping space across all platforms. MI code should not make nested calls to pmap_quick_enter_page. --MI code should not perform locking while holding onto a mapping created by pmap_quick_enter_page The idea is to use this in busdma, for bounce buffer copies as well as virtually-indexed cache maintenance on mips and arm. NOTE: the non-i386, non-amd64 implementations of these functions still need review and testing. Reviewed by: kib Approved by: kib (mentor) Differential Revision: http://reviews.freebsd.org/D3013
|
#
79855a57 |
|
29-Jul-2015 |
Sean Bruno <sbruno@FreeBSD.org> |
Remove dead functions pmap_pvdump and pads. Differential Revision: D3206 Submitted by: kevin.bowling@kev009.com Reviewed by: alc
|
#
65a9768f |
|
09-Jun-2015 |
Alan Cox <alc@FreeBSD.org> |
Account for superpage mappings that are created by pmap_copy().
|
#
1c8e7232 |
|
18-Apr-2015 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove lazy pmap switch code from i386. Naive benchmark with md(4) shows no difference with the code removed. On both amd64 and i386, assert that a released pmap is not active. Proposed and reviewed by: alc Discussed with: Svatopluk Kraus <onwahe@gmail.com>, peter Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
#
34c15db9 |
|
13-Apr-2015 |
Konstantin Belousov <kib@FreeBSD.org> |
Add config option PAE_TABLES for the i386 kernel. It switches pmap to use PAE format for the page tables, but does not incur other consequences of the full PAE config. In particular, vm_paddr_t and bus_addr_t are left 32bit, and max supported memory is still limited by 4GB. The option allows to have nx permissions for memory mappings on i386 kernel, while keeping the usual i386 KBI and avoiding the kernel data sizing problems typical for the PAE config. Intel documented that the PAE format for page tables is available starting with the Pentium Pro, but it is possible that the plain Pentium CPUs have the required support (Appendix H). The goal is to enable the option and non-exec mappings on i386 for the GENERIC kernel. Anybody wanting a useful system on 486, have to reconfigure the modern i386 kernel anyway. Discussed with: alc, jhb Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
#
f2c2231e |
|
31-Mar-2015 |
Ryan Stone <rstone@FreeBSD.org> |
Fix integer truncation bug in malloc(9) A couple of internal functions used by malloc(9) and uma truncated a size_t down to an int. This could cause any number of issues (e.g. indefinite sleeps, memory corruption) if any kernel subsystem tried to allocate 2GB or more through malloc. zfs would attempt such an allocation when run on a system with 2TB or more of RAM. Note to self: When this is MFCed, sparc64 needs the same fix. Differential revision: https://reviews.freebsd.org/D2106 Reviewed by: kib Reported by: Michael Fuckner <michael@fuckner.net> Tested by: Michael Fuckner <michael@fuckner.net> MFC after: 2 weeks
|
#
271f0f12 |
|
15-Nov-2014 |
Alan Cox <alc@FreeBSD.org> |
Enable the use of VM_PHYSSEG_SPARSE on amd64 and i386, making it the default on i386 PAE. Previously, VM_PHYSSEG_SPARSE could not be used on amd64 and i386 because vm_page_startup() would not create vm_page structures for the kernel page table pages allocated during pmap_bootstrap() but those vm_page structures are needed when the kernel attempts to promote the corresponding kernel virtual addresses to superpage mappings. To address this problem, a new public function, vm_phys_add_seg(), is introduced and vm_phys_init() is updated to reflect the creation of vm_phys_seg structures by calls to vm_phys_add_seg(). Discussed with: Svatopluk Kraus MFC after: 3 weeks Sponsored by: EMC / Isilon Storage Division
|
#
d6e53ebe |
|
26-Oct-2014 |
Alan Cox <alc@FreeBSD.org> |
By the time that pmap_init() runs, vm_phys_segs[] has been initialized. Obtaining the end of memory address from vm_phys_segs[] is a little easier than obtaining it from phys_avail[]. Discussed with: Svatopluk Kraus
|
#
07a92f34 |
|
08-Oct-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
Add an argument to the x86 pmap_invalidate_cache_range() to request forced invalidation of the cache range regardless of the presence of self-snoop feature. Some recent Intel GPUs in some modes are not coherent, and dirty lines in CPU cache must be flushed before the pages are transferred to GPU domain. Reviewed by: alc (previous version) Tested by: pho (amd64) Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
490356e5 |
|
17-Sep-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
Presence of any VM_PROT bits in the permission argument on x86 implies that the entry is readable and valid. Reported by: markj Submitted by: alc Tested by: pho (previous version), markj MFC after: 3 days
|
#
5e1d15a8 |
|
16-Aug-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
Complete r254667, do not destroy pmap lock if KVA allocation failed. Submitted by: Svatopluk Kraus <onwahe@gmail.com> MFC after: 1 week
|
#
827a661d |
|
11-Aug-2014 |
Alan Cox <alc@FreeBSD.org> |
Change {_,}pmap_allocpte() so that they look for the flag PMAP_ENTER_NOSLEEP instead of M_NOWAIT/M_WAITOK when deciding whether to sleep on page table page allocation. (The same functions in the i386/xen and mips pmap implementations already use PMAP_ENTER_NOSLEEP.) X-MFC with: r269728 Sponsored by: EMC / Isilon Storage Division
|
#
39ffa8c1 |
|
08-Aug-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
Change pmap_enter(9) interface to take flags parameter and superpage mapping size (currently unused). The flags includes the fault access bits, wired flag as PMAP_ENTER_WIRED, and a new flag PMAP_ENTER_NOSLEEP to indicate that pmap should not sleep. For powerpc aim both 32 and 64 bit, fix implementation to ensure that the requested mapping is created when PMAP_ENTER_NOSLEEP is not specified, in particular, wait for the available memory required to proceed. In collaboration with: alc Tested by: nwhitehorn (ppc aim32 and booke) Sponsored by: The FreeBSD Foundation and EMC / Isilon Storage Division MFC after: 2 weeks
|
#
a695d9b2 |
|
03-Aug-2014 |
Alan Cox <alc@FreeBSD.org> |
Retire pmap_change_wiring(). We have never used it to wire virtual pages. We continue to use pmap_enter() for that. For unwiring virtual pages, we now use pmap_unwire(), which unwires a range of virtual addresses instead of a single virtual page. Sponsored by: EMC / Isilon Storage Division
|
#
1c42633e |
|
24-Jul-2014 |
Marius Strobl <marius@FreeBSD.org> |
- Copying and zeroing pages via temporary mappings involves updating the corresponding page tables followed by accesses to the pages in question. This sequence is subject to the situation exactly described in the "AMD64 Architecture Programmer's Manual Volume 2: System Programming" rev. 3.23, "7.3.1 Special Coherency Considerations" [1, p. 171 f.]. Therefore, issuing the INVLPG right after modifying the PTE bits is crucial. For pmap_copy_page(), this has been broken in r124956 and later on carried over to pmap_copy_pages() derived from the former, while all other places in the i386 PMAP code use the correct order of instructions in this regard. Fixing the latter breakage solves the problem of data corruption seen with unmapped I/O enabled when running at least bare metal on AMD R-268D APUs. However, this might also fix similar corruption reported for virtualized environments. - In pmap_copy_pages(), correctly set the cache bits on the source page being copied. This change is thought to be a NOP for the real world, though. [2] 1: http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/24593_APM_v21.pdf Submitted by: kib [2] Reviewed by: alc, kib MFC after: 3 days Sponsored by: Bally Wulff Games & Entertainment GmbH
|
#
09132ba6 |
|
06-Jul-2014 |
Alan Cox <alc@FreeBSD.org> |
Introduce pmap_unwire(). It will replace pmap_change_wiring(). There are several reasons for this change: pmap_change_wiring() has never (in my memory) been used to set the wired attribute on a virtual page. We have always used pmap_enter() to do that. Moreover, it is not really safe to use pmap_change_wiring() to set the wired attribute on a virtual page. The description of pmap_change_wiring() says that it assumes the existence of a mapping in the pmap. However, non-wired mappings may be reclaimed by the pmap at any time. (See pmap_collect().) Many implementations of pmap_change_wiring() will crash if the mapping does not exist. pmap_unwire() accepts a range of virtual addresses, whereas pmap_change_wiring() acts upon a single virtual page. Since we are typically unwiring a range of virtual addresses, pmap_unwire() will be more efficient. Moreover, pmap_unwire() allows us to unwire superpage mappings. Previously, we were forced to demote the superpage mapping, because pmap_change_wiring() only allowed us to express the unwiring of a single base page mapping at a time. This added to the overhead of unwiring for large ranges of addresses, including the implicit unwiring that occurs at process termination. Implementations for arm and powerpc will follow. Discussed with: jeff, marcel Reviewed by: kib Sponsored by: EMC / Isilon Storage Division
|
#
af3b2549 |
|
27-Jun-2014 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
Pull in r267961 and r267973 again. Fix for issues reported will follow.
|
#
37a107a4 |
|
27-Jun-2014 |
Glen Barber <gjb@FreeBSD.org> |
Revert r267961, r267973: These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory
|
#
3da1cf1e |
|
27-Jun-2014 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies
|
#
3ae10f74 |
|
16-Jun-2014 |
Attilio Rao <attilio@FreeBSD.org> |
- Modify vm_page_unwire() and vm_page_enqueue() to directly accept the queue where to enqueue pages that are going to be unwired. - Add stronger checks to the enqueue/dequeue for the pagequeues when adding and removing pages to them. Of course, for unmanaged pages the queue parameter of vm_page_unwire() will be ignored, just as the active parameter today. This makes adding new pagequeues quicker. This change effectively modifies the KPI. __FreeBSD_version will be, however, bumped just when the full cache of free pages will be evicted. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho
|
#
dd05fa19 |
|
07-Jun-2014 |
Alan Cox <alc@FreeBSD.org> |
Add a page size field to struct vm_page. Increase the page size field when a partially populated reservation becomes fully populated, and decrease this field when a fully populated reservation becomes partially populated. Use this field to simplify the implementation of pmap_enter_object() on amd64, arm, and i386. On all architectures where we support superpages, the cost of creating a superpage mapping is roughly the same as creating a base page mapping. For example, both kinds of mappings entail the creation of a single PTE and PV entry. With this in mind, use the page size field to make the implementation of vm_map_pmap_enter(..., MAP_PREFAULT_PARTIAL) a little smarter. Previously, if MAP_PREFAULT_PARTIAL was specified to vm_map_pmap_enter(), that function would only map base pages. Now, it will create up to 96 base page or superpage mappings. Reviewed by: kib Sponsored by: EMC / Isilon Storage Division
|
#
44f1c916 |
|
22-Mar-2014 |
Bryan Drewery <bdrewery@FreeBSD.org> |
Rename global cnt to vm_cnt to avoid shadowing. To reduce the diff struct pcu.cnt field was not renamed, so PCPU_OP(cnt.field) is still used. pc_cnt and pcpu are also used in kvm(3) and vmstat(8). The goal was to not affect externally used KPI. Bump __FreeBSD_version_ in case some out-of-tree module/code relies on the the global cnt variable. Exp-run revealed no ports using it directly. No objection from: arch@ Sponsored by: EMC / Isilon Storage Division
|
#
f4385473 |
|
22-Feb-2014 |
Alan Cox <alc@FreeBSD.org> |
When the kernel is running in a virtual machine, it cannot rely upon the processor family to determine if the workaround for AMD Family 10h Erratum 383 should be enabled. To enable virtual machine migration among a heterogeneous collection of physical machines, the hypervisor may have been configured to report an older processor family with a reduced feature set. Effectively, the reported processor family and its features are like a "least common denominator" for the collection of machines. Therefore, when the kernel is running in a virtual machine, instead of relying upon the processor family, we now test for features that prove that the underlying processor is not affected by the erratum. (The features that we test for are unlikely to ever be emulated in software on an affected physical processor.) PR: 186061 Tested by: Simon Matter Discussed with: jhb, neel MFC after: 2 weeks
|
#
4f67a8c5 |
|
11-Feb-2014 |
John Baldwin <jhb@FreeBSD.org> |
Don't waste a page of KVA for the boot-time memory test on x86. For amd64, reuse the first page of the crashdumpmap as CMAP1/CADDR1. For i386, remove CMAP1/CADDR1 entirely and reuse CMAP3/CADDR3 for the memory test. Reviewed by: alc, peter MFC after: 2 weeks
|
#
e07ef9b0 |
|
23-Jan-2014 |
John Baldwin <jhb@FreeBSD.org> |
Move <machine/apicvar.h> to <x86/apicvar.h>.
|
#
deb179bb |
|
19-Sep-2013 |
Alan Cox <alc@FreeBSD.org> |
The pmap function pmap_clear_reference() is no longer used. Remove it. pmap_clear_reference() has had exactly one caller in the kernel for several years, more precisely, since FreeBSD 8. Now, that call no longer exists. Approved by: re (kib) Sponsored by: EMC / Isilon Storage Division
|
#
06646d66 |
|
16-Sep-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Merge the change r255607 from amd64 to i386. Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week Approved by: re (gjb)
|
#
87ee6303 |
|
11-Sep-2013 |
Alan Cox <alc@FreeBSD.org> |
Prior to r254304, we only began scanning the active page queue when the amount of free memory was close to the point at which we would begin reclaiming pages. Now, we continuously scan the active page queue, regardless of the amount of free memory. Consequently, we are continuously calling pmap_ts_referenced() on active pages. Prior to this change, pmap_ts_referenced() would always demote superpage mappings in order to obtain finer-grained reference information. This made sense because we were coming under memory pressure and would soon have to begin reclaiming pages. Now, however, with continuous scanning of the active page queue, these demotions are taking a toll on performance. To address this problem, I have replaced the demotion with a heuristic for periodically clearing the reference flag on superpage mappings. Approved by: re (kib) Sponsored by: EMC / Isilon Storage Division
|
#
51321f7c |
|
29-Aug-2013 |
Alan Cox <alc@FreeBSD.org> |
Significantly reduce the cost, i.e., run time, of calls to madvise(..., MADV_DONTNEED) and madvise(..., MADV_FREE). Specifically, introduce a new pmap function, pmap_advise(), that operates on a range of virtual addresses within the specified pmap, allowing for a more efficient implementation of MADV_DONTNEED and MADV_FREE. Previously, the implementation of MADV_DONTNEED and MADV_FREE relied on per-page pmap operations, such as pmap_clear_reference(). Intuitively, the problem with this implementation is that the pmap-level locks are acquired and released and the page table traversed repeatedly, once for each resident page in the range that was specified to madvise(2). A more subtle flaw with the previous implementation is that pmap_clear_reference() would clear the reference bit on all mappings to the specified page, not just the mapping in the range specified to madvise(2). Since our malloc(3) makes heavy use of madvise(2), this change can have a measureable impact. For example, the system time for completing a parallel "buildworld" on a 6-core amd64 machine was reduced by about 1.5% to 2.0%. Note: This change only contains pmap_advise() implementations for a subset of our supported architectures. I will commit implementations for the remaining architectures after further testing. For now, a stub function is sufficient because of the advisory nature of pmap_advise(). Discussed with: jeff, jhb, kib Tested by: pho (i386), marcel (ia64) Sponsored by: EMC / Isilon Storage Division
|
#
e68c64f0 |
|
22-Aug-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Revert r254501. Instead, reuse the type stability of the struct pmap which is the part of struct vmspace, allocated from UMA_ZONE_NOFREE zone. Initialize the pmap lock in the vmspace zone init function, and remove pmap lock initialization and destruction from pmap_pinit() and pmap_release(). Suggested and reviewed by: alc (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation
|
#
c325e866 |
|
10-Aug-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Different consumers of the struct vm_page abuse pageq member to keep additional information, when the page is guaranteed to not belong to a paging queue. Usually, this results in a lot of type casts which make reasoning about the code correctness harder. Sometimes m->object is used instead of pageq, which could cause real and confusing bugs if non-NULL m->object is leaked. See r141955 and r253140 for examples. Change the pageq member into a union containing explicitly-typed members. Use them instead of type-punning or abusing m->object in x86 pmaps, uma and vm_page_alloc_contig(). Requested and reviewed by: alc Sponsored by: The FreeBSD Foundation
|
#
e946b949 |
|
09-Aug-2013 |
Attilio Rao <attilio@FreeBSD.org> |
On all the architectures, avoid to preallocate the physical memory for nodes used in vm_radix. On architectures supporting direct mapping, also avoid to pre-allocate the KVA for such nodes. In order to do so make the operations derived from vm_radix_insert() to fail and handle all the deriving failure of those. vm_radix-wise introduce a new function called vm_radix_replace(), which can replace a leaf node, already present, with a new one, and take into account the possibility, during vm_radix_insert() allocation, that the operations on the radix trie can recurse. This means that if operations in vm_radix_insert() recursed vm_radix_insert() will start from scratch again. Sponsored by: EMC / Isilon storage division Reviewed by: alc (older version) Reviewed by: jeff Tested by: pho, scottl
|
#
c7aebda8 |
|
09-Aug-2013 |
Attilio Rao <attilio@FreeBSD.org> |
The soft and hard busy mechanism rely on the vm object lock to work. Unify the 2 concept into a real, minimal, sxlock where the shared acquisition represent the soft busy and the exclusive acquisition represent the hard busy. The old VPO_WANTED mechanism becames the hard-path for this new lock and it becomes per-page rather than per-object. The vm_object lock becames an interlock for this functionality: it can be held in both read or write mode. However, if the vm_object lock is held in read mode while acquiring or releasing the busy state, the thread owner cannot make any assumption on the busy state unless it is also busying it. Also: - Add a new flag to directly shared busy pages while vm_page_alloc and vm_page_grab are being executed. This will be very helpful once these functions happen under a read object lock. - Move the swapping sleep into its own per-object flag The KPI is heavilly changed this is why the version is bumped. It is very likely that some VM ports users will need to change their own code. Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff, kib Tested by: gavin, bapt (older version) Tested by: pho, scottl
|
#
5df87b21 |
|
07-Aug-2013 |
Jeff Roberson <jeff@FreeBSD.org> |
Replace kernel virtual address space allocation with vmem. This provides transparent layering and better fragmentation. - Normalize functions that allocate memory to use kmem_* - Those that allocate address space are named kva_* - Those that operate on maps are named kmap_* - Implement recursive allocation handling for kmem_arena in vmem. Reviewed by: alc Tested by: pho Sponsored by: EMC / Isilon Storage Division
|
#
2c0b86b4 |
|
04-Aug-2013 |
Jeff Roberson <jeff@FreeBSD.org> |
- Introduce a specific function, pmap_remove_kernel_pde, for removing huge pages in the kernel's address space. This works around several asserts from pmap_demote_pde_locked that did not apply and gave false warnings. Discovered by: pho Reviewed by: alc Sponsored by: EMC / Isilon Storage Division
|
#
30dac21d |
|
10-Jul-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Explicitely panic instead of possibly doing undefined things when ptelist KVA is exhausted. Currently this cannot happen, the added panic serves as assert. Discussed with: alc Sponsored by: The FreeBSD Foundation
|
#
3fb25770 |
|
10-Jul-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
MFamd64 r253140: Clear m->object for the page taken from the delayed free list in pmap_pv_reclaim(). Noted by: alc
|
#
17a27377 |
|
13-Jun-2013 |
Jeff Roberson <jeff@FreeBSD.org> |
- Add a BIT_FFS() macro and use it to replace cpusetffs_obj() Discussed with: attilio Sponsored by: EMC / Isilon Storage Division
|
#
9af6d512 |
|
21-May-2013 |
Attilio Rao <attilio@FreeBSD.org> |
o Relax locking assertions for vm_page_find_least() o Relax locking assertions for pmap_enter_object() and add them also to architectures that currently don't have any o Introduce VM_OBJECT_LOCK_DOWNGRADE() which is basically a downgrade operation on the per-object rwlock o Use all the mechanisms above to make vm_map_pmap_enter() to work mostl of the times only with readlocks. Sponsored by: EMC / Isilon storage division Reviewed by: alc
|
#
ee75e7de |
|
19-Mar-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Implement the concept of the unmapped VMIO buffers, i.e. buffers which do not map the b_pages pages into buffer_map KVA. The use of the unmapped buffers eliminate the need to perform TLB shootdown for mapping on the buffer creation and reuse, greatly reducing the amount of IPIs for shootdown on big-SMP machines and eliminating up to 25-30% of the system time on i/o intensive workloads. The unmapped buffer should be explicitely requested by the GB_UNMAPPED flag by the consumer. For unmapped buffer, no KVA reservation is performed at all. The consumer might request unmapped buffer which does have a KVA reserve, to manually map it without recursing into buffer cache and blocking, with the GB_KVAALLOC flag. When the mapped buffer is requested and unmapped buffer already exists, the cache performs an upgrade, possibly reusing the KVA reservation. Unmapped buffer is translated into unmapped bio in g_vfs_strategy(). Unmapped bio carry a pointer to the vm_page_t array, offset and length instead of the data pointer. The provider which processes the bio should explicitely specify a readiness to accept unmapped bio, otherwise g_down geom thread performs the transient upgrade of the bio request by mapping the pages into the new bio_transient_map KVA submap. The bio_transient_map submap claims up to 10% of the buffer map, and the total buffer_map + bio_transient_map KVA usage stays the same. Still, it could be manually tuned by kern.bio_transient_maxcnt tunable, in the units of the transient mappings. Eventually, the bio_transient_map could be removed after all geom classes and drivers can accept unmapped i/o requests. Unmapped support can be turned off by the vfs.unmapped_buf_allowed tunable, disabling which makes the buffer (or cluster) creation requests to ignore GB_UNMAPPED and GB_KVAALLOC flags. Unmapped buffers are only enabled by default on the architectures where pmap_copy_page() was implemented and tested. In the rework, filesystem metadata is not the subject to maxbufspace limit anymore. Since the metadata buffers are always mapped, the buffers still have to fit into the buffer map, which provides a reasonable (but practically unreachable) upper bound on it. The non-metadata buffer allocations, both mapped and unmapped, is accounted against maxbufspace, as before. Effectively, this means that the maxbufspace is forced on mapped and unmapped buffers separately. The pre-patch bufspace limiting code did not worked, because buffer_map fragmentation does not allow the limit to be reached. By Jeff Roberson request, the getnewbuf() function was split into smaller single-purpose functions. Sponsored by: The FreeBSD Foundation Discussed with: jeff (previous version) Tested by: pho, scottl (previous version), jhb, bf MFC after: 2 weeks
|
#
774d251d |
|
17-Mar-2013 |
Attilio Rao <attilio@FreeBSD.org> |
Sync back vmcontention branch into HEAD: Replace the per-object resident and cached pages splay tree with a path-compressed multi-digit radix trie. Along with this, switch also the x86-specific handling of idle page tables to using the radix trie. This change is supposed to do the following: - Allowing the acquisition of read locking for lookup operations of the resident/cached pages collections as the per-vm_page_t splay iterators are now removed. - Increase the scalability of the operations on the page collections. The radix trie does rely on the consumers locking to ensure atomicity of its operations. In order to avoid deadlocks the bisection nodes are pre-allocated in the UMA zone. This can be done safely because the algorithm needs at maximum one new node per insert which means the maximum number of the desired nodes is the number of available physical frames themselves. However, not all the times a new bisection node is really needed. The radix trie implements path-compression because UFS indirect blocks can lead to several objects with a very sparse trie, increasing the number of levels to usually scan. It also helps in the nodes pre-fetching by introducing the single node per-insert property. This code is not generalized (yet) because of the possible loss of performance by having much of the sizes in play configurable. However, efforts to make this code more general and then reusable in further different consumers might be really done. The only KPI change is the removal of the function vm_page_splay() which is now reaped. The only KBI change, instead, is the removal of the left/right iterators from struct vm_page, which are now reaped. Further technical notes broken into mealpieces can be retrieved from the svn branch: http://svn.freebsd.org/base/user/attilio/vmcontention/ Sponsored by: EMC / Isilon storage division In collaboration with: alc, jeff Tested by: flo, pho, jhb, davide Tested by: ian (arm) Tested by: andreast (powerpc)
|
#
e8a4a618 |
|
14-Mar-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Add pmap function pmap_copy_pages(), which copies the content of the pages around, taking array of vm_page_t both for source and destination. Starting offsets and total transfer size are specified. The function implements optimal algorithm for copying using the platform-specific optimizations. For instance, on the architectures were the direct map is available, no transient mappings are created, for i386 the per-cpu ephemeral page frame is used. The code was typically borrowed from the pmap_copy_page() for the same architecture. Only i386/amd64, powerpc aim and arm/arm-v6 implementations were tested at the time of commit. High-level code, not committed yet to the tree, ensures that the use of the function is only allowed after explicit enablement. For sparc64, the existing code has known issues and a stab is added instead, to allow the kernel linking. Sponsored by: The FreeBSD Foundation Tested by: pho (i386, amd64), scottl (amd64), ian (arm and arm-v6) MFC after: 2 weeks
|
#
9f585991 |
|
10-Mar-2013 |
Alan Cox <alc@FreeBSD.org> |
The kernel pmap is statically allocated, so there is really no need to explicitly initialize its pm_root field to zero. Sponsored by: EMC / Isilon Storage Division
|
#
89f6b863 |
|
08-Mar-2013 |
Attilio Rao <attilio@FreeBSD.org> |
Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes. The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs. The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example). Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho
|
#
b38d37f7 |
|
02-Mar-2013 |
Attilio Rao <attilio@FreeBSD.org> |
Merge from vmc-playground branch: Rename the pv_entry_t iterator from pv_list to pv_next. Besides being more correct technically (as the name seems to suggest this is a list while it is an iterator), it will also be needed by vm_radix work to avoid a nameclash on macro expansions. Sponsored by: EMC / Isilon storage division Reviewed by: alc, jeff Tested by: flo, pho, jhb, davide
|
#
dc1558d1 |
|
27-Feb-2013 |
Attilio Rao <attilio@FreeBSD.org> |
Merge from vmobj-rwlock: VM_OBJECT_LOCKED() macro is only used to implement a custom version of lock assertions right now (which likely spread out thanks to copy and paste). Remove it and implement actual assertions. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho
|
#
00a54dfb |
|
15-Feb-2013 |
Jung-uk Kim <jkim@FreeBSD.org> |
Consistently use round_page(x) rather than roundup(x, PAGE_SIZE). There is no functional change.
|
#
b5821c6f |
|
18-Jan-2013 |
John Baldwin <jhb@FreeBSD.org> |
Fix build with SMP disabled.` Reported by: bf
|
#
f876ffea |
|
17-Jan-2013 |
John Baldwin <jhb@FreeBSD.org> |
Don't attempt to use clflush on the local APIC register window. Various CPUs exhibit bad behavior if this is done (Intel Errata AAJ3, hangs on Pentium-M, and trashing of the local APIC registers on a VIA C7). The local APIC is implicitly mapped UC already via MTRRs, so the clflush isn't necessary anyway. MFC after: 2 weeks
|
#
cfedf924 |
|
03-Nov-2012 |
Attilio Rao <attilio@FreeBSD.org> |
Rework the known rwlock to benefit about staying on their own cache line in order to avoid manual frobbing but using struct rwlock_padalign. Reviewed by: alc, jimharris
|
#
9065aa64 |
|
24-Oct-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Add missed sched_pin(). Submitted by: Svatopluk Kraus <onwahe@gmail.com> Reviewed by: alc MFC after: 3 days
|
#
4e445832 |
|
08-Oct-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Add several asserts to i386 pmap, which mostly state that pv entry shall have corresponding pte. Reviewed by: alc Tested by: pho MFC after: 3 days
|
#
e1de0706 |
|
08-Oct-2012 |
Alan Cox <alc@FreeBSD.org> |
In a few places, like the implementation of ptrace(), a thread may call upon pmap_enter() to create a mapping within a different address space, i.e., not the thread's own address space. On i386, this entails the creation of a temporary mapping to the affected page table page (PTP). In general, pmap_enter() will read from this PTP, allocate a PV entry, and write to this PTP. The trouble comes when the system is short of memory. In order to allocate a new PV entry, an older PV entry has to be reclaimed. Reclaiming a PV entry involves destroying a mapping, which requires access to the affected PTP. Thus, the PTP mapped at the beginning of pmap_enter() is no longer mapped at the end of pmap_enter(), which leads to pmap_enter() modifying the wrong PTP. To address this problem, pmap_pv_reclaim() is changed to use an alternate method of mapping PTPs. Update a related comment. Reported by: pho Diagnosed by: kib MFC after: 5 days
|
#
e4b8a2fc |
|
27-Sep-2012 |
Alan Cox <alc@FreeBSD.org> |
Eliminate a stale comment. It describes another use case for the pmap in Mach that doesn't exist in FreeBSD.
|
#
7336315b |
|
10-Sep-2012 |
Alan Cox <alc@FreeBSD.org> |
Simplify pmap_unmapdev(). Since kmem_free() eventually calls pmap_remove(), pmap_unmapdev()'s own direct efforts to destroy the page table entries are redundant, so eliminate them. Don't set PTE_W on the page table entry in pmap_kenter{,_attr}() on MIPS. Setting PTE_W on MIPS is inconsistent with the implementation of this function on other architectures. Moreover, PTE_W should not be set, unless the pmap's wired mapping count is incremented, which pmap_kenter{,_attr}() doesn't do. MFC after: 10 days
|
#
d8f9ed32 |
|
05-Sep-2012 |
Alan Cox <alc@FreeBSD.org> |
Rename {_,}pmap_unwire_pte_hold() to {_,}pmap_unwire_ptp() and update the comment describing them. Both the function names and the comment had grown stale. Quite some time has passed since these pmap implementations last used the page's hold count to track the number of valid mapping within a page table page. Also, returning TRUE from pmap_unwire_ptp() rather than _pmap_unwire_ptp() eliminates a few instructions from callers like pmap_enter_quick_locked() where pmap_unwire_ptp()'s return value is used directly by a conditional statement.
|
#
e93d0cbe |
|
26-Jul-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
MFamd64 r238623: Introduce curpcb magic variable. Requested and reviewed by: bde MFC after: 3 weeks
|
#
e30df26e |
|
26-Jun-2012 |
Alan Cox <alc@FreeBSD.org> |
Add new pmap layer locks to the predefined lock order. Change the names of a few existing VM locks to follow a consistent naming scheme.
|
#
23c0d041 |
|
03-Jun-2012 |
Alan Cox <alc@FreeBSD.org> |
Various small changes to PV entry management: Constify pc_freemask[]. pmap_pv_reclaim() Eliminate "freemask" because it was a pessimization. Add a comment about the resident count adjustment. free_pv_entry() [i386 only] Merge an optimization from amd64 (r233954). get_pv_entry() Eliminate the move to tail of the pv_chunk on the global pv_chunks list. (The right strategy needs more thought. Moreover, there were unintended differences between the amd64 and i386 implementation.) pmap_remove_pages() Eliminate unnecessary ()'s.
|
#
0d6f49d8 |
|
02-Jun-2012 |
Alan Cox <alc@FreeBSD.org> |
Isolate the global pv list lock from data and other locks to prevent false sharing within the cache.
|
#
d85fbe8a |
|
31-May-2012 |
Alan Cox <alc@FreeBSD.org> |
Eliminate code duplication in free_pv_entry() and pmap_remove_pages() by introducing free_pv_chunk().
|
#
a2efa424 |
|
29-May-2012 |
Alan Cox <alc@FreeBSD.org> |
Eliminate some purely stylistic differences among the amd64, i386 native, and i386 xen PV entry allocators.
|
#
4edfd622 |
|
28-May-2012 |
Alan Cox <alc@FreeBSD.org> |
Update a comment in get_pv_entry() to reflect the changes to the synchronization of pv_vafree in r236158.
|
#
8b0f4e0a |
|
27-May-2012 |
Alan Cox <alc@FreeBSD.org> |
Replace all uses of the vm page queues lock by a r/w lock that is private to this pmap.c. This new r/w lock is used primarily to synchronize access to the PV lists. However, it will be used in a somewhat unconventional way. As finer-grained PV list locking is added to each of the pmap functions that acquire this r/w lock, its acquisition will be changed from write to read, enabling concurrent execution of the pmap functions with finer-grained locking. X-MFC after: r236045
|
#
33853281 |
|
26-May-2012 |
Alan Cox <alc@FreeBSD.org> |
Rename pmap_collect() to pmap_pv_reclaim() and rewrite it such that it no longer uses the active and inactive paging queues. Instead, the pmap now maintains an LRU-ordered list of pv entry pages, and pmap_pv_reclaim() uses this list to select pv entries for reclamation. Note: The old pmap_collect() tried to avoid reclaiming mappings for pages that have either a hold_count or a busy field that is non-zero. However, this isn't necessary for correctness, and the locking in pmap_collect() was insufficient to guarantee that such mappings weren't reclaimed. The new pmap_pv_reclaim() doesn't even try. MFC after: 5 weeks
|
#
4e656345 |
|
24-May-2012 |
Alan Cox <alc@FreeBSD.org> |
MF amd64 r233097, r233122 With the changes over the past year to how accesses to the page's dirty field are synchronized, there is no need for pmap_protect() to acquire the page queues lock unless it is going to access the pv lists or PMAP1/PADDR1. Style fix to pmap_protect().
|
#
5d4c773b |
|
24-Mar-2012 |
Alan Cox <alc@FreeBSD.org> |
Disable detailed PV entry accounting by default. Add a config option to enable it. MFC after: 1 week
|
#
fc6e32fb |
|
19-Mar-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
If we ever allow for managed fictitious pages, the pages shall be excluded from superpage promotions. At least one of the reason is that pv_table is sized for non-fictitious pages only. Consistently check for the page to be non-fictitious before accesing superpage pv list. Sponsored by: The FreeBSD Foundation Reviewed by: alc MFC after: 2 weeks
|
#
fe8b9971 |
|
28-Dec-2011 |
Alan Cox <alc@FreeBSD.org> |
Fix a bug in the Xen pmap's implementation of pmap_extract_and_hold(): If the page lock acquisition is retried, then the underlying thread is not unpinned. Wrap nearby lines that exceed 80 columns.
|
#
9800a50f |
|
27-Dec-2011 |
Alan Cox <alc@FreeBSD.org> |
Eliminate many of the unnecessary differences between the native and paravirtualized pmap implementations for i386. This includes some style fixes to the native pmap and several bug fixes that were not previously applied to the paravirtualized pmap. Tested by: sbruno MFC after: 3 weeks
|
#
894b2848 |
|
14-Dec-2011 |
Alan Cox <alc@FreeBSD.org> |
Create large page mappings in pmap_map(). MFC after: 6 weeks
|
#
6472ac3d |
|
07-Nov-2011 |
Ed Schouten <ed@FreeBSD.org> |
Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs. The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
|
#
703dec68 |
|
27-Oct-2011 |
Alan Cox <alc@FreeBSD.org> |
Eliminate vestiges of page coloring in VM_ALLOC_NOOBJ calls to vm_page_alloc(). While I'm here, for the sake of consistency, always specify the allocation class, such as VM_ALLOC_NORMAL, as the first of the flags.
|
#
3407fefe |
|
06-Sep-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomic flags field. Updates to the atomic flags are performed using the atomic ops on the containing word, do not require any vm lock to be held, and are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9) functions are provided to modify afalgs. Document the changes to flags field to only require the page lock. Introduce vm_page_reference(9) function to provide a stable KPI and KBI for filesystems like tmpfs and zfs which need to mark a page as referenced. Reviewed by: alc, attilio Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64) Approved by: re (bz)
|
#
d98d0ce2 |
|
09-Aug-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
- Move the PG_UNMANAGED flag from m->flags to m->oflags, renaming the flag to VPO_UNMANAGED (and also making the flag protected by the vm object lock, instead of vm page queue lock). - Mark the fake pages with both PG_FICTITIOUS (as it is now) and VPO_UNMANAGED. As a consequence, pmap code now can use use just VPO_UNMANAGED to decide whether the page is unmanaged. Reviewed by: alc Tested by: pho (x86, previous version), marius (sparc64), marcel (arm, ia64, powerpc), ray (mips) Sponsored by: The FreeBSD Foundation Approved by: re (bz)
|
#
80788b2a |
|
02-Jul-2011 |
Alan Cox <alc@FreeBSD.org> |
When iterating over a paging queue, explicitly check for PG_MARKER, instead of relying on zeroed memory being interpreted as an empty PV list. Reviewed by: kib
|
#
6bbee8e2 |
|
29-Jun-2011 |
Alan Cox <alc@FreeBSD.org> |
Add a new option, OBJPR_NOTMAPPED, to vm_object_page_remove(). Passing this option to vm_object_page_remove() asserts that the specified range of pages is not mapped, or more precisely that none of these pages have any managed mappings. Thus, vm_object_page_remove() need not call pmap_remove_all() on the pages. This change not only saves time by eliminating pointless calls to pmap_remove_all(), but it also eliminates an inconsistency in the use of pmap_remove_all() versus related functions, like pmap_remove_write(). It eliminates harmless but pointless calls to pmap_remove_all() that were being performed on PG_UNMANAGED pages. Update all of the existing assertions on pmap_remove_all() to reflect this change. Reviewed by: kib
|
#
d16f8274 |
|
28-Jun-2011 |
Attilio Rao <attilio@FreeBSD.org> |
Remove pc_cpumask usage from i386 and XEN. Tested by: pluknet
|
#
250a44f6 |
|
22-Jun-2011 |
Attilio Rao <attilio@FreeBSD.org> |
Remove pc_other_cpus usage from i386 and XEN. Tested by: pluknet
|
#
d7eb69e1 |
|
24-May-2011 |
Attilio Rao <attilio@FreeBSD.org> |
- Fix a misusage of cpuset_t objects - Fix a typo Reported by: pluknet
|
#
d30e0db5 |
|
22-May-2011 |
Attilio Rao <attilio@FreeBSD.org> |
Add a "safety belt" check for lsb setting. I don't think it is really necessary because the cpumask is known to be != 0, but it is just in case. Requested by: kib
|
#
b2b45cca |
|
20-May-2011 |
Attilio Rao <attilio@FreeBSD.org> |
Reintroduce the lazypmap infrastructure and convert it to using cpuset_t. Requested by: alc
|
#
71a19bdc |
|
05-May-2011 |
Attilio Rao <attilio@FreeBSD.org> |
Commit the support for removing cpumask_t and replacing it directly with cpuset_t objects. That is going to offer the underlying support for a simple bump of MAXCPU and then support for number of cpus > 32 (as it is today). Right now, cpumask_t is an int, 32 bits on all our supported architecture. cpumask_t on the other side is implemented as an array of longs, and easilly extendible by definition. The architectures touched by this commit are the following: - amd64 - i386 - pc98 - arm - ia64 - XEN while the others are still missing. Userland is believed to be fully converted with the changes contained here. Some technical notes: - This commit may be considered an ABI nop for all the architectures different from amd64 and ia64 (and sparc64 in the future) - per-cpu members, which are now converted to cpuset_t, needs to be accessed avoiding migration, because the size of cpuset_t should be considered unknown - size of cpuset_t objects is different from kernel and userland (this is primirally done in order to leave some more space in userland to cope with KBI extensions). If you need to access kernel cpuset_t from the userland please refer to example in this patch on how to do that correctly (kgdb may be a good source, for example). - Support for other architectures is going to be added soon - Only MAXCPU for amd64 is bumped now The patch has been tested by sbruno and Nicholas Esborn on opteron 4 x 12 pack CPUs. More testing on big SMP is expected to came soon. pluknet tested the patch with his 8-ways on both amd64 and i386. Tested by: pluknet, sbruno, gianni, Nicholas Esborn Reviewed by: jeff, jhb, sbruno
|
#
97340772 |
|
30-Apr-2011 |
Attilio Rao <attilio@FreeBSD.org> |
Remove the support for lazy cr3 switching from i386. amd64 has already this micro-optimization removed. Submitted by: kib
|
#
3136faa5 |
|
18-Apr-2011 |
Konstantin Belousov <kib@FreeBSD.org> |
Make pmap_invalidate_cache_range() available for consumption on amd64. Add pmap_invalidate_cache_pages() method on x86. It flushes the CPU cache for the set of pages, which are not neccessary mapped. Since its supposed use is to prepare the move of the pages ownership to a device that does not snoop all CPU accesses to the main memory (read GPU in GMCH), do not rely on CPU self-snoop feature. amd64 implementation takes advantage of the direct map. On i386, extract the helper pmap_flush_page() from pmap_page_set_memattr(), and use it to make a temporary mapping of the flushed page. Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 3 weeks
|
#
4053b05b |
|
21-Jan-2011 |
Sergey Kandaurov <pluknet@FreeBSD.org> |
Make MSGBUF_SIZE kernel option a loader tunable kern.msgbufsize. Submitted by: perryh pluto.rain.com (previous version) Reviewed by: jhb Approved by: kib (mentor) Tested by: universe
|
#
9d555e45 |
|
19-Dec-2010 |
Alan Cox <alc@FreeBSD.org> |
Redo some parts of r216333, specifically, the locking changes to pmap_extract_and_hold(), and undo the rest. In particular, I forgot that PG_PS and PG_PTE_PAT are the same bit.
|
#
a9b31c25 |
|
18-Dec-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
In pmap_extract(), unlock pmap lock earlier. The calculation does not need the lock when operating on local variables. Reviewed by: alc
|
#
d1cf854b |
|
09-Dec-2010 |
Alan Cox <alc@FreeBSD.org> |
When r207410 eliminated the acquisition and release of the page queues lock from pmap_extract_and_hold(), it didn't take into account that pmap_pte_quick() sometimes requires the page queues lock to be held. This change reimplements pmap_extract_and_hold() such that it no longer uses pmap_pte_quick(), and thus never requires the page queues lock. For consistency, adopt the same idiom as used by the new implementation of pmap_extract_and_hold() in pmap_extract() and pmap_mincore(). It also happens to make these functions shorter. Fix a style error in pmap_pte(). Reviewed by: kib@
|
#
d2d0fda8 |
|
23-Nov-2010 |
Jung-uk Kim <jkim@FreeBSD.org> |
Remove a stale tunable introduced in r215703.
|
#
7dd052c1 |
|
22-Nov-2010 |
Jung-uk Kim <jkim@FreeBSD.org> |
- Disable caches and flush caches/TLBs when we update PAT as we do for MTRR. Flushing TLBs is required to ensure cache coherency according to the AMD64 architecture manual. Flushing caches is only required when changing from a cacheable memory type (WB, WP, or WT) to an uncacheable type (WC, UC, or UC-). Since this function is only used once per processor during startup, there is no need to take any shortcuts. - Leave PAT indices 0-3 at the default of WB, WT, UC-, and UC. Program 5 as WP (from default WT) and 6 as WC (from default UC-). Leave 4 and 7 at the default of WB and UC. This is to avoid transition from a cacheable memory type to an uncacheable type to minimize possible cache incoherency. Since we perform flushing caches and TLBs now, this change may not be necessary any more but we do not want to take any chances. - Remove Apple hardware specific quirks. With the above changes, it seems this hack is no longer needed. - Improve pmap_cache_bits() with an array to map PAT memory type to index. This array is initialized early from pmap_init_pat(), so that we do not need to handle special cases in the function any more. Now this function is identical on both amd64 and i386. Reviewed by: jhb Tested by: RM (reuf_m at hotmail dot com) Ryszard Czekaj (rychoo at freeshell dot net) army.of.root (army dot of dot root at googlemail dot com) MFC after: 3 days
|
#
228a2537 |
|
07-Nov-2010 |
Alan Cox <alc@FreeBSD.org> |
Eliminate a possible race between pmap_pinit() and pmap_kenter_pde() on superpage promotion or demotion. Micro-optimize pmap_kenter_pde(). Reviewed by: kib, jhb (an earlier version) MFC after: 1 week
|
#
a7d5f7eb |
|
19-Oct-2010 |
Jamie Gritton <jamie@FreeBSD.org> |
A new jail(8) with a configuration file, to replace the work currently done by /etc/rc.d/jail.
|
#
fb4c8540 |
|
05-Oct-2010 |
Alan Cox <alc@FreeBSD.org> |
Initialize KPTmap in locore so that vm86.c can call vtophys() (or really pmap_kextract()) before pmap_bootstrap() is called. Document the set of pmap functions that may be called before pmap_bootstrap() is called. Tested by: bde@ Reviewed by: kib@ Discussed with: jhb@ MFC after: 6 weeks
|
#
e0e08e6a |
|
16-Aug-2010 |
Pietro Cerutti <gahr@FreeBSD.org> |
- The iMac9,1 needs the PAT workaround as well Approved by: cognet
|
#
4c967b61 |
|
10-Aug-2010 |
Attilio Rao <attilio@FreeBSD.org> |
Fix some places that may use cpumask_t while they still use 'int' types. While there, also fix some places assuming cpu type is 'int' while u_int is really meant. Note: this will also fix some possible races in per-cpu data accessings to be addressed in further commits. In collabouration with: Yahoo! Incorporated (via sbruno and peter) Tested by: gianni MFC after: 1 month
|
#
8155e5d5 |
|
10-Jul-2010 |
Alan Cox <alc@FreeBSD.org> |
Reduce the number of global TLB shootdowns generated by pmap_qenter(). Specifically, teach pmap_qenter() to recognize the case when it is being asked to replace a mapping with the very same mapping and not generate a shootdown. Unfortunately, the buffer cache commonly passes an entire buffer to pmap_qenter() when only a subset of the mappings are changing. For the extension of buffers in allocbuf() this was resulting in unnecessary shootdowns. The addition of new pages to the end of the buffer need not and did not trigger a shootdown, but overwriting the initial mappings with the very same mappings was seen as a change that necessitated a shootdown. With this change, that is no longer so. For a "buildworld" on amd64, this change eliminates 14-15% of the pmap_invalidate_range() shootdowns, and about 4% of the overall shootdowns. MFC after: 3 weeks
|
#
2680dac9 |
|
09-Jul-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
For both i386 and amd64 pmap, - change the type of pm_active to cpumask_t, which it is; - in pmap_remove_pages(), compare with PCPU(curpmap), instead of dereferencing the long chain of pointers [1]. For amd64 pmap, remove the unneeded checks for validity of curpmap in pmap_activate(), since curpmap should be always valid after r209789. Submitted by: alc [1] Reviewed by: alc MFC after: 3 weeks
|
#
9124d0d6 |
|
11-Jun-2010 |
Alan Cox <alc@FreeBSD.org> |
Relax one of the new assertions in pmap_enter() a little. Specifically, allow pmap_enter() to be performed on an unmanaged page that doesn't have VPO_BUSY set. Having VPO_BUSY set really only matters for managed pages. (See, for example, pmap_remove_write().)
|
#
ce186587 |
|
10-Jun-2010 |
Alan Cox <alc@FreeBSD.org> |
Reduce the scope of the page queues lock and the number of PG_REFERENCED changes in vm_pageout_object_deactivate_pages(). Simplify this function's inner loop using TAILQ_FOREACH(), and shorten some of its overly long lines. Update a stale comment. Assert that PG_REFERENCED may be cleared only if the object containing the page is locked. Add a comment documenting this. Assert that a caller to vm_page_requeue() holds the page queues lock, and assert that the page is on a page queue. Push down the page queues lock into pmap_ts_referenced() and pmap_page_exists_quick(). (As of now, there are no longer any pmap functions that expect to be called with the page queues lock held.) Neither pmap_ts_referenced() nor pmap_page_exists_quick() should ever be passed an unmanaged page. Assert this rather than returning "0" and "FALSE" respectively. ARM: Simplify pmap_page_exists_quick() by switching to TAILQ_FOREACH(). Push down the page queues lock inside of pmap_clearbit(), simplifying pmap_clear_modify(), pmap_clear_reference(), and pmap_remove_write(). Additionally, this allows for avoiding the acquisition of the page queues lock in some cases. PowerPC/AIM: moea*_page_exits_quick() and moea*_page_wired_mappings() will never be called before pmap initialization is complete. Therefore, the check for moea_initialized can be eliminated. Push down the page queues lock inside of moea*_clear_bit(), simplifying moea*_clear_modify() and moea*_clear_reference(). The last parameter to moea*_clear_bit() is never used. Eliminate it. PowerPC/BookE: Simplify mmu_booke_page_exists_quick()'s control flow. Reviewed by: kib@
|
#
acb4c5ec |
|
07-Jun-2010 |
Alan Cox <alc@FreeBSD.org> |
MFC r208765 In the unlikely event that pmap_ts_referenced() demoted five superpage mappings to the same underlying physical page, the calling thread would be left forever pinned to the same processor. Approved by: re (kib)
|
#
966898be |
|
02-Jun-2010 |
Alan Cox <alc@FreeBSD.org> |
In the unlikely event that pmap_ts_referenced() demoted five superpage mappings to the same underlying physical page, the calling thread would be left forever pinned to the same processor. MFC after: 3 days
|
#
b2830a96 |
|
31-May-2010 |
Alan Cox <alc@FreeBSD.org> |
Eliminate a stale comment.
|
#
72dc3eb6 |
|
30-May-2010 |
Alan Cox <alc@FreeBSD.org> |
Simplify the inner loop of pmap_collect(): While iterating over the page's pv list, there is no point in checking whether or not the pv list is empty. Instead, wait until the loop completes.
|
#
8f0d5d3b |
|
29-May-2010 |
Alan Cox <alc@FreeBSD.org> |
When I pushed down the page queues lock into pmap_is_modified(), I created an ordering dependence: A pmap operation that clears PG_WRITEABLE and calls vm_page_dirty() must perform the call first. Otherwise, pmap_is_modified() could return FALSE without acquiring the page queues lock because the page is not (currently) writeable, and the caller to pmap_is_modified() might believe that the page's dirty field is clear because it has not seen the effect of the vm_page_dirty() call. When I pushed down the page queues lock into pmap_is_modified(), I overlooked one place where this ordering dependence is violated: pmap_enter(). In a rare situation pmap_enter() can be called to replace a dirty mapping to one page with a mapping to another page. (I say rare because replacements generally occur as a result of a copy-on-write fault, and so the old page is not dirty.) This change delays clearing PG_WRITEABLE until after vm_page_dirty() has been called. Fixing the ordering dependency also makes it easy to introduce a small optimization: When pmap_enter() used to replace a mapping to one page with a mapping to another page, it freed the pv entry for the first mapping and later called the pv entry allocator for the new mapping. Now, pmap_enter() attempts to recycle the old pv entry, saving two calls to the pv entry allocator. There is no point in setting PG_WRITEABLE on unmanaged pages, so don't. Update a comment to reflect this. Tidy up the variable declarations at the start of pmap_enter().
|
#
52d8ba37 |
|
28-May-2010 |
Alan Cox <alc@FreeBSD.org> |
Defer freeing any page table pages in pmap_remove_all() until after the page queues lock is released. This may reduce the amount of time that the page queues lock is held by pmap_remove_all().
|
#
c46b90e9 |
|
26-May-2010 |
Alan Cox <alc@FreeBSD.org> |
Push down page queues lock acquisition in pmap_enter_object() and pmap_is_referenced(). Eliminate the corresponding page queues lock acquisitions from vm_map_pmap_enter() and mincore(), respectively. In mincore(), this allows some additional cases to complete without ever acquiring the page queues lock. Assert that the page is managed in pmap_is_referenced(). On powerpc/aim, push down the page queues lock acquisition from moea*_is_modified() and moea*_is_referenced() into moea*_query_bit(). Again, this will allow some additional cases to complete without ever acquiring the page queues lock. Reorder a few statements in vm_page_dontneed() so that a race can't lead to an old reference persisting. This scenario is described in detail by a comment. Correct a spelling error in vm_page_dontneed(). Assert that the object is locked in vm_page_clear_dirty(), and restrict the page queues lock assertion to just those cases in which the page is currently writeable. Add object locking to vnode_pager_generic_putpages(). This was the one and only place where vm_page_clear_dirty() was being called without the object being locked. Eliminate an unnecessary vm_page_lock() around vnode_pager_setsize()'s call to vm_page_clear_dirty(). Change vnode_pager_generic_putpages() to the modern-style of function definition. Also, change the name of one of the parameters to follow virtual memory system naming conventions. Reviewed by: kib
|
#
567e51e1 |
|
24-May-2010 |
Alan Cox <alc@FreeBSD.org> |
Roughly half of a typical pmap_mincore() implementation is machine- independent code. Move this code into mincore(), and eliminate the page queues lock from pmap_mincore(). Push down the page queues lock into pmap_clear_modify(), pmap_clear_reference(), and pmap_is_modified(). Assert that these functions are never passed an unmanaged page. Eliminate an inaccurate comment from powerpc/powerpc/mmu_if.m: Contrary to what the comment says, pmap_mincore() is not simply an optimization. Without a complete pmap_mincore() implementation, mincore() cannot return either MINCORE_MODIFIED or MINCORE_REFERENCED because only the pmap can provide this information. Eliminate the page queues lock from vfs_setdirty_locked_object(), vm_pageout_clean(), vm_object_page_collect_flush(), and vm_object_page_clean(). Generally speaking, these are all accesses to the page's dirty field, which are synchronized by the containing vm object's lock. Reduce the scope of the page queues lock in vm_object_madvise() and vm_page_dontneed(). Reviewed by: kib (an earlier version)
|
#
9ab6032f |
|
16-May-2010 |
Alan Cox <alc@FreeBSD.org> |
On entry to pmap_enter(), assert that the page is busy. While I'm here, make the style of assertion used by pmap_enter() consistent across all architectures. On entry to pmap_remove_write(), assert that the page is neither unmanaged nor fictitious, since we cannot remove write access to either kind of page. With the push down of the page queues lock, pmap_remove_write() cannot condition its behavior on the state of the PG_WRITEABLE flag if the page is busy. Assert that the object containing the page is locked. This allows us to know that the page will neither become busy nor will PG_WRITEABLE be set on it while pmap_remove_write() is running. Correct a long-standing bug in vm_page_cowsetup(). We cannot possibly do copy-on-write-based zero-copy transmit on unmanaged or fictitious pages, so don't even try. Previously, the call to pmap_remove_write() would have failed silently.
|
#
3c4a2440 |
|
08-May-2010 |
Alan Cox <alc@FreeBSD.org> |
Push down the page queues into vm_page_cache(), vm_page_try_to_cache(), and vm_page_try_to_free(). Consequently, push down the page queues lock into pmap_enter_quick(), pmap_page_wired_mapped(), pmap_remove_all(), and pmap_remove_write(). Push down the page queues lock into Xen's pmap_page_is_mapped(). (I overlooked the Xen pmap in r207702.) Switch to a per-processor counter for the total number of pages cached.
|
#
7024db1d |
|
06-May-2010 |
Alan Cox <alc@FreeBSD.org> |
Push down the page queues lock inside of vm_page_free_toq() and pmap_page_is_mapped() in preparation for removing page queues locking around calls to vm_page_free(). Setting aside the assertion that calls pmap_page_is_mapped(), vm_page_free_toq() now acquires and holds the page queues lock just long enough to actually add or remove the page from the paging queues. Update vm_page_unhold() to reflect the above change.
|
#
2965a453 |
|
29-Apr-2010 |
Kip Macy <kmacy@FreeBSD.org> |
On Alan's advice, rather than do a wholesale conversion on a single architecture from page queue lock to a hashed array of page locks (based on a patch by Jeff Roberson), I've implemented page lock support in the MI code and have only moved vm_page's hold_count out from under page queue mutex to page lock. This changes pmap_extract_and_hold on all pmaps. Supported by: Bitgravity Inc. Discussed with: alc, jeffr, and kib
|
#
0d2e1c3e |
|
25-Apr-2010 |
Alan Cox <alc@FreeBSD.org> |
Clearing a page table entry's accessed bit (PG_A) and setting the page's PG_REFERENCED flag in pmap_protect() can't really be justified. In contrast to pmap_remove() or pmap_remove_all(), the mapping is not being destroyed, so the notion that the page was accessed is not lost. Moreover, clearing the page table entry's accessed bit and setting the page's PG_REFERENCED flag can throw off the page daemon's activity count calculation. Finally, in my tests, I found that 15% of the atomic memory operations being performed by pmap_protect() were only to clear PG_A, and not change protection. This could, by itself, be fixed, but I don't see the point given the above argument. Remove a comment from pmap_protect_pde() that is no longer meaningful after the above change.
|
#
c5cc832f |
|
24-Apr-2010 |
Kip Macy <kmacy@FreeBSD.org> |
- fix style issues on i386 as well requested by: alc@
|
#
7b85f591 |
|
24-Apr-2010 |
Alan Cox <alc@FreeBSD.org> |
Resurrect pmap_is_referenced() and use it in mincore(). Essentially, pmap_ts_referenced() is not always appropriate for checking whether or not pages have been referenced because it clears any reference bits that it encounters. For example, in mincore(), clearing the reference bits has two negative consequences. First, it throws off the activity count calculations performed by the page daemon. Specifically, a page on which mincore() has called pmap_ts_referenced() looks less active to the page daemon than it should. Consequently, the page could be deactivated prematurely by the page daemon. Arguably, this problem could be fixed by having mincore() duplicate the activity count calculation on the page. However, there is a second problem for which that is not a solution. In order to clear a reference on a 4KB page, it may be necessary to demote a 2/4MB page mapping. Thus, a mincore() by one process can have the side effect of demoting a superpage mapping within another process!
|
#
02b5123e |
|
05-Apr-2010 |
Alan Cox <alc@FreeBSD.org> |
MFC r204907, r204913, r205402, r205573, r205573 Implement AMD's recommended workaround for Erratum 383 on Family 10h processors. Enable machine check exceptions by default.
|
#
c7014073 |
|
03-Apr-2010 |
Alan Cox <alc@FreeBSD.org> |
MFC r205652 A ptrace(2) by one process may trigger a page size promotion in the address space of another process. Modify pmap_promote_pde() to handle this.
|
#
dfeca187 |
|
30-Mar-2010 |
Marcel Moolenaar <marcel@FreeBSD.org> |
MFC rev 198341 and 198342: o Introduce vm_sync_icache() for making the I-cache coherent with the memory or D-cache, depending on the semantics of the platform. vm_sync_icache() is basically a wrapper around pmap_sync_icache(), that translates the vm_map_t argumument to pmap_t. o Introduce pmap_sync_icache() to all PMAP implementation. For powerpc it replaces the pmap_page_executable() function, added to solve the I-cache problem in uiomove_fromphys(). o In proc_rwmem() call vm_sync_icache() when writing to a page that has execute permissions. This assures that when breakpoints are written, the I-cache will be coherent and the process will actually hit the breakpoint. o This also fixes the Book-E PMAP implementation that was missing necessary locking while trying to deal with the I-cache coherency in pmap_enter() (read: mmu_booke_enter_locked).
|
#
3792de2e |
|
27-Mar-2010 |
Alan Cox <alc@FreeBSD.org> |
Correctly handle preemption of pmap_update_pde_invalidate(). X-MFC after: r205573
|
#
a57d0d8e |
|
27-Mar-2010 |
Alan Cox <alc@FreeBSD.org> |
Simplify pmap_growkernel(), making the i386 version more like the amd64 version. MFC after: 3 weeks
|
#
09fcdf11 |
|
25-Mar-2010 |
Alan Cox <alc@FreeBSD.org> |
A ptrace(2) by one processor may trigger a promotion in the address space of another process. Modify pmap_promote_pde() to handle this. (This is not a problem on amd64 due to implementation differences.) Reported by: jh@ MFC after: 1 week
|
#
2dec7615 |
|
24-Mar-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
MFC r204957: Fall back to wbinvd when region for CLFLUSH is >= 2MB. MFC r205334 (by avg): Fix a typo in a comment.
|
#
e1990590 |
|
23-Mar-2010 |
Alan Cox <alc@FreeBSD.org> |
Adapt r204907 and r205402, the amd64 implementation of the workaround for AMD Family 10h Erratum 383, to i386. Enable machine check exceptions by default, just like r204913 for amd64. Enable superpage promotion only if the processor actually supports large pages, i.e., PG_PS. MFC after: 2 weeks
|
#
9344361b |
|
19-Mar-2010 |
Andriy Gapon <avg@FreeBSD.org> |
pmap amd64/i386: fix a typo in a comment MFC after: 3 days
|
#
55c4e016 |
|
11-Mar-2010 |
John Baldwin <jhb@FreeBSD.org> |
Fix the previous attempt to fix kernel builds of HEAD on 7.x. Use the __gnu_inline__ attribute for PMAP_INLINE when using the 7.x compiler to match what 7.x uses for PMAP_INLINE.
|
#
2a595a40 |
|
10-Mar-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
Fall back to wbinvd when region for CLFLUSH is >= 2MB. Submitted by: Kevin Day <toasty dragondata com> Reviewed by: jhb MFC after: 2 weeks
|
#
c288186f |
|
02-Mar-2010 |
Alan Cox <alc@FreeBSD.org> |
MFC r204420 When running as a guest operating system, the FreeBSD kernel must assume that the virtual machine monitor has enabled machine check exceptions. Unfortunately, on AMD Family 10h processors the machine check hardware has a bug (Erratum 383) that can result in a false machine check exception when a superpage promotion occurs. Thus, I am disabling superpage promotion when the FreeBSD kernel is running as a guest operating system on an AMD Family 10h processor.
|
#
0b993ee5 |
|
27-Feb-2010 |
Alan Cox <alc@FreeBSD.org> |
When running as a guest operating system, the FreeBSD kernel must assume that the virtual machine monitor has enabled machine check exceptions. Unfortunately, on AMD Family 10h processors the machine check hardware has a bug (Erratum 383) that can result in a false machine check exception when a superpage promotion occurs. Thus, I am disabling superpage promotion when the FreeBSD kernel is running as a guest operating system on an AMD Family 10h processor. Reviewed by: jhb, kib MFC after: 3 days
|
#
454947a6 |
|
20-Feb-2010 |
Alan Cox <alc@FreeBSD.org> |
MFC r203085 Optimize pmap_demote_pde() by using the new KPTmap to access a kernel page table page instead of creating a temporary mapping to it. Set the PG_G bit on the page table entries that implement the KPTmap. Locore initializes the unused portions of the NKPT kernel page table pages that it allocates to zero. So, pmap_bootstrap() needn't zero the page table entries referenced by CMAP1 and CMAP3. Simplify pmap_set_pg().
|
#
ddc534916 |
|
18-Feb-2010 |
Ed Schouten <ed@FreeBSD.org> |
Allow the pmap code to be built with GCC from FreeBSD 7 again. This patch basically gives us the best of both worlds. Instead of forcing the compiler to emulate GNU-style inline semantics even though we're using ISO C99, it will only use GNU-style inlining when the compiler is configured that way (__GNUC_GNU_INLINE__). Tested by: jhb
|
#
bda39c37 |
|
01-Feb-2010 |
Alan Cox <alc@FreeBSD.org> |
Change the default value for the flag enabling superpage mapping and promotion to "on".
|
#
6d41eae0 |
|
29-Jan-2010 |
Alan Cox <alc@FreeBSD.org> |
MFC r202894 Handle a race between pmap_kextract() and pmap_promote_pde().
|
#
b3021b93 |
|
27-Jan-2010 |
Alan Cox <alc@FreeBSD.org> |
Optimize pmap_demote_pde() by using the new KPTmap to access a kernel page table page instead of creating a temporary mapping to it. Set the PG_G bit on the page table entries that implement the KPTmap. Locore initializes the unused portions of the NKPT kernel page table pages that it allocates to zero. So, pmap_bootstrap() needn't zero the page table entries referenced by CMAP1 and CMAP3. Simplify pmap_set_pg(). MFC after: 10 days
|
#
cf350851 |
|
23-Jan-2010 |
Alan Cox <alc@FreeBSD.org> |
Handle a race between pmap_kextract() and pmap_promote_pde(). This race is known to cause a kernel crash in ZFS on i386 when superpage promotion is enabled. Tested by: netchild MFC after: 1 week
|
#
91bfd816 |
|
19-Jan-2010 |
Ed Schouten <ed@FreeBSD.org> |
Recommit r193732: Remove __gnu89_inline. Now that we use C99 almost everywhere, just use C99-style in the pmap code. Since the pmap code is the only consumer of __gnu89_inline, remove it from cdefs.h as well. Because the flag was only introduced 17 months ago, I don't expect any problems. Reviewed by: alc It was backed out, because it prevented us from building kernels using a 7.x compiler. Now that most people use 8.x, there is nothing that holds us back. Even if people run 7.x, they should be able to build a kernel if they run `make kernel-toolchain' or `make buildworld' first.
|
#
294a68ca |
|
18-Jan-2010 |
Alan Cox <alc@FreeBSD.org> |
MFC r202085 Simplify pmap_init(). Additionally, correct a harmless misbehavior on i386.
|
#
ac24a8ea |
|
11-Jan-2010 |
Alan Cox <alc@FreeBSD.org> |
Simplify pmap_init(). Additionally, correct a harmless misbehavior on i386. Specifically, where locore had created large page mappings for the kernel, the wrong vm page array entries were being initialized. The vm page array entries for the pages containing the kernel were being initialized instead of the vm page array entries for page table pages. MFC after: 1 week
|
#
0f59b74f |
|
09-Jan-2010 |
Alan Cox <alc@FreeBSD.org> |
Long ago, in r120654, the rounding of KERNend and physfree in locore was changed from a small page boundary to a large page boundary. As a consequence pmap_kmem_choose() became a pointless waste of address space. Eliminate it.
|
#
28a5e2a5 |
|
07-Jan-2010 |
Alan Cox <alc@FreeBSD.org> |
Make pmap_set_pg() static.
|
#
3ed91d6a |
|
08-Dec-2009 |
Andriy Gapon <avg@FreeBSD.org> |
MFC r199184: reflect that pg_ps_enabled is a tunable
|
#
6cc16fcb |
|
11-Nov-2009 |
Andriy Gapon <avg@FreeBSD.org> |
reflect that pg_ps_enabled is a tunable, not just a read-only sysctl Nod from: jhb
|
#
dcf9f137 |
|
06-Nov-2009 |
Attilio Rao <attilio@FreeBSD.org> |
MFC r197070: Consolidate CPUID to CPU family/model macros for amd64 and i386 to reduce unnecessary #ifdef's for shared code between them. This MFC should unbreak the kernel build breakage introduced by r198977. Reported by: kib Pointy hat to: me
|
#
13529df1 |
|
31-Oct-2009 |
Alan Cox <alc@FreeBSD.org> |
MFC r197317 When superpages are enabled, add the 2 or 4MB page size to the array of supported page sizes.
|
#
1a4fcaeb |
|
21-Oct-2009 |
Marcel Moolenaar <marcel@FreeBSD.org> |
o Introduce vm_sync_icache() for making the I-cache coherent with the memory or D-cache, depending on the semantics of the platform. vm_sync_icache() is basically a wrapper around pmap_sync_icache(), that translates the vm_map_t argumument to pmap_t. o Introduce pmap_sync_icache() to all PMAP implementation. For powerpc it replaces the pmap_page_executable() function, added to solve the I-cache problem in uiomove_fromphys(). o In proc_rwmem() call vm_sync_icache() when writing to a page that has execute permissions. This assures that when breakpoints are written, the I-cache will be coherent and the process will actually hit the breakpoint. o This also fixes the Book-E PMAP implementation that was missing necessary locking while trying to deal with the I-cache coherency in pmap_enter() (read: mmu_booke_enter_locked). The key property of this change is that the I-cache is made coherent *after* writes have been done. Doing it in the PMAP layer when adding or changing a mapping means that the I-cache is made coherent *before* any writes happen. The difference is key when the I-cache prefetches.
|
#
d6dbb0db |
|
18-Sep-2009 |
Alan Cox <alc@FreeBSD.org> |
When superpages are enabled, add the 2 or 4MB page size to the array of supported page sizes. Reviewed by: jhb MFC after: 3 weeks
|
#
3bcdfb9b |
|
10-Sep-2009 |
Jung-uk Kim <jkim@FreeBSD.org> |
Consolidate CPUID to CPU family/model macros for amd64 and i386 to reduce unnecessary #ifdef's for shared code between them.
|
#
bf202eb1 |
|
03-Sep-2009 |
John Baldwin <jhb@FreeBSD.org> |
MFC 196705 and 196707: - Improve pmap_change_attr() on i386 so that it is able to demote a large (2/4MB) page into 4KB pages as needed. This should be fairly rare in practice. - Simplify pmap_change_attr() a bit: - Always calculate the cache bits instead of doing it on-demand. - Always set changed to TRUE rather than only doing it if it is false. Approved by: re (kib)
|
#
c8e648e1 |
|
02-Sep-2009 |
Jung-uk Kim <jkim@FreeBSD.org> |
Fix confusing comments about default PAT entries.
|
#
c9e88179 |
|
02-Sep-2009 |
Jung-uk Kim <jkim@FreeBSD.org> |
- Work around ACPI mode transition problem for recent NVIDIA 9400M chipset based Intel Macs. Since r189055, these platforms started freezing when ACPI is being initialized for unknown reason. For these platforms, we just use the old PAT layout. Note this change is not enough to boot fully on these platforms because of other problems but it makes debugging possible. Note MacBook5,2 may be affected as well but it was not added here because of lack of hardware to test. - Initialize PAT MSR fully instead of reading and modifying it for safety. Reported by: rpaulo, hps, Eygene Ryabinkin (rea-fbsd at codelabs dot ru) Reviewed by: jhb
|
#
e2ba7437 |
|
01-Sep-2009 |
Robert Noland <rnoland@FreeBSD.org> |
MFC 196643 Swap the start/end virtual addresses in pmap_invalidate_cache_range(). This fixes the functionality on non SelfSnoop hardware. Found by: rnoland Submitted by: alc Reviewed by: kib Approved by: re (rwatson)
|
#
8101afb6 |
|
31-Aug-2009 |
John Baldwin <jhb@FreeBSD.org> |
Simplify pmap_change_attr() a bit: - Always calculate the cache bits instead of doing it on-demand. - Always set changed to TRUE rather than only doing it if it is false. Discussed with: alc MFC after: 3 days
|
#
75e66e42 |
|
31-Aug-2009 |
John Baldwin <jhb@FreeBSD.org> |
Improve pmap_change_attr() so that it is able to demote a large (2/4MB) page into 4KB pages as needed. This should be fairly rare in practice on i386. This includes merging the following changes from the amd64 pmap: 180430, 180485, 180845, 181043, 181077, and 196318. - Add basic support for changing attributes on PDEs to pmap_change_attr() similar to the support in the initial version of pmap_change_attr() on amd64 including inlines for pmap_pde_attr() and pmap_pte_attr(). - Extend pmap_demote_pde() to include the ability to instantiate a new page table page where none existed before. - Enhance pmap_change_attr(). Use pmap_demote_pde() to demote a 2/4MB page mapping to 4KB page mappings when the specified attribute change only applies to a portion of the 2/4MB page. Previously, in such cases, pmap_change_attr() gave up and returned an error. - Correct a critical accounting error in pmap_demote_pde(). Reviewed by: alc MFC after: 3 days
|
#
cbc3c1f6 |
|
29-Aug-2009 |
Robert Noland <rnoland@FreeBSD.org> |
Swap the start/end virtual addresses in pmap_invalidate_cache_range(). This fixes the functionality on non SelfSnoop hardware. Found by: rnoland Submitted by: alc Reviewed by: kib MFC after: 3 days
|
#
8a5ac5d5 |
|
29-Jul-2009 |
Konstantin Belousov <kib@FreeBSD.org> |
As was done in r195820 for amd64, use clflush for flushing cache lines when memory page caching attributes changed, and CPU does not support self-snoop, but implemented clflush, for i386. Take care of possible mappings of the page by sf buffer by utilizing the mapping for clflush, otherwise map the page transiently. Amd64 used direct map. Proposed and reviewed by: alc Approved by: re (kensmith)
|
#
01381811 |
|
24-Jul-2009 |
John Baldwin <jhb@FreeBSD.org> |
Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to a device pager (OBJT_DEVICE) object in that it uses fictitious pages to provide aliases to other memory addresses. The primary difference is that it uses an sglist(9) to determine the physical addresses for a given offset into the object instead of invoking the d_mmap() method in a device driver. Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks
|
#
33131452 |
|
23-Jul-2009 |
Alan Cox <alc@FreeBSD.org> |
Eliminate unnecessary cache and TLB flushes by pmap_change_attr(). (This optimization was implemented in the amd64 version roughly 1 year ago.) Approved by: re (kensmith)
|
#
9861cbc6 |
|
19-Jul-2009 |
Alan Cox <alc@FreeBSD.org> |
Change the handling of fictitious pages by pmap_page_set_memattr() on amd64 and i386. Essentially, fictitious pages provide a mechanism for creating aliases for either normal or device-backed pages. Therefore, pmap_page_set_memattr() on a fictitious page needn't update the direct map or flush the cache. Such actions are the responsibility of the "primary" instance of the page or the device driver that "owns" the physical address. For example, these actions are already performed by pmap_mapdev(). The device pager needn't restore the memory attributes on a fictitious page before releasing it. It's now pointless. Add pmap_page_set_memattr() to the Xen pmap. Approved by: re (kib)
|
#
13de7221 |
|
17-Jul-2009 |
Alan Cox <alc@FreeBSD.org> |
An addendum to r195649, "Add support to the virtual memory system for configuring machine-dependent memory attributes...": Don't set the memory attribute for a "real" page that is allocated to a device object in vm_page_alloc(). It is a pointless act, because the device pager replaces this "real" page with a "fake" page and sets the memory attribute on that "fake" page. Eliminate pointless code from pmap_cache_bits() on amd64. Employ the "Self Snoop" feature supported by some x86 processors to avoid cache flushes in the pmap. Approved by: re (kib)
|
#
3153e878 |
|
12-Jul-2009 |
Alan Cox <alc@FreeBSD.org> |
Add support to the virtual memory system for configuring machine- dependent memory attributes: Rename vm_cache_mode_t to vm_memattr_t. The new name reflects the fact that there are machine-dependent memory attributes that have nothing to do with controlling the cache's behavior. Introduce vm_object_set_memattr() for setting the default memory attributes that will be given to an object's pages. Introduce and use pmap_page_{get,set}_memattr() for getting and setting a page's machine-dependent memory attributes. Add full support for these functions on amd64 and i386 and stubs for them on the other architectures. The function pmap_page_set_memattr() is also responsible for any other machine-dependent aspects of changing a page's memory attributes, such as flushing the cache or updating the direct map. The uses include kmem_alloc_contig(), vm_page_alloc(), and the device pager: kmem_alloc_contig() can now be used to allocate kernel memory with non-default memory attributes on amd64 and i386. vm_page_alloc() and the device pager will set the memory attributes for the real or fictitious page according to the object's default memory attributes. Update the various pmap functions on amd64 and i386 that map pages to incorporate each page's memory attributes in the mapping. Notes: (1) Inherent to this design are safety features that prevent the specification of inconsistent memory attributes by different mappings on amd64 and i386. In addition, the device pager provides a warning when a device driver creates a fictitious page with memory attributes that are inconsistent with the real page that the fictitious page is an alias for. (2) Storing the machine-dependent memory attributes for amd64 and i386 as a dedicated "int" in "struct md_page" represents a compromise between space efficiency and the ease of MFCing these changes to RELENG_7. In collaboration with: jhb Approved by: re (kib)
|
#
0e18ab26 |
|
05-Jul-2009 |
Alan Cox <alc@FreeBSD.org> |
PAE adds another level to the i386 page table. This level is a small 4-entry table that must be located within the first 4GB of RAM. This requirement is met by defining an UMA zone with a custom back-end allocator function. This revision makes two changes to this back-end allocator function: (1) It replaces the use of contigmalloc() with the use of kmem_alloc_contig(). This eliminates "double accounting", i.e., accounting by both the UMA zone and malloc tags. (I made the same change for the same reason to the zones supporting jumbo frames a week ago.) (2) It passes through the "wait" parameter, i.e., M_WAITOK, M_ZERO, etc. to kmem_alloc_contig() rather than ignoring it. pmap_init() calls uma_zalloc() with both M_WAITOK and M_ZERO. At the moment, this is harmless only because the default behavior of contigmalloc()/kmem_alloc_contig() is to wait and because pmap_init() doesn't really depend on the memory being zeroed. The back-end allocator function in the Xen pmap is dead code. I am changing it nonetheless because I don't want to leave any "bad examples" in the source tree for someone to copy at a later date. Approved by: re (kib)
|
#
387aabc5 |
|
14-Jun-2009 |
Alan Cox <alc@FreeBSD.org> |
Long, long ago in r27464 special case code for mapping device-backed memory with 4MB pages was added to pmap_object_init_pt(). This code assumes that the pages of a OBJT_DEVICE object are always physically contiguous. Unfortunately, this is not always the case. For example, jhb@ informs me that the recently introduced /dev/ksyms driver creates a OBJT_DEVICE object that violates this assumption. Thus, this revision modifies pmap_object_init_pt() to abort the mapping if the OBJT_DEVICE object's pages are not physically contiguous. This revision also changes some inconsistent if not buggy behavior. For example, the i386 version aborts if the first 4MB virtual page that would be mapped is already valid. However, it incorrectly replaces any subsequent 4MB virtual page mappings that it encounters, potentially leaking a page table page. The amd64 version has a bug of my own creation. It potentially busies the wrong page and always an insufficent number of pages if it blocks allocating a page table page. To my knowledge, there have been no reports of these bugs, hence, their persistance. I suspect that the existing restrictions that pmap_object_init_pt() placed on the OBJT_DEVICE objects that it would choose to map, for example, that the first page must be aligned on a 2 or 4MB physical boundary and that the size of the mapping must be a multiple of the large page size, were enough to avoid triggering the bug for drivers like ksyms. However, one side effect of testing the OBJT_DEVICE object's pages for physical contiguity is that a dubious difference between pmap_object_init_pt() and the standard path for mapping devices pages, i.e., vm_fault(), has been eliminated. Previously, pmap_object_init_pt() would only instantiate the first PG_FICTITOUS page being mapped because it never examined the rest. Now, however, pmap_object_init_pt() uses the new function vm_object_populate() to instantiate them all (in order to support testing their physical contiguity). These pages need to be instantiated for the mechanism that I have prototyped for automatically maintaining the consistency of the PAT settings across multiple mappings, particularly, amd64's direct mapping, to work. (Translation: This change is also being made to support jhb@'s work on the Nvidia feature requests.) Discussed with: jhb@
|
#
5942207f |
|
08-Jun-2009 |
Ed Schouten <ed@FreeBSD.org> |
Revert my change; reintroduce __gnu89_inline. It turns out our compiler in stable/7 can't build this code anymore. Even though my opinion is that those people should just run `make kernel-toolchain' before building a kernel, I am willing to wait and commit this after we've branched stable/8. Requested by: rwatson
|
#
032e3d1d |
|
08-Jun-2009 |
Ed Schouten <ed@FreeBSD.org> |
Remove __gnu89_inline. Now that we use C99 almost everywhere, just use C99-style in the pmap code. Since the pmap code is the only consumer of __gnu89_inline, remove it from cdefs.h as well. Because the flag was only introduced 17 months ago, I don't expect any problems. Reviewed by: alc
|
#
120b18d8 |
|
14-May-2009 |
Attilio Rao <attilio@FreeBSD.org> |
FreeBSD right now support 32 CPUs on all the architectures at least. With the arrival of 128+ cores it is necessary to handle more than that. One of the first thing to change is the support for cpumask_t that needs to handle more than 32 bits masking (which happens now). Some places, however, still assume that cpumask_t is a 32 bits mask. Fix that situation by using always correctly cpumask_t when needed. While here, remove the part under STOP_NMI for the Xen support as it is broken in any case. Additively make ipi_nmi_pending as static. Reviewed by: jhb, kmacy Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
|
#
07a7b85e |
|
13-May-2009 |
Alan Cox <alc@FreeBSD.org> |
Correct a rare use-after-free error in pmap_copy(). This error was introduced in amd64 revision 1.540 and i386 revision 1.547. However, it had no harmful effects until after a recent change, r189698, on amd64. (In other words, the error is harmless in RELENG_7.) The error is triggered by the failure to allocate a pv entry for the one and only mapping in a page table page. I am addressing the error by changing pmap_copy() to abort if either pv entry allocation or page table page allocation fails. This is appropriate because the creation of mappings by pmap_copy() is optional. They are a (possible) optimization, and not a requirement. Correct a nearby whitespace error in the i386 pmap_copy(). Crash reported by: jeff@ MFC after: 6 weeks
|
#
e34a906f |
|
14-Mar-2009 |
Alan Cox <alc@FreeBSD.org> |
MFamd64 r189785 Update the pmap's resident page count when a page table page is freed in pmap_remove_pde() and pmap_remove_pages(). MFC after: 6 weeks
|
#
a4079bfb |
|
25-Feb-2009 |
Jung-uk Kim <jkim@FreeBSD.org> |
Enable support for PAT_WRITE_PROTECTED and PAT_UNCACHED cache modes unconditionally on amd64. On i386, we assume PAT is usable if the CPU vendor is not Intel or CPU model is newer than Pentium IV. Reviewed by: alc, jhb
|
#
6d65f2fa |
|
22-Feb-2009 |
Alan Cox <alc@FreeBSD.org> |
Optimize free_pv_entry(); specifically, avoid repeated TAILQ_REMOVE()s. MFC after: 1 week
|
#
6be00eca |
|
14-Feb-2009 |
Alan Cox <alc@FreeBSD.org> |
Remove unnecessary page queues locking around vm_page_busy() and vm_page_wakeup(). (This change is applicable to RELENG_7 but not RELENG_6.) MFC after: 1 week
|
#
d7f03759 |
|
19-Oct-2008 |
Ulf Lilleengen <lulf@FreeBSD.org> |
- Import the HEAD csup code which is the basis for the cvsmode work.
|
#
1628900b |
|
20-Sep-2008 |
Alan Cox <alc@FreeBSD.org> |
MFamd64 SVN rev 179749 CVS rev 1.620 Reverse the direction of pmap_promote_pde()'s traversal over the specified page table page. The direction of the traversal can matter if pmap_promote_pde() has to remove write access (PG_RW) from a PTE that hasn't been modified (PG_M). In general, if there are two or more such PTEs to choose among, it is better to write protect the one nearer the high end of the page table page rather than the low end. This is because most programs access memory in an ascending direction. The net result of this change is a sometimes significant reduction in the number of failed promotion attempts and the number of pages that are write protected by pmap_promote_pde(). MFamd64 SVN rev 179777 CVS rev 1.621 Tweak the promotion test in pmap_promote_pde(). Specifically, test PG_A before PG_M. This sometimes prevents unnecessary removal of write access from a PTE. Overall, the net result is fewer demotions and promotion failures.
|
#
2af89e55 |
|
18-Sep-2008 |
Alan Cox <alc@FreeBSD.org> |
MFamd64 SVN rev 179471 CVS rev 1.619 Correct an error in pmap_promote_pde() that may result in an errant promotion within the kernel's address space.
|
#
494c177e |
|
04-Aug-2008 |
Alan Cox <alc@FreeBSD.org> |
Make pmap_kenter_attr() static.
|
#
e79980e1 |
|
27-Jul-2008 |
Alan Cox <alc@FreeBSD.org> |
Correct an off-by-one error in the previous change to pmap_change_attr(). Change the nearby comment to mention the recursive map.
|
#
cc1ec88f |
|
27-Jul-2008 |
Alan Cox <alc@FreeBSD.org> |
Don't allow pmap_change_attr() to be applied to the recursive mapping.
|
#
35db2ce0 |
|
27-Jul-2008 |
Alan Cox <alc@FreeBSD.org> |
Style fixes to several function definitions.
|
#
59a23cac |
|
18-Jul-2008 |
Alan Cox <alc@FreeBSD.org> |
Correct an error in pmap_change_attr()'s initial loop that verifies that the given range of addresses are mapped. Previously, the loop was testing the same address every time. Submitted by: Magesh Dhasayyan
|
#
53d13c60 |
|
18-Jul-2008 |
Alan Cox <alc@FreeBSD.org> |
Simplify pmap_extract()'s control flow, making it more like the related functions pmap_extract_and_hold() and pmap_kextract().
|
#
cc82a18b |
|
07-Jul-2008 |
Alan Cox <alc@FreeBSD.org> |
In FreeBSD 7.0 and beyond, pmap_growkernel() should pass VM_ALLOC_INTERRUPT to vm_page_alloc() instead of VM_ALLOC_SYSTEM. VM_ALLOC_SYSTEM was the logical choice before FreeBSD 7.0 because VM_ALLOC_INTERRUPT could not reclaim a cached page. Simply put, there was no ordering between VM_ALLOC_INTERRUPT and VM_ALLOC_SYSTEM as to which "dug deeper" into the cache and free queues. Now, there is; VM_ALLOC_INTERRUPT dominates VM_ALLOC_SYSTEM. While I'm here, teach pmap_growkernel() to request a prezeroed page. MFC after: 1 week
|
#
1ec1304b |
|
17-May-2008 |
Alan Cox <alc@FreeBSD.org> |
Retire pmap_addr_hint(). It is no longer used.
|
#
ef4d480c |
|
11-May-2008 |
Alan Cox <alc@FreeBSD.org> |
Correct an error in pmap_align_superpage(). Specifically, correctly handle the case where the mapping is greater than a superpage in size but the alignment of the physical pages spans a superpage boundary.
|
#
d3249b14 |
|
09-May-2008 |
Alan Cox <alc@FreeBSD.org> |
Introduce pmap_align_superpage(). It increases the starting virtual address of the given mapping if a different alignment might result in more superpage mappings.
|
#
26b77ff3 |
|
25-Apr-2008 |
Alan Cox <alc@FreeBSD.org> |
Always use PG_PS_FRAME to extract the physical address of a 2/4MB page from a PDE.
|
#
f4d2c7f1 |
|
10-Apr-2008 |
Alan Cox <alc@FreeBSD.org> |
Correct pmap_copy()'s method for extracting the physical address of a 2/4MB page from a PDE. Specifically, change it to use PG_PS_FRAME, not PG_FRAME, to extract the physical address of a 2/4MB page from a PDE. Change the last argument passed to pmap_pv_insert_pde() from a vm_page_t representing the first 4KB page of a 2/4MB page to the vm_paddr_t of the 2/4MB page. This avoids an otherwise unnecessary conversion from a vm_paddr_t to a vm_page_t in pmap_copy().
|
#
109d4932 |
|
07-Apr-2008 |
Alan Cox <alc@FreeBSD.org> |
Update pmap_page_wired_mappings() so that it counts 2/4MB page mappings.
|
#
7630c265 |
|
04-Apr-2008 |
Alan Cox <alc@FreeBSD.org> |
Reintroduce UMA_SLAB_KMAP; however, change its spelling to UMA_SLAB_KERNEL for consistency with its sibling UMA_SLAB_KMEM. (UMA_SLAB_KMAP met its original demise in revision 1.30 of vm/uma_core.c.) UMA_SLAB_KERNEL is now required by the jumbo frame allocators. Without it, UMA cannot correctly return pages from the jumbo frame zones to the VM system because it resets the pages' object field to NULL instead of the kernel object. In more detail, the jumbo frame zones are created with the option UMA_ZONE_REFCNT. This causes UMA to overwrite the pages' object field with the address of the slab. However, when UMA wants to release these pages, it doesn't know how to restore the object field, so it sets it to NULL. This change teaches UMA how to reset the object field to the kernel object. Crashes reported by: kris Fix tested by: kris Fix discussed with: jeff MFC after: 6 weeks
|
#
4ae6e474 |
|
28-Mar-2008 |
Alan Cox <alc@FreeBSD.org> |
Eliminate an #if 0/#endif that was unintentionally introduced by the previous revision.
|
#
96a6e6e6 |
|
28-Mar-2008 |
Brooks Davis <brooks@FreeBSD.org> |
Use ; instead of : to end a line. Submitted by: Niclas Zeising <niclas dot zeising at gmail dot com>
|
#
6e7534b8 |
|
27-Mar-2008 |
Paul Saab <ps@FreeBSD.org> |
Add support to mincore for detecting whether a page is part of a "super" page or not. Reviewed by: alc, ups
|
#
97dbe5e4 |
|
26-Mar-2008 |
Alan Cox <alc@FreeBSD.org> |
MFamd64 with few changes: 1. Add support for automatic promotion of 4KB page mappings to 2MB page mappings. Automatic promotion can be enabled by setting the tunable "vm.pmap.pg_ps_enabled" to a non-zero value. By default, automatic promotion is disabled. Tested by: kris 2. To date, we have assumed that the TLB will only set the PG_M bit in a PTE if that PTE has the PG_RW bit set. However, this assumption does not hold on recent processors from Intel. For example, consider a PTE that has the PG_RW bit set but the PG_M bit clear. Suppose this PTE is cached in the TLB and later the PG_RW bit is cleared in the PTE, but the corresponding TLB entry is not (yet) invalidated. Historically, upon a write access using this (stale) TLB entry, the TLB would observe that the PG_RW bit had been cleared and initiate a page fault, aborting the setting of the PG_M bit in the PTE. Now, however, P4- and Core2-family processors will set the PG_M bit before observing that the PG_RW bit is clear and initiating a page fault. In other words, the write does not occur but the PG_M bit is still set. The real impact of this difference is not that great. Specifically, we should no longer assert that any PTE with the PG_M bit set must also have the PG_RW bit set, and we should ignore the state of the PG_M bit unless the PG_RW bit is set.
|
#
3f7905d2 |
|
23-Mar-2008 |
Konstantin Belousov <kib@FreeBSD.org> |
Prevent the overflow in the calculation of the next page directory. The overflow causes the wraparound with consequent corruption of the (almost) whole address space mapping. As Alan noted, pmap_copy() does not require the wrap-around checks because it cannot be applied to the kernel's pmap. The checks there are included for consistency. Reported and tested by: kris (i386/pmap.c:pmap_remove() part) Reviewed by: alc MFC after: 1 week
|
#
6634dbbd |
|
17-Jan-2008 |
Alan Cox <alc@FreeBSD.org> |
Retire PMAP_DIAGNOSTIC. Any useful diagnostics that were conditionally compiled under PMAP_DIAGNOSTIC are now KASSERT()s. (Note: The kernel option DIAGNOSTIC still disables inlining of certain pmap functions.) Eliminate dead code from pmap_enter(). This code implemented an assertion. On i386, an equivalent check is already implemented. However, on amd64, a small change is required to implement an equivalent check. Eliminate \n from a nearby panic string. Use KASSERT() to reimplement pmap_copy()'s two assertions.
|
#
a658a1e0 |
|
14-Jan-2008 |
Peter Wemm <peter@FreeBSD.org> |
Add a CTASSERT that KERNBASE is valid. This is usually messed up by an invalid KVA_PAGES, so add a pointer to there.
|
#
fa093ee2 |
|
08-Jan-2008 |
Alan Cox <alc@FreeBSD.org> |
Convert a PMAP_DIAGNOSTIC to a KASSERT.
|
#
5cccf586 |
|
06-Jan-2008 |
Alan Cox <alc@FreeBSD.org> |
Shrink the size of struct vm_page on amd64 and i386 by eliminating pv_list_count from struct md_page. Ever since Peter rewrote the pv entry allocator for amd64 and i386 pv_list_count has been correctly maintained but otherwise unused.
|
#
eb2a0517 |
|
03-Jan-2008 |
Alan Cox <alc@FreeBSD.org> |
Add an access type parameter to pmap_enter(). It will be used to implement superpage promotion. Correct a style error in kmem_malloc(): pmap_enter()'s last parameter is a Boolean.
|
#
86f14493 |
|
02-Jan-2008 |
Alan Cox <alc@FreeBSD.org> |
Provide a legitimate pindex to vm_page_alloc() in pmap_growkernel() instead of writing apologetic comments. As it turns out, I need every kernel page table page to have a legitimate pindex to support superpage promotion on kernel memory. Correct a nearby style error: Pointers should be compared to NULL.
|
#
dbfb54ff |
|
09-Dec-2007 |
Alan Cox <alc@FreeBSD.org> |
Eliminate compilation warnings due to the use of non-static inlines through the introduction and use of the __gnu89_inline attribute. Submitted by: bde (i386) MFC after: 3 days
|
#
d1ce3dfa |
|
04-Dec-2007 |
Alan Cox <alc@FreeBSD.org> |
Correct an error under COUNT_IPIS within pmap_lazyfix_action(): Increment the counter that the pointer refers to, not the pointer. MFC after: 3 days
|
#
58041e4b |
|
30-Nov-2007 |
Alan Cox <alc@FreeBSD.org> |
Improve get_pv_entry()'s handling of low-memory conditions. After page allocation fails and pv entries are reclaimed, there may be an unused pv entry in a pv chunk that survived the reclamation. However, previously, after reclamation, get_pv_entry() did not look for an unused pv entry in a surviving pv chunk; it simply retried the page allocation. Now, it does look for an unused pv entry before retrying the page allocation. Note: This only applies to RELENG_7. Earlier branches use a different pv entry allocator. MFC after: 6 weeks
|
#
59677d3c |
|
17-Nov-2007 |
Alan Cox <alc@FreeBSD.org> |
Prevent the leakage of wired pages in the following circumstances: First, a file is mmap(2)ed and then mlock(2)ed. Later, it is truncated. Under "normal" circumstances, i.e., when the file is not mlock(2)ed, the pages beyond the EOF are unmapped and freed. However, when the file is mlock(2)ed, the pages beyond the EOF are unmapped but not freed because they have a non-zero wire count. This can be a mistake. Specifically, it is a mistake if the sole reason why the pages are wired is because of wired, managed mappings. Previously, unmapping the pages destroys these wired, managed mappings, but does not reduce the pages' wire count. Consequently, when the file is unmapped, the pages are not unwired because the wired mapping has been destroyed. Moreover, when the vm object is finally destroyed, the pages are leaked because they are still wired. The fix is to reduce the pages' wired count by the number of wired, managed mappings destroyed. To do this, I introduce a new pmap function pmap_page_wired_mappings() that returns the number of managed mappings to the given physical page that are wired, and I use this function in vm_object_page_remove(). Reviewed by: tegge MFC after: 6 weeks
|
#
6dd3a6c0 |
|
13-Nov-2007 |
Peter Wemm <peter@FreeBSD.org> |
Drastically simplify the i386 pcpu backend by merging parts of the amd64 mechanism over. Instead of page table hackery that isn't actually needed, just use 'struct pcpu __pcpu[MAXCPU]' for backing like all the other platforms do. Get rid of 'struct privatespace' and a while mess of #ifdef SMP garbage that set it up. As a bonus, this returns the 4MB of KVA that we stole to implement it the old way. This also allows you to read the pcpu data for each cpu when reading a minidump. Background information: Originally, pcpu stuff was implemented as having per-cpu page tables and magic to make different data structures appear at the same actual address. In order to share page tables, we switched to using the GDT and %fs/%gs to access it. But we still did the evil magic to set it up for the old way. The "idle stacks" are not used for the idle process anymore and are just used for a few functions during bootup, then ignored. (excercise for reader: free these afterwards).
|
#
605385f8 |
|
05-Nov-2007 |
Alan Cox <alc@FreeBSD.org> |
Add comments explaining why all stores updating a non-kernel page table must be globally performed before calling any of the TLB invalidation functions. With one exception, on amd64, this requirement was already met. Fix this one case. Also, as a clarification, change an existing atomic op into a release. (Suggested by: jhb) Reported and reviewed by: ups MFC after: 3 days
|
#
89b57fcf |
|
05-Nov-2007 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix for the panic("vm_thread_new: kstack allocation failed") and silent NULL pointer dereference in the i386 and sparc64 pmap_pinit() when the kmem_alloc_nofault() failed to allocate address space. Both functions now return error instead of panicing or dereferencing NULL. As consequence, vmspace_exec() and vmspace_unshare() returns the errno int. struct vmspace arg was added to vm_forkproc() to avoid dealing with failed allocation when most of the fork1() job is already done. The kernel stack for the thread is now set up in the thread_alloc(), that itself may return NULL. Also, allocation of the first process thread is performed in the fork1() to properly deal with stack allocation failure. proc_linkup() is separated into proc_linkup() called from fork1(), and proc_linkup0(), that is used to set up the kernel process (was known as swapper). In collaboration with: Peter Holm Reviewed by: jhb
|
#
6afd4b92 |
|
02-Nov-2007 |
Alan Cox <alc@FreeBSD.org> |
Eliminate spurious "Approaching the limit on PV entries, ..." warnings. Specifically, whenever vm_page_alloc(9) returned NULL to get_pv_entry(), we issued a warning regardless of the number of pv entries in use. (Note: The older pv entry allocator in RELENG_6 does not have this problem.) Reported by: Jeremy Chadwick Eliminate the direct call to pagedaemon_wakeup() by get_pv_entry(). This was a holdover from earlier times when the page daemon was responsible for the reclamation of pv entries. MFC after: 5 days
|
#
8beae253 |
|
20-Aug-2007 |
Alan Cox <alc@FreeBSD.org> |
In general, when we map a page into the kernel's address space, we no longer create a pv entry for that mapping. (The two exceptions are mappings into the kernel's exec and pipe submaps.) Consequently, there is no reason for get_pv_entry() to dig deep into the free page queues, i.e., use VM_ALLOC_SYSTEM, by default. This revision changes get_pv_entry() to use VM_ALLOC_NORMAL by default, i.e., before calling pmap_collect() to reclaim pv entries. Approved by: re (kensmith)
|
#
ba4b85e4 |
|
01-Jul-2007 |
Alan Cox <alc@FreeBSD.org> |
Pages that do belong to an object and page queue can now be freed without holding the page queues lock. Thus, the page table pages released by pmap_remove() and pmap_remove_pages() can be freed after the page queues lock is released. Approved by: re (kensmith)
|
#
2feb50bf |
|
31-May-2007 |
Attilio Rao <attilio@FreeBSD.org> |
Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)
|
#
80b200da |
|
20-May-2007 |
Jeff Roberson <jeff@FreeBSD.org> |
- rename VMCNT_DEC to VMCNT_SUB to reflect the count argument. Suggested by: julian@ Contributed by: attilio@
|
#
222d0195 |
|
18-May-2007 |
Jeff Roberson <jeff@FreeBSD.org> |
- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>
|
#
31b4f4a9 |
|
21-Apr-2007 |
Stephan Uphoff <ups@FreeBSD.org> |
Modify TLB invalidation handling. Reviewed by: alc@, peter@ MFC after: 1 week
|
#
0b765048 |
|
13-Apr-2007 |
Alan Cox <alc@FreeBSD.org> |
Eliminate the misuse of PG_FRAME to truncate a virtual address to a virtual page boundary. Reviewed by: ru@
|
#
2e137367 |
|
06-Apr-2007 |
Ruslan Ermilov <ru@FreeBSD.org> |
Add the PG_NX support for i386/PAE. Reviewed by: alc
|
#
8a5e898d |
|
24-Mar-2007 |
Alan Cox <alc@FreeBSD.org> |
In order to satisfy ACPI's need for an identity mapping, modify the temporary mapping created by locore so that the lowest two to four megabytes can become a permanent identity mapping. This implementation avoids any use of a large page mapping.
|
#
8cfba726 |
|
17-Mar-2007 |
Alan Cox <alc@FreeBSD.org> |
Eliminate an unused parameter.
|
#
9b0df55b |
|
14-Mar-2007 |
Nate Lawson <njl@FreeBSD.org> |
Create an identity mapping (V=P) super page for the low memory region on boot. Then, just switch to the kernel pmap when suspending instead of allocating/freeing our own mapping every time. This should solve a panic of pmap_remove() being called with interrupts disabled. Thanks to Alan Cox for developing this patch. Note: this means that ACPI requires super page (PG_PS) support in the CPU. This has been present since the Pentium and first documented in the Pentium Pro. However, it may need to be revisited later. Submitted by: alc MFC after: 1 month
|
#
8da3fc95 |
|
05-Mar-2007 |
Alan Cox <alc@FreeBSD.org> |
Acquiring smp_ipi_mtx on every call to pmap_invalidate_*() is wasteful. For example, during a buildworld more than half of the calls do not generate an IPI because the only TLB entry invalidated is on the calling processor. This revision pushes down the acquisition and release of smp_ipi_mtx into smp_tlb_shootdown() and smp_targeted_tlb_shootdown() and instead uses sched_pin() and sched_unpin() in pmap_invalidate_*() so that thread migration doesn't lead to a missed TLB invalidation. Reviewed by: jhb MFC after: 3 weeks
|
#
ae0663a3 |
|
17-Feb-2007 |
Alan Cox <alc@FreeBSD.org> |
Eliminate some acquisitions and releases of the page queues lock that are no longer necessary.
|
#
f67af5c9 |
|
17-Jan-2007 |
Xin LI <delphij@FreeBSD.org> |
Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form.
|
#
da449604 |
|
19-Nov-2006 |
Alan Cox <alc@FreeBSD.org> |
The global variable avail_end is redundant and only used once. Eliminate it. Make avail_start static to the pmap on amd64. (It no longer exists on other architectures.)
|
#
79ba24ca |
|
16-Nov-2006 |
Maxim Konovalov <maxim@FreeBSD.org> |
o Make pv_maxchunks no less than maxproc. This helps to survive a forkbomb explosion. Reviewed by: alc Security: local DoS X-MFC atfer: RELENG_6 is not affected due to a different pv_entry allocation code.
|
#
44b8bd66 |
|
12-Nov-2006 |
Alan Cox <alc@FreeBSD.org> |
Make pmap_enter() responsible for setting PG_WRITEABLE instead of its caller. (As a beneficial side-effect, a high-contention acquisition of the page queues lock in vm_fault() is eliminated.)
|
#
43200cd3ed |
|
21-Oct-2006 |
Alan Cox <alc@FreeBSD.org> |
Eliminate unnecessary PG_BUSY tests.
|
#
d25fdf53 |
|
14-Aug-2006 |
John Baldwin <jhb@FreeBSD.org> |
Don't try to preserve PAT bits in pmap_enter(). We currently on pages that aren't mapped via pmap_enter() (KVA). We will eventually support PAT bits on user pages, but those will require some sort of MI caching mode stored in the vm_page. Reviewed by: alc
|
#
7e9f73f3 |
|
11-Aug-2006 |
John Baldwin <jhb@FreeBSD.org> |
First pass at allowing memory to be mapped using cache modes other than WB (write-back) on x86 via control bits in PTEs and PDEs (including making use of the PAT MSR). Changes include: - A new pmap_mapdev_attr() function for amd64 and i386 which takes an additional parameter (relative to pmap_mapdev()) specifying the cache mode for this mapping. Note that on amd64 only WB mappings are done with the direct map, all other modes result in a private mapping. - pmap_mapdev() on i386 and amd64 now defaults to using UC (uncached) mappings rather than WB. Previously we relied on the BIOS setting up MTRR's to enforce memio regions being treated as UC. This might make hw.cbb_start_memory unnecessary in some cases now for example. - A new pmap_mapbios()/pmap_unmapbios() API has been added to allow places that used pmap_mapdev() to map non-device memory (such as ACPI tables) to do so using WB as before. - A new pmap_change_attr() function for amd64 and i386 that changes the caching mode for a range of KVA. Reviewed by: alc
|
#
14aaab53 |
|
06-Aug-2006 |
Alan Cox <alc@FreeBSD.org> |
Eliminate the acquisition and release of the page queues lock around a call to vm_page_sleep_if_busy().
|
#
78985e42 |
|
01-Aug-2006 |
Alan Cox <alc@FreeBSD.org> |
Complete the transition from pmap_page_protect() to pmap_remove_write(). Originally, I had adopted sparc64's name, pmap_clear_write(), for the function that is now pmap_remove_write(). However, this function is more like pmap_remove_all() than like pmap_clear_modify() or pmap_clear_reference(), hence, the name change. The higher-level rationale behind this change is described in src/sys/amd64/amd64/pmap.c revision 1.567. The short version is that I'm trying to clean up and fix our support for execute access. Reviewed by: marcel@ (ia64)
|
#
3cad40e5 |
|
20-Jul-2006 |
Alan Cox <alc@FreeBSD.org> |
Add pmap_clear_write() to the interface between the virtual memory system's machine-dependent and machine-independent layers. Once pmap_clear_write() is implemented on all of our supported architectures, I intend to replace all calls to pmap_page_protect() by calls to pmap_clear_write(). Why? Both the use and implementation of pmap_page_protect() in our virtual memory system has subtle errors, specifically, the management of execute permission is broken on some architectures. The "prot" argument to pmap_page_protect() should behave differently from the "prot" argument to other pmap functions. Instead of meaning, "give the specified access rights to all of the physical page's mappings," it means "don't take away the specified access rights from all of the physical page's mappings, but do take away the ones that aren't specified." However, owing to our i386 legacy, i.e., no support for no-execute rights, all but one invocation of pmap_page_protect() specifies VM_PROT_READ only, when the intent is, in fact, to remove only write permission. Consequently, a faithful implementation of pmap_page_protect(), e.g., ia64, would remove execute permission as well as write permission. On the other hand, some architectures that support execute permission have basically ignored whether or not VM_PROT_EXECUTE is passed to pmap_page_protect(), e.g., amd64 and sparc64. This change represents the first step in replacing pmap_page_protect() by the less subtle pmap_clear_write() that is already implemented on amd64, i386, and sparc64. Discussed with: grehan@ and marcel@
|
#
7c9cc27f |
|
17-Jul-2006 |
Alan Cox <alc@FreeBSD.org> |
MFamd64 pmap_clear_ptes() is already convoluted. This will worsen with the implementation of superpages. Eliminate it and add pmap_clear_write(). There are no functional changes. Checked by: md5
|
#
e4cec283 |
|
16-Jul-2006 |
Alan Cox <alc@FreeBSD.org> |
Now that free_pv_entry() accesses the pmap, call free_pv_entry() in pmap_remove_all() before rather than after the pmap is unlocked. At present, the page queues lock provides sufficient sychronization. In the future, the page queues lock may not always be held when free_pv_entry() is called.
|
#
d259662b |
|
16-Jul-2006 |
Alan Cox <alc@FreeBSD.org> |
MFamd64 Make three simplifications to pmap_ts_referenced(): Eliminate an initialized but otherwise unused variable. Eliminate an unnecessary test. Exit the loop in a shorter way.
|
#
ddd6244a |
|
16-Jul-2006 |
Alan Cox <alc@FreeBSD.org> |
Eliminate the remaining uses of "register". Convert the remaining K&R-style function declarations to ANSI-style. Eliminate excessive white space from pmap_ts_referenced().
|
#
d3dd65ab |
|
15-Jul-2006 |
Alan Cox <alc@FreeBSD.org> |
Make pc_freemask an array of uint32_t, rather than uint64_t. (I believe that the use of the latter is simply an oversight in porting the new pv entry code from amd64.)
|
#
da536e63 |
|
02-Jul-2006 |
Alan Cox <alc@FreeBSD.org> |
Correct an error in the new pmap_collect(), thus only affecting HEAD. Specifically, the pv entry was always being freed to the caller's pmap instead of the pmap to which the pv entry belongs.
|
#
8e0e1e22 |
|
26-Jun-2006 |
Alan Cox <alc@FreeBSD.org> |
Correct a very old and very obscure bug: vmspace_fork() calls pmap_copy() if the mapping is VM_INHERIT_SHARE. Suppose the mapping is also wired. vmspace_fork() clears the wiring attributes in the vm map entry but pmap_copy() copies the PG_W attribute in the PTE. I don't think this is catastrophic. It blocks pmap_remove_pages() from destroying the mapping and corrupts the pmap's wiring count. This revision fixes the problem by changing pmap_copy() to clear the PG_W attribute. Reviewed by: tegge@
|
#
031bf5ea |
|
25-Jun-2006 |
Alan Cox <alc@FreeBSD.org> |
Eliminate a comment that became stale after revision 1.547.
|
#
f0544664 |
|
20-Jun-2006 |
Alan Cox <alc@FreeBSD.org> |
Change get_pv_entry() such that the call to vm_page_alloc() specifies VM_ALLOC_NORMAL instead of VM_ALLOC_SYSTEM when try is TRUE. In other words, when get_pv_entry() is permitted to fail, it no longer tries as hard to allocate a page. Change pmap_enter_quick_locked() to fail rather than wait if it is unable to allocate a page table page. This prevents a race between pmap_enter_object() and the page daemon. Specifically, an inactive page that is a successor to the page that was given to pmap_enter_quick_locked() might become a cache page while pmap_enter_quick_locked() waits and later pmap_enter_object() maps the cache page violating the invariant that cache pages are never mapped. Similarly, change pmap_enter_quick_locked() to call pmap_try_insert_pv_entry() rather than pmap_insert_entry(). Generally speaking, pmap_enter_quick_locked() is used to create speculative mappings. So, it should not try hard to allocate memory if free memory is scarce. Add an assertion that the object containing m_start is locked in pmap_enter_object(). Remove a similar assertion from pmap_enter_quick_locked() because that function no longer accesses the containing object. Remove a stale comment. Reviewed by: ups@
|
#
2053c127 |
|
14-Jun-2006 |
Stephan Uphoff <ups@FreeBSD.org> |
Remove mpte optimization from pmap_enter_quick(). There is a race with the current locking scheme and removing it should have no measurable performance impact. This fixes page faults leading to panics in pmap_enter_quick_locked() on amd64/i386. Reviewed by: alc,jhb,peter,ps
|
#
b74a62d6 |
|
12-Jun-2006 |
Alan Cox <alc@FreeBSD.org> |
Don't invalidate the TLB in pmap_qenter() unless the old mapping was valid. Most often, it isn't. Reviewed by: tegge@
|
#
ce142d9e |
|
05-Jun-2006 |
Alan Cox <alc@FreeBSD.org> |
Introduce the function pmap_enter_object(). It maps a sequence of resident pages from the same object. Use it in vm_map_pmap_enter() to reduce the locking overhead of premapping objects. Reviewed by: tegge@
|
#
62b5e735 |
|
05-Jun-2006 |
Alan Cox <alc@FreeBSD.org> |
MFamd64 Eliminate unnecessary, recursive acquisitions and releases of the page queues lock by free_pv_entry() and pmap_remove_pages(). Reduce the scope of the page queues lock in pmap_remove_pages().
|
#
2b8a339c |
|
01-May-2006 |
John Baldwin <jhb@FreeBSD.org> |
Add various constants for the PAT MSR and the PAT PTE and PDE flags. Initialize the PAT MSR during boot to map PAT type 2 to Write-Combining (WC) instead of Uncached (UC-). MFC after: 1 month
|
#
4ac60df5 |
|
01-May-2006 |
John Baldwin <jhb@FreeBSD.org> |
Add a new 'pmap_invalidate_cache()' to flush the CPU caches via the wbinvd() instruction. This includes a new IPI so that all CPU caches on all CPUs are flushed for the SMP case. MFC after: 1 month
|
#
ada5d7d5 |
|
01-May-2006 |
Peter Wemm <peter@FreeBSD.org> |
Using an idea from Stephan Uphoff, use the empty pte's that correspond to the unused kva in the pv memory block to thread a freelist through. This allows us to free pages that used to be used for pv entry chunks since we can now track holes in the kva memory block. Idea from: ups
|
#
4c8eff70 |
|
01-May-2006 |
Peter Wemm <peter@FreeBSD.org> |
Fix missing changes required for the amd64->i386 conversion. Add the missing VM_ALLOC_WIRED flags to vm_page_alloc() calls I added. Submitted by: alc
|
#
7eeda227 |
|
28-Apr-2006 |
Peter Wemm <peter@FreeBSD.org> |
Interim fix for pmap problems I introduced with my last commit. Remove the code to dyanmically change the pv_entry limits. Go back to a single fixed kva reservation for pv entries, like was done before when using the uma zone. Go back to never freeing pages back to the free pool after they are no longer used, just like before. This stops the lock order reversal due to aquiring the kernel map lock while pmap was locked. This fixes the recursive panic if invariants are enabled. The problem was that allocating/freeing kva causes vm_map_entry nodes to be allocated/freed. That can recurse back into pmap as new pages are hooked up to kvm and hence all the problem. Allocating/freeing kva indirectly allocate/frees memory. So, by going back to a single fixed size kva block and an index, we avoid the recursion panics and the LOR. The problem is that now with a linear block of kva, we have no mechanism to track holes once pages are freed. UMA has the same problem when using custom object for a zone and a fixed reservation of kva. Simple solutions like having a bitmap would work, but would be very inefficient when there are hundreds of thousands of bits in the map. A first-free pointer is similarly flawed because pages can be freed at random and the first-free pointer would be rewinding huge amounts. If we could allocate memory for tree strucures or an external freelist, that would work. Except we cannot allocate/free memory here because we cannot allocate/free address space to use it in. Anyway, my change here reverts back to the UMA behavior of not freeing pages for now, thereby avoiding holes in the map. ups@ had a truely evil idea that I'll investigate. It should allow freeing unused pages again by giving us a no-cost way to track the holes in the kva block. But in the meantime, this should get people booting with witness and/or invariants again. Footnote: amd64 doesn't have this problem because of the direct map access method. I'd done all my witness/invariants testing there. I'd never considered that the harmless-looking kmem_alloc/kmem_free calls would cause such a problem and it didn't show up on the boot test.
|
#
7dece6c7 |
|
27-Apr-2006 |
Alan Cox <alc@FreeBSD.org> |
In general, bits in the page directory entry (PDE) and the page table entry (PTE) have the same meaning. The exception to this rule is the eighth bit (0x080). It is the PS bit in a PDE and the PAT bit in a PTE. This change avoids the possibility that pmap_enter() confuses a PAT bit with a PS bit, avoiding a panic(). Eliminate a diagnostic printf() from the i386 pmap_enter() that serves no current purpose, i.e., I've seen no bug reports in the last two years that are helped by this printf(). Reviewed by: jhb
|
#
027ed650 |
|
26-Apr-2006 |
Xin LI <delphij@FreeBSD.org> |
Fix build on i386
|
#
041a991f |
|
26-Apr-2006 |
Peter Wemm <peter@FreeBSD.org> |
MFamd64: shrink pv entries from 24 bytes to about 12 bytes. (336 pv entries per page = effectively 12.19 bytes per pv entry after overheads). Instead of using a shared UMA zone for 24 byte pv entries (two 8-byte tailq nodes, a 4 byte pointer, and a 4 byte address), we allocate a page at a time per process. This provides 336 pv entries per process (actually, per pmap address space) and eliminates one of the 8-byte tailq entries since we now can track per-process pv entries implicitly. The pointer to the pmap can be eliminated by doing address arithmetic to find the metadata on the page headers to find a single pointer shared by all 336 entries. There is an 11-int bitmap for the freelist of those 336 entries. This is mostly a mechanical conversion from amd64, except: * i386 has to allocate kvm and map the pages, amd64 has them outside of kvm * native word size is smaller, so bitmaps etc become 32 bit instead of 64 * no dump_add_page() etc stuff because they are in kvm always. * various pmap internals tweaks because pmap uses direct map on amd64 but on i386 it has to use sched_pin and temporary mappings. Also, sysctl vm.pmap.pv_entry_max and vm.pmap.shpgperproc are now dynamic sysctls. Like on amd64, i386 can now tune the pv entry limits without a recompile or reboot. This is important because of the following scenario. If you have a 1GB file (262144 pages) mmap()ed into 50 processes, that requires 13 million pv entries. At 24 bytes per pv entry, that is 314MB of ram and kvm, while at 12 bytes it is 157MB. A 157MB saving is significant. Test-run by: scottl (Thanks!)
|
#
826c2072 |
|
11-Apr-2006 |
Alan Cox <alc@FreeBSD.org> |
Retire pmap_track_modified(). We no longer need it because we do not create managed mappings within the clean submap. To prevent regressions, add assertions blocking the creation of managed mappings within the clean submap. Reviewed by: tegge
|
#
b9eee07e |
|
03-Apr-2006 |
Peter Wemm <peter@FreeBSD.org> |
Remove the unused sva and eva arguments from pmap_remove_pages().
|
#
9c6a71e4 |
|
01-Apr-2006 |
Alan Cox <alc@FreeBSD.org> |
Introduce pmap_try_insert_pv_entry(), a function that conditionally creates a pv entry if the number of entries is below the high water mark for pv entries. Use pmap_try_insert_pv_entry() in pmap_copy() instead of pmap_insert_entry(). This avoids possible recursion on a pmap lock in get_pv_entry(). Eliminate the explicit low-memory checks in pmap_copy(). The check that the number of pv entries was below the high water mark was largely ineffective because it was located in the outer loop rather than the inner loop where pv entries were allocated. Instead of checking, we attempt the allocation and handle the failure. Reviewed by: tegge Reported by: kris MFC after: 5 days
|
#
fa8053e9 |
|
21-Mar-2006 |
Alan Cox <alc@FreeBSD.org> |
Eliminate unnecessary invalidations of the entire TLB by pmap_remove(). Specifically, on mappings with PG_G set pmap_remove() not only performs the necessary per-page invlpg invalidations but also performs an unnecessary invalidation of the entire set of non-PG_G entries. Reviewed by: tegge
|
#
39d3e619 |
|
20-Mar-2006 |
David Xu <davidxu@FreeBSD.org> |
Remove stale KSE code. Reviewed by: alc
|
#
6bd7e81d |
|
16-Feb-2006 |
Tor Egge <tegge@FreeBSD.org> |
Rounding addr upwards to next 4M or 2M boundary in pmap_growkernel() could cause addr to become 0, resulting in an early return without populating the last PDE. Reviewed by: alc
|
#
f0b98139 |
|
05-Dec-2005 |
John Baldwin <jhb@FreeBSD.org> |
- Move the code to deal with handling an IPI_STOP IPI out of ipi_nmi_handler() and into a new cpustop_handler() function. Change the Xcpustop IPI_STOP handler to call this function instead of duplicating all the same logic in assembly. - EOI the local APIC for the lapic timer interrupt in C rather than assembly. - Bump the lazypmap IPI counter if COUNT_IPIS is defined in C rather than assembly.
|
#
97a0c226 |
|
19-Nov-2005 |
Alan Cox <alc@FreeBSD.org> |
Eliminate pmap_init2(). It's no longer used.
|
#
421552a5 |
|
13-Nov-2005 |
Warner Losh <imp@FreeBSD.org> |
Provide a dummy NO_XBOX option that lives in opt_xbox.h for pc98. This allows us to eliminate a three ifdef PC98 instances.
|
#
65336314 |
|
12-Nov-2005 |
Alan Cox <alc@FreeBSD.org> |
In get_pv_entry() use PMAP_LOCK() instead of PMAP_TRYLOCK() when deadlock cannot possibly occur.
|
#
1ba0023e |
|
08-Nov-2005 |
Yoshihiro Takahashi <nyan@FreeBSD.org> |
Fix pc98 build.
|
#
7a35a21e |
|
09-Nov-2005 |
Alan Cox <alc@FreeBSD.org> |
Reimplement the reclamation of PV entries. Specifically, perform reclamation synchronously from get_pv_entry() instead of asynchronously as part of the page daemon. Additionally, limit the reclamation to inactive pages unless allocation from the PV entry zone or reclamation from the inactive queue fails. Previously, reclamation destroyed mappings to both inactive and active pages. get_pv_entry() still, however, wakes up the page daemon when reclamation occurs. The reason being that the page daemon may move some pages from the active queue to the inactive queue, making some new pages available to future reclamations. Print the "reclaiming PV entries" message at most once per minute, but don't stop printing it after the fifth time. This way, we do not give the impression that the problem has gone away. Reviewed by: tegge
|
#
51ef421d |
|
08-Nov-2005 |
Warner Losh <imp@FreeBSD.org> |
Add support for XBOX to the FreeBSD port. The xbox architecture is nearly identical to wintel/ia32, with a couple of tweaks. Since it is so similar to ia32, it is optionally added to a i386 kernel. This port is preliminary, but seems to work well. Further improvements will improve the interaction with syscons(4), port Linux nforce driver and future versions of the xbox. This supports the 64MB and 128MB boxes. You'll need the most recent CVS version of Cromwell (the Linux BIOS for the XBOX) to boot. Rink will be maintaining this port, and is interested in feedback. He's setup a website http://xbox-bsd.nl to report the latest developments. Any silly mistakes are my fault. Submitted by: Rink P.W. Springer rink at stack dot nl and Ed Schouten ed at fxq dot nl
|
#
e9cb1037 |
|
04-Nov-2005 |
Alan Cox <alc@FreeBSD.org> |
Begin and end the initialization of pvzone in pmap_init(). Previously, pvzone's initialization was split between pmap_init() and pmap_init2(). This split initialization was the underlying cause of some UMA panics during initialization. Specifically, if the UMA boot pages was exhausted before the pvzone was fully initialized, then UMA, through no fault of its own, would use an inappropriate back-end allocator leading to a panic. (Previously, as a workaround, we have increased the UMA boot pages.) Fortunately, there is no longer any reason that pvzone's initialization cannot be completed in pmap_init(). Eliminate a check for whether pv_entry_high_water has been initialized or not from get_pv_entry(). Since pvzone's initialization is completed in pmap_init(), this check is no longer needed. Use cnt.v_page_count, the actual count of available physical pages, instead of vm_page_array_size to compute the maximum number of pv entries. Introduce the vm.pmap.pv_entries tunable on alpha and ia64. Eliminate some unnecessary white space. Discussed with: tegge (item #1) Tested by: marcel (ia64)
|
#
f7118bdf |
|
31-Oct-2005 |
Alan Cox <alc@FreeBSD.org> |
Instead of a panic()ing in pmap_insert_entry() if get_pv_entry() fails, reclaim a pv entry by destroying a mapping to an inactive page. Change the format strings in many of the assertions that were recently converted from PMAP_DIAGNOSTIC printf()s so that they are compatible with PAE. Avoid unnecessary differences between the amd64 and i386 format strings.
|
#
6fb8d0e3 |
|
30-Oct-2005 |
Alan Cox <alc@FreeBSD.org> |
Replace diagnostic printf()s by assertions. Use consistent style for similar assertions.
|
#
8d228514 |
|
21-Oct-2005 |
Ade Lovett <ade@FreeBSD.org> |
Specifically panic() in the case where pmap_insert_entry() fails to get a new pv under high system load where the available pv entries have been exhausted before the pagedaemon has a chance to wake up to reclaim some. Prior to this, the NULL pointer dereference ended up causing secondary panics with rather less than useful resulting tracebacks. Reviewed by: alc, jhb MFC after: 1 week
|
#
3be99ffc |
|
04-Sep-2005 |
Alan Cox <alc@FreeBSD.org> |
Eliminate unnecessary TLB invalidations by pmap_enter(). Specifically, eliminate TLB invalidations when permissions are relaxed, such as when a read-only mapping is changed to a read/write mapping. Additionally, eliminate TLB invalidations when bits that are ignored by the hardware, such as PG_W ("wired mapping"), are changed. Reviewed by: tegge
|
#
ba8bca61 |
|
03-Sep-2005 |
Alan Cox <alc@FreeBSD.org> |
Pass a value of type vm_prot_t to pmap_enter_quick() so that it determine whether the mapping should permit execute access.
|
#
f564b2d2 |
|
27-Aug-2005 |
Alan Cox <alc@FreeBSD.org> |
MFamd64 revision 1.526 When pmap_allocpte() destroys a 2/4MB "superpage" mapping it does not reduce the pmap's resident count accordingly. It should.
|
#
96e51094 |
|
14-Aug-2005 |
Alan Cox <alc@FreeBSD.org> |
Simplify the page table page reference counting by pmap_enter()'s change of mapping case. Eliminate a stale comment from pmap_enter(). Reviewed by: tegge
|
#
50b33450 |
|
11-Aug-2005 |
Alan Cox <alc@FreeBSD.org> |
Eliminate unneeded diagnostic code. Eliminate an unused #include. (Kernel stack allocation and deallocation long ago migrated to the machine-independent code.)
|
#
b69dd0fd |
|
11-Aug-2005 |
Alan Cox <alc@FreeBSD.org> |
Eliminate unneeded diagnostic code. Reviewed by: tegge
|
#
8e7a85fa |
|
10-Aug-2005 |
Alan Cox <alc@FreeBSD.org> |
Decouple the unrefing of a page table page from the removal of a pv entry. In other words, change pmap_remove_entry() such that it no longer unrefs the page table page. Now, it only removes the pv entry. Reviewed by: tegge
|
#
5f2c46d5 |
|
07-Aug-2005 |
Alan Cox <alc@FreeBSD.org> |
When support for 2MB/4MB pages was added in revision 1.148 an error was made in pmap_protect(): The pmap's resident count should not be reduced unless mappings are removed. The errant change to the pmap's resident count could result in a later pmap_remove() failing to remove any mappings if the errant change has set the pmap's resident count to zero.
|
#
bca79029 |
|
29-Jul-2005 |
John Baldwin <jhb@FreeBSD.org> |
Fix a bug in pmap_protect() in the PAE case where it would try to look up the vm_page_t associated with a pte using only the lower 32-bits of the pte instead of the full 64-bits. Submitted by: Greg Taleck greg at isilon dot com Reviewed by: jeffr, alc MFC after: 3 days
|
#
60baed37 |
|
02-Jul-2005 |
Xin LI <delphij@FreeBSD.org> |
Remove the CPU_ENABLE_SSE option from the i386 and pc98 architectures, as they are already default for I686_CPU for almost 3 years, and CPU_DISABLE_SSE always disables it. On the other hand, CPU_ENABLE_SSE does not work for I486_CPU and I586_CPU. This commit has: - Removed the option from conf/options.* - Removed the option and comments from MD NOTES files - Simplified the CPU_ENABLE_SSE ifdef's so they don't deal with CPU_ENABLE_SSE from kernel configuration. (*) For most users, this commit should be largely no-op. If you used to place CPU_ENABLE_SSE into your kernel configuration for some reason, it is time to remove it. (*) The ifdef's of CPU_ENABLE_SSE are not removed at this point, since we need to change it to !defined(CPU_DISABLE_SSE) && defined(I686_CPU), not just !defined(CPU_DISABLE_SSE), if we really want to do so. Discussed on: -arch Approved by: re (scottl)
|
#
1c245ae7 |
|
09-Jun-2005 |
Alan Cox <alc@FreeBSD.org> |
Introduce a procedure, pmap_page_init(), that initializes the vm_page's machine-dependent fields. Use this function in vm_pageq_add_new_page() so that the vm_page's machine-dependent and machine-independent fields are initialized at the same time. Remove code from pmap_init() for initializing the vm_page's machine-dependent fields. Remove stale comments from pmap_init(). Eliminate the Boolean variable pmap_initialized from the alpha, amd64, i386, and ia64 pmap implementations. Its use is no longer required because of the above changes and earlier changes that result in physical memory that is being mapped at initialization time being mapped without pv entries. Tested by: cognet, kensmith, marcel
|
#
fb1b26da |
|
05-Feb-2005 |
Alan Cox <alc@FreeBSD.org> |
Implement proper handling of PG_G mappings in pmap_protect(). (I don't believe that this omission mattered before the introduction of MemGuard.) Reviewed by: tegge@ MFC after: 1 week
|
#
1f70d622 |
|
23-Dec-2004 |
Alan Cox <alc@FreeBSD.org> |
Modify pmap_enter_quick() so that it expects the page queues to be locked on entry and it assumes the responsibility for releasing the page queues lock if it must sleep. Remove a bogus comment from pmap_enter_quick(). Using the first change, modify vm_map_pmap_enter() so that the page queues lock is acquired and released once, rather than each time that a page is mapped.
|
#
85f5b245 |
|
15-Dec-2004 |
Alan Cox <alc@FreeBSD.org> |
In the common case, pmap_enter_quick() completes without sleeping. In such cases, the busying of the page and the unlocking of the containing object by vm_map_pmap_enter() and vm_fault_prefault() is unnecessary overhead. To eliminate this overhead, this change modifies pmap_enter_quick() so that it expects the object to be locked on entry and it assumes the responsibility for busying the page and unlocking the object if it must sleep. Note: alpha, amd64, i386 and ia64 are the only implementations optimized by this change; arm, powerpc, and sparc64 still conservatively busy the page and unlock the object within every pmap_enter_quick() call. Additionally, this change is the first case where we synchronize access to the page's PG_BUSY flag and busy field using the containing object's lock rather than the global page queues lock. (Modifications to the page's PG_BUSY flag and busy field have asserted both locks for several weeks, enabling an incremental transition.)
|
#
8b902508 |
|
06-Dec-2004 |
Stephan Uphoff <ups@FreeBSD.org> |
Move reading the current CPU mask in pmap_lazyfix() to where the thread is protected from migrating to another CPU. Approved by: sam (mentor) MFC after: 4 weeks
|
#
4878c3cd |
|
01-Dec-2004 |
Alan Cox <alc@FreeBSD.org> |
For efficiency move the call to pmap_pte_quick() out of pmap_protect()'s and pmap_remove()'s inner loop. Reviewed by: peter@, tegge@
|
#
6004362e |
|
26-Nov-2004 |
David Schultz <das@FreeBSD.org> |
Don't include sys/user.h merely for its side-effect of recursively including other headers.
|
#
2d68e3fb |
|
16-Nov-2004 |
John Baldwin <jhb@FreeBSD.org> |
Initiate deorbit burn sequence for 80386 support in FreeBSD: Remove 80386 (I386_CPU) support from the kernel.
|
#
7c76d642 |
|
29-Oct-2004 |
Alan Cox <alc@FreeBSD.org> |
Implement per-CPU SYSMAPs, i.e., CADDR* and CMAP*, to reduce lock contention within pmap_zero_page() and pmap_copy_page().
|
#
aced26ce |
|
08-Oct-2004 |
Alan Cox <alc@FreeBSD.org> |
Make pte_load_store() an atomic operation in all cases, not just i386 PAE. Restructure pmap_enter() to prevent the loss of a page modified (PG_M) bit in a race between processors. (This restructuring assumes the newly atomic pte_load_store() for correct operation.) Reviewed by: tegge@ PR: i386/61852
|
#
caa665aa |
|
03-Oct-2004 |
Alan Cox <alc@FreeBSD.org> |
Undo revision 1.251. This change was a performance pessimizing work-around that is no longer required. (In fact, it is not clear that it was ever required in HEAD or RELENG_4, only RELENG_3 required a work-around.) Now, as before revision 1.251, if the preexisting PTE is invalid, pmap_enter() does not call pmap_invalidate_page() to update the TLB(s). Note: Even with this change, the handling of a copy-on-write fault is inefficient, in such cases pmap_enter() calls pmap_invalidate_page() twice. Discussed with: bde@ PR: kern/16568
|
#
8ceb3dcb |
|
02-Oct-2004 |
Alan Cox <alc@FreeBSD.org> |
The physical address stored in the vm_page is page aligned. There is no need to mask off the page offset bits. (This operation made some sense prior to i386/i386/pmap.c revision 1.254 when we passed a physical address rather than a vm_page pointer to pmap_enter().)
|
#
07b33039 |
|
02-Oct-2004 |
Alan Cox <alc@FreeBSD.org> |
Eliminate unnecessary uses of PHYS_TO_VM_PAGE() from pmap_enter(). These uses predate the change in the pmap_enter() interface that replaced the page's physical address by the address of its vm_page structure. The PHYS_TO_VM_PAGE() was being used to compute the address of the same vm_page structure that was being passed in.
|
#
0a752e98 |
|
29-Sep-2004 |
Alan Cox <alc@FreeBSD.org> |
Prevent the unexpected deallocation of a page table page while performing pmap_copy(). This entails additional locking in pmap_copy() and the addition of a "flags" parameter to the page table page allocator for specifying whether it may sleep when memory is unavailable. (Already, pmap_copy() checks the availability of memory, aborting if it is scarce. In theory, another CPU could, however, allocate memory between pmap_copy()'s check and the call to the page table page allocator, causing the current thread to release its locks and sleep. This change makes this scenario impossible.) Reviewed by: tegge@
|
#
a9711396 |
|
21-Sep-2004 |
Alan Cox <alc@FreeBSD.org> |
Correct a long-standing error in _pmap_unwire_pte_hold() affecting multiprocessors. Specifically, the error is conditioning the call to pmap_invalidate_page() on whether the pmap is active on the current CPU. This call must be unconditional. Regardless of whether the pmap is active on the CPU performing _pmap_unwire_pte_hold(), it could be active on another CPU. For example, a call to pmap_remove_all() by the page daemon could result in a call to _pmap_unwire_pte_hold() with the pmap inactive on the current CPU and active on another CPU. In such circumstances, failing to call pmap_invalidate_page() results in a stale TLB entry on the other CPU that still maps the now deallocated page table page. What happens next is typically a mysterious panic in pmap_enter() by the other CPU, either "pmap_enter: attempted pmap_enter on 4MB page" or "pmap_enter: pte vanished, va: 0x%lx". Both occur because the former page table page has been recycled and allocated to a new purpose. Consequently, it no longer contains zeroes. See also Peter's i386/i386/pmap.c revision 1.448 and the related e-mail thread last year. Many thanks to the engineers at Sandvine for providing clear and concise information until all of the pieces of the puzzle fell into place and for testing an earlier patch. MT5 Candidate
|
#
de6c3db0 |
|
19-Sep-2004 |
Alan Cox <alc@FreeBSD.org> |
Simplify the reference counting of page table pages. Specifically, use the page table page's wired count rather than its hold count to contain the reference count. My rationale for this change is based on several factors: 1. The machine-independent and pmap layers used the same hold count field in subtly different ways. The machine-independent layer uses the hold count to implement a form of ephemeral wiring that is used by pipes, physio, etc. In other words, subsystems where we wish to temporarily block a page from being swapped out while it is mapped into the kernel's address space. Such pages are never removed from the page queues. Instead, the page daemon recognizes a non-zero hold count to mean "hands off this page." In contrast, page table pages are never in the page queues; they are wired from birth to death. The hold count was being used as a kind of reference count, specifically, the number of valid page table entries within the page. Not surprisingly, these two different uses imply different synchronization rules: in the machine- independent layer access to the hold count requires the page queues lock; whereas in the pmap layer the pmap lock is required. Thus, continued use by the pmap layer of vm_page_unhold(), which asserts that the page queues lock is held, made no sense. 2. _pmap_unwire_pte_hold() was too forgiving in its handling of the wired count. An unexpected wired count on a page table page was ignored and the underlying page leaked. 3. In a word, microoptimization. Using the wired count exclusively, rather than a combination of the wired and hold counts, makes the code slightly smaller and faster. Reviewed by: tegge@
|
#
8478ea24 |
|
18-Sep-2004 |
Alan Cox <alc@FreeBSD.org> |
Remove an outdated assertion from _pmap_allocpte(). (When vm_page_alloc() succeeds, the page's queue field is unconditionally set to PQ_NONE by vm_pageq_remove_nowakeup().)
|
#
7580b56b |
|
18-Sep-2004 |
Alan Cox <alc@FreeBSD.org> |
Release the page queues lock earlier in pmap_protect() and pmap_remove() in order to reduce contention.
|
#
031102cc |
|
12-Sep-2004 |
Alan Cox <alc@FreeBSD.org> |
Use an atomic op to update the pte in pmap_protect(). This is to prevent the loss of a page modified (PG_M) bit in a race between processors. Quoting Tor: One scenario where the old code could cause a lost PG_M bit is a multithreaded linux program (or FreeBSD program using the linuxthreads port) where one thread was starting a subprocess. The thread doing fork() would call vmspace_fork(), which would then call vm_map_copy_entry() which would call pmap_protect() on an area possibly accessed by other threads. Additionally, make the clearing of PG_M by pmap_protect() unconditional if write permission is removed. Previously, PG_M could persist on a read-only unmanaged page. That seems inconsistent and confusing. In collaboration with: tegge@ MT5 candidate PR: 61852
|
#
1e7fad6b |
|
11-Sep-2004 |
Scott Long <scottl@FreeBSD.org> |
Revert the previous round of changes to td_pinned. The scheduler isn't fully initialed when the pmap layer tries to call sched_pini() early in the boot and results in an quick panic. Use ke_pinned instead as was originally done with Tor's patch. Approved by: julian
|
#
5c854acc |
|
10-Sep-2004 |
Julian Elischer <julian@FreeBSD.org> |
Make up my mind if cpu pinning is stored in the thread structure or the scheduler specific extension to it. Put it in the extension as the implimentation details of how the pinning is done needn't be visible outside the scheduler. Submitted by: tegge (of course!) (with changes) MFC after: 3 days
|
#
e232eb82 |
|
08-Sep-2004 |
Alan Cox <alc@FreeBSD.org> |
Use atomic ops in pmap_clear_ptes() to prevent SMP races that could result in the loss of an accessed or modified bit from the pte. In collaboration with: tegge@ MT5 candidate
|
#
3c3e8d11 |
|
01-Sep-2004 |
Alan Cox <alc@FreeBSD.org> |
Correction to the previous revision: I forgot to apply the ones complement to a constant. This didn't show in testing because the broken expression produced the same result in my tests as the correct expression.
|
#
e33353b5 |
|
01-Sep-2004 |
Alan Cox <alc@FreeBSD.org> |
Modify pmap_pte() to support its use on non-current, non-kernel pmaps without holding Giant.
|
#
bfa15df9 |
|
29-Aug-2004 |
Alan Cox <alc@FreeBSD.org> |
Remove unnecessary check for curthread == NULL.
|
#
dd68efd0 |
|
27-Aug-2004 |
David E. O'Brien <obrien@FreeBSD.org> |
s/smp_rv_mtx/smp_ipi_mtx/g Requested by: jhb
|
#
8991a235 |
|
27-Aug-2004 |
Alan Cox <alc@FreeBSD.org> |
The machine-independent parts of the virtual memory system always pass a valid pmap to the pmap functions that require one. Remove the checks for NULL. (These checks have their origins in the Mach pmap.c that was integrated into BSD. None of the new code written specifically for FreeBSD included them.)
|
#
f1009e1e |
|
23-Aug-2004 |
Peter Wemm <peter@FreeBSD.org> |
Commit Doug White and Alan Cox's fix for the cross-ipi smp deadlock. We were obtaining different spin mutexes (which disable interrupts after aquisition) and spin waiting for delivery. For example, KSE processes do LDT operations which use smp_rendezvous, while other parts of the system are doing things like tlb shootdowns with a different mutex. This patch uses the common smp_rendezvous mutex for all MD home-grown IPIs that spinwait for delivery. Having the single mutex means that the spinloop to aquire it will enable interrupts periodically, thus avoiding the cross-ipi deadlock. Obtained from: dwhite, alc Reviewed by: jhb
|
#
a9cb79ba |
|
07-Aug-2004 |
Alan Cox <alc@FreeBSD.org> |
With the advent of pmap locking it makes sense for pmap_copy() to be less forgiving about inconsistencies in the source pmap. Also, remove a new- line character terminating a nearby panic string.
|
#
0e5a07e5 |
|
04-Aug-2004 |
John Baldwin <jhb@FreeBSD.org> |
Remove a potential deadlock on i386 SMP by changing the lazypmap ipi and spin-wait code to use the same spin mutex (smp_tlb_mtx) as the TLB ipi and spin-wait code snippets so that you can't get into the situation of one CPU doing a TLB shootdown to another CPU that is doing a lazy pmap shootdown each of which are waiting on each other. With this change, only one of the CPUs would do an IPI and spin-wait at a time.
|
#
1b3b9cfe |
|
04-Aug-2004 |
Alan Cox <alc@FreeBSD.org> |
Post-locking clean up/simplification, particularly, the elimination of vm_page_sleep_if_busy() and the page table page's busy flag as a synchronization mechanism on page table pages. Also, relocate the inline pmap_unwire_pte_hold() so that it can be used to shorten _pmap_unwire_pte_hold() on alpha and amd64. This places pmap_unwire_pte_hold() next to a comment that more accurately describes it than _pmap_unwire_pte_hold().
|
#
c6bf9f04 |
|
31-Jul-2004 |
Alan Cox <alc@FreeBSD.org> |
Add pmap locking to pmap_object_init_pt().
|
#
a0879143 |
|
29-Jul-2004 |
Alan Cox <alc@FreeBSD.org> |
Advance the state of pmap locking on alpha, amd64, and i386. - Enable recursion on the page queues lock. This allows calls to vm_page_alloc(VM_ALLOC_NORMAL) and UMA's obj_alloc() with the page queues lock held. Such calls are made to allocate page table pages and pv entries. - The previous change enables a partial reversion of vm/vm_page.c revision 1.216, i.e., the call to vm_page_alloc() by vm_page_cowfault() now specifies VM_ALLOC_NORMAL rather than VM_ALLOC_INTERRUPT. - Add partial locking to pmap_copy(). (As a side-effect, pmap_copy() should now be faster on i386 SMP because it no longer generates IPIs for TLB shootdown on the other processors.) - Complete the locking of pmap_enter() and pmap_enter_quick(). (As of now, all changes to a user-level pmap on alpha, amd64, and i386 are performed with appropriate locking.)
|
#
e1021dde |
|
21-Jul-2004 |
Olivier Houchard <cognet@FreeBSD.org> |
Using NULL as a malloc type when calling contigmalloc() is wrong, so introduce a new malloc type, and use it.
|
#
aec86de4 |
|
18-Jul-2004 |
Alan Cox <alc@FreeBSD.org> |
Utilize pmap_pte_quick() rather than pmap_pte() in pmap_protect(). The reason being that pmap_pte_quick() requires the page queues lock, which is already held, rather than Giant.
|
#
b73cfbb3 |
|
17-Jul-2004 |
Alan Cox <alc@FreeBSD.org> |
Remedy my omission of one change in the prevision revision: pmap_remove() must pin the current thread in order to call pmap_pte_quick().
|
#
c9829537 |
|
17-Jul-2004 |
Alan Cox <alc@FreeBSD.org> |
- Utilize pmap_pte_quick() rather than pmap_pte() in pmap_remove() and pmap_remove_page(). The reason being that pmap_pte_quick() requires the page queues lock, which is already held, rather than Giant. - Assert that the page queues lock is held in pmap_remove_page() and pmap_remove_pte().
|
#
3d2e54c3 |
|
15-Jul-2004 |
Alan Cox <alc@FreeBSD.org> |
Push down the acquisition and release of the page queues lock into pmap_protect() and pmap_remove(). In general, they require the lock in order to modify a page's pv list or flags. In some cases, however, pmap_protect() can avoid acquiring the lock.
|
#
ce8da309 |
|
12-Jul-2004 |
Alan Cox <alc@FreeBSD.org> |
Push down the acquisition and release of the page queues lock into pmap_remove_pages(). (The implementation of pmap_remove_pages() is optional. If pmap_remove_pages() is unimplemented, the acquisition and release of the page queues lock is unnecessary.) Remove spl calls from the alpha, arm, and ia64 pmap_remove_pages().
|
#
26a96556 |
|
07-Jul-2004 |
Alan Cox <alc@FreeBSD.org> |
Simplify the control flow in pmap_extract(), enabling the elimination of a PMAP_UNLOCK() call.
|
#
03c0ca74 |
|
06-Jul-2004 |
Alan Cox <alc@FreeBSD.org> |
White space and style changes only.
|
#
c9a217d2 |
|
05-Jul-2004 |
Alan Cox <alc@FreeBSD.org> |
Style changes to pmap_extract().
|
#
1b74731b |
|
29-Jun-2004 |
Peter Wemm <peter@FreeBSD.org> |
Fix leftover argument to pmap_unuse_pt(). I committed the wrong diff. Submmitted by: Jon Noack <noackjr@alumni.rice.edu>
|
#
654bd0e8 |
|
29-Jun-2004 |
Peter Wemm <peter@FreeBSD.org> |
Reduce the size of pv entries by 15%. This saves 1MB of KVA for mapping pv entries per 1GB of user virtual memory. (eg: if we had 1GB file was mmaped into 30 processes, that would theoretically reduce the KVA demand by 30MB for pv entries. In reality though, we limit pv entries so we don't have that many at once.) We used to store the vm_page_t for the page table page. But we recently had the pa of the ptp, or can calculate it fairly quickly. If we wanted to avoid the shift/mask operation in pmap_pde(), we could recover the pa but that means we have to store it for a while. This does not measurably change performance. Suggested by: alc Tested by: alc
|
#
df68e345 |
|
26-Jun-2004 |
Alan Cox <alc@FreeBSD.org> |
In case pmap_extract_and_hold() is ever performed on a different pmap than the current one, we need to pin the current thread to its CPU. Submitted by: tegge@
|
#
9eb31321 |
|
22-Jun-2004 |
Alan Cox <alc@FreeBSD.org> |
Implement the protection check required by the pmap_extract_and_hold() specification. This enables the elimination of Giant from that function. Reviewed by: tegge@
|
#
dc8beb53 |
|
20-Jun-2004 |
Alan Cox <alc@FreeBSD.org> |
- Simplify pmap_remove_pages(), eliminating unnecessary indirection. - Simplify the locking of pmap_is_modified() by converting control flow to data flow.
|
#
1ec4b759 |
|
20-Jun-2004 |
Alan Cox <alc@FreeBSD.org> |
Add pmap locking to pmap_is_prefaultable().
|
#
785f2cdf |
|
19-Jun-2004 |
Alan Cox <alc@FreeBSD.org> |
Remove unused pt_entry_ts. Remove an unneeded semicolon.
|
#
d45f21f3 |
|
17-Jun-2004 |
Alan Cox <alc@FreeBSD.org> |
Do not preset PG_BUSY on VM_ALLOC_NOOBJ pages. Such pages are not accessible through an object. Thus, PG_BUSY serves no purpose.
|
#
4d831945 |
|
16-Jun-2004 |
Alan Cox <alc@FreeBSD.org> |
MFamd64 Introduce pmap locking to many of the pmap functions.
|
#
1e82a3d1 |
|
15-Jun-2004 |
Alan Cox <alc@FreeBSD.org> |
MFamd64 Remove dead or unneeded code, e.g., spl calls.
|
#
7b9d4744 |
|
15-Jun-2004 |
Alan Cox <alc@FreeBSD.org> |
Remove a stale comment.
|
#
7881f950 |
|
13-Jun-2004 |
Alan Cox <alc@FreeBSD.org> |
Prevent the loss of a PG_M bit through an SMP race in pmap_ts_referenced().
|
#
2d0dc0fc |
|
12-Jun-2004 |
Alan Cox <alc@FreeBSD.org> |
In a multiprocessor, the PG_W bit in the pte must be changed atomically. Otherwise, the setting of the PG_M bit by one processor could be lost if another processor is simultaneously changing the PG_W bit. Reviewed by: tegge@
|
#
662d471d |
|
28-May-2004 |
Alan Cox <alc@FreeBSD.org> |
Remove a broken micro-optimization from pmap_enter(). The ill effect of this micro-optimization occurs when we call pmap_enter() to wire an already mapped page. Because of the micro-optimization, we fail to mark the PTE as wired. Later, on teardown of the address space, pmap_remove_pages() destroys the PTE before vm_fault_unwire() has unwired the page. (pmap_remove_pages() is not supposed to destroy wired PTEs. They are destroyed by a later call to pmap_remove().) Thus, the page becomes lost. Note: The page is not lost if the application called munlock(2), only if it relies on teardown of the address space to unwire its pages. For the historically inclined, this bug was introduced by a megacommit, revision 1.182, roughly six years ago. Leak observed by: green@ and dillon independently Patch submitted by: dillon at backplane dot com Reviewed by: tegge@ MFC after: 1 week
|
#
0af0eeac |
|
10-Apr-2004 |
Alan Cox <alc@FreeBSD.org> |
- pmap_kenter_temporary()'s first parameter, which is a physical address, should be declared as vm_paddr_t not vm_offset_t.
|
#
c8607538 |
|
04-Apr-2004 |
Alan Cox <alc@FreeBSD.org> |
Remove avail_start on those platforms that no longer use it. (Only amd64 does anything with it beyond simple initialization.)
|
#
bdb93eb2 |
|
04-Apr-2004 |
Alan Cox <alc@FreeBSD.org> |
Remove unused arguments from pmap_init().
|
#
fcffa790 |
|
07-Mar-2004 |
Alan Cox <alc@FreeBSD.org> |
Retire pmap_pinit2(). Alpha was the last platform that used it. However, ever since alpha/alpha/pmap.c revision 1.81 introduced the list allpmaps, there has been no reason for having this function on Alpha. Briefly, when pmap_growkernel() relied upon the list of all processes to find and update the various pmaps to reflect a growth in the kernel's valid address space, pmap_init2() served to avoid a race between pmap initialization and pmap_growkernel(). Specifically, pmap_pinit2() was responsible for initializing the kernel portions of the pmap and pmap_pinit2() was called after the process structure contained a pointer to the new pmap for use by pmap_growkernel(). Thus, an update to the kernel's address space might be applied to the new pmap unnecessarily, but an update would never be lost.
|
#
8600a412 |
|
01-Feb-2004 |
Alan Cox <alc@FreeBSD.org> |
Eliminate all TLB shootdowns by pmap_pte_quick(): By temporarily pinning the thread that calls pmap_pte_quick() and by virtue of the page queues lock being held, we can manage PADDR1/PMAP1 as a CPU private mapping. The most common effect of this change is to reduce the overhead of the page daemon on multiprocessors. In collaboration with: tegge
|
#
f67fdc73 |
|
25-Jan-2004 |
Jeff Roberson <jeff@FreeBSD.org> |
- Now that both schedulers support temporary cpu pinning use this rather than the switchin functions to guarantee that we're operating with the correct tlb entry. - Remove the post copy/zero tlb invalidations. It is faster to invalidate an entry that is known to exist and so it is faster to invalidate after use. However, some architectures implement speculative page table prefetching so we can not be guaranteed that the invalidated entry is still invalid when we re-enter any of these functions. As a result of this we must always invalidate before use to be safe.
|
#
7fac6d07 |
|
10-Jan-2004 |
Alan Cox <alc@FreeBSD.org> |
Include "opt_cpu.h" and related #ifdef's for SSE so that pagezero() actually includes the call to sse2_pagezero().
|
#
a41c6c2a |
|
28-Dec-2003 |
Alan Cox <alc@FreeBSD.org> |
Don't bother clearing PG_ZERO on the page table page in _pmap_allocpte(); it serves no purpose.
|
#
8eaddab1 |
|
27-Dec-2003 |
Alan Cox <alc@FreeBSD.org> |
Don't bother clearing and setting PG_BUSY on page table directory pages.
|
#
925692ca |
|
21-Dec-2003 |
Alan Cox <alc@FreeBSD.org> |
- Significantly reduce the number of preallocated pv entries in pmap_init(). Such a large preallocation is unnecessary and wastes nearly eight megabytes of kernel virtual address space per gigabyte of managed physical memory. - Increase UMA_BOOT_PAGES by two. This enables the removal of pmap_pv_allocf(). (Note: this function was only used during initialization, specifically, after pmap_init() but before pmap_init2(). During pmap_init2(), a new allocator is installed.)
|
#
1c0e8644 |
|
18-Dec-2003 |
John Baldwin <jhb@FreeBSD.org> |
MFamd64: Remove i386_protection_init() and the protection_codes[] array and replace them with a simple if test to turn on PG_RW. i386 != vax.
|
#
5e4a2fc9 |
|
07-Nov-2003 |
Alan Cox <alc@FreeBSD.org> |
- Similar to post-PAE RELENG_4 split pmap_pte_quick() into two cases, pmap_pte() and pmap_pte_quick(). The distinction being based upon the locks that are held by the caller. When the given pmap is not the current pmap, pmap_pte() should be used when Giant is held and pmap_pte_quick() should be used when the vm page queues lock is held. - When assigning to PMAP1 or PMAP2, include PG_A anf PG_M. - Reenable the inlining of pmap_is_current(). In collaboration with: tegge
|
#
147ad8d5 |
|
03-Nov-2003 |
John Baldwin <jhb@FreeBSD.org> |
New i386 SMP code: - The MP code no longer knows anything specific about an MP Table. Instead, the local APIC code adds CPUs via the cpu_add() function when a local APIC is enumerated by an APIC enumerator. - Don't divide the argument to mp_bootaddress() by 1024 just so that we can turn around and mulitply it by 1024 again. - We no longer panic if SMP is enabled but we are booted on a UP machine. - init_secondary(), the asm code between init_secondary() and ap_init() in mpboot.s and ap_init() have all been merged together in C into init_secondary(). - We now use the cpuid feature bits to determine if we should enable PSE, PGE, or VME on each AP. - Due to the change in the implementation of critical sections, acquire the SMP TLB mutex around a slightly larger chunk of code for TLB shootdowns. - Remove some of the debug code from the original SMP implementation that is no longer used or no longer applies to the new APIC code. - Use a temporary hack to disable the ACPI module until the SMP code has been further reorganized to allow ACPI to work as a module again. - Add a DDB command to dump the interesting contents of the IDT.
|
#
cd7ccabe |
|
31-Oct-2003 |
John Baldwin <jhb@FreeBSD.org> |
For physical address regions between 0 and KERNLOAD, allow pmap_mapdev() to use the direct mapped KVA at KERNBASE to service the request. This also allows pmap_mapdev() to be used for such addresses very early during the boot process and might provide some small savings on KVA. Reviewed by: peter
|
#
d49aa135 |
|
30-Oct-2003 |
Peter Wemm <peter@FreeBSD.org> |
Change the pmap_invalidate_xxx() functions so they test against pmap == kernel_pmap rather than pmap->pm_active == -1. gcc's inliner can remove more code that way. Only kernel_pmap has a pm_active of -1.
|
#
2e81e660 |
|
27-Oct-2003 |
John Baldwin <jhb@FreeBSD.org> |
Fix pmap_unmapdev() to call pmap_kremove() instead of implementing it directly so that it more closely mirrors pmap_mapdev() which calls pmap_kenter().
|
#
f3075be8 |
|
25-Oct-2003 |
Peter Wemm <peter@FreeBSD.org> |
For the SMP case, flush the TLB at the beginning of the page zero/copy routines. Otherwise we run into trouble with speculative tlb preloads on SMP systems. This effectively defeats Jeff's revision 1.438 optimization (for his pentium4-M laptop) in the SMP case. It breaks other systems, particularly athlon-MP's.
|
#
b09a77f5 |
|
24-Oct-2003 |
Peter Wemm <peter@FreeBSD.org> |
GC workaround code for detecting pentium4's and disabling PSE and PG_G. It's been ifdef'ed out for ages.
|
#
6f366f87 |
|
14-Oct-2003 |
Peter Wemm <peter@FreeBSD.org> |
Get some more data if we hit the pmap_enter() thing.
|
#
50ac3f99 |
|
12-Oct-2003 |
Alan Cox <alc@FreeBSD.org> |
- Modify pmap_is_current() to return FALSE when a pmap's page table is in use because a kernel thread is borrowing it. The borrowed page table can change spontaneously, making any dependence on its continued use subject to a race condition. - _pmap_unwire_pte_hold() cannot use pmap_is_current(): If a change is made to a page table page mapping for a borrowed page table, the TLB must be updated. In collaboration with: tegge
|
#
9b993f82 |
|
12-Oct-2003 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Initialize CMAP3 to 0
|
#
ab87e2fb |
|
04-Oct-2003 |
Alan Cox <alc@FreeBSD.org> |
Don't bother setting a page table page's valid field. It is unused and not setting it is consistent with other uses of VM_ALLOC_NOOBJ pages.
|
#
47804290 |
|
04-Oct-2003 |
Jeff Roberson <jeff@FreeBSD.org> |
- The proper test is CPU_ENABLE_SSE and not CPU_ENABLED_SSE. This effectively disabled the sse2_pagezero() code. Spotted by: bde
|
#
566526a9 |
|
03-Oct-2003 |
Alan Cox <alc@FreeBSD.org> |
Migrate pmap_prefault() into the machine-independent virtual memory layer. A small helper function pmap_is_prefaultable() is added. This function encapsulate the few lines of pmap_prefault() that actually vary from machine to machine. Note: pmap_is_prefaultable() and pmap_mincore() have much in common. Going forward, it's worth considering their merger.
|
#
6ccf265b |
|
01-Oct-2003 |
Peter Wemm <peter@FreeBSD.org> |
Commit Bosko's patch to clean up the PSE/PG_G initialization to and avoid problems with some Pentium 4 cpus and some older PPro/Pentium2 cpus. There are several problems, some documented in Intel errata. This patch: 1) moves the kernel to the second page in the PSE case. There is an errata that says that you Must Not point a 4MB page at physical address zero on older cpus. We avoided bugs here due to sheer luck. 2) sets up PSE page tables right from the start in locore, rather than trying to switch from 4K to 4M (or 2M) pages part way through the boot sequence at the same time that we're messing with PG_G. For some reason, the pmap work over the last 18 months seems to tickle the problems, and the PAE infrastructure changes disturb the cpu bugs even more. A couple of people have reported a problem with APM bios calls during boot. I'll work with people to get this resolved. Obtained from: bmilekic
|
#
1419773d |
|
30-Sep-2003 |
Jeff Roberson <jeff@FreeBSD.org> |
- Hide more #ifdef logic in a new invlcaddr inline. This function flushes the full tlb if you're on an I386or does an invlpg otherwise. Glanced at by: peter
|
#
043407f8 |
|
30-Sep-2003 |
Jeff Roberson <jeff@FreeBSD.org> |
- Define an inline pagezero() to select the appropriate full-page zeroing function from one of bzero, i686_pagezero, or sse2_pagezero. - Use pagezero() in the three pmap functions that need to zero full pages.
|
#
fb9bde2d |
|
30-Sep-2003 |
Jeff Roberson <jeff@FreeBSD.org> |
- Correct a problem with the last commit. The CMAP ptes need to be zeroed prior to invalidating the TLB to be certain that the processor doesn't keep a cached copy. Discussed with: pete Paniced: tegge Pointy Hat: The usual spot
|
#
fa3f9daa |
|
30-Sep-2003 |
Jeff Roberson <jeff@FreeBSD.org> |
- On my Pentium4-M laptop, invalpg takes ~1100 cycles if the page is found in the TLB and ~1600 if it is not. Therefore, it is more effecient to invalidate the TLB after operations that use CMAP rather than before. - So that the tlb is invalidated prior to switching off of a processor, we must change the switchin functions to switchout functions. - Remove td_switchout from the thread and move it to the x86 pcb. - Move the code that calls switchout into swtch.s. These changes make this optimization truely x86 specific.
|
#
4487ff65 |
|
26-Sep-2003 |
Alan Cox <alc@FreeBSD.org> |
Addendum to the previous revision: If vm_page_alloc() for the page table page fails, perform a VM_WAIT; update some comments in _pmap_allocpte().
|
#
f3fd831c |
|
24-Sep-2003 |
Alan Cox <alc@FreeBSD.org> |
- Eliminate the pte object. - Use kmem_alloc_nofault() rather than kmem_alloc_pageable() to allocate KVA space for the page directory page(s). Submitted by: tegge
|
#
be19fdd1 |
|
21-Sep-2003 |
Alan Cox <alc@FreeBSD.org> |
Allocate the page table directory page(s) as "no object" pages. (This leaves one explicit use of the pte object.)
|
#
f8363bde |
|
20-Sep-2003 |
Alan Cox <alc@FreeBSD.org> |
Reimplement pmap_release() such that it uses the page table rather than the pte object to locate the page table directory pages. (This is another step toward the elimination of the pte object.)
|
#
6d66d714 |
|
13-Sep-2003 |
Alan Cox <alc@FreeBSD.org> |
Simplify (and micro-optimize) pmap_unuse_pt(): Only one caller, pmap_remove_pte(), passed NULL instead of the required page table page to pmap_unuse_pt(). Compute the necessary page table page in pmap_remove_pte(). Also, remove some unreachable code from pmap_remove_pte().
|
#
b9850eb2 |
|
12-Sep-2003 |
Alan Cox <alc@FreeBSD.org> |
Add a new parameter to pmap_extract_and_hold() that is needed to eliminate Giant from vmapbuf(). Idea from: tegge
|
#
ba2157f2 |
|
07-Sep-2003 |
Alan Cox <alc@FreeBSD.org> |
Introduce a new pmap function, pmap_extract_and_hold(). This function atomically extracts and holds the physical page that is associated with the given pmap and virtual address. Such a function is needed to make the memory mapping optimizations used by, for example, pipes and raw disk I/O MP-safe. Reviewed by: tegge
|
#
a7b60ab2 |
|
25-Aug-2003 |
David E. O'Brien <obrien@FreeBSD.org> |
Fix copyright comment & FBSDID style nits. Requested by: bde
|
#
3c5a69f7 |
|
21-Aug-2003 |
Alan Cox <alc@FreeBSD.org> |
Eliminate the last (direct) use of vm_page_lookup() on the pte object.
|
#
1e584e46 |
|
19-Aug-2003 |
Alan Cox <alc@FreeBSD.org> |
Eliminate a possible race condition for multithreaded applications in _pmap_allocpte(): Guarantee that the page table page is zero filled before adding it to the directory. Otherwise, a 2nd, 3rd, etc. thread could access a nearby virtual address and use garbage for the address translation. Discussed with: peter, tegge
|
#
90ca070d |
|
17-Aug-2003 |
Alan Cox <alc@FreeBSD.org> |
Acquire the pte object's mutex when performing vm_page_grab(). Note: It is my long-term objective to eliminate the pte object. In the near term, this does, however, enable the addition of some vm object locking assertions.
|
#
365b27ea |
|
16-Aug-2003 |
Alan Cox <alc@FreeBSD.org> |
In pmap_copy(), since we have the page table page's physical address in hand, use PHYS_TO_VM_PAGE() rather than vm_page_lookup().
|
#
d8df7ab7 |
|
13-Aug-2003 |
Alan Cox <alc@FreeBSD.org> |
Eliminate pmap_page_lookup() and its uses. Instead, use PHYS_TO_VM_PAGE() to convert the pte's physical address into a vm page. Reviewed by: peter
|
#
ba97fd8a |
|
10-Aug-2003 |
Alan Cox <alc@FreeBSD.org> |
Rename pmap_changebit() to pmap_clear_ptes() and remove the last parameter. The new name better reflects what the function does and how it is used. The last parameter was always FALSE. Note: In theory, gcc would perform constant propagation and dead code elimination to achieve the same effect as removing the last parameter, which is always FALSE. In practice, recent versions do not. So, there is little point in letting unused code pessimize execution.
|
#
2c2464cb |
|
06-Aug-2003 |
Alan Cox <alc@FreeBSD.org> |
Correct a mistake in the previous revision: Reduce the scope of the page queues lock such that it isn't held around the call to get_pv_entry(), which calls uma_zalloc(). At the point of the call to get_pv_entry(), the lock isn't necessary and holding it could lead to recursive acquisition, which isn't allowed.
|
#
b0b2803a |
|
06-Aug-2003 |
Alan Cox <alc@FreeBSD.org> |
Acquire the page queues lock in pmap_insert_entry(). (I used to believe that the page's busy flag could be relied upon to synchronize access to the pv list. I don't any longer. See, for example, the call to pmap_insert_entry() from pmap_copy().)
|
#
195d68e5 |
|
02-Aug-2003 |
Alan Cox <alc@FreeBSD.org> |
- Use kmem_alloc_nofault() rather than kmem_alloc_pageable() in pmap_mapdev(). See revision 1.140 of kern/sys_pipe.c for a detailed rationale. Submitted by: tegge - Remove GIANT_REQUIRED from pmap_mapdev().
|
#
b053bc84 |
|
30-Jul-2003 |
Bosko Milekic <bmilekic@FreeBSD.org> |
Make sure that when the PV ENTRY zone is created in pmap, that it's created not only with UMA_ZONE_VM but also with UMA_ZONE_NOFREE. In the i386 case in particular, the pmap code would hook a special page allocation routine that allocated from kernel_map and not kmem_map, and so when/if the pageout daemon drained the zones, it could actually push out slabs from the PV ENTRY zone but call UMA's default page_free, which resulted in pages allocated from kernel_map being freed to kmem_map; bad. kmem_free() ignores the return value of the vm_map_delete and just returns. I'm not sure what the exact repercussions could be, but it doesn't look good. In the PAE case on i386, we also set-up a zone in pmap, so be conservative for now and make that zone also ZONE_NOFREE and ZONE_VM. Do this for the pmap zones for the other archs too, although in some cases it may not be entirely necessarily. We'd rather be safe than sorry at this point. Perhaps all UMA_ZONE_VM zones should by default be also UMA_ZONE_NOFREE? May fix some of silby's crashes on the PV ENTRY zone.
|
#
34621500 |
|
23-Jul-2003 |
Alan Cox <alc@FreeBSD.org> |
Annotate pmap_changebit() as __always_inline. This function was written as a template that when inlined is specialized for the caller through constant value propagation and dead code elimination. Thus, the specialized code that is generated for pmap_clear_reference() et al. avoids several conditional branches inside of a loop.
|
#
e95babf3 |
|
09-Jul-2003 |
Peter Wemm <peter@FreeBSD.org> |
unifdef -DLAZY_SWITCH and start to tidy up the associated glue.
|
#
90a7c7b6 |
|
08-Jul-2003 |
Alan Cox <alc@FreeBSD.org> |
In pmap_object_init_pt(), the pmap_invalidate_all() should be performed on the caller-provided pmap, not the kernel_pmap. Using the kernel_pmap results in an unnecessary IPI for TLB shootdown on SMPs. Reviewed by: jake, peter
|
#
4041408f |
|
04-Jul-2003 |
Alan Cox <alc@FreeBSD.org> |
Add vm object locking to pmap_prefault().
|
#
1f78f902 |
|
03-Jul-2003 |
Alan Cox <alc@FreeBSD.org> |
Background: pmap_object_init_pt() premaps the pages of a object in order to avoid the overhead of later page faults. In general, it implements two cases: one for vnode-backed objects and one for device-backed objects. Only the device-backed case is really machine-dependent, belonging in the pmap. This commit moves the vnode-backed case into the (relatively) new function vm_map_pmap_enter(). On amd64 and i386, this commit only amounts to code rearrangement. On alpha and ia64, the new machine independent (MI) implementation of the vnode case is smaller and more efficient than their pmap-based implementations. (The MI implementation takes advantage of the fact that objects in -CURRENT are ordered collections of pages.) On sparc64, pmap_object_init_pt() hadn't (yet) been implemented.
|
#
dca96f1a |
|
29-Jun-2003 |
Alan Cox <alc@FreeBSD.org> |
- Export pmap_enter_quick() to the MI VM. This will permit the implementation of a largely MI pmap_object_init_pt() for vnode-backed objects. pmap_enter_quick() is implemented via pmap_enter() on sparc64 and powerpc. - Correct a mismatch between pmap_object_init_pt()'s prototype and its various implementations. (I plan to keep pmap_object_init_pt() as the MD hook for device-backed objects on i386 and amd64.) - Correct an error in ia64's pmap_enter_quick() and adjust its interface to match the other versions. Discussed with: marcel
|
#
eabd1972 |
|
27-Jun-2003 |
Peter Wemm <peter@FreeBSD.org> |
Tidy up leftover lazy_switch instrumentation that is no longer needed. This cleans up some #ifdef hell.
|
#
b50953cc |
|
27-Jun-2003 |
Peter Wemm <peter@FreeBSD.org> |
Fix the false IPIs on smp when using LAZY_SWITCH caused by pmap_activate() not releasing the pm_active bit in the old pmap.
|
#
0e2a4d3a |
|
14-Jun-2003 |
David Xu <davidxu@FreeBSD.org> |
Rename P_THREADED to P_SA. P_SA means a process is using scheduler activations.
|
#
49a2507b |
|
14-Jun-2003 |
Alan Cox <alc@FreeBSD.org> |
Migrate the thread stack management functions from the machine-dependent to the machine-independent parts of the VM. At the same time, this introduces vm object locking for the non-i386 platforms. Two details: 1. KSTACK_GUARD has been removed in favor of KSTACK_GUARD_PAGES. The different machine-dependent implementations used various combinations of KSTACK_GUARD and KSTACK_GUARD_PAGES. To disable guard page, set KSTACK_GUARD_PAGES to 0. 2. Remove the (unnecessary) clearing of PG_ZERO in vm_thread_new. In 5.x, (but not 4.x,) PG_ZERO can only be set if VM_ALLOC_ZERO is passed to vm_page_alloc() or vm_page_grab().
|
#
89f4fca2 |
|
14-Jun-2003 |
Alan Cox <alc@FreeBSD.org> |
Move the *_new_altkstack() and *_dispose_altkstack() functions out of the various pmap implementations into the machine-independent vm. They were all identical.
|
#
b26ce6a4 |
|
13-Jun-2003 |
Alan Cox <alc@FreeBSD.org> |
Add vm object locking to pmap_object_init_pt().
|
#
8630c117 |
|
12-Jun-2003 |
Alan Cox <alc@FreeBSD.org> |
Add vm object locking to various pagers' "get pages" methods, i386 stack management functions, and a u area management function.
|
#
9676a785 |
|
02-Jun-2003 |
David E. O'Brien <obrien@FreeBSD.org> |
Use __FBSDID().
|
#
14ce5bd4 |
|
28-Apr-2003 |
Jake Burkholder <jake@FreeBSD.org> |
Use inlines for loading and storing page table entries. Use cmpxchg8b for the PAE case to ensure idempotent 64 bit loads and stores. Sponsored by: DARPA, Network Associates Laboratories
|
#
ffad008f |
|
25-Apr-2003 |
Jake Burkholder <jake@FreeBSD.org> |
Remove harmless invalid cast. Sponsored by: DARPA, Network Associates Laboratories
|
#
163529c2 |
|
03-Apr-2003 |
Jake Burkholder <jake@FreeBSD.org> |
- Removed APTD and associated macros, it is no longer used. BANG BANG BANG etc. Sponsored by: DARPA, Network Associates Laboratories
|
#
cc66ebe2 |
|
02-Apr-2003 |
Peter Wemm <peter@FreeBSD.org> |
Commit a partial lazy thread switch mechanism for i386. it isn't as lazy as it could be and can do with some more cleanup. Currently its under options LAZY_SWITCH. What this does is avoid %cr3 reloads for short context switches that do not involve another user process. ie: we can take an interrupt, switch to a kthread and return to the user without explicitly flushing the tlb. However, this isn't as exciting as it could be, the interrupt overhead is still high and too much blocks on Giant still. There are some debug sysctls, for stats and for an on/off switch. The main problem with doing this has been "what if the process that you're running on exits while we're borrowing its address space?" - in this case we use an IPI to give it a kick when we're about to reclaim the pmap. Its not compiled in unless you add the LAZY_SWITCH option. I want to fix a few more things and get some more feedback before turning it on by default. This is NOT a replacement for Bosko's lazy interrupt stuff. This was more meant for the kthread case, while his was for interrupts. Mine helps a little for interrupts, but his helps a lot more. The stats are enabled with options SWTCH_OPTIM_STATS - this has been a pseudo-option for years, I just added a bunch of stuff to it. One non-trivial change was to select a new thread before calling cpu_switch() in the first place. This allows us to catch the silly case of doing a cpu_switch() to the current process. This happens uncomfortably often. This simplifies a bit of the asm code in cpu_switch (no longer have to call choosethread() in the middle). This has been implemented on i386 and (thanks to jake) sparc64. The others will come soon. This is actually seperate to the lazy switch stuff. Glanced at by: jake, jhb
|
#
7ab9b220 |
|
29-Mar-2003 |
Jake Burkholder <jake@FreeBSD.org> |
- Add support for PAE and more than 4 gigs of ram on x86, dependent on the kernel opition 'options PAE'. This will only work with device drivers which either use busdma, or are able to handle 64 bit physical addresses. Thanks to Lanny Baron from FreeBSD Systems for the loan of a test machine with 6 gigs of ram. Sponsored by: DARPA, Network Associates Laboratories, FreeBSD Systems
|
#
de54353f |
|
29-Mar-2003 |
Jake Burkholder <jake@FreeBSD.org> |
- Remove invalid casts. Sponsored by: DARPA, Network Associates Laboratories
|
#
aea57872 |
|
29-Mar-2003 |
Jake Burkholder <jake@FreeBSD.org> |
- Convert all uses of pmap_pte and get_ptbase to pmap_pte_quick. When accessing an alternate address space this causes 1 page table page at a time to be mapped in, rather than using the recursive mapping technique to map in an entire alternate address space. The recursive mapping technique changes large portions of the address space and requires global tlb flushes, which seem to cause problems when PAE is enabled. This will also allow IPIs to be avoided when mapping in new page table pages using the same technique as is used for pmap_copy_page and pmap_zero_page. Sponsored by: DARPA, Network Associates Laboratories
|
#
227f9a1c |
|
24-Mar-2003 |
Jake Burkholder <jake@FreeBSD.org> |
- Add vm_paddr_t, a physical address type. This is required for systems where physical addresses larger than virtual addresses, such as i386s with PAE. - Use this to represent physical addresses in the MI vm system and in the i386 pmap code. This also changes the paddr parameter to d_mmap_t. - Fix printf formats to handle physical addresses >4G in the i386 memory detection code, and due to kvtop returning vm_paddr_t instead of u_long. Note that this is a name change only; vm_paddr_t is still the same as vm_offset_t on all currently supported platforms. Sponsored by: DARPA, Network Associates Laboratories Discussed with: re, phk (cdevsw change)
|
#
4d3f408c |
|
12-Mar-2003 |
Jake Burkholder <jake@FreeBSD.org> |
- Added support for multiple page directory pages to pmap_pinit and pmap_release. - Merged pmap_release and pmap_release_free_page. When pmap_release is called only the page directory page(s) can be left in the pmap pte object, since all page table pages will have been freed by pmap_remove_pages and pmap_remove. In addition, there can only be one reference to the pmap and the page directory is wired, so the page(s) can never be busy. So all there is to do is clear the magic mappings from the page directory and free the page(s). Sponsored by: DARPA, Network Associates Laboratories
|
#
ac2e4153 |
|
26-Feb-2003 |
Julian Elischer <julian@FreeBSD.org> |
Change the process flags P_KSES to be P_THREADED. This is just a cosmetic change but I've been meaning to do it for about a year.
|
#
0f1a7e05 |
|
25-Feb-2003 |
Jake Burkholder <jake@FreeBSD.org> |
- Added inlines pmap_is_current, pmap_is_alternate and pmap_set_alternate for testing and setting the current and alternate address spaces. - Changed PTDpde and APTDpde to arrays to support multiple page directory pages. ponsored by: DARPA, Network Associates Laboratories
|
#
07159f9c |
|
24-Feb-2003 |
Maxime Henrion <mux@FreeBSD.org> |
Cleanup of the d_mmap_t interface. - Get rid of the useless atop() / pmap_phys_address() detour. The device mmap handlers must now give back the physical address without atop()'ing it. - Don't borrow the physical address of the mapping in the returned int. Now we properly pass a vm_offset_t * and expect it to be filled by the mmap handler when the mapping was successful. The mmap handler must now return 0 when successful, any other value is considered as an error. Previously, returning -1 was the only way to fail. This change thus accidentally fixes some devices which were bogusly returning errno constants which would have been considered as addresses by the device pager. - Garbage collect the poorly named pmap_phys_address() now that it's no longer used. - Convert all the d_mmap_t consumers to the new API. I'm still not sure wheter we need a __FreeBSD_version bump for this, since and we didn't guarantee API/ABI stability until 5.1-RELEASE. Discussed with: alc, phk, jake Reviewed by: peter Compile-tested on: LINT (i386), GENERIC (alpha and sparc64) Runtime-tested on: i386
|
#
28c9e1aa |
|
23-Feb-2003 |
Jake Burkholder <jake@FreeBSD.org> |
Use the direct mapping of IdlePTD setup in locore for proc0's page directory, instead of allocating another page of kva and mapping it in again. This was likely an oversight in revision 1.174 (cut and paste from pmap_pinit). Discussed with: peter, tegge Sponsored by: DARPA, Network Associates Laboratories
|
#
910548de |
|
23-Feb-2003 |
Jake Burkholder <jake@FreeBSD.org> |
- Added macros NPGPTD, NBPTD, and NPDEPTD, for dealing with the size of the page directory. - Use these instead of the magic constants 1 or PAGE_SIZE where appropriate. There are still numerous assumptions that the page directory is exactly 1 page. Sponsored by: DARPA, Network Associates Laboratories
|
#
e29632c9 |
|
23-Feb-2003 |
Jake Burkholder <jake@FreeBSD.org> |
- Added macros PDESHIFT and PTESHIFT, use these instead of magic constants in locore. - Removed the macros PTESIZE and PDESIZE, use sizeof instead in C. Sponsored by: DARPA, Network Associates Laboratories
|
#
01a06ce2 |
|
22-Feb-2003 |
Alan Cox <alc@FreeBSD.org> |
The root of the splay tree maintained within the pm_pteobj always refers to the last accessed pte page. Thus, the pm_ptphint is redundant and can be removed.
|
#
8e42580d |
|
15-Feb-2003 |
Alan Cox <alc@FreeBSD.org> |
Assert that the kernel map's system mutex is held in pmap_growkernel().
|
#
e33d37b6 |
|
14-Feb-2003 |
Alan Cox <alc@FreeBSD.org> |
- Add a mutex for synchronizing the use of CMAP/CADDR 1 and 2. - Eliminate small style differences between pmap_zero_page(), pmap_copy_page(), etc.
|
#
939a4397 |
|
12-Feb-2003 |
Peter Wemm <peter@FreeBSD.org> |
Oops. I mis-remembered about the P4 problems. It was 5.0-DP2 that was shipped with DISABLE_PG_G and DISABLE_PSE, not 5.0-REL. *blush* Disable the code - but still leave it there in case its still lurking.
|
#
521871f1 |
|
12-Feb-2003 |
Peter Wemm <peter@FreeBSD.org> |
Turn of PG_PS and PG_G for Pentium-4 cpus at boot time. This is so that we can stop turning off PG_G and PG_PS globally for releases.
|
#
393a225c |
|
11-Feb-2003 |
Alan Cox <alc@FreeBSD.org> |
Remove kptobj. Instead, use VM_ALLOC_NOOBJ.
|
#
571cd8a1 |
|
07-Feb-2003 |
Alan Cox <alc@FreeBSD.org> |
MF alpha - Synchronize access to the allpmaps list with a mutex.
|
#
e2294003 |
|
06-Feb-2003 |
Peter Wemm <peter@FreeBSD.org> |
Commit some cosmetic changes I had laying around and almost included with another commit. Unwrap a line. Unexpand a pmap_kenter().
|
#
ca380469 |
|
02-Feb-2003 |
Alan Cox <alc@FreeBSD.org> |
- Make allpmaps static. - Use atomic subtract to update the global wired pages count. (See also vm/vm_page.c revision 1.233.) - Assert that the page queue lock is held in pmap_remove_entry().
|
#
d6d92c84 |
|
27-Jan-2003 |
Alan Cox <alc@FreeBSD.org> |
Merge pmap_testbit() and pmap_is_modified(). The latter is the only caller of the former.
|
#
9d5abbdd |
|
01-Jan-2003 |
Jens Schweikhardt <schweikh@FreeBSD.org> |
Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup, especially in troff files.
|
#
84cdcd85 |
|
27-Dec-2002 |
Alan Cox <alc@FreeBSD.org> |
Assert that the page queues lock is held in pmap_testbit().
|
#
11a2911c |
|
24-Dec-2002 |
Alan Cox <alc@FreeBSD.org> |
- Hold the page queues lock around calls to vm_page_wakeup() and vm_page_flag_clear().
|
#
0ced3981 |
|
14-Dec-2002 |
Alan Cox <alc@FreeBSD.org> |
Add page locking to pmap_mincore(). Submitted (in part) by: tjr@
|
#
f4dcf955 |
|
02-Dec-2002 |
Alan Cox <alc@FreeBSD.org> |
Avoid recursive acquisition of the page queues lock in pmap_unuse_pt(). Approved by: re
|
#
d79ebb60 |
|
01-Dec-2002 |
Alan Cox <alc@FreeBSD.org> |
Hold the page queues lock when calling pmap_unwire_pte_hold() or pmap_remove_pte(). Use vm_page_sleep_if_busy() in _pmap_unwire_pte_hold() so that the page queues lock is released when sleeping. Approved by: re (blanket)
|
#
e6c90801 |
|
30-Nov-2002 |
Alan Cox <alc@FreeBSD.org> |
Assert that the page queues lock is held in pmap_changebit() and pmap_ts_referenced(). Approved by: re (blanket)
|
#
0d51d232 |
|
30-Nov-2002 |
Alan Cox <alc@FreeBSD.org> |
Assert that the page queues lock is held in pmap_page_exists_quick(). Approved by: re (blanket)
|
#
ffb30958 |
|
24-Nov-2002 |
Alan Cox <alc@FreeBSD.org> |
Assert that the page queues lock is held in pmap_remove_pages(). Approved by: re (blanket)
|
#
4817d8e5 |
|
22-Nov-2002 |
Alan Cox <alc@FreeBSD.org> |
- Assert that the page queues lock is held in pmap_remove_all(). - Fix a diagnostic message and comment in pmap_remove_all(). - Eliminate excessive white space from pmap_remove_all(). Approved by: re
|
#
eea85e9b |
|
12-Nov-2002 |
Alan Cox <alc@FreeBSD.org> |
Move pmap_collect() out of the machine-dependent code, rename it to reflect its new location, and add page queue and flag locking. Notes: (1) alpha, i386, and ia64 had identical implementations of pmap_collect() in terms of machine-independent interfaces; (2) sparc64 doesn't require it; (3) powerpc had it as a TODO.
|
#
6372d61e |
|
10-Nov-2002 |
Alan Cox <alc@FreeBSD.org> |
- Clear the page's PG_WRITEABLE flag in the i386's pmap_changebit() if we're removing write access from the page's PTEs. - Export pmap_remove_all() on alpha, i386, and ia64. (It's already exported on sparc64.)
|
#
aa8e11b6 |
|
07-Nov-2002 |
Alan Cox <alc@FreeBSD.org> |
Simplify and optimize pmap_object_init_pt(). More specifically, take advantage of the fact that the vm object's list of pages is now ordered to reduce the overhead of finding the desired set of pages to be mapped. (See revision 1.215 of vm/vm_page.c.)
|
#
316ec49a |
|
02-Oct-2002 |
Scott Long <scottl@FreeBSD.org> |
Some kernel threads try to do significant work, and the default KSTACK_PAGES doesn't give them enough stack to do much before blowing away the pcb. This adds MI and MD code to allow the allocation of an alternate kstack who's size can be speficied when calling kthread_create. Passing the value 0 prevents the alternate kstack from being created. Note that the ia64 MD code is missing for now, and PowerPC was only partially written due to the pmap.c being incomplete there. Though this patch does not modify anything to make use of the alternate kstack, acpi and usb are good candidates. Reviewed by: jake, peter, jhb
|
#
b29d22f9 |
|
01-Oct-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
The pmap_prefault_pageorder[] array was initialize with wrong values due to a missing comma. I have no idea what trouble, if any, this may have caused. Pointed out by: FlexeLint
|
#
37c84183 |
|
28-Sep-2002 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Be consistent about "static" functions: if the function is marked static in its prototype, mark it static at the definition too. Inspired by: FlexeLint warning #512
|
#
6508a194 |
|
24-Aug-2002 |
Alan Cox <alc@FreeBSD.org> |
o Retire pmap_pageable(). It's an advisory routine that none of our platforms implements.
|
#
55f7c614 |
|
21-Aug-2002 |
Archie Cobbs <archie@FreeBSD.org> |
Don't use "NULL" when "0" is really meant.
|
#
fe047604 |
|
17-Aug-2002 |
Alan Cox <alc@FreeBSD.org> |
o Simplify the ptphint test in pmap_release_free_page(). In other words, make it just like the test in _pmap_unwire_pte_hold().
|
#
e9ed460a |
|
13-Aug-2002 |
Alan Cox <alc@FreeBSD.org> |
o Remove an unnecessary vm_page_flash() from _pmap_unwire_pte_hold(). Reviewed by: peter
|
#
d837b369 |
|
12-Aug-2002 |
Alan Cox <alc@FreeBSD.org> |
o Convert three instances of vm_page_sleep_busy() into vm_page_sleep_if_busy() with page queue locking.
|
#
d8a0d079 |
|
12-Aug-2002 |
Ian Dowse <iedowse@FreeBSD.org> |
Use roundup2() to avoid a problem where pmap_growkernel was unable to extend the kernel VM to the maximum possible address of 4G-4M. PR: i386/22441 Submitted by: Bill Carpenter <carp@world.std.com> Reviewed by: alc
|
#
0da73705 |
|
10-Aug-2002 |
Alan Cox <alc@FreeBSD.org> |
o Remove the setting and clearing of the PG_MAPPED flag. (This flag is obsolete.)
|
#
c679b309 |
|
05-Aug-2002 |
Peter Wemm <peter@FreeBSD.org> |
Revert rev 1.356 and 1.352 (pmap_mapdev hacks). It wasn't worth the pain.
|
#
3f3655b0 |
|
04-Aug-2002 |
Peter Wemm <peter@FreeBSD.org> |
Fix a mistake in 1.352 - I was returning a pointer to the rounded down address. I expect this will fix acpica.
|
#
ea5e5b13 |
|
03-Aug-2002 |
Alan Cox <alc@FreeBSD.org> |
o Request a wired page from vm_page_grab() in _pmap_allocpte().
|
#
b9c51c91 |
|
03-Aug-2002 |
Alan Cox <alc@FreeBSD.org> |
o Ask for a prezeroed page in pmap_pinit() for the page directory page.
|
#
5da2d6a4 |
|
03-Aug-2002 |
Alan Cox <alc@FreeBSD.org> |
o Don't set PG_MAPPED on the page allocated and mapped in _pmap_allocpte(). (Only set this flag if the mapping has a corresponding pv list entry, which this mapping doesn't.)
|
#
8f1586dd |
|
02-Aug-2002 |
Peter Wemm <peter@FreeBSD.org> |
Take advantage of the fact that there is a small 1MB direct mapped region on x86 in between KERNBASE and the kernel load address. pmap_mapdev() can return pointers to this for devices operating in the isa "hole".
|
#
64a1b85e |
|
01-Aug-2002 |
Alan Cox <alc@FreeBSD.org> |
o Lock page queue accesses by vm_page_deactivate().
|
#
239b5b97 |
|
31-Jul-2002 |
Alan Cox <alc@FreeBSD.org> |
o Setting PG_MAPPED and PG_WRITEABLE on pages that are mapped and unmapped by pmap_qenter() and pmap_qremove() is pointless. In fact, it probably leads to unnecessary pmap_page_protect() calls if one of these pages is paged out after unwiring. Note: setting PG_MAPPED asserts that the page's pv list may be non-empty. Since checking the status of the page's pv list isn't any harder than checking this flag, the flag should probably be eliminated. Alternatively, PG_MAPPED could be set by pmap_enter() exclusively rather than various places throughout the kernel.
|
#
bfd28670 |
|
30-Jul-2002 |
Alan Cox <alc@FreeBSD.org> |
o Lock page queue accesses by pmap_release_free_page().
|
#
14f8ceaa |
|
28-Jul-2002 |
Alan Cox <alc@FreeBSD.org> |
o Pass VM_ALLOC_WIRED to vm_page_grab() rather than calling vm_page_wire() in pmap_new_thread(), pmap_pinit(), and vm_proc_new(). o Lock page queue accesses by vm_page_free() in pmap_object_init_pt().
|
#
e344afe7 |
|
20-Jul-2002 |
Peter Wemm <peter@FreeBSD.org> |
Move SWTCH_OPTIM_STATS related code out of cpufunc.h. (This sort of stat gathering is not an x86 cpu feature)
|
#
4aca0b15 |
|
19-Jul-2002 |
Alan Cox <alc@FreeBSD.org> |
o Use vm_page_alloc(... | VM_ALLOC_WIRED) in place of vm_page_wire().
|
#
d08c48b4 |
|
17-Jul-2002 |
Peter Wemm <peter@FreeBSD.org> |
Avoid trying to set PG_G on the first 4MB when we set up the 4MB page. This solves the SMP panic for at least one system. I'd still like to know why my xeon works though. Tested by: bmilekic
|
#
700399bc |
|
14-Jul-2002 |
Alan Cox <alc@FreeBSD.org> |
o Lock page queue accesses by vm_page_wire().
|
#
5c5e3622 |
|
13-Jul-2002 |
Peter Wemm <peter@FreeBSD.org> |
Two invlpg's slipped through that were not protected from I386_CPU Pointed out by: dillon
|
#
96fd5002 |
|
13-Jul-2002 |
Peter Wemm <peter@FreeBSD.org> |
invlpg() does not work too well on i386 cpus. Add token i386 support back in to the pmap_zero_page* stuff.
|
#
00649044 |
|
13-Jul-2002 |
Peter Wemm <peter@FreeBSD.org> |
Do global shootdowns when switching to/from 4MB pages. I believe we can do a shootdown on a 4MB "page" though, but this should be safer for now. Noticed by: tegge
|
#
a7b1f16c |
|
13-Jul-2002 |
Peter Wemm <peter@FreeBSD.org> |
Bandaid for SMP. Changing APTDpde without a global shootdown is not safe yet. We used to do a global shootdown here anyway so another day or so shouldn't hurt.
|
#
753492f4 |
|
13-Jul-2002 |
Alan Cox <alc@FreeBSD.org> |
o Lock some page queue accesses, in particular, those by vm_page_unwire().
|
#
fbcf77c2 |
|
12-Jul-2002 |
Matthew Dillon <dillon@FreeBSD.org> |
Re-enable the idle page-zeroing code. Remove all IPIs from the idle page-zeroing code as well as from the general page-zeroing code and use a lazy tlb page invalidation scheme based on a callback made at the end of mi_switch. A number of people came up with this idea at the same time so credit belongs to Peter, John, and Jake as well. Two-way SMP buildworld -j 5 tests (second run, after stabilization) 2282.76 real 2515.17 user 704.22 sys before peter's IPI commit 2266.69 real 2467.50 user 633.77 sys after peter's commit 2232.80 real 2468.99 user 615.89 sys after this commit Reviewed by: peter, jhb Approved by: peter
|
#
f1b665c8 |
|
12-Jul-2002 |
Peter Wemm <peter@FreeBSD.org> |
Revive backed out pmap related changes from Feb 2002. The highlights are: - It actually works this time, honest! - Fine grained TLB shootdowns for SMP on i386. IPI's are very expensive, so try and optimize things where possible. - Introduce ranged shootdowns that can be done as a single IPI. - PG_G support for i386 - Specific-cpu targeted shootdowns. For example, there is no sense in globally purging the TLB cache for where we are stealing a page from the local unshared process on the local cpu. Use pm_active to track this. - Add some instrumentation for the tlb shootdown code. - Rip out SMP code from <machine/cpufunc.h> - Try and fix some very bogus PG_G and PG_PS interactions that were bad enough to cause vm86 bios calls to break. vm86 depended on our existing bugs and this was the cause of the VESA panics last time. - Fix the silly one-line error that caused the 'panic: bad pte' last time. - Fix a couple of other silly one-line errors that should have caused more pain than they did. Some more work is needed: - pmap_{zero,copy}_page[_idle]. These can be done without IPI's if we have a hook in cpu_switch. - The IPI handlers need some cleanup. I have a bogus %ds load that can be avoided. - APTD handling is rather bogus and appears to be a large source of global TLB IPI shootdowns for no really good reason. I see speedups of between 1.5% and ~4% on buildworlds in a while 1 loop. I expect to see a bigger difference when there is significant pageout activity or the system otherwise has memory shortages. I have backed out a few optimizations that I had been using over the last few days in order to be a little more conservative. I'll revisit these again over the next few days as the dust settles. New option: DISABLE_PG_G - In case I missed something.
|
#
4a0226c6 |
|
11-Jul-2002 |
Peter Wemm <peter@FreeBSD.org> |
Unexpand a couple of 8-space indents that I added in rev 1.285.
|
#
a58b3a68 |
|
07-Jul-2002 |
Peter Wemm <peter@FreeBSD.org> |
Add a special page zero entry point intended to be called via the single threaded VM pagezero kthread outside of Giant. For some platforms, this is really easy since it can just use the direct mapped region. For others, IPI sending is involved or there are other issues, so grab Giant when needed. We still have preemption issues to deal with, but Alan Cox has an interesting suggestion on how to minimize the problem on x86. Use Luigi's hack for preserving the (lack of) priority. Turn the idle zeroing back on since it can now actually do something useful outside of Giant in many cases.
|
#
64ca7010 |
|
07-Jul-2002 |
Peter Wemm <peter@FreeBSD.org> |
Fix a hideous TLB bug. pmap_unmapdev neglected to remove the device mappings from the page tables, which were mapped with PG_G! We could reuse the page table entry for another mapping (pmap_mapdev) but it would never have cleared any remaining PG_G TLB entries.
|
#
a136efe9 |
|
07-Jul-2002 |
Peter Wemm <peter@FreeBSD.org> |
Collect all the (now equivalent) pmap_new_proc/pmap_dispose_proc/ pmap_swapin_proc/pmap_swapout_proc functions from the MD pmap code and use a single equivalent MI version. There are other cleanups needed still. While here, use the UMA zone hooks to keep a cache of preinitialized proc structures handy, just like the thread system does. This eliminates one dependency on 'struct proc' being persistent even after being freed. There are some comments about things that can be factored out into ctor/dtor functions if it is worth it. For now they are mostly just doing statistics to get a feel of how it is working.
|
#
3a7ef791 |
|
04-Jul-2002 |
Peter Wemm <peter@FreeBSD.org> |
Diff reduction (microoptimization) with another WIP. Move the frame calculation in get_ptbase() to a little later on.
|
#
8692e184 |
|
03-Jul-2002 |
Julian Elischer <julian@FreeBSD.org> |
Don't free pages we never allocated.. My eyes openned by: Matt
|
#
0d6735c6 |
|
03-Jul-2002 |
Julian Elischer <julian@FreeBSD.org> |
Slight restatement of the code and remove some unused variables.
|
#
e04d8bf8 |
|
03-Jul-2002 |
Julian Elischer <julian@FreeBSD.org> |
Add comments and slightly rearrange the thread stack assignment code to try make it less obscure.
|
#
b2adb4b2 |
|
03-Jul-2002 |
Julian Elischer <julian@FreeBSD.org> |
Remove vestiges of old code... These functions are always called on new memory so they can not already be set up, so don't bother testing for that. (This was left over from before we used UMA (which is cool))
|
#
e602ba25 |
|
29-Jun-2002 |
Julian Elischer <julian@FreeBSD.org> |
Part 1 of KSE-III The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools) Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands) NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..
|
#
c6d84b4d |
|
27-Jun-2002 |
Andrew R. Reiter <arr@FreeBSD.org> |
Fix for the problem stated below by Tor Egge: (from: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=832566+0+ \ current/freebsd-current) "Too many pages were prefaulted in pmap_object_init_pt, thus the wrong physical page was entered in the pmap for the virtual address where the .dynamic section was supposed to be." Submitted by: tegge Approved by: tegge's patches never fail
|
#
23f09d50 |
|
26-Jun-2002 |
Ian Dowse <iedowse@FreeBSD.org> |
Avoid using the 64-bit vm_pindex_t in a few places where 64-bit types are not required, as the overhead is unnecessary: o In the i386 pmap_protect(), `sindex' and `eindex' represent page indices within the 32-bit virtual address space. o In swp_pager_meta_build() and swp_pager_meta_ctl(), use a temporary variable to store the low few bits of a vm_pindex_t that gets used as an array index. o vm_uiomove() uses `osize' and `idx' for page offsets within a map entry. o In vm_object_split(), `idx' is a page offset within a map entry.
|
#
6395da54 |
|
25-Jun-2002 |
Ian Dowse <iedowse@FreeBSD.org> |
Complete the initial set of VM changes required to support full 64-bit file sizes. This step simply addresses the remaining overflows, and does attempt to optimise performance. The details are: o Use a 64-bit type for the vm_object `size' and the size argument to vm_object_allocate(). o Use the correct type for index variables in dev_pager_getpages(), vm_object_page_clean() and vm_object_page_remove(). o Avoid an overflow in the i386 pmap_object_init_pt().
|
#
18aa2de5 |
|
17-Jun-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
- Introduce the new M_NOVM option which tells uma to only check the currently allocated slabs and bucket caches for free items. It will not go ask the vm for pages. This differs from M_NOWAIT in that it not only doesn't block, it doesn't even ask. - Add a new zcreate option ZONE_VM, that sets the BUCKETCACHE zflag. This tells uma that it should only allocate buckets out of the bucket cache, and not from the VM. It does this by using the M_NOVM option to zalloc when getting a new bucket. This is so that the VM doesn't recursively enter itself while trying to allocate buckets for vm_map_entry zones. If there are already allocated buckets when we get here we'll still use them but otherwise we'll skip it. - Use the ZONE_VM flag on vm map entries and pv entries on x86.
|
#
db17c6fc |
|
29-Apr-2002 |
Peter Wemm <peter@FreeBSD.org> |
Tidy up some loose ends. i386/ia64/alpha - catch up to sparc64/ppc: - replace pmap_kernel() with refs to kernel_pmap - change kernel_pmap pointer to (&kernel_pmap_store) (this is a speedup since ld can set these at compile/link time) all platforms (as suggested by jake): - gc unused pmap_reference - gc unused pmap_destroy - gc unused struct pmap.pm_count (we never used pm_count - we track address space sharing at the vmspace)
|
#
1a87a0da |
|
15-Apr-2002 |
Peter Wemm <peter@FreeBSD.org> |
Pass vm_page_t instead of physical addresses to pmap_zero_page[_area]() and pmap_copy_page(). This gets rid of a couple more physical addresses in upper layers, with the eventual aim of supporting PAE and dealing with the physical addressing mostly within pmap. (We will need either 64 bit physical addresses or page indexes, possibly both depending on the circumstances. Leaving this to pmap itself gives more flexibilitly.) Reviewed by: jake Tested on: i386, ia64 and (I believe) sparc64. (my alpha was hosed)
|
#
46d0abf3 |
|
20-Mar-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
Remove references to vm_zone.h and switch over to the new uma API.
|
#
15fe3067 |
|
20-Mar-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
Remove __P.
|
#
8355f576 |
|
19-Mar-2002 |
Jeff Roberson <jeff@FreeBSD.org> |
This is the first part of the new kernel memory allocator. This replaces malloc(9) and vm_zone with a slab like allocator. Reviewed by: arch@
|
#
f4e18c9a |
|
28-Feb-2002 |
Mike Silbersack <silby@FreeBSD.org> |
Fix a minor swap leak. Previously, the UPAGES/KSTACK area of processes/threads would leak memory at the time that a previously swapped process was terminated. Lukcily, the leak was only 12K/proc, so it was unlikely to be a major problem unless you had an undersized swap partition. Submitted by: dillon Reviewed by: silby MFC after: 1 week
|
#
7f3a4093 |
|
27-Feb-2002 |
Mike Silbersack <silby@FreeBSD.org> |
Fix a horribly suboptimal algorithm in the vm_daemon. In order to determine what to page out, the vm_daemon checks reference bits on all pages belonging to all processes. Unfortunately, the algorithm used reacted badly with shared pages; each shared page would be checked once per process sharing it; this caused an O(N^2) growth of tlb invalidations. The algorithm has been changed so that each page will be checked only 16 times. Prior to this change, a fork/sleepbomb of 1300 processes could cause the vm_daemon to take over 60 seconds to complete, effectively freezing the system for that time period. With this change in place, the vm_daemon completes in less than a second. Any system with hundreds of processes sharing pages should benefit from this change. Note that the vm_daemon is only run when the system is under extreme memory pressure. It is likely that many people with loaded systems saw no symptoms of this problem until they reached the point where swapping began. Special thanks go to dillon, peter, and Chuck Cranor, who helped me get up to speed with vm internals. PR: 33542, 20393 Reviewed by: dillon MFC after: 1 week
|
#
d1693e17 |
|
27-Feb-2002 |
Peter Wemm <peter@FreeBSD.org> |
Back out all the pmap related stuff I've touched over the last few days. There is some unresolved badness that has been eluding me, particularly affecting uniprocessor kernels. Turning off PG_G helped (which is a bad sign) but didn't solve it entirely. Userland programs still crashed.
|
#
5c004dc6 |
|
26-Feb-2002 |
Peter Wemm <peter@FreeBSD.org> |
Bandaid for the Uniprocessor kernel exploding. This makes a UP kernel boot and run (and indeed I am committing from it) instead of exploding during the int 0x15 call from inside the atkbd driver to get the keyboard repeat rates.
|
#
b7eeb587 |
|
26-Feb-2002 |
Alfred Perlstein <alfred@FreeBSD.org> |
clarify panic message
|
#
bd1e3a0f |
|
26-Feb-2002 |
Peter Wemm <peter@FreeBSD.org> |
Jake further reduced IPI shootdowns on sparc64 in loops by using ranged shootdowns in a couple of key places. Do the same for i386. This also hides some physical addresses from higher levels and has it use the generic vm_page_t's instead. This will help for PAE down the road. Obtained from: jake (MI code, suggestions for MD part)
|
#
9f34c416 |
|
26-Feb-2002 |
Matthew Dillon <dillon@FreeBSD.org> |
didn't quite undo the last reversion. This gets it.
|
#
08b38b1f |
|
26-Feb-2002 |
Matthew Dillon <dillon@FreeBSD.org> |
revert compatibility fix temporarily (thought it would not break anything leaving it in).
|
#
24e68cb0 |
|
26-Feb-2002 |
Matthew Dillon <dillon@FreeBSD.org> |
Make peter's commit compatible with interrupt-enabled critical_enter() and exit(), which has already solved the problem in regards to deadlocked IPI's.
|
#
6bd95d70 |
|
25-Feb-2002 |
Peter Wemm <peter@FreeBSD.org> |
Work-in-progress commit syncing up pmap cleanups that I have been working on for a while: - fine grained TLB shootdown for SMP on i386 - ranged TLB shootdowns.. eg: specify a range of pages to shoot down with a single IPI, since the IPI is very expensive. Adjust some callers that used to trigger this inside tight loops to do a ranged shootdown at the end instead. - PG_G support for SMP on i386 (options ENABLE_PG_G) - defer PG_G activation till after we decide what we are going to do with PSE and the 4MB pages at the start of the kernel. This should solve some rumored strangeness about stale PG_G entries getting stuck underneath the 4MB pages. - add some instrumentation for the fine TLB shootdown - convert some asm instruction wrappers from functions to inlines. gcc seems to do a fair bit better with this. - [temporarily!] pessimize the tlb shootdown IPI handlers. I will fix this again shortly. This has been working fairly well for me for a while, but I have tweaked it again prior to commit since my last major testing round. The only outstanding problem that I know of is PG_G related, which is why there is an option for it (not on by default for SMP). I have seen a world speedups by a few percent (as much as 4 or 5% in one case) but I have *not* accurately measured this - I am a bit sceptical of these numbers.
|
#
98f1484c |
|
20-Feb-2002 |
Peter Wemm <peter@FreeBSD.org> |
Pass me the pointy hat please. Be sure to return a value in a non-void function. I've been running with this buried in the mountains of compiler output for about a month on my desktop.
|
#
6a3e90ef |
|
19-Feb-2002 |
Peter Wemm <peter@FreeBSD.org> |
Some more tidy-up of stray "unsigned" variables instead of p[dt]_entry_t etc.
|
#
ead8168a |
|
05-Jan-2002 |
Peter Wemm <peter@FreeBSD.org> |
Convert a bunch of 1 << PCPU_GET(cpuid) to PCPU_GET(cpumask).
|
#
7ff48af7 |
|
02-Jan-2002 |
Peter Wemm <peter@FreeBSD.org> |
Allow a specific setting for pv entries. This avoids the need to guess (or calculate by hand) the effect of interactions between shpgperproc, physical ram size, maxproc, maxdsiz, etc.
|
#
587bd8bf |
|
31-Dec-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
Grrr. The tlb code is strewn over 3 files and I misread it. Revert the last change (it was a NOP), and remove the XXX comments that no longer apply.
|
#
b00dcfdf |
|
31-Dec-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
You know those 'XXX what about SMP' comments in pmap_kenter()? Well, they were right. Fix both kenter() and kremove() for SMP by ensuring that the tlb is flushed on other cpu's. This will directly solve random-corruption panic issues in -stable when it is MFC'd. Better to be safe then sorry, we can optimize this later. Original Suspicion by: peter Maybe MFC: immediately on re's permission
|
#
ff5a52e1 |
|
19-Dec-2001 |
Peter Wemm <peter@FreeBSD.org> |
Replace a bunch of: for (pv = TAILQ_FIRST(&m->md.pv_list); pv; pv = TAILQ_NEXT(pv, pv_list)) { with: TAILQ_FOREACH(pv, &m->md.pv_list, pv_list) {
|
#
c04cbb47 |
|
19-Dec-2001 |
Peter Wemm <peter@FreeBSD.org> |
Fix some whitespace nits, and a minor error that I made in some unused #ifdef DEBUG code (VM_MAXUSER_ADDRESS vs UPT_MAX_ADDRESS).
|
#
0bbc8826 |
|
11-Dec-2001 |
John Baldwin <jhb@FreeBSD.org> |
Overhaul the per-CPU support a bit: - The MI portions of struct globaldata have been consolidated into a MI struct pcpu. The MD per-CPU data are specified via a macro defined in machine/pcpu.h. A macro was chosen over a struct mdpcpu so that the interface would be cleaner (PCPU_GET(my_md_field) vs. PCPU_GET(md.md_my_md_field)). - All references to globaldata are changed to pcpu instead. In a UP kernel, this data was stored as global variables which is where the original name came from. In an SMP world this data is per-CPU and ideally private to each CPU outside of the context of debuggers. This also included combining machine/globaldata.h and machine/globals.h into machine/pcpu.h. - The pointer to the thread using the FPU on i386 was renamed from npxthread to fpcurthread to be identical with other architectures. - Make the show pcpu ddb command MI with a MD callout to display MD fields. - The globaldata_register() function was renamed to pcpu_init() and now init's MI fields of a struct pcpu in addition to registering it with the internal array and list. - A pcpu_destroy() function was added to remove a struct pcpu from the internal array and list. Tested on: alpha, i386 Reviewed by: peter, jake
|
#
eaef7150 |
|
10-Dec-2001 |
Peter Wemm <peter@FreeBSD.org> |
Delete some leftover code from a bygone age. We dont have an array of IdlePTDS anymore and dont to the PTD[MPPTDI] swapping etc.
|
#
720c992f |
|
16-Nov-2001 |
Peter Wemm <peter@FreeBSD.org> |
Fix the non-KSTACK_GUARD case.. It has been broken since the KSE commit. ptek was not been initialized.
|
#
6729cb88 |
|
16-Nov-2001 |
Peter Wemm <peter@FreeBSD.org> |
Start bringing i386/pmap.c into line with cleanups that were done to alpha pmap. In particular - - pd_entry_t and pt_entry_t are now u_int32_t instead of a pointer. This is to enable cleaner PAE and x86-64 support down the track sor that we can change the pd_entry_t/pt_entry_t types to 64 bit entities. - Terminate "unsigned *ptep, pte" with extreme prejudice and use the correct pt_entry_t/pd_entry_t types. - Various other cosmetic changes to match cleanups elsewhere. - This eliminates a boatload of casts. - use VM_MAXUSER_ADDRESS in place of UPT_MIN_ADDRESS in a couple of places where we're testing user address space limits. Assuming the page tables start directly after the end of user space is not a safe assumption. There is still more to go.
|
#
ab29f876 |
|
15-Nov-2001 |
Peter Wemm <peter@FreeBSD.org> |
Oops, I accidently merged a whitespace error from the original commit. (whitespace at end of line in rev 1.264 pmap.c). Fix them all.
|
#
8ad88132 |
|
15-Nov-2001 |
Peter Wemm <peter@FreeBSD.org> |
Converge/fix some debug code (#if 0'ed on alpha, but whatever) - use NPTEPG/NPDEPG instead of magic 1024 (important for PAE) - use pt_entry_t instead of unsigned (important for PAE) - use vm_offset_t instead of unsigned for va's (important for x86-64)
|
#
e258f08a |
|
31-Oct-2001 |
Peter Wemm <peter@FreeBSD.org> |
Skip PG_UNMANAGED pages when we're shooting everything down to try and reclaim pv_entries. PG_UNMANAGED pages dont have pv_entries to reclaim. Reported by: David Xu <davidx@viasoft.com.cn>
|
#
e3026983 |
|
30-Oct-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
Don't let pmap_object_init_pt() exhaust all available free pages (allocating pv entries w/ zalloci) when called in a loop due to an madvise(). It is possible to completely exhaust the free page list and cause a system panic when an expected allocation fails.
|
#
23340918 |
|
14-Oct-2001 |
Tor Egge <tegge@FreeBSD.org> |
Reduce the number of TLB shootdowns caused by a call to pmap_qenter() from number of pages mapped to 1. Reviewed by: dillon
|
#
b40ce416 |
|
12-Sep-2001 |
Julian Elischer <julian@FreeBSD.org> |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
|
#
43295941 |
|
30-Aug-2001 |
Peter Wemm <peter@FreeBSD.org> |
Do a style cleanup pass for the pmap_{new,dispose,etc}_proc() functions to get them closer to the KSE tree. I will do the other $machine/pmap.c files shortly.
|
#
15dac10b |
|
24-Aug-2001 |
Julian Elischer <julian@FreeBSD.org> |
Add another comment. check for 'teh's this time..
|
#
268bdb43 |
|
24-Aug-2001 |
Peter Wemm <peter@FreeBSD.org> |
Optionize UPAGES for the i386. As part of this I split some of the low level implementation stuff out of machine/globaldata.h to avoid exposing UPAGES to lots more places. The end result is that we can double the kernel stack size with 'options UPAGES=4' etc. This is mainly being done for the benefit of a MFC to RELENG_4 at some point. -current doesn't really need this so much since each interrupt runs on its own kstack.
|
#
219d632c |
|
21-Aug-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
Move most of the kernel submap initialization code, including the timeout callwheel and buffer cache, out of the platform specific areas and into the machine independant area. i386 and alpha adjusted here. Other cpus can be fixed piecemeal. Reviewed by: freebsd-smp, jake
|
#
3d6fde76 |
|
21-Aug-2001 |
Peter Wemm <peter@FreeBSD.org> |
Introduce two new sysctl's.. vm.kvm_size and vm.kvm_free. These are purely informational and can give some advance indications of tuning problems. These are i386 only for now as it seems that the i386 is the only one suffering kvm pressure.
|
#
0b27d710 |
|
26-Jul-2001 |
Peter Wemm <peter@FreeBSD.org> |
Make PMAP_SHPGPERPROC tunable. One shouldn't need to recompile a kernel for this, since it is easy to run into with large systems with lots of shared mmap space. Obtained from: yahoo
|
#
0cddd8f0 |
|
04-Jul-2001 |
Matthew Dillon <dillon@FreeBSD.org> |
With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
|
#
1f50e112 |
|
23-May-2001 |
Alfred Perlstein <alfred@FreeBSD.org> |
pmap_mapdev needs the vm_mtx, aquire it if not already locked
|
#
9dceb26b |
|
21-May-2001 |
John Baldwin <jhb@FreeBSD.org> |
Sort includes.
|
#
23955314 |
|
18-May-2001 |
Alfred Perlstein <alfred@FreeBSD.org> |
Introduce a global lock for the vm subsystem (vm_mtx). vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb
|
#
fb919e4d |
|
01-May-2001 |
Mark Murray <markm@FreeBSD.org> |
Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)
|
#
1005a129 |
|
28-Mar-2001 |
John Baldwin <jhb@FreeBSD.org> |
Convert the allproc and proctree locks from lockmgr locks to sx locks.
|
#
50e2347e |
|
14-Mar-2001 |
Peter Wemm <peter@FreeBSD.org> |
Kill the 4MB kernel limit dead. [I hope :-)]. For UP, we were using $tmp_stk as a stack from the data section. If the kernel text section grew beyond ~3MB, the data section would be pushed beyond the temporary 4MB P==V mapping. This would cause the trampoline up to high memory to fault. The hack workaround I did was to use all of the page table pages that we already have while preparing the initial P==V mapping, instead of just the first one. For SMP, the AP bootstrap process suffered the same sort of problem and got the same treatment. MFC candidate - this breaks on 4.x just the same.. Thanks to: Richard Todd <rmtodd@ichotolot.servalan.com>
|
#
136d8f42 |
|
06-Mar-2001 |
John Baldwin <jhb@FreeBSD.org> |
Unrevert the pmap_map() changes. They weren't broken on x86. Sense beaten into me by: peter
|
#
4a01ebd4 |
|
06-Mar-2001 |
John Baldwin <jhb@FreeBSD.org> |
Back out the pmap_map() change for now, it isn't completely stable on the i386.
|
#
968950e5 |
|
05-Mar-2001 |
John Baldwin <jhb@FreeBSD.org> |
- Rework pmap_map() to take advantage of direct-mapped segments on supported architectures such as the alpha. This allows us to save on kernel virtual address space, TLB entries, and (on the ia64) VHPT entries. pmap_map() now modifies the passed in virtual address on architectures that do not support direct-mapped segments to point to the next available virtual address. It also returns the actual address that the request was mapped to. - On the IA64 don't use a special zone of PV entries needed for early calls to pmap_kenter() during pmap_init(). This gets us in trouble because we end up trying to use the zone allocator before it is initialized. Instead, with the pmap_map() change, the number of needed PV entries is small enough that we can get by with a static pool that is used until pmap_init() is complete. Submitted by: dfr Debugging help: peter Tested by: me
|
#
f1532aad |
|
22-Feb-2001 |
Peter Wemm <peter@FreeBSD.org> |
Activate USER_LDT by default. The new thread libraries are going to depend on this. The linux ABI emulator tries to use it for some linux binaries too. VM86 had a bigger cost than this and it was made default a while ago. Reviewed by: jhb, imp
|
#
7ad1d369 |
|
29-Jan-2001 |
John Baldwin <jhb@FreeBSD.org> |
Remove unnecessary locking to protect the p_upages_obj and p_addr pointers.
|
#
16fdce53 |
|
24-Jan-2001 |
John Baldwin <jhb@FreeBSD.org> |
- Proc locking. - P_INMEM -> PS_INMEM.
|
#
a3ea6d41 |
|
21-Jan-2001 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
First step towards an MP-safe zone allocator: - have zalloc() and zfree() always lock the vm_zone. - remove zalloci() and zfreei(), which are now redundant. Reviewed by: bmilekic, jasone
|
#
3e899e10 |
|
20-Jan-2001 |
Jake Burkholder <jake@FreeBSD.org> |
Remove the per-cpu pages used for copy and zero-ing pages of memory for SMP; just use the same ones as UP. These weren't used without holding Giant anyway, and the routines that use them would have to be protected from pre-emption to avoid migrating cpus.
|
#
e44a0ea3 |
|
16-Jan-2001 |
Peter Wemm <peter@FreeBSD.org> |
Stop doing runtime checking on i386 cpus for cpu class. The cpu is slow enough as it is, without having to constantly check that it really is an i386 still. It was possible to compile out the conditionals for faster cpus by leaving out 'I386_CPU', but it was not possible to unconditionally compile for the i386. You got the runtime checking whether you wanted it or not. This makes I386_CPU mutually exclusive with the other cpu types, and tidies things up a little in the process. Reviewed by: alfred, markm, phk, benno, jlemon, jhb, jake, grog, msmith, jasone, dcs, des (and a bunch more people who encouraged it)
|
#
ef73ae4b |
|
09-Jan-2001 |
Jake Burkholder <jake@FreeBSD.org> |
Use PCPU_GET, PCPU_PTR and PCPU_SET to access all per-cpu variables other then curproc.
|
#
c0c25570 |
|
12-Dec-2000 |
Jake Burkholder <jake@FreeBSD.org> |
- Change the allproc_lock to use a macro, ALLPROC_LOCK(how), instead of explicit calls to lockmgr. Also provides macros for the flags pased to specify shared, exclusive or release which map to the lockmgr flags. This is so that the use of lockmgr can be easily replaced with optimized reader-writer locks. - Add some locking that I missed the first time.
|
#
553629eb |
|
22-Nov-2000 |
Jake Burkholder <jake@FreeBSD.org> |
Protect the following with a lockmgr lock: allproc zombproc pidhashtbl proc.p_list proc.p_hash nextpid Reviewed by: jhb Obtained from: BSD/OS and netbsd
|
#
b5d335ad |
|
07-Nov-2000 |
Alfred Perlstein <alfred@FreeBSD.org> |
Protect against an infinite loop when prefaulting pages. This can happen when the vm system maps past the end of an object or tries to map a zero length object, the pmap layer misses the fact that offsets wrap into negative numbers and we get stuck. Found by: Joost Pol aka Nohican <nohican@marcella.niets.org> Submitted by: tegge
|
#
c794ceb5 |
|
17-Oct-2000 |
Paul Saab <ps@FreeBSD.org> |
Implement write combining for crashdumps. This is useful when write caching is disabled on both SCSI and IDE disks where large memory dumps could take up to an hour to complete. Taking an i386 scsi based system with 512MB of ram and timing (in seconds) how long it took to complete a dump, the following results were obtained: Before: After: WCE TIME WCE TIME ------------------ ------------------ 1 141.820972 1 15.600111 0 797.265072 0 65.480465 Obtained from: Yahoo! Reviewed by: peter
|
#
12e8a79c |
|
05-Oct-2000 |
John Baldwin <jhb@FreeBSD.org> |
Replace loadandclear() with atomic_readandclear_int().
|
#
7321545f |
|
22-Sep-2000 |
Paul Saab <ps@FreeBSD.org> |
Remove the NCPU, NAPIC, NBUS, NINTR config options. Make NAPIC, NBUS, NINTR dynamic and set NCPU to a maximum of 16 under SMP. Reviewed by: peter
|
#
300019c4 |
|
19-Sep-2000 |
Eivind Eklund <eivind@FreeBSD.org> |
Better error message when booting an SMP kernel on an UP system.
|
#
0384fff8 |
|
06-Sep-2000 |
Jason Evans <jasone@FreeBSD.org> |
Major update to the way synchronization is done in the kernel. Highlights include: * Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.) * Per-CPU idle processes. * Interrupts are run in their own separate kernel threads and can be preempted (i386 only). Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
|
#
d9b05734 |
|
16-Aug-2000 |
Tor Egge <tegge@FreeBSD.org> |
Prepare for a cleanup of pmap module API pollution introduced by the suggested fix in PR 12378. Keep track of all existing pmaps independent of existing processes. This allows for a process to temporarily connect to a different address space without the risk of missing an update of the original address space if the kernel grows. pmap_pinit2() is no longer needed on the i386 platform but is left as a stub until the alpha pmap code is updated. PR: 12378
|
#
8b03c8ed |
|
29-May-2000 |
Matthew Dillon <dillon@FreeBSD.org> |
This is a cleanup patch to Peter's new OBJT_PHYS VM object type and sysv shared memory support for it. It implements a new PG_UNMANAGED flag that has slightly different characteristics from PG_FICTICIOUS. A new sysctl, kern.ipc.shm_use_phys has been added to enable the use of physically-backed sysv shared memory rather then swap-backed. Physically backed shm segments are not tracked with PV entries, allowing programs which use a large shm segment as a rendezvous point to operate without eating an insane amount of KVM in the PV entry management. Read: Oracle. Peter's OBJT_PHYS object will also allow us to eventually implement page-table sharing and/or 4MB physical page support for such segments. We're half way there.
|
#
1536418a |
|
29-May-2000 |
Doug Rabson <dfr@FreeBSD.org> |
Brucify the pmap_enter_temporary() changes.
|
#
31891bc2 |
|
28-May-2000 |
Doug Rabson <dfr@FreeBSD.org> |
Add a new pmap entry point, pmap_enter_temporary() to be used during dumps to create temporary page mappings. This replaces the use of CADDR1 which is fairly x86 specific. Reviewed by: dillon
|
#
0385347c |
|
20-May-2000 |
Peter Wemm <peter@FreeBSD.org> |
Implement an optimization of the VM<->pmap API. Pass vm_page_t's directly to various pmap_*() functions instead of looking up the physical address and passing that. In many cases, the first thing the pmap code was doing was going to a lot of trouble to get back the original vm_page_t, or it's shadow pv_table entry. Inspired by: John Dyson's 1998 patches. Also: Eliminate pv_table as a seperate thing and build it into a machine dependent part of vm_page_t. This eliminates having a seperate set of structions that shadow each other in a 1:1 fashion that we often went to a lot of trouble to translate from one to the other. (see above) This happens to save 4 bytes of physical memory for each page in the system. (8 bytes on the Alpha). Eliminate the use of the phys_avail[] array to determine if a page is managed (ie: it has pv_entries etc). Store this information in a flag. Things like device_pager set it because they create vm_page_t's on the fly that do not have pv_entries. This makes it easier to "unmanage" a page of physical memory (this will be taken advantage of in subsequent commits). Add a function to add a new page to the freelist. This could be used for reclaiming the previously wasted pages left over from preloaded loader(8) files. Reviewed by: dillon
|
#
c71d5570 |
|
20-Apr-2000 |
Luoqi Chen <luoqi@FreeBSD.org> |
IO apics are not necessarily page aligned, they are only required to be aligned on 1K boundary. Correct a typo that would cause problem to a second IO apic. Pointed out by: Steve Passe <smp.csn.net>
|
#
db5f635a |
|
16-Mar-2000 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Eliminate the undocumented, experimental, non-delivering and highly dangerous MAX_PERF option.
|
#
311b554b |
|
13-Mar-2000 |
Bruce Evans <bde@FreeBSD.org> |
Disabled the optimization of not doing an invltlb_1pg() when changing pte's from zero. The TLB is supposed to be invalidated when pte's are changed _to_ zero, but this doesn't occur in all cases for global pages (PG_G stops invltlb() from working, and invltlb_1pg() is not used enough). PR: 14141, 16568 Submitted by: dillon
|
#
2996376a |
|
19-Nov-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Use LIST_FOREACH to traverse the allproc list. Submitted by: Jake Burkholder jake@checker.org
|
#
923502ff |
|
29-Oct-1999 |
Poul-Henning Kamp <phk@FreeBSD.org> |
useracc() the prequel: Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
|
#
1a16554b |
|
11-Sep-1999 |
Peter Wemm <peter@FreeBSD.org> |
Make pmap_mapdev() deal with non-page-aligned requests. Add a corresponding pmap_unmapdev() to release the KVM back to kernel_map.
|
#
c3aac50f |
|
27-Aug-1999 |
Peter Wemm <peter@FreeBSD.org> |
$Id$ -> $FreeBSD$
|
#
7308467d |
|
11-Aug-1999 |
Alan Cox <alc@FreeBSD.org> |
_pmap_allocpte: If the pte page isn't PQ_NONE, panic rather than silently covering up the problem.
|
#
7f8d2279 |
|
09-Aug-1999 |
Alan Cox <alc@FreeBSD.org> |
pmap_remove_pages: Add KASSERT to detect out of range access to the pv_table and report the errant pte before it's overwritten.
|
#
eaf183a8 |
|
31-Jul-1999 |
Alan Cox <alc@FreeBSD.org> |
pmap_object_init_pt: Verify that object != NULL.
|
#
086d0ae1 |
|
30-Jul-1999 |
Alan Cox <alc@FreeBSD.org> |
Add parentheses for clarity. Submitted by: dillon
|
#
d4da2dba |
|
21-Jul-1999 |
Alan Cox <alc@FreeBSD.org> |
Fix the following problem: When creating new processes (or performing exec), the new page directory is initialized too early. The kernel might grow before p_vmspace is initialized for the new process. Since pmap_growkernel doesn't yet know about the new page directory, it isn't updated, and subsequent use causes a failure. The fix is (1) to clear p_vmspace early, to stop pmap_growkernel from stomping on memory, and (2) to defer part of the initialization of new page directories until p_vmspace is initialized. PR: kern/12378 Submitted by: tegge Reviewed by: dfr
|
#
ad8ac923 |
|
08-Jul-1999 |
Kirk McKusick <mckusick@FreeBSD.org> |
These changes appear to give us benefits with both small (32MB) and large (1G) memory machine configurations. I was able to run 'dbench 32' on a 32MB system without bring the machine to a grinding halt. * buffer cache hash table now dynamically allocated. This will have no effect on memory consumption for smaller systems and will help scale the buffer cache for larger systems. * minor enhancement to pmap_clearbit(). I noticed that all the calls to it used constant arguments. Making it an inline allows the constants to propogate to deeper inlines and should produce better code. * removal of inherent vfs_ioopt support through the emplacement of appropriate #ifdef's, with John's permission. If we do not find a use for it by the end of the year we will remove it entirely. * removal of getnewbufloops* counters & sysctl's - no longer necessary for debugging, getnewbuf() is now optimal. * buffer hash table functions removed from sys/buf.h and localized to vfs_bio.c * VFS_BIO_NEED_DIRTYFLUSH flag and support code added ( bwillwrite() ), allowing processes to block when too many dirty buffers are present in the system. * removal of a softdep test in bdwrite() that is no longer necessary now that bdwrite() no longer attempts to flush dirty buffers. * slight optimization added to bqrelse() - there is no reason to test for available buffer space on B_DELWRI buffers. * addition of reverse-scanning code to vfs_bio_awrite(). vfs_bio_awrite() will attempt to locate clusterable areas in both the forward and reverse direction relative to the offset of the buffer passed to it. This will probably not make much of a difference now, but I believe we will start to rely on it heavily in the future if we decide to shift some of the burden of the clustering closer to the actual I/O initiation. * Removal of the newbufcnt and lastnewbuf counters that Kirk added. They do not fix any race conditions that haven't already been fixed by the gbincore() test done after the only call to getnewbuf(). getnewbuf() is a static, so there is no chance of it being misused by other modules. ( Unless Kirk can think of a specific thing that this code fixes. I went through it very carefully and didn't see anything ). * removal of VOP_ISLOCKED() check in flushbufqueues(). I do not think this check is necessary, the buffer should flush properly whether the vnode is locked or not. ( yes? ). * removal of extra arguments passed to getnewbuf() that are not necessary. * missed cluster_wbuild() that had to be a cluster_wbuild_wb() in vfs_cluster.c * vn_write() now calls bwillwrite() *PRIOR* to locking the vnode, which should greatly aid flushing operations in heavy load situations - both the pageout and update daemons will be able to operate more efficiently. * removal of b_usecount. We may add it back in later but for now it is useless. Prior implementations of the buffer cache never had enough buffers for it to be useful, and current implementations which make more buffers available might not benefit relative to the amount of sophistication required to implement a b_usecount. Straight LRU should work just as well, especially when most things are VMIO backed. I expect that (even though John will not like this assumption) directories will become VMIO backed some point soon. Submitted by: Matthew Dillon <dillon@backplane.com> Reviewed by: Kirk McKusick <mckusick@mckusick.com>
|
#
541e0187 |
|
23-Jun-1999 |
Luoqi Chen <luoqi@FreeBSD.org> |
Do not setup 4M pdir until all APs are up.
|
#
21053753 |
|
08-Jun-1999 |
Dmitrij Tejblum <dt@FreeBSD.org> |
Use kmem_alloc_nofault() rather than kmem_alloc_pageable() to allocate kernel virtual address space for UPAGES.
|
#
31fdd69a |
|
05-Jun-1999 |
Luoqi Chen <luoqi@FreeBSD.org> |
Fix an accounting problem when prefaulting 4M pages. PR: kern/11948
|
#
eb9d435a |
|
01-Jun-1999 |
Jonathan Lemon <jlemon@FreeBSD.org> |
Unifdef VM86. Reviewed by: silence on on -current
|
#
72e51821 |
|
27-May-1999 |
Alan Cox <alc@FreeBSD.org> |
pmap_object_init_pt: The size of vm_object::memq is vm_object::resident_page_count, not vm_object::size.
|
#
88b67a96 |
|
18-May-1999 |
Alan Cox <alc@FreeBSD.org> |
pmap_qremove: Eliminate unnecessary TLB shootdowns.
|
#
5206bca1 |
|
27-Apr-1999 |
Luoqi Chen <luoqi@FreeBSD.org> |
Enable vmspace sharing on SMP. Major changes are, - %fs register is added to trapframe and saved/restored upon kernel entry/exit. - Per-cpu pages are no longer mapped at the same virtual address. - Each cpu now has a separate gdt selector table. A new segment selector is added to point to per-cpu pages, per-cpu global variables are now accessed through this new selector (%fs). The selectors in gdt table are rearranged for cache line optimization. - fask_vfork is now on as default for both UP and SMP. - Some aio code cleanup. Reviewed by: Alan Cox <alc@cs.rice.edu> John Dyson <dyson@iquest.net> Julian Elischer <julian@whistel.com> Bruce Evans <bde@zeta.org.au> David Greenman <dg@root.com>
|
#
51bb7ba6 |
|
25-Apr-1999 |
Alan Cox <alc@FreeBSD.org> |
pmap_dispose_proc and pmap_copy_page: Conditionally compile 386-specific code. pmap_enter: Eliminate unnecessary TLB shootdowns. pmap_zero_page and pmap_zero_page_area: Use invltlb_1pg instead of duplicating the code.
|
#
11a9f83f |
|
23-Apr-1999 |
Dmitrij Tejblum <dt@FreeBSD.org> |
Make pmap_collect() an official pmap interface.
|
#
270da415 |
|
19-Apr-1999 |
Alan Cox <alc@FreeBSD.org> |
_pmap_unwire_pte_hold and pmap_remove_page: Use pmap_TLB_invalidate instead of invltlb_1pg to eliminate unnecessary IPIs. pmap_remove, pmap_protect and pmap_remove_pages: Use pmap_TLB_invalidate_all instead of invltlb to eliminate unnecessary IPIs. pmap_copy: Use cpu_invltlb instead of invltlb when updating APTDpde. pmap_changebit: Rather than deleting the unused "set bit" option (which may be useful later), make pmap_changebit an inline that is used by the new pmap_clearbit procedure. Collectively, the first three changes reduce the number of TLB shootdown IPIs by 1/3 for a kernel compile.
|
#
53134efb |
|
09-Apr-1999 |
Alan Cox <alc@FreeBSD.org> |
pmap_remove_pte: Use "loadandclear" to update the pte. pmap_changebit and pmap_ts_referenced: Switch to pmap_TLB_invalidate from invltlb.
|
#
4ffd949e |
|
06-Apr-1999 |
Mike Smith <msmith@FreeBSD.org> |
mem.c Split out ioctl handler a little more cleanly, add memory range attribute handling for both kernel and user-space consumers. pmap.c Remove obsolete P6 MTRR-related code. i686_mem.c Map generic memory-range attribute interface to the P6 MTRR model.
|
#
47b9dbd4 |
|
05-Apr-1999 |
Alan Cox <alc@FreeBSD.org> |
Two changes to pmap_remove_all: 1. Switch to pmap_TLB_invalidate from invltlb, eliminating a full TLB flush where a single-page flush suffices. (Also, this eliminates some unnecessary IPIs.) 2. Use "loadandclear" to update the pte, eliminating a race condition on SMPs. Change #2 should be committed to -STABLE.
|
#
8d17e694 |
|
05-Apr-1999 |
Julian Elischer <julian@FreeBSD.org> |
Catch a case spotted by Tor where files mmapped could leave garbage in the unallocated parts of the last page when the file ended on a frag but not a page boundary. Delimitted by tags PRE_MATT_MMAP_EOF and POST_MATT_MMAP_EOF, in files alpha/alpha/pmap.c i386/i386/pmap.c nfs/nfs_bio.c vm/pmap.h vm/vm_page.c vm/vm_page.h vm/vnode_pager.c miscfs/specfs/spec_vnops.c ufs/ufs/ufs_readwrite.c kern/vfs_bio.c Submitted by: Matt Dillon <dillon@freebsd.org> Reviewed by: Alan Cox <alc@freebsd.org>
|
#
087e80a9 |
|
02-Apr-1999 |
Alan Cox <alc@FreeBSD.org> |
Put in place the infrastructure for improved UP and SMP TLB management. In particular, replace the unused field pmap::pm_flag by pmap::pm_active, which is a bit mask representing which processors have the pmap activated. (Thus, it is a simple Boolean on UPs.) Also, eliminate an unnecessary memory reference from cpu_switch() in swtch.s. Assisted by: John S. Dyson <dyson@iquest.net> Tested by: Luoqi Chen <luoqi@watermarkgroup.com>, Poul-Henning Kamp <phk@critter.freebsd.dk>
|
#
10e77073 |
|
13-Mar-1999 |
Alan Cox <alc@FreeBSD.org> |
pmap_qenter/pmap_qremove: Use the pmap_kenter/pmap_kremove inline functions instead of duplicating them. pmap_remove_all: Eliminate an unused (but initialized) variable. pmap_ts_reference: Change the implementation. The new implementation is much smaller and simpler, but functionally identical. (Reviewed by "John S. Dyson" <dyson@iquest.net>.)
|
#
901671c0 |
|
05-Mar-1999 |
Alan Cox <alc@FreeBSD.org> |
Fix an SMP-only TLB invalidation bug. Specifically, disable a TLB invalidation optimization that won't work given the limitations of our current SMP support. This patch should be applied to -stable ASAP. Thanks to John Capo <jc@irbs.com>, Steve Kargl <sgk@troutmask.apl.washington.edu>, and Chuck Robey <chuckr@mat.net> for testing.
|
#
b1028ad1 |
|
19-Feb-1999 |
Luoqi Chen <luoqi@FreeBSD.org> |
Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This is the preparation step for moving pmap storage out of vmspace proper. Reviewed by: Alan Cox <alc@cs.rice.edu> Matthew Dillion <dillon@apollo.backplane.com>
|
#
0a5e03dd |
|
27-Jan-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile
|
#
7dbf82dc |
|
23-Jan-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
Change all manual settings of vm_page_t->dirty = VM_PAGE_BITS_ALL to use the vm_page_dirty() inline. The inline can thus do sanity checks ( or not ) over all cases.
|
#
1c7c3c6a |
|
21-Jan-1999 |
Matthew Dillon <dillon@FreeBSD.org> |
This is a rather large commit that encompasses the new swapper, changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues. Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
|
#
fd8d7e38 |
|
11-Jan-1999 |
Eivind Eklund <eivind@FreeBSD.org> |
Silence warnings by removing unused convenience function and globalizing debugging functions.
|
#
c197d61d |
|
09-Jan-1999 |
Dmitrij Tejblum <dt@FreeBSD.org> |
Oops --<, replace 1.216 with a version that actually check pv_entries (and was tested for month or two in production). Noticed by: Stephen McKay Stephen also suggested to remove the complication at all. I don't do it as it would be backout of a large part of 1.190 (from 1998/03/16)...
|
#
fcf37ac3 |
|
08-Jan-1999 |
Luoqi Chen <luoqi@FreeBSD.org> |
Allocate kernel page table object (kptobj) before any kmem_alloc calls. On a system with a large amount of ram (e.g. 2G), allocation of per-page data structures (512K physical pages) could easily bust the initial kernel page table (36M), and growth of kernel page table requires kptobj.
|
#
6143bceb |
|
07-Jan-1999 |
Dmitrij Tejblum <dt@FreeBSD.org> |
Make pmap_ts_referenced check more than 1 pv_entry. (One should be carefull when move elements to the tail of a list in a loop...)
|
#
f1d19042 |
|
07-Dec-1998 |
Archie Cobbs <archie@FreeBSD.org> |
The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static and local variables, goto labels, and functions declared but not defined.
|
#
18830dba |
|
26-Nov-1998 |
Tor Egge <tegge@FreeBSD.org> |
Don't forget to update the pmap associated with aio daemons when adding new page directory entries for a growing kernel virtual address space.
|
#
38cc2d93 |
|
24-Nov-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Move the declaration of PPro_vmtrr from the header file to pmap.c, replacing the one in the header file with a definition. This makes it easier to work with tools that grok ANSI C only.
|
#
1916cd29 |
|
07-Nov-1998 |
Mike Smith <msmith@FreeBSD.org> |
Enable 686 class optimisations for all 686-class processors, not just the Pentium Pro. This resolves the "Dog slow SMP" issue for Pentium II systems.
|
#
73007561 |
|
28-Oct-1998 |
David Greenman <dg@FreeBSD.org> |
Added a second argument, "activate" to the vm_page_unwire() call so that the caller can select either inactive or active queue to put the page on.
|
#
9b827155 |
|
21-Oct-1998 |
David Greenman <dg@FreeBSD.org> |
Decrement the now unused page table page's wire_count prior to freeing it. It will soon be required that pages have a zero wire_count when being freed.
|
#
3bfe64f9 |
|
06-Sep-1998 |
Tor Egge <tegge@FreeBSD.org> |
Don't go below the low water mark of free pages due to optional prefaulting of pages. PR: 2431
|
#
569d43a2 |
|
04-Sep-1998 |
Andrey A. Chernov <ache@FreeBSD.org> |
PAGE_WAKEUP -> vm_page_wakeup
|
#
1fcee469 |
|
23-Aug-1998 |
Bruce Evans <bde@FreeBSD.org> |
Fixed printf format errors.
|
#
69ed480f |
|
15-Aug-1998 |
Bruce Evans <bde@FreeBSD.org> |
pmap.c: Cast pointers to (vm_offset_t) instead of to (u_long) (as before) or to (uintptr_t)(void *) (as would be more correct). Don't cast vm_offset_t's to (u_long) just to do arithmetic on them. mp_machdep.c: Cast pointers to (uintptr_t) instead of to (u_long). Don't forget to cast pointers to (void *) first or to recover from integral possible integral promotions, although this is too much work for machine-dependent code. vm code generally avoids warnings for pointer vs long size mismatches by using vm_offset_t to represent pointers; pmap.c often uses plain `unsigned int' instead of vm_offset_t and didn't use u_long elsewhere, but this style was messed up by code apparently imported from mp_machdep.c.
|
#
767dfb80 |
|
11-Jul-1998 |
Bruce Evans <bde@FreeBSD.org> |
Fixed printf format errors.
|
#
3d1af38b |
|
11-Jul-1998 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Don't disable pmap_setdevram() which isn't called, but which could be, but instead disable pmap_setvidram() which is called, but probably shouldn't be. PR: 7227, 7240
|
#
ac1e407b |
|
11-Jul-1998 |
Bruce Evans <bde@FreeBSD.org> |
Fixed printf format errors.
|
#
cf2819cc |
|
21-May-1998 |
John Dyson <dyson@FreeBSD.org> |
Make flushing dirty pages work correctly on filesystems that unexpectedly do not complete writes even with sync I/O requests. This should help the behavior of mmaped files when using softupdates (and perhaps in other circumstances also.)
|
#
58067a99 |
|
19-May-1998 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Make the size of the msgbuf (dmesg) a "normal" option.
|
#
12d311d0 |
|
18-May-1998 |
Tor Egge <tegge@FreeBSD.org> |
Back out part of revision 1.198 commit (clearing kernel stack pages). By request from David Greenman <dg@root.com>
|
#
5931a9c2 |
|
17-May-1998 |
Tor Egge <tegge@FreeBSD.org> |
For SMP, use prv_PPAGE1/prv_PMAP1 instead of PADDR1/PMAP1. get_ptbase and pmap_pte_quick no longer generates IPIs. This should reduce the number of IPIs during heavy paging.
|
#
cf4b29e4 |
|
17-May-1998 |
Tor Egge <tegge@FreeBSD.org> |
Clear kernel stack pages before usage. Correct panic message in pmap_zero_page (s/CMAP /CMAP2 /).
|
#
424edf1b |
|
15-May-1998 |
John Dyson <dyson@FreeBSD.org> |
Disable the auto-Write Combining setup for the pmap code. This worked on a couple of machines of mine, but appears to cause problems on others.
|
#
fcf1880f |
|
11-May-1998 |
John Dyson <dyson@FreeBSD.org> |
Change some tests from CPU_CLASS686 to CPU_686 as appropriate, and also correct a serious ommision that would cause process faulures due to forgetting an invltlb type operatino. This was just a transcription problem.
|
#
5498a452 |
|
10-May-1998 |
John Dyson <dyson@FreeBSD.org> |
Support better performance with P6 architectures and in SMP mode. Unnecessary TLB flushes removed. More efficient page zeroing on P6 (modify page only if non-zero.)
|
#
f0175db1 |
|
10-May-1998 |
John Dyson <dyson@FreeBSD.org> |
Attempt to set write combining mode for graphics devices.
|
#
78a81826 |
|
19-Apr-1998 |
Bruce Evans <bde@FreeBSD.org> |
Support compiling with gcc -pedantic (don't use a bogus, null cast).
|
#
c1087c13 |
|
15-Apr-1998 |
Bruce Evans <bde@FreeBSD.org> |
Support compiling with `gcc -ansi'.
|
#
55caa497 |
|
06-Apr-1998 |
Peter Wemm <peter@FreeBSD.org> |
Bogus casts
|
#
bef608bd |
|
15-Mar-1998 |
John Dyson <dyson@FreeBSD.org> |
Some VM improvements, including elimination of alot of Sig-11 problems. Tor Egge and others have helped with various VM bugs lately, but don't blame him -- blame me!!! pmap.c: 1) Create an object for kernel page table allocations. This fixes a bogus allocation method previously used for such, by grabbing pages from the kernel object, using bogus pindexes. (This was a code cleanup, and perhaps a minor system stability issue.) pmap.c: 2) Pre-set the modify and accessed bits when prudent. This will decrease bus traffic under certain circumstances. vfs_bio.c, vfs_cluster.c: 3) Rather than calculating the beginning virtual byte offset multiple times, stick the offset into the buffer header, so that the calculated offset can be reused. (Long long multiplies are often expensive, and this is a probably unmeasurable performance improvement, and code cleanup.) vfs_bio.c: 4) Handle write recursion more intelligently (but not perfectly) so that it is less likely to cause a system panic, and is also much more robust. vfs_bio.c: 5) getblk incorrectly wrote out blocks that are incorrectly sized. The problem is fixed, and writes blocks out ONLY when B_DELWRI is true. vfs_bio.c: 6) Check that already constituted buffers have fully valid pages. If not, then make sure that the B_CACHE bit is not set. (This was a major source of Sig-11 type problems.) vfs_bio.c: 7) Fix a potential system deadlock due to an incorrectly specified sleep priority while waiting for a buffer write operation. The change that I made opens the system up to serious problems, and we need to examine the issue of process sleep priorities. vfs_cluster.c, vfs_bio.c: 8) Make clustered reads work more correctly (and more completely) when buffers are already constituted, but not fully valid. (This was another system reliability issue.) vfs_subr.c, ffs_inode.c: 9) Create a vtruncbuf function, which is used by filesystems that can truncate files. The vinvalbuf forced a file sync type operation, while vtruncbuf only invalidates the buffers past the new end of file, and also invalidates the appropriate pages. (This was a system reliabiliy and performance issue.) 10) Modify FFS to use vtruncbuf. vm_object.c: 11) Make the object rundown mechanism for OBJT_VNODE type objects work more correctly. Included in that fix, create pager entries for the OBJT_DEAD pager type, so that paging requests that might slip in during race conditions are properly handled. (This was a system reliability issue.) vm_page.c: 12) Make some of the page validation routines be a little less picky about arguments passed to them. Also, support page invalidation change the object generation count so that we handle generation counts a little more robustly. vm_pageout.c: 13) Further reduce pageout daemon activity when the system doesn't need help from it. There should be no additional performance decrease even when the pageout daemon is running. (This was a significant performance issue.) vnode_pager.c: 14) Teach the vnode pager to handle race conditions during vnode deallocations.
|
#
005092bb |
|
09-Mar-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Turn "PMAP_SHPGPERPROC" into a new-style option, add it to LINT, and document it there.
|
#
8f9110f6 |
|
07-Mar-1998 |
John Dyson <dyson@FreeBSD.org> |
This mega-commit is meant to fix numerous interrelated problems. There has been some bitrot and incorrect assumptions in the vfs_bio code. These problems have manifest themselves worse on NFS type filesystems, but can still affect local filesystems under certain circumstances. Most of the problems have involved mmap consistancy, and as a side-effect broke the vfs.ioopt code. This code might have been committed seperately, but almost everything is interrelated. 1) Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that are fully valid. 2) Rather than deactivating erroneously read initial (header) pages in kern_exec, we now free them. 3) Fix the rundown of non-VMIO buffers that are in an inconsistent (missing vp) state. 4) Fix the disassociation of pages from buffers in brelse. The previous code had rotted and was faulty in a couple of important circumstances. 5) Remove a gratuitious buffer wakeup in vfs_vmio_release. 6) Remove a crufty and currently unused cluster mechanism for VBLK files in vfs_bio_awrite. When the code is functional, I'll add back a cleaner version. 7) The page busy count wakeups assocated with the buffer cache usage were incorrectly cleaned up in a previous commit by me. Revert to the original, correct version, but with a cleaner implementation. 8) The cluster read code now tries to keep data associated with buffers more aggressively (without breaking the heuristics) when it is presumed that the read data (buffers) will be soon needed. 9) Change to filesystem lockmgr locks so that they use LK_NOPAUSE. The delay loop waiting is not useful for filesystem locks, due to the length of the time intervals. 10) Correct and clean-up spec_getpages. 11) Implement a fully functional nfs_getpages, nfs_putpages. 12) Fix nfs_write so that modifications are coherent with the NFS data on the server disk (at least as well as NFS seems to allow.) 13) Properly support MS_INVALIDATE on NFS. 14) Properly pass down MS_INVALIDATE to lower levels of the VM code from vm_map_clean. 15) Better support the notion of pages being busy but valid, so that fewer in-transit waits occur. (use p->busy more for pageouts instead of PG_BUSY.) Since the page is fully valid, it is still usable for reads. 16) It is possible (in error) for cached pages to be busy. Make the page allocation code handle that case correctly. (It should probably be a printf or panic, but I want the system to handle coding errors robustly. I'll probably add a printf.) 17) Correct the design and usage of vm_page_sleep. It didn't handle consistancy problems very well, so make the design a little less lofty. After vm_page_sleep, if it ever blocked, it is still important to relookup the page (if the object generation count changed), and verify it's status (always.) 18) In vm_pageout.c, vm_pageout_clean had rotted, so clean that up. 19) Push the page busy for writes and VM_PROT_READ into vm_pageout_flush. 20) Fix vm_pager_put_pages and it's descendents to support an int flag instead of a boolean, so that we can pass down the invalidate bit.
|
#
ffc82b0a |
|
28-Feb-1998 |
John Dyson <dyson@FreeBSD.org> |
1) Use a more consistent page wait methodology. 2) Do not unnecessarily force page blocking when paging pages out. 3) Further improve swap pager performance and correctness, including fixing the paging in progress deadlock (except in severe I/O error conditions.) 4) Enable vfs_ioopt=1 as a default. 5) Fix and enable the page prezeroing in SMP mode. All in all, SMP systems especially should show a significant improvement in "snappyness."
|
#
045b6fef |
|
12-Feb-1998 |
Bruce Evans <bde@FreeBSD.org> |
Fixed initialization of the 4MB page. Kernels larger than about 2.75MB (from _btext to _end) crashed in pmap_bootstrap(). Smaller kernels worked accidentally.
|
#
b2651781 |
|
10-Feb-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Fix warning after previous staticization.
|
#
303b270b |
|
08-Feb-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Staticize.
|
#
0b08f5f7 |
|
05-Feb-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Back out DIAGNOSTIC changes.
|
#
95461b45 |
|
04-Feb-1998 |
John Dyson <dyson@FreeBSD.org> |
1) Start using a cleaner and more consistant page allocator instead of the various ad-hoc schemes. 2) When bringing in UPAGES, the pmap code needs to do another vm_page_lookup. 3) When appropriate, set the PG_A or PG_M bits a-priori to both avoid some processor errata, and to minimize redundant processor updating of page tables. 4) Modify pmap_protect so that it can only remove permissions (as it originally supported.) The additional capability is not needed. 5) Streamline read-only to read-write page mappings. 6) For pmap_copy_page, don't enable write mapping for source page. 7) Correct and clean-up pmap_incore. 8) Cluster initial kern_exec pagin. 9) Removal of some minor lint from kern_malloc. 10) Correct some ioopt code. 11) Remove some dead code from the MI swapout routine. 12) Correct vm_object_deallocate (to remove backing_object ref.) 13) Fix dead object handling, that had problems under heavy memory load. 14) Add minor vm_page_lookup improvements. 15) Some pages are not in objects, and make sure that the vm_page.c can properly support such pages. 16) Add some more page deficit handling. 17) Some minor code readability improvements.
|
#
47cfdb16 |
|
04-Feb-1998 |
Eivind Eklund <eivind@FreeBSD.org> |
Turn DIAGNOSTIC into a new-style option.
|
#
44429dc4 |
|
03-Feb-1998 |
Bruce Evans <bde@FreeBSD.org> |
Converted DISABLE_PSE to a new-style option. Fixed some formatting in options.i386.
|
#
eaf13dd7 |
|
31-Jan-1998 |
John Dyson <dyson@FreeBSD.org> |
Change the busy page mgmt, so that when pages are freed, they MUST be PG_BUSY. It is bogus to free a page that isn't busy, because it is in a state of being "unavailable" when being freed. The additional advantage is that the page_remove code has a better cross-check that the page should be busy and unavailable for other use. There were some minor problems with the collapse code, and this plugs those subtile "holes." Also, the vfs_bio code wasn't checking correctly for PG_BUSY pages. I am going to develop a more consistant scheme for grabbing pages, busy or otherwise. For now, we are stuck with the current morass.
|
#
2d8acc0f |
|
22-Jan-1998 |
John Dyson <dyson@FreeBSD.org> |
VM level code cleanups. 1) Start using TSM. Struct procs continue to point to upages structure, after being freed. Struct vmspace continues to point to pte object and kva space for kstack. u_map is now superfluous. 2) vm_map's don't need to be reference counted. They always exist either in the kernel or in a vmspace. The vmspaces are managed by reference counts. 3) Remove the "wired" vm_map nonsense. 4) No need to keep a cache of kernel stack kva's. 5) Get rid of strange looking ++var, and change to var++. 6) Change more data structures to use our "zone" allocator. Added struct proc, struct vmspace and struct vnode. This saves a significant amount of kva space and physical memory. Additionally, this enables TSM for the zone managed memory. 7) Keep ioopt disabled for now. 8) Remove the now bogus "single use" map concept. 9) Use generation counts or id's for data structures residing in TSM, where it allows us to avoid unneeded restart overhead during traversals, where blocking might occur. 10) Account better for memory deficits, so the pageout daemon will be able to make enough memory available (experimental.) 11) Fix some vnode locking problems. (From Tor, I think.) 12) Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp. (experimental.) 13) Significantly shrink, cleanup, and make slightly faster the vm_fault.c code. Use generation counts, get rid of unneded collpase operations, and clean up the cluster code. 14) Make vm_zone more suitable for TSM. This commit is partially as a result of discussions and contributions from other people, including DG, Tor Egge, PHK, and probably others that I have forgotten to attribute (so let me know, if I forgot.) This is not the infamous, final cleanup of the vnode stuff, but a necessary step. Vnode mgmt should be correct, but things might still change, and there is still some missing stuff (like ioopt, and physical backing of non-merged cache files, debugging of layering concepts.)
|
#
47221757 |
|
17-Jan-1998 |
John Dyson <dyson@FreeBSD.org> |
Tie up some loose ends in vnode/object management. Remove an unneeded config option in pmap. Fix a problem with faulting in pages. Clean-up some loose ends in swap pager memory management. The system should be much more stable, but all subtile bugs aren't fixed yet.
|
#
841fc368 |
|
22-Dec-1997 |
John Dyson <dyson@FreeBSD.org> |
Correct my previous fix for the UPAGES problem.
|
#
adbf9b6f |
|
21-Dec-1997 |
John Dyson <dyson@FreeBSD.org> |
Hopefully fix the problem with the TLB not being updated correctly. Problem tracked down by bde@freebsd.org, but this is an attempted efficient fix.
|
#
82566551 |
|
13-Dec-1997 |
John Dyson <dyson@FreeBSD.org> |
After one of my analysis passes to evaluate methods for SMP TLB mgmt, I noticed some major enhancements available for UP situations. The number of UP TLB flushes is decreased much more than significantly with these changes. Since a TLB flush appears to cost minimally approx 80 cycles, this is a "nice" enhancement, equiv to eliminating between 40 and 160 instructions per TLB flush. Changes include making sure that kernel threads all use the same PTD, and eliminate unneeded PTD switches at context switch time.
|
#
96a73b40 |
|
20-Nov-1997 |
Bruce Evans <bde@FreeBSD.org> |
Moved some extern declarations to header files (unused ones to /dev/null).
|
#
31e52254 |
|
07-Nov-1997 |
Tor Egge <tegge@FreeBSD.org> |
Use UPAGES when setting up private pages for SMP (which includes idle stack).
|
#
0abc78a6 |
|
07-Nov-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Rename some local variables to avoid shadowing other local variables. Found by: -Wshadow
|
#
4a11ca4e |
|
07-Nov-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove a bunch of variables which were unused both in GENERIC and LINT. Found by: -Wunused
|
#
55b211e3 |
|
28-Oct-1997 |
Bruce Evans <bde@FreeBSD.org> |
Removed unused #includes.
|
#
d80130d3 |
|
26-Oct-1997 |
John Dyson <dyson@FreeBSD.org> |
Check to see if the pv_limits are initialized before checking.
|
#
fe3e6985 |
|
25-Oct-1997 |
John Dyson <dyson@FreeBSD.org> |
Change the initial amount of memory allocated for pv_entries to be proportional to the amount of system memory. Also, clean-up some of the new pv_entry mgmt code.
|
#
0c8029e9 |
|
24-Oct-1997 |
John Dyson <dyson@FreeBSD.org> |
Somehow an error crept in during the previous commit.
|
#
5985940e |
|
24-Oct-1997 |
John Dyson <dyson@FreeBSD.org> |
Support garbage collecting the pmap pv entries. The management doesn't happen until the system would have nearly failed anyway, so no signficant overhead is added. This helps large systems with lots of processes.
|
#
0a80f406 |
|
24-Oct-1997 |
John Dyson <dyson@FreeBSD.org> |
Decrease the initial allocation for the zone allocations.
|
#
55166637 |
|
11-Oct-1997 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Distribute and statizice a lot of the malloc M_* types. Substantial input from: bde
|
#
a65247e1 |
|
20-Sep-1997 |
John Dyson <dyson@FreeBSD.org> |
Add support for more than 1 page of idle process stack on SMP systems.
|
#
5b05023a |
|
06-Sep-1997 |
John Dyson <dyson@FreeBSD.org> |
Fix an intermittent problem during SMP code operation. Not all of the idle page table directories for all of the processors was being updated during kernel grow operations. The problem appears to be gone now.
|
#
9a3b3e8b |
|
26-Aug-1997 |
Peter Wemm <peter@FreeBSD.org> |
Clean up the SMP AP bootstrap and eliminate the wretched idle procs. - We now have enough per-cpu idle context, the real idle loop has been revived (cpu's halt now with nothing to do). - Some preliminary support for running some operations outside the global lock (eg: zeroing "free but not yet zeroed pages") is present but appears to cause problems. Off by default. - the smp_active sysctl now behaves differently. It's merely a 'true/false' option. Setting smp_active to zero causes the AP's to halt in the idle loop and stop scheduling processes. - bootstrap is a lot safer. Instead of sharing a statically compiled in stack a number of times (which has caused lots of problems) and then abandoning it, we use the idle context to boot the AP's directly. This should help >2 cpu support since the bootlock stuff was in doubt. - print physical apic id in traps.. helps identify private pages getting out of sync. (You don't want to know how much hair I tore out with this!) More cleanup to follow, this is more of a checkpoint than a 'finished' thing.
|
#
10a1aa05 |
|
25-Aug-1997 |
Bruce Evans <bde@FreeBSD.org> |
Finished (?) support for DISABLE_PSE option. 2-3MB of kernel vm was sometimes wasted. Fixed type mismatches for functions with vm_prot_t's as args. vm_prot_t is u_char, so the prototypes should have used promoteof(u_char) to match the old-style function definitions. They use just vm_prot_t. This depends on gcc features to work. I fixed the definitions since this is easiest. The correct fix may be to change vm_prot_t to u_int, to optimize for time instead of space. Removed a stale comment.
|
#
d3d1eb99 |
|
06-Aug-1997 |
John Dyson <dyson@FreeBSD.org> |
Fix the DDB breakpoint code when using the 4MB page support.
|
#
f1c1c5b5 |
|
06-Aug-1997 |
John Dyson <dyson@FreeBSD.org> |
More vm_zone cleanup. The sysctl now accounts for items better, and counts the number of allocations.
|
#
0d65e566 |
|
05-Aug-1997 |
John Dyson <dyson@FreeBSD.org> |
Another attempt at cleaning up the new memory allocator.
|
#
b79933eb |
|
05-Aug-1997 |
John Dyson <dyson@FreeBSD.org> |
Fix some bugs, document vm_zone better. Add copyright to vm_zone.h. Use the new zone code in pmap.c so that we can get rid of the ugly ad-hoc allocations in pmap.c.
|
#
b25b051b |
|
04-Aug-1997 |
John Dyson <dyson@FreeBSD.org> |
Modify pmap to use our new memory allocator.
|
#
f6363c84 |
|
04-Aug-1997 |
John Dyson <dyson@FreeBSD.org> |
Slightly reorder some operations so that the main processor gets global mappings early on.
|
#
de5858ab |
|
04-Aug-1997 |
John Dyson <dyson@FreeBSD.org> |
Remove the PMAP_PVLIST conditionals in pmap.*, and another unneeded define.
|
#
322d7a88 |
|
20-Jul-1997 |
John Dyson <dyson@FreeBSD.org> |
Fix a crash that has manifest itself while running X after the 4MB page upgrades.
|
#
e31521c3 |
|
20-Jul-1997 |
Bruce Evans <bde@FreeBSD.org> |
Removed unused #includes.
|
#
78342719 |
|
17-Jul-1997 |
John Dyson <dyson@FreeBSD.org> |
Hopefully fix a few problems that could cause hangs in SMP mode. 1) Make sure that the region mapped by a 4MB page is properly aligned. 2) Don't turn on the PG_G flag in locore for SMP. I plan to do that later in startup anyway. 3) Make sure the 2nd processor has PSE enabled, so that 4MB pages don't hose it. We don't use PG_G yet on SMP -- there is work to be done to make that work correctly. It isn't that important anyway...
|
#
0a0a85b3 |
|
16-Jul-1997 |
John Dyson <dyson@FreeBSD.org> |
Add support for 4MB pages. This includes the .text, .data, .data parts of the kernel, and also most of the dynamic parts of the kernel. Additionally, 4MB pages will be allocated for display buffers as appropriate (only.) The 4MB support for SMP isn't complete, but doesn't interfere with operation either.
|
#
a4ec81c7 |
|
25-Jun-1997 |
Tor Egge <tegge@FreeBSD.org> |
Allow kernel configuration file to override PMAP_SHPGPERPROC. The default value (200) is too low in some environments, causing a fatal "panic: get_pv_entry: cannot get a pv_entry_t". The same panic might still occur due to temporary shortage of free physical memory (cf. PR i386/2431).
|
#
b3196e4b |
|
22-Jun-1997 |
Peter Wemm <peter@FreeBSD.org> |
Preliminary support for per-cpu data pages. This eliminates a lot of #ifdef SMP type code. Things like _curproc reside in a data page that is unique on each cpu, eliminating the expensive macros like: #define curproc (SMPcurproc[cpunumber()]) There are some unresolved bootstrap and address space sharing issues at present, but Steve is waiting on this for other work. There is still some strictly temporary code present that isn't exactly pretty. This is part of a larger change that has run into some bumps, this part is standalone so it should be safe. The temporary code goes away when the full idle cpu support is finished. Reviewed by: fsmp, dyson
|
#
a8baaafd |
|
28-May-1997 |
Steve Passe <fsmp@FreeBSD.org> |
Code such as apic_base[APIC_ID] converted to lapic__id Changes to pmap.c for lapic_t lapic && ioapic_t ioapic pointers, currently equal to apic_base && io_apic_base, will stand alone with the private page mapping.
|
#
288e2230 |
|
28-May-1997 |
Peter Wemm <peter@FreeBSD.org> |
remove no longer needed opt_smp.h includes
|
#
a8a74574 |
|
26-Apr-1997 |
Peter Wemm <peter@FreeBSD.org> |
Whoops.. We forgot to turn off the 4MB Virtual==Physical mapping at address zero from bootstrap in the non-SMP case. Noticed by: bde
|
#
477a642c |
|
26-Apr-1997 |
Peter Wemm <peter@FreeBSD.org> |
Man the liferafts! Here comes the long awaited SMP -> -current merge! There are various options documented in i386/conf/LINT, there is more to come over the next few days. The kernel should run pretty much "as before" without the options to activate SMP mode. There are a handful of known "loose ends" that need to be fixed, but have been put off since the SMP kernel is in a moderately good condition at the moment. This commit is the result of the tinkering and testing over the last 14 months by many people. A special thanks to Steve Passe for implementing the APIC code!
|
#
aec17d50 |
|
12-Apr-1997 |
John Dyson <dyson@FreeBSD.org> |
The pmap code was too generous in the allocation of kva space for the pv entries. This problem has become obvious due to the increase in the size of the pv entries. We need to create a more intelligent policy for pv entry management eventually. Submitted by: David Greenman <dg@freebsd.org>
|
#
5856e12e |
|
12-Apr-1997 |
John Dyson <dyson@FreeBSD.org> |
Fully implement vfork. Vfork is now much much faster than even our fork. (On my machine, fork is about 240usecs, vfork is 78usecs.) Implement rfork(!RFPROC !RFMEM), which allows a thread to divorce its memory from the other threads of a group. Implement rfork(!RFPROC RFCFDG), which closes all file descriptors, eliminating possible existing shares with other threads/processes. Implement rfork(!RFPROC RFFDG), which divorces the file descriptors for a thread from the rest of the group. Fix the case where a thread does an exec. It is almost nonsense for a thread to modify the other threads address space by an exec, so we now automatically divorce the address space before modifying it.
|
#
a2a1c95c |
|
07-Apr-1997 |
Peter Wemm <peter@FreeBSD.org> |
The biggie: Get rid of the UPAGES from the top of the per-process address space. (!) Have each process use the kernel stack and pcb in the kvm space. Since the stacks are at a different address, we cannot copy the stack at fork() and allow the child to return up through the function call tree to return to user mode - create a new execution context and have the new process begin executing from cpu_switch() and go to user mode directly. In theory this should speed up fork a bit. Context switch the tss_esp0 pointer in the common tss. This is a lot simpler since than swithching the gdt[GPROC0_SEL].sd.sd_base pointer to each process's tss since the esp0 pointer is a 32 bit pointer, and the sd_base setting is split into three different bit sections at non-aligned boundaries and requires a lot of twiddling to reset. The 8K of memory at the top of the process space is now empty, and unmapped (and unmappable, it's higher than VM_MAXUSER_ADDRESS). Simplity the pmap code to manage process contexts, we no longer have to double map the UPAGES, this simplifies and should measuably speed up fork(). The following parts came from John Dyson: Set PG_G on the UPAGES that are now in kernel context, and invalidate them when swapping them out. Move the upages object (upobj) from the vmspace to the proc structure. Now that the UPAGES (pcb and kernel stack) are out of user space, make rfork(..RFMEM..) do what was intended by sharing the vmspace entirely via reference counting rather than simply inheriting the mappings.
|
#
6875d254 |
|
22-Feb-1997 |
Peter Wemm <peter@FreeBSD.org> |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
996c772f |
|
09-Feb-1997 |
John Dyson <dyson@FreeBSD.org> |
This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
|
#
3def4913 |
|
30-Jan-1997 |
David Greenman <dg@FreeBSD.org> |
Removed unnecessary PG_N flag from device memory mappings. This is handled by the CPU/chipset already and was apparantly triggering a hardware bug that causes strange parity errors.
|
#
1130b656 |
|
14-Jan-1997 |
Jordan K. Hubbard <jkh@FreeBSD.org> |
Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
b447ce90 |
|
11-Jan-1997 |
John Dyson <dyson@FreeBSD.org> |
When we changed pmap_protect to support adding the writeable attribute to a page range, we forgot to set the PG_WRITEABLE flag in the vm_page_t. This fixes that problem.
|
#
9b5a5d81 |
|
11-Jan-1997 |
John Dyson <dyson@FreeBSD.org> |
Prepare better for multi-platform by eliminating another required pmap routine (pmap_is_referenced.) Upper level recoded to use pmap_ts_referenced.
|
#
f486bc1d |
|
28-Dec-1996 |
John Dyson <dyson@FreeBSD.org> |
Allow pmap_protect to increase permissions. This mod can eliminate the need for unnecessary vm_faults. Submitted by: Alan Cox <alc@cs.rice.edu>
|
#
d22671dc |
|
10-Nov-1996 |
John Dyson <dyson@FreeBSD.org> |
Support the PG_G flag on Pentium-Pro processors. This pretty much eliminates the unnecessary unmapping of the kernel during context switches and during invtlb...
|
#
1d6ccf9c |
|
07-Nov-1996 |
Joerg Wunsch <joerg@FreeBSD.org> |
Fix the message buffer mapping. This actually allows to increase the message buffer size in <sys/msgbuf.h>. Reviewed by: davidg,joerg Submitted by: bde
|
#
5c2a644a |
|
02-Nov-1996 |
John Dyson <dyson@FreeBSD.org> |
Fix a problem with running down processes that have left wired mappings with mlock. This problem only occurred because of the quick unmap code not respecting the wired-ness of pages in the process. In the future, we need to eliminate the dependency intrinsic to the design of the code that wired pages actually be mapped. It is kind-of bogus not to have wired pages mapped, but it is also a weakness for the code to fall flat because of a missing page. This show fix a problem that Tor Egge has been having, and also should be included into 2.2-RELEASE.
|
#
845c4ec4 |
|
22-Oct-1996 |
John Dyson <dyson@FreeBSD.org> |
Account for the UPAGES in the same way as before moving the MD code from vm_glue into pmap.c. Now RSS should appear to be the same as before.
|
#
675878e7 |
|
14-Oct-1996 |
John Dyson <dyson@FreeBSD.org> |
Move much of the machine dependent code from vm_glue.c into pmap.c. Along with the improved organization, small proc fork performance is now about 5%-10% faster.
|
#
8f3a9a1b |
|
12-Oct-1996 |
John Dyson <dyson@FreeBSD.org> |
Minor optimization for final rundown of a pmap.
|
#
9d3fbbb5 |
|
12-Oct-1996 |
John Dyson <dyson@FreeBSD.org> |
Performance optimizations. One of which was meant to go in before the previous snap. Specifically, kern_exit and kern_exec now makes a call into the pmap module to do a very fast removal of pages from the address space. Additionally, the pmap module now updates the PG_MAPPED and PG_WRITABLE flags. This is an optional optimization, but helpful on the X86.
|
#
da2186af |
|
12-Oct-1996 |
Bruce Evans <bde@FreeBSD.org> |
Cleaned up: - fixed a sloppy common-style declaration. - removed an unused macro. - moved once-used macros to the one file where they are used. - removed unused forward struct declarations. - removed __pure. - declared inline functions as inline in their prototype as well as in theire definition (gcc unfortunately allows the prototype to be inconsistent). - staticized.
|
#
c20b324b |
|
09-Oct-1996 |
Bruce Evans <bde@FreeBSD.org> |
Put I*86_CPU defines in opt_cpu.h.
|
#
27e9b35e |
|
28-Sep-1996 |
John Dyson <dyson@FreeBSD.org> |
Essentially rename pmap_update to be invltlb. It is a very machine dependent operation, and not really a correct name. invltlb and invlpg are more descriptive, and in the case of invlpg, a real opcode. Additionally, fix the tlb management code for 386 machines.
|
#
f53687f7 |
|
28-Sep-1996 |
Bruce Evans <bde@FreeBSD.org> |
Restored my change in rev.1.119 which was clobbered by the previous commit.
|
#
9299d3c4 |
|
27-Sep-1996 |
John Dyson <dyson@FreeBSD.org> |
Move pmap_update_1pg to cpufunc.h. Additionally, use the invlpg opcode instead of the nasty looking .byte directives. There are some other minor micro-level code improvements to pmap.c
|
#
c254b6e8 |
|
13-Sep-1996 |
Bruce Evans <bde@FreeBSD.org> |
Made debugging code (pmap_pvdump()) compile again so that I can test LINT. I don't know if it actually works.
|
#
dba940b4 |
|
11-Sep-1996 |
John Dyson <dyson@FreeBSD.org> |
Primarily a fix so that pages are properly tracked for being modified. Pages that are removed by the pageout daemon were the worst affected. Additionally, numerous minor cleanups, including better handling of busy page table pages. This commit fixes the worst of the pmap problems recently introduced.
|
#
690db31d |
|
10-Sep-1996 |
John Dyson <dyson@FreeBSD.org> |
A minor fix to the new pmap code. This might not fix the global problems with the last major pmap commits.
|
#
5070c7f8 |
|
08-Sep-1996 |
John Dyson <dyson@FreeBSD.org> |
Addition of page coloring support. Various levels of coloring are afforded. The default level works with minimal overhead, but one can also enable full, efficient use of a 512K cache. (Parameters can be generated to support arbitrary cache sizes also.)
|
#
b8e251a5 |
|
08-Sep-1996 |
John Dyson <dyson@FreeBSD.org> |
Improve the scalability of certain pmap operations.
|
#
67bf6868 |
|
29-Jul-1996 |
John Dyson <dyson@FreeBSD.org> |
Backed out the recent changes/enhancements to the VM code. The problem with the 'shell scripts' was found, but there was a 'strange' problem found with a 486 laptop that we could not find. This commit backs the code back to 25-jul, and will be re-entered after the snapshot in smaller (more easily tested) chunks.
|
#
78d43461 |
|
29-Jul-1996 |
John Dyson <dyson@FreeBSD.org> |
Fix a problem with a DEBUG section of code.
|
#
b7fb3572 |
|
28-Jul-1996 |
John Dyson <dyson@FreeBSD.org> |
Fix an error in statement order in pmap_remove_pages, remove the pmap pte hint (for now), and general code cleanup.
|
#
da54aa7f |
|
28-Jul-1996 |
John Dyson <dyson@FreeBSD.org> |
Fix a problem that pmap update was not being done for kernel_pmap. Also remove some (currently) gratuitious tests for PG_V... This bug could have caused various anomolous (temporary) behavior.
|
#
4f4d35ed |
|
26-Jul-1996 |
John Dyson <dyson@FreeBSD.org> |
This commit is meant to solve a couple of VM system problems or performance issues. 1) The pmap module has had too many inlines, and so the object file is simply bigger than it needs to be. Some common code is also merged into subroutines. 2) Removal of some *evil* PHYS_TO_VM_PAGE macro calls. Unfortunately, a few have needed to be added also. The removal caused the need for more vm_page_lookups. I added lookup hints to minimize the need for the page table lookup operations. 3) Removal of some bogus performance improvements, that mostly made the code more complex (tracking individual page table page updates unnecessarily). Those improvements actually hurt 386 processors perf (not that people who worry about perf use 386 processors anymore :-)). 4) Changed pv queue manipulations/structures to be TAILQ's. 5) The pv queue code has had some performance problems since day one. Some significant scalability issues are resolved by threading the pv entries from the pmap AND the physical address instead of just the physical address. This makes certain pmap operations run much faster. This does not affect most micro-benchmarks, but should help loaded system performance *significantly*. DG helped and came up with most of the solution for this one. 6) Most if not all pmap bit operations follow the pattern: pmap_test_bit(); pmap_clear_bit(); That made for twice the necessary pv list traversal. The pmap interface now supports only pmap_tc_bit type operations: pmap_[test/clear]_modified, pmap_[test/clear]_referenced. Additionally, the modified routine now takes a vm_page_t arg instead of a phys address. This eliminates a PHYS_TO_VM_PAGE operation. 7) Several rewrites of routines that contain redundant code to use common routines, so that there is a greater likelihood of keeping the cache footprint smaller.
|
#
73571d2d |
|
12-Jul-1996 |
Bruce Evans <bde@FreeBSD.org> |
Removed "optimization" using gcc's builtin memcpy instead of bcopy. There is little difference now since the amount copied is large, and bcopy will become much faster on some machines.
|
#
f4346724 |
|
25-Jun-1996 |
John Dyson <dyson@FreeBSD.org> |
When page table pages were removed from process address space, the resident page stats were not being decremented. This mode corrects that problem.
|
#
cb87c9be |
|
24-Jun-1996 |
John Dyson <dyson@FreeBSD.org> |
Limit the scan for preloading pte's to the end of an object.
|
#
9b2b0822 |
|
17-Jun-1996 |
Bruce Evans <bde@FreeBSD.org> |
Removed unused #includes of <i386/isa/icu.h> and <i386/isa/icu.h>. icu.h is only used by the icu support modules and by a few drivers that know too much about the icu (most only use it to convert `n' to `IRQn'). isa.h is only used by ioconf.c and by a few drivers that know too much about isa addresses (a few have to, because config is deficient).
|
#
ef743ce6 |
|
16-Jun-1996 |
John Dyson <dyson@FreeBSD.org> |
Several bugfixes/improvements: 1) Make it much less likely to miss a wakeup in vm_page_free_wakeup 2) Create a new entry point into pmap: pmap_ts_referenced, eliminates the need to scan the pv lists twice in many cases. Perhaps there is alot more to do here to work on minimizing pv list manipulation 3) Minor improvements to vm_pageout including the use of pmap_ts_ref. 4) Major changes and code improvement to pmap. This code has had several serious bugs in page table page manipulation. In order to simplify the problem, and hopefully solve it for once and all, page table pages are no longer "managed" with the pv list stuff. Page table pages are only (mapped and held/wired) or (free and unused) now. Page table pages are never inactive, active or cached. These changes have probably fixed the hold count problems, but if they haven't, then the code is simpler anyway for future bugfixing. 5) The pmap code has been sorely in need of re-organization, and I have taken a first (of probably many) steps. Please tell me if you have any ideas.
|
#
419702a4 |
|
12-Jun-1996 |
John Dyson <dyson@FreeBSD.org> |
Fix a very significant cnt.v_wire_count leak in vm_page.c, and some minor leaks in pmap.c. Bruce Evans made me aware of this problem.
|
#
c23670e2 |
|
11-Jun-1996 |
Gary Palmer <gpalmer@FreeBSD.org> |
Clean up -Wunused warnings. Reviewed by: bde
|
#
886d3e11 |
|
08-Jun-1996 |
John Dyson <dyson@FreeBSD.org> |
Adjust the threshold for blocking on movement of pages from the cache queue in vm_fault. Move the PG_BUSY in vm_fault to the correct place. Remove redundant/unnecessary code in pmap.c. Properly block on rundown of page table pages, if they are busy. I think that the VM system is in pretty good shape now, and the following individuals (among others, in no particular order) have helped with this recent bunch of bugs, thanks! If I left anyone out, I apologize! Stephen McKay, Stephen Hocking, Eric J. Chet, Dan O'Brien, James Raynard, Marc Fournier.
|
#
475dca82 |
|
06-Jun-1996 |
John Dyson <dyson@FreeBSD.org> |
Fix a bug in the pmap_object_init_pt routine that pages aren't taken from the cache queue before being mapped into the process.
|
#
3ccd871c |
|
05-Jun-1996 |
John Dyson <dyson@FreeBSD.org> |
I missed a case of the page table page dirty-bit fix.
|
#
6b6f0008 |
|
04-Jun-1996 |
John Dyson <dyson@FreeBSD.org> |
Keep page-table pages from ever being sensed as dirty. This should fix some problems with the page-table page management code, since it can't deal with the notion of page-table pages being paged out or in transit. Also, clean up some stylistic issues per some suggestions from Stephen McKay.
|
#
3943b4ea |
|
02-Jun-1996 |
John Dyson <dyson@FreeBSD.org> |
Don't carry the modified or referenced bits through to the child process during pmap_copy. This minimizes unnecessary swapping or creation of swap space. If there is a hold_count flaw for page-table pages, clear the page before freeing it to lessen the chance of a system crash -- this is a robustness thing only, NOT a fix.
|
#
c2b39c99 |
|
01-Jun-1996 |
John Dyson <dyson@FreeBSD.org> |
Fix the problem with pmap_copy that breaks X in small memory machines. Also close some windows that are opened up by page table allocations. The prefaulting code no longer uses hold counts, but now uses the busy flag for synchronization.
|
#
f35329ac |
|
30-May-1996 |
John Dyson <dyson@FreeBSD.org> |
This commit is dual-purpose, to fix more of the pageout daemon queue corruption problems, and to apply Gary Palmer's code cleanups. David Greenman helped with these problems also. There is still a hang problem using X in small memory machines.
|
#
25695129 |
|
28-May-1996 |
John Dyson <dyson@FreeBSD.org> |
The wrong address (pindex) was being used for the page table directory. No negative side effects right now, but just a clean-up.
|
#
cd61f6c5 |
|
22-May-1996 |
Peter Wemm <peter@FreeBSD.org> |
Fix harmless warning.. pmap_nw_modified was not having it's arg cast to pt_entry_t like the others inside the DIAGNOSTIC code.
|
#
93d52b3c |
|
21-May-1996 |
John Dyson <dyson@FreeBSD.org> |
A serious error in pmap.c(pmap_remove) is corrected by this. When comparing the PTD pointers, they needed to be masked by PG_FRAME, and they weren't. Also, the "improved" non-386 code wasn't really an improvement, so I simplified and fixed the code. This might have caused some of the panics caused by the VM megacommit.
|
#
ed48f831 |
|
20-May-1996 |
John Dyson <dyson@FreeBSD.org> |
To quote Stephen McKay: pmap_copy is a complex NOP at this moment :-). With this fix from Stephen, we are getting the target fork performance that I have been trying to attain: P5-166, before the mega-commit: 700-800usecs, after: 600usecs, with Stephen's fix: 500usecs!!! Also, this could be the solution of some strange panic problems... Reviewed by: dyson@freebsd.org Submitted by: Stephen McKay <syssgm@devetir.qld.gov.au>
|
#
867a482d |
|
19-May-1996 |
John Dyson <dyson@FreeBSD.org> |
Initial support for mincore and madvise. Both are almost fully supported, except madvise does not page in with MADV_WILLNEED, and MADV_DONTNEED doesn't force dirty pages out.
|
#
b18bfc3d |
|
17-May-1996 |
John Dyson <dyson@FreeBSD.org> |
This set of commits to the VM system does the following, and contain contributions or ideas from Stephen McKay <syssgm@devetir.qld.gov.au>, Alan Cox <alc@cs.rice.edu>, David Greenman <davidg@freebsd.org> and me: More usage of the TAILQ macros. Additional minor fix to queue.h. Performance enhancements to the pageout daemon. Addition of a wait in the case that the pageout daemon has to run immediately. Slightly modify the pageout algorithm. Significant revamp of the pmap/fork code: 1) PTE's and UPAGES's are NO LONGER in the process's map. 2) PTE's and UPAGES's reside in their own objects. 3) TOTAL elimination of recursive page table pagefaults. 4) The page directory now resides in the PTE object. 5) Implemented pmap_copy, thereby speeding up fork time. 6) Changed the pv entries so that the head is a pointer and not an entire entry. 7) Significant cleanup of pmap_protect, and pmap_remove. 8) Removed significant amounts of machine dependent fork code from vm_glue. Pushed much of that code into the machine dependent pmap module. 9) Support more completely the reuse of already zeroed pages (Page table pages and page directories) as being already zeroed. Performance and code cleanups in vm_map: 1) Improved and simplified allocation of map entries. 2) Improved vm_map_copy code. 3) Corrected some minor problems in the simplify code. Implemented splvm (combo of splbio and splimp.) The VM code now seldom uses splhigh. Improved the speed of and simplified kmem_malloc. Minor mod to vm_fault to avoid using pre-zeroed pages in the case of objects with backing objects along with the already existant condition of having a vnode. (If there is a backing object, there will likely be a COW... With a COW, it isn't necessary to start with a pre-zeroed page.) Minor reorg of source to perhaps improve locality of ref.
|
#
aa8de40a |
|
03-May-1996 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Another sweep over the pmap/vm macros, this time with more focus on the usage. I'm not satisfied with the naming, but now at least there is less bogus stuff around.
|
#
5084d10d |
|
02-May-1996 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Move atdevbase out of locore.s and into machdep.c Macroize locore.s' page table setup even more, now it's almost readable. Rename PG_U to PG_A (so that I can...) Rename PG_u to PG_U. "PG_u" was just too ugly... Remove some unused vars in pmap.c Remove PG_KR and PG_KW Remove SSIZE Remove SINCR Remove BTOPKERNBASE This concludes my spring cleaning, modulus any bug fixes for messes I have made on the way. (Funny to be back here in pmap.c, that's where my first significant contribution to 386BSD was... :-)
|
#
e911eafc |
|
02-May-1996 |
Poul-Henning Kamp <phk@FreeBSD.org> |
removed: CLBYTES PD_SHIFT PGSHIFT NBPG PGOFSET CLSIZELOG2 CLSIZE pdei() ptei() kvtopte() ptetov() ispt() ptetoav() &c &c new: NPDEPG Major macro cleanup.
|
#
7847cec1 |
|
21-Apr-1996 |
John Dyson <dyson@FreeBSD.org> |
This fixes a troubling oversight in some of the pmap code enhancements. One of the manifiestations of the problem includes the -4 RSS problem in ps. Reviewed by: dyson Submitted by: Stephen McKay <syssgm@devetir.qld.gov.au>
|
#
07b10591 |
|
06-Apr-1996 |
John Dyson <dyson@FreeBSD.org> |
Major cleanups for the pmap code.
|
#
909e5e0e |
|
31-Mar-1996 |
David Greenman <dg@FreeBSD.org> |
Change if/goto into a while loop.
|
#
4e489ec4 |
|
27-Mar-1996 |
John Dyson <dyson@FreeBSD.org> |
Remove a now unnecessary prototype from pmap.c. Also remove now unnecessary vm_fault's of page table pages in trap.c.
|
#
208bfdc9 |
|
27-Mar-1996 |
John Dyson <dyson@FreeBSD.org> |
Significant code cleanup, and some performance improvement. Also, mlock will now work properly without killing the system.
|
#
3ce8e60f |
|
12-Mar-1996 |
John Dyson <dyson@FreeBSD.org> |
Make sure that we pmap_update AFTER modifying the page table entries. The P6 can do a serious job of reordering code, and our stuff could execute incorrectly.
|
#
6ad13830 |
|
10-Mar-1996 |
Jeffrey Hsu <hsu@FreeBSD.org> |
For Lite2: proc LIST changes. Reviewed by: david & bde
|
#
874308f7 |
|
10-Mar-1996 |
John Dyson <dyson@FreeBSD.org> |
Improved efficiency in pmap_remove, and also remove some of the pmap_update optimizations that were probably incorrect.
|
#
9212ebc6 |
|
09-Mar-1996 |
John Dyson <dyson@FreeBSD.org> |
Correct some new and older lurking bugs. Hold count wasn't being handled correctly. Fix some incorrect code that was included to improve performance. Significantly simplify the pmap_use_pt and pmap_unuse_pt subroutines. Add some more diagnostic code.
|
#
d6673cba |
|
24-Feb-1996 |
John Dyson <dyson@FreeBSD.org> |
Re-insert a missing pmap_remove operation.
|
#
3eb77c83 |
|
24-Feb-1996 |
John Dyson <dyson@FreeBSD.org> |
Fix a problem with tracking the modified bit. Eliminate the ugly inline-asm code, and speed up the page-table-page tracking.
|
#
267173e7 |
|
04-Feb-1996 |
David Greenman <dg@FreeBSD.org> |
Rewrote cpu_fork so that it doesn't use pmap_activate, and removed pmap_activate since it's not used anymore. Changed cpu_fork so that it uses one line of inline assembly rather than calling mvesp() to get the current stack pointer. Removed mvesp() since it is no longer being used.
|
#
1a46737f |
|
19-Jan-1996 |
Peter Wemm <peter@FreeBSD.org> |
Some trivial fixes to get it to compile again, plus some new lint: - cpuclass should be cpu_class - CPUCLASS_I386 should be CPUCLASS_386 (^^ those only show up if you compile for i386) - two missing prototypes on new functions - one missing static
|
#
bd7e5f99 |
|
18-Jan-1996 |
John Dyson <dyson@FreeBSD.org> |
Eliminated many redundant vm_map_lookup operations for vm_mmap. Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish overhead for merged cache. Efficiency improvement for vfs_cluster. It used to do alot of redundant calls to cluster_rbuild. Correct the ordering for vrele of .text and release of credentials. Use the selective tlb update for 486/586/P6. Numerous fixes to the size of objects allocated for files. Additionally, fixes in the various pagers. Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs. Fixes in the swap pager for exhausted resources. The pageout code will not as readily thrash. Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE), thereby improving efficiency of several routines. Eliminate even more unnecessary vm_page_protect operations. Significantly speed up process forks. Make vm_object_page_clean more efficient, thereby eliminating the pause that happens every 30seconds. Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the case of filesystems mounted async. Fix a panic with busy pages when write clustering is done for non-VMIO buffers.
|
#
e3900906 |
|
22-Dec-1995 |
Bruce Evans <bde@FreeBSD.org> |
Staticized code that was hidden by `#ifdef DEBUG'.
|
#
f2c6b65b |
|
17-Dec-1995 |
Bruce Evans <bde@FreeBSD.org> |
Fixed 1TB filesize changes. Some pindexes had bogus names and types but worked because vm_pindex_t is indistinuishable from vm_offset_t.
|
#
c3741af9 |
|
14-Dec-1995 |
Bruce Evans <bde@FreeBSD.org> |
Added a prototype. Merged prototype lists.
|
#
a316d390 |
|
10-Dec-1995 |
John Dyson <dyson@FreeBSD.org> |
Changes to support 1Tb filesizes. Pages are now named by an (object,index) pair instead of (object,offset) pair.
|
#
87b91157 |
|
10-Dec-1995 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Staticize and cleanup. remove a TON of #includes from machdep.
|
#
efeaf95a |
|
06-Dec-1995 |
David Greenman <dg@FreeBSD.org> |
Untangled the vm.h include file spaghetti.
|
#
8ae2aed2 |
|
03-Dec-1995 |
Bruce Evans <bde@FreeBSD.org> |
Completed function declarations and/or added prototypes.
|
#
07658526 |
|
19-Nov-1995 |
Poul-Henning Kamp <phk@FreeBSD.org> |
Remove unused vars.
|
#
63017f04 |
|
22-Oct-1995 |
David Greenman <dg@FreeBSD.org> |
Remove PG_W bit setting in some cases where it should not be set. Submitted by: John Dyson <dyson>
|
#
b596ee8d |
|
22-Oct-1995 |
David Greenman <dg@FreeBSD.org> |
More improvements to the logic for modify-bit checking. Removed pmap_prefault() code as we don't plan to use it at this point in time. Submitted by: John Dyson <dyson>
|
#
6928ec33 |
|
21-Oct-1995 |
David Greenman <dg@FreeBSD.org> |
Simplified some expressions.
|
#
0937c08c |
|
15-Sep-1995 |
David Greenman <dg@FreeBSD.org> |
1) Killed 'BSDVM_COMPAT'. 2) Killed i386pagesperpage as it is not used by anything. 3) Fixed benign miscalculations in pmap_bootstrap(). 4) Moved allocation of ISA DMA memory to machdep.c. 5) Removed bogus vm_map_find()'s in pmap_init() - the entire range was already allocated kmem_init(). 6) Added some comments. virual_avail is still miscalculated NKPT*NBPG too large, but in order to fix this properly requires moving the variable initialization into locore.s. Some other day.
|
#
6b837e5d |
|
28-Jul-1995 |
David Greenman <dg@FreeBSD.org> |
Fixed bug I introduced with the memory-size code rewrite that broke floppy DMA buffers...use avail_start not "first". Removed duplicate (and wrong) declaration of phys_avail[]. Submitted by: Bruce Evans, but fixed differently by me.
|
#
24a1cce3 |
|
13-Jul-1995 |
David Greenman <dg@FreeBSD.org> |
NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct proc or any VM system structure will have to be rebuilt!!! Much needed overhaul of the VM system. Included in this first round of changes: 1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages, haspage, and sync operations are supported. The haspage interface now provides information about clusterability. All pager routines now take struct vm_object's instead of "pagers". 2) Improved data structures. In the previous paradigm, there is constant confusion caused by pagers being both a data structure ("allocate a pager") and a collection of routines. The idea of a pager structure has escentially been eliminated. Objects now have types, and this type is used to index the appropriate pager. In most cases, items in the pager structure were duplicated in the object data structure and thus were unnecessary. In the few cases that remained, a un_pager structure union was created in the object to contain these items. 3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now be removed. For instance, vm_object_enter(), vm_object_lookup(), vm_object_remove(), and the associated object hash list were some of the things that were removed. 4) simple_lock's removed. Discussion with several people reveals that the SMP locking primitives used in the VM system aren't likely the mechanism that we'll be adopting. Even if it were, the locking that was in the code was very inadequate and would have to be mostly re-done anyway. The locking in a uni-processor kernel was a no-op but went a long way toward making the code difficult to read and debug. 5) Places that attempted to kludge-up the fact that we don't have kernel thread support have been fixed to reflect the reality that we are really dealing with processes, not threads. The VM system didn't have complete thread support, so the comments and mis-named routines were just wrong. We now use tsleep and wakeup directly in the lock routines, for instance. 6) Where appropriate, the pagers have been improved, especially in the pager_alloc routines. Most of the pager_allocs have been rewritten and are now faster and easier to maintain. 7) The pagedaemon pageout clustering algorithm has been rewritten and now tries harder to output an even number of pages before and after the requested page. This is sort of the reverse of the ideal pagein algorithm and should provide better overall performance. 8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup have been removed. Some other unnecessary casts have also been removed. 9) Some almost useless debugging code removed. 10) Terminology of shadow objects vs. backing objects straightened out. The fact that the vm_object data structure escentially had this backwards really confused things. The use of "shadow" and "backing object" throughout the code is now internally consistent and correct in the Mach terminology. 11) Several minor bug fixes, including one in the vm daemon that caused 0 RSS objects to not get purged as intended. 12) A "default pager" has now been created which cleans up the transition of objects to the "swap" type. The previous checks throughout the code for swp->pg_data != NULL were really ugly. This change also provides the rudiments for future backing of "anonymous" memory by something other than the swap pager (via the vnode pager, for example), and it allows the decision about which of these pagers to use to be made dynamically (although will need some additional decision code to do this, of course). 13) (dyson) MAP_COPY has been deprecated and the corresponding "copy object" code has been removed. MAP_COPY was undocumented and non- standard. It was furthermore broken in several ways which caused its behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will continue to work correctly, but via the slightly different semantics of MAP_PRIVATE. 14) (dyson) Sharing maps have been removed. It's marginal usefulness in a threads design can be worked around in other ways. Both #12 and #13 were done to simplify the code and improve readability and maintain- ability. (As were most all of these changes) TODO: 1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing this will reduce the vnode pager to a mere fraction of its current size. 2) Rewrite vm_fault and the swap/vnode pagers to use the clustering information provided by the new haspage pager interface. This will substantially reduce the overhead by eliminating a large number of VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be improved to provide both a "behind" and "ahead" indication of contiguousness. 3) Implement the extended features of pager_haspage in swap_pager_haspage(). It currently just says 0 pages ahead/behind. 4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps via a much more general mechanism that could also be used for disk striping of regular filesystems. 5) Do something to improve the architecture of vm_object_collapse(). The fact that it makes calls into the swap pager and knows too much about how the swap pager operates really bothers me. It also doesn't allow for collapsing of non-swap pager objects ("unnamed" objects backed by other pagers).
|
#
9b2e5354 |
|
30-May-1995 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
Remove trailing whitespace.
|
#
b2b795f0 |
|
11-May-1995 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
Fix -Wformat warnings from LINT kernel.
|
#
fcb5be87 |
|
08-Apr-1995 |
David Greenman <dg@FreeBSD.org> |
Cosmetic changes.
|
#
7b0047e2 |
|
30-Mar-1995 |
David Greenman <dg@FreeBSD.org> |
Made pmap_testbit a static function.
|
#
880d1d84 |
|
26-Mar-1995 |
David Greenman <dg@FreeBSD.org> |
Changed pmap_changebit() into a static function as it always should have been. Submitted by: John Dyson
|
#
b5e8ce9f |
|
16-Mar-1995 |
Bruce Evans <bde@FreeBSD.org> |
Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.
|
#
90c47808 |
|
10-Mar-1995 |
David Greenman <dg@FreeBSD.org> |
Removed unnecessary routines vm_get_pmap() and vm_put_pmap(). kmem_alloc() returns zero filled memory, so no need to explicitly bzero() it.
|
#
fde2cdc4 |
|
01-Mar-1995 |
David Greenman <dg@FreeBSD.org> |
Various changes from John and myself that do the following: New functions create - vm_object_pip_wakeup and pagedaemon_wakeup that are used to reduce the actual number of wakeups. New function vm_page_protect which is used in conjuction with some new page flags to reduce the number of calls to pmap_page_protect. Minor changes to reduce unnecessary spl nesting. Rewrote vm_page_alloc() to improve readability. Various other mostly cosmetic changes.
|
#
550f8550 |
|
25-Feb-1995 |
Bruce Evans <bde@FreeBSD.org> |
Replace all remaining instances of `i386/include' by `machine' and fix nearby #include inconsistencies.
|
#
7389231d |
|
14-Feb-1995 |
David Greenman <dg@FreeBSD.org> |
Killed the pmap_use_pt and pmap_unuse_pt prototypes as they are now in machine/pmap.h.
|
#
87bc4e69 |
|
02-Feb-1995 |
David Greenman <dg@FreeBSD.org> |
Mostly cosmetic changes. Use KERNBASE instead of UPT_MAX_ADDRESS in some comparisons as it is more correct (we want the kernel page tables included). Reorganized some of the expressions for efficiency. Fixed the new pmap_prefault() routine - it would sometimes pick up the wrong page if the page in the shadow was present but the page in object was paged out. The routine remains unused and commented out, however. Explicitly free zero reference count page tables (rather than waiting for the pagedaemon to do it). Submitted by: John Dyson
|
#
7a642ab5 |
|
26-Jan-1995 |
David Greenman <dg@FreeBSD.org> |
Fix from Doug Rabson for a panic related to not initializing the kernel's PTD. Submitted by: John Dyson
|
#
d3a10e2c |
|
25-Jan-1995 |
David Greenman <dg@FreeBSD.org> |
Comment out pmap_prefault for the time being (perhaps until after 2.1). The object_init_pt routine is still enabled and used, however, and this is where most of the 'pre-faulting' performance improvement comes from.
|
#
3edf89fe |
|
25-Jan-1995 |
David Greenman <dg@FreeBSD.org> |
Make sure that the pages being 'pre-faulted' are currently on a queue.
|
#
6100ed1d |
|
25-Jan-1995 |
David Greenman <dg@FreeBSD.org> |
Be a bit less fast and loose about setting non-cacheablity of pages.
|
#
fbdfe8ac |
|
24-Jan-1995 |
David Greenman <dg@FreeBSD.org> |
Changed buffer allocation policy (machdep.c) Moved various pmap 'bit' test/set functions back into real functions; gcc generates better code at the expense of more of it. (pmap.c) Fixed a deadlock problem with pv entry allocations (pmap.c) Added a new, optional function 'pmap_prefault' that does clustered page table preloading (pmap.c) Changed the way that page tables are held onto (trap.c). Submitted by: John Dyson
|
#
7082ec8a |
|
15-Jan-1995 |
David Greenman <dg@FreeBSD.org> |
Fixed some page table reference count problems; these changes may not be complete, but should be closer to correct than before.
|
#
92c38536 |
|
13-Jan-1995 |
David Greenman <dg@FreeBSD.org> |
Add missing object_lock/unlock.
|
#
0d94caff |
|
09-Jan-1995 |
David Greenman <dg@FreeBSD.org> |
These changes embody the support of the fully coherent merged VM buffer cache, much higher filesystem I/O performance, and much better paging performance. It represents the culmination of over 6 months of R&D. The majority of the merged VM/cache work is by John Dyson. The following highlights the most significant changes. Additionally, there are (mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to support the new VM/buffer scheme. vfs_bio.c: Significant rewrite of most of vfs_bio to support the merged VM buffer cache scheme. The scheme is almost fully compatible with the old filesystem interface. Significant improvement in the number of opportunities for write clustering. vfs_cluster.c, vfs_subr.c Upgrade and performance enhancements in vfs layer code to support merged VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff. vm_object.c: Yet more improvements in the collapse code. Elimination of some windows that can cause list corruption. vm_pageout.c: Fixed it, it really works better now. Somehow in 2.0, some "enhancements" broke the code. This code has been reworked from the ground-up. vm_fault.c, vm_page.c, pmap.c, vm_object.c Support for small-block filesystems with merged VM/buffer cache scheme. pmap.c vm_map.c Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of kernel PTs. vm_glue.c Much simpler and more effective swapping code. No more gratuitous swapping. proc.h Fixed the problem that the p_lock flag was not being cleared on a fork. swap_pager.c, vnode_pager.c Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the code doesn't need it anymore. machdep.c Changes to better support the parameter values for the merged VM/buffer cache scheme. machdep.c, kern_exec.c, vm_glue.c Implemented a seperate submap for temporary exec string space and another one to contain process upages. This eliminates all map fragmentation problems that previously existed. ffs_inode.c, ufs_inode.c, ufs_readwrite.c Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on busy buffers. Submitted by: John Dyson and David Greenman
|
#
94101b52 |
|
18-Dec-1994 |
David Greenman <dg@FreeBSD.org> |
Move page_unhold's in pmap_object_init_pt down one line to gard against a potential race condition.
|
#
931bde7f |
|
17-Dec-1994 |
David Greenman <dg@FreeBSD.org> |
Check for PG_FAKE too in pmap_object_init_pt.
|
#
3fb3086e |
|
08-Oct-1994 |
Poul-Henning Kamp <phk@FreeBSD.org> |
db_disasm.c: Unused var zapped. pmap.c: tons of unused vars zapped, various other warnings silenced. trap.c: unused vars zapped. vm_machdep.c: A wrong argument, which by chance did the right thing, was corrected.
|
#
df9ab304 |
|
16-Sep-1994 |
David Greenman <dg@FreeBSD.org> |
Removed inclusion of pio.h and cpufunc.h (cpufunc.h is included from systm.h). Merged functionality of pio.h into cpufunc.h. Cleaned up some related code.
|
#
9cbeeedd |
|
03-Sep-1994 |
David Greenman <dg@FreeBSD.org> |
Added pmap_mapdev() function to map device memory.
|
#
2c7a40c7 |
|
01-Sep-1994 |
David Greenman <dg@FreeBSD.org> |
Removed all vestiges of tlbflush(). Replaced them with calls to pmap_update(). Made pmap_update an inline assembly function.
|
#
f23b4c91 |
|
18-Aug-1994 |
Garrett Wollman <wollman@FreeBSD.org> |
Fix up some sloppy coding practices: - Delete redundant declarations. - Add -Wredundant-declarations to Makefile.i386 so they don't come back. - Delete sloppy COMMON-style declarations of uninitialized data in header files. - Add a few prototypes. - Clean up warnings resulting from the above. NB: ioconf.c will still generate a redundant-declaration warning, which is unavoidable unless somebody volunteers to make `config' smarter.
|
#
f540b106 |
|
12-Aug-1994 |
Garrett Wollman <wollman@FreeBSD.org> |
Change all #includes to follow the current Berkeley style. Some of these ``changes'' are actually not changes at all, but CVS sometimes has trouble telling the difference. This also includes support for second-directory compiles. This is not quite complete yet, as `config' doesn't yet do the right thing. You can still make it work trivially, however, by doing the following: rm /sys/compile mkdir /usr/obj/sys/compile ln -s M-. /sys/compile cd /sys/i386/conf config MYKERNEL cd ../../compile/MYKERNEL ln -s /sys @ rm machine ln -s @/i386/include machine make depend make
|
#
8339815f |
|
07-Aug-1994 |
David Greenman <dg@FreeBSD.org> |
Made pmap_kenter "TLB safe". ...and then removed all the pmap_updates that are no longer needed because of this.
|
#
a481f200 |
|
07-Aug-1994 |
David Greenman <dg@FreeBSD.org> |
Provide support for upcoming merged VM/buffer cache, and fixed a few bugs that haven't appeared to manifest themselves (yet). Submitted by: John Dyson
|
#
c87801fe |
|
06-Aug-1994 |
David Greenman <dg@FreeBSD.org> |
Fixed various prototype problems with the pmap functions and the subsequent problems that fixing them caused.
|
#
16f62314 |
|
06-Aug-1994 |
David Greenman <dg@FreeBSD.org> |
Incorporated post 1.1.5 work from John Dyson. This includes performance improvements via the new routines pmap_qenter/pmap_qremove and pmap_kenter/ pmap_kremove. These routine allow fast mapping of pages for those architectures that have "normal" MMUs. Also included is a fix to the pageout daemon to properly check a queue end condition. Submitted by: John Dyson
|
#
d23d07ef |
|
02-Aug-1994 |
David Greenman <dg@FreeBSD.org> |
Merged in post-1.1.5 work done by myself and John Dyson. This includes: me: 1) TLB flush optimization that effectively eliminates half of all of the TLB flushes. This works by only flushing the TLB when a page is "present" in memory (i.e. the valid bit is set in the page table entry). See section 5.3.5 of the Intel 386 Programmer's Reference Manual. 2) The handling of "CMAP" has been improved to catch attempts at multiple simultaneous use. John: 1) Added pmap_qenter/pmap_qremove functions for fast mapping of pages into the kernel. This is for future optimizations and support for the upcoming merged VM/buffer cache. Reviewed by: John Dyson
|
#
26f9a767 |
|
25-May-1994 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch. Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
|
#
bb508919 |
|
01-May-1994 |
David Greenman <dg@FreeBSD.org> |
Removed some tlbflush optimizations as some of them were bogus and lead to some strange behavior.
|
#
0e195446 |
|
20-Apr-1994 |
David Greenman <dg@FreeBSD.org> |
Bug fixes and performance improvements from John Dyson and myself: 1) check va before clearing the page clean flag. Not doing so was causing the vnode pager error 5 messages when paging from NFS. (pmap.c) 2) put back interrupt protection in idle_loop. Bruce didn't think it was necessary, John insists that it is (and I agree). (swtch.s) 3) various improvements to the clustering code (vm_machdep.c). It's now enabled/used by default. 4) bad disk blocks are now handled properly when doing clustered IOs. (wd.c, vm_machdep.c) 5) bogus bad block handling fixed in wd.c. 6) algorithm improvements to the pageout/pagescan daemons. It's amazing how well 4MB machines work now.
|
#
f690bbac |
|
14-Apr-1994 |
David Greenman <dg@FreeBSD.org> |
Changes from John Dyson and myself: 1) Removed all instances of disable_intr()/enable_intr() and changed them back to splimp/splx. The previous method was done to improve the performance, but Bruces recent changes to inline spl* have made this unnecessary. 2) Cleaned up vm_machdep.c considerably. Probably fixed a few bugs, too. 3) Added a new mechanism for collecting page statistics - now done by a new system process "pagescan". Previously this was done by the pageout daemon, but this proved to be impractical. 4) Improved the page usage statistics gathering mechanism - performance is much improved in small memory machines. 5) Modified mbuf.h to enable the support for an external free routine when using mbuf clusters. Added appropriate glue in various places to allow this to work. 6) Adapted a suggested change to the NFS code from Yuval Yurom to take advantage of #5. 7) Added fault/swap statistics support.
|
#
6b4ac811 |
|
29-Mar-1994 |
David Greenman <dg@FreeBSD.org> |
New routine "pmap_kenter", designed to take advantage of the special case of the kernel pmap.
|
#
943a66f3 |
|
14-Mar-1994 |
David Greenman <dg@FreeBSD.org> |
Performance improvements from John Dyson. 1) A new mechanism has been added to prevent pages from being paged out called "vm_page_hold". Similar to vm_page_wire, but much lower overhead. 2) Scheduling algorithm has been changed to improve interactive performance. 3) Paging algorithm improved. 4) Some vnode and swap pager bugs fixed.
|
#
04f18356 |
|
07-Mar-1994 |
David Greenman <dg@FreeBSD.org> |
1) "Pre-faulting" in of pages into process address space Eliminates vm_fault overhead on process startup and mmap referenced data for in-memory pages. (process startup time using in-memory segments *much* faster) 2) Even more efficient pmap code. Code partially cleaned up. More comments yet to follow. (generally more efficient pte management) 3) Pageout clustering ( in addition to the FreeBSD V1.1 pagein clustering.) (much faster paging performance on non-write behind disk subsystems, slightly faster performance on other systems.) 4) Slightly changed vm_pageout code for more efficiency and better statistics. Also, resist swapout a little more. (less likely to pageout a recently used page) 5) Slight improvement to the page table page trap efficiency. (generally faster system VM fault performance) 6) Defer creation of unnamed anonymous regions pager until needed. (speeds up shared memory bss creation) 7) Remove possible deadlock from swap_pager initialization. 8) Enhanced procfs to provide "vminfo" about vm objects and user pmaps. 9) Increased MCLSHIFT/MCLBYTES from 2K to 4K to improve net & socket performance and to prepare for things to come. John Dyson dyson@implode.root.com David Greenman davidg@root.com
|
#
2c194b2e |
|
13-Feb-1994 |
David Greenman <dg@FreeBSD.org> |
Fixed bug in handling of COW - the original code was bogus and it was only accidental that it worked. Also, don't cache non-managed pages.
|
#
43ef94a9 |
|
09-Feb-1994 |
David Greenman <dg@FreeBSD.org> |
Patch from John Dyson: a pv chain was being traversed while interrupts were fully enabled in pmap_remove_all ... this is bogus, and has been fixed in pmap.c. (sorry for adding the splimp)
|
#
98446d4e |
|
07-Feb-1994 |
David Greenman <dg@FreeBSD.org> |
Fixes from John Dyson to fix out-of-memory hangs and other problems (such as increased swap space usage) related to (incorrectly) paging out the page tables.
|
#
8f64d25d |
|
31-Jan-1994 |
David Greenman <dg@FreeBSD.org> |
Added four pattern memory test routine that is done at startup.
|
#
ec120393 |
|
30-Jan-1994 |
David Greenman <dg@FreeBSD.org> |
VM system performance improvements from John Dyson and myself. The following is a summary: 1) increased object cache back up to a more reasonable value. 2) removed old & bogus cruft from machdep.c (clearseg, copyseg, physcopyseg, etc). 3) inlined many functions in pmap.c 4) changed "load_cr3(rcr3())" into tlbflush() and made tlbflush inline assembly. 5) changed the way that modified pages are tracked - now vm_page struct is kept updated directly - no more scanning page tables. 6) removed lots of unnecessary spl's 7) removed old unused functions from pmap.c 8) removed all use of page_size, page_shift, page_mask variables - replaced with PAGE_ constants. 9) moved trunc/round_page, atop, ptoa, out of vm_param.h and into i386/ include/param.h, and optimized them. 10) numerous changes to sys/vm/ swap_pager, vnode_pager, pageout, fault code to improve performance. LRU algorithm modified to be more effective, read ahead/behind values tuned for better performance, etc, etc...
|
#
5d7fe66e |
|
26-Jan-1994 |
David Greenman <dg@FreeBSD.org> |
Made pmap_is_managed a static inline function.
|
#
d64f660f |
|
17-Jan-1994 |
David Greenman <dg@FreeBSD.org> |
Improvements mostly from John Dyson, with a little bit from me. * Removed pmap_is_wired * added extra cli/sti protection in idle (swtch.s) * slight code improvement in trap.c * added lots of comments * improved paging and other algorithms in VM system
|
#
7f8cb368 |
|
14-Jan-1994 |
David Greenman <dg@FreeBSD.org> |
"New" VM system from John Dyson & myself. For a run-down of the major changes, see the log of any effected file in the sys/vm directory (swap_pager.c for instance).
|
#
aaf08d94 |
|
18-Dec-1993 |
Garrett Wollman <wollman@FreeBSD.org> |
Make everything compile with -Wtraditional. Make it easier to distribute a binary link-kit. Make all non-optional options (pagers, procfs) standard, and update LINT to reflect new symtab requirements. NB: -Wtraditional will henceforth be forgotten. This editing pass was primarily intended to detect any constructions where the old code might have been relying on traditional C semantics or syntax. These were all fixed, and the result of fixing some of them means that -Wall is now a realistic possibility within a few weeks.
|
#
6aa5e701 |
|
13-Dec-1993 |
David Greenman <dg@FreeBSD.org> |
added some panics to catch the condition where pmap_pte returns null - indicating that the page table page is non-resident.
|
#
381fe1aa |
|
24-Nov-1993 |
Garrett Wollman <wollman@FreeBSD.org> |
Make the LINT kernel compile with -W -Wreturn-type -Wcomment -Werror, and add same (sans -Werror) to Makefile for future compilations.
|
#
0967373e |
|
12-Nov-1993 |
David Greenman <dg@FreeBSD.org> |
First steps in rewriting locore.s, and making info useful when the machine panics. i386/i386/locore.s: 1) got rid of most .set directives that were being used like #define's, and replaced them with appropriate #define's in the appropriate header files (accessed via genassym). 2) added comments to header inclusions and global definitions, and global variables 3) replaced some hardcoded constants with cpp defines (such as PDESIZE and others) 4) aligned all comments to the same column to make them easier to read 5) moved macro definitions for ENTRY, ALIGN, NOP, etc. to /sys/i386/include/asmacros.h 6) added #ifdef BDE_DEBUGGER around all of Bruce's debugger code 7) added new global '_KERNend' to store last location+1 of kernel 8) cleaned up zeroing of bss so that only bss is zeroed 9) fix zeroing of page tables so that it really does zero them all - not just if they follow the bss. 10) rewrote page table initialization code so that 1) works correctly and 2) write protects the kernel text by default 11) properly initialize the kernel page directory, upages, p0stack PT, and page tables. The previous scheme was more than a bit screwy. 12) change allocation of virtual area of IO hole so that it is fixed at KERNBASE + 0xa0000. The previous scheme put it right after the kernel page tables and then later expected it to be at KERNBASE +0xa0000 13) change multiple bogus settings of user read/write of various areas of kernel VM - including the IO hole; we should never be accessing the IO hole in user mode through the kernel page tables 14) split kernel support routines such as bcopy, bzero, copyin, copyout, etc. into a seperate file 'support.s' 15) split swtch and related routines into a seperate 'swtch.s' 16) split routines related to traps, syscalls, and interrupts into a seperate file 'exception.s' 17) remove some unused global variables from locore that got inserted by Garrett when he pulled them out of some .h files. i386/isa/icu.s: 1) clean up global variable declarations 2) move in declaration of astpending and netisr i386/i386/pmap.c: 1) fix calculation of virtual_avail. It previously was calculated to be right in the middle of the kernel page tables - not a good place to start allocating kernel VM. 2) properly allocate kernel page dir/tables etc out of kernel map - previously only took out 2 pages. i386/i386/machdep.c: 1) modify boot() to print a warning that the system will reboot in PANIC_REBOOT_WAIT_TIME amount of seconds, and let the user abort with a key on the console. The machine will wait for ever if a key is typed before the reboot. The default is 15 seconds, but can be set to 0 to mean don't wait at all, -1 to mean wait forever, or any positive value to wait for that many seconds. 2) print "Rebooting..." just before doing it. kern/subr_prf.c: 1) remove PANICWAIT as it is deprecated by the change to machdep.c i386/i386/trap.c: 1) add table of trap type strings and use it to print a real trap/ panic message rather than just a number. Lot's of work to be done here, but this is the first step. Symbolic traceback is in the TODO. i386/i386/Makefile.i386: 1) add support in to build support.s, exception.s and swtch.s ...and various changes to various header files to make all of the above happen.
|
#
960173b9 |
|
15-Oct-1993 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
genassym.c: Remove NKMEMCLUSTERS, it is no longer define or used. locores.s: Fix comment on PTDpde and APTDpde to be pde instead of pte Add new equation for calculating location of Sysmap Remove Bill's old #ifdef garbage for counting up memory, that stuff will never be made to work and was just cluttering up the file. Add code that places the PTD, page table pages, and kernel stack below the 640k ISA hole if there is room for it, otherwise put this stuff all at 1MB. This fixes the 28K bogusity in the boot blocks, that can now go away! Fix the caclulation of where first is to be dependent on NKPDE so that we can skip over the above mentioned areas. The 28K thing is now 44K in size due to the increase in kernel virtual memory space, but since we no longer have to worry about that this is no big deal. Use if NNPX > 0 instead of ifdef NPX for floating point code. machdep.c Change the calculation of for the buffer cache to be 20% of all memory above 2MB and add back the upper limit of 2/5's of the VM_KMEM_SIZE so that we do not eat ALL of the kernel memory space on large memory machines, note that this will not even come into effect unless you have more than 32MB. The current buffer cache limit is 6.7MB due to this caclulation. It seems that we where erroniously allocating bufpages pages for buffer_map. buffer_map is UNUSED in this implementation of the buffer cache, but since the map is referenced in several if statements a quick fix was to simply allocate 1 vm page (but no real memory) to it. pmap.h Remove rcsid, don't want them in the kernel files! Removed some cruft inside an #ifdef DEBUGx that caused compiler errors if you where compiling this for debug. Use the #defines for PD_SHIFT and PG_SHIFT in place of constants. trap.c: Remove patch kit header and rcsid, fix $Id$. Now include "npx.h" and use NNPX for controlling the floating point code. Remove a now completly invalid check for a maximum virtual address, the virtual address now ends at 0xFFFFFFFF so there is no more MAX!! (Thanks David, I completly missed that one!) vm_machdep.c Remove patch kit header and rcsid, fix $Id$. Now include "npx.h" and use NNPX for controlling the floating point code. Replace several 0xFE00000 constants with KERNBASE
|
#
a27df782 |
|
12-Oct-1993 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
KPTDI_LAST renamed to KPTDI
|
#
9aa17d68 |
|
12-Oct-1993 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
Eliminate definition of I386_PAGE_SIZE and use NBPG instead Replace 0xFE000000 constants with KERNBASE Use new definition NKPDE in place of a first-last+1 calculation.
|
#
b145f751 |
|
30-Sep-1993 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
This is a fix for the 32K DMA buffer region that was not accounted for, it relocates it to be after the BIOS memory hole instead of right below the 640K limit. THANK YOU CHRIS!!! From: <cgd@postgres.Berkeley.EDU> Date: Wed, 29 Sep 93 18:49:58 -0700 basically, reserve a new 32k space right after firstaddr, and put the buffer space there... the diffs are below, and are in ~cgd/sys/i386/i386 (in machdep.c) on freefall. i obviously can't test them, so if some of you would look the diffs over and try them out...
|
#
b4f05987 |
|
08-Sep-1993 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
Changed the pg("ptdi> %x") to a printf and then a panic, since we are going to panic shortly after this anyway. Destroys less state, and keeps the machine from waiting for someone to smash the return key a few times before it panics!
|
#
26931201 |
|
27-Jul-1993 |
David Greenman <dg@FreeBSD.org> |
* Applied fixes from Bruce Evans to fix COW bugs, >1MB kernel loading, profiling, and various protection checks that cause security holes and system crashes. * Changed min/max/bcmp/ffs/strlen to be static inline functions - included from cpufunc.h in via systm.h. This change improves performance in many parts of the kernel - up to 5% in the networking layer alone. Note that this requires systm.h to be included in any file that uses these functions otherwise it won't be able to find them during the load. * Fixed incorrect call to splx() in if_is.c * Fixed bogus variable assignment to splx() in if_ed.c
|
#
5b81b6b3 |
|
12-Jun-1993 |
Rodney W. Grimes <rgrimes@FreeBSD.org> |
Initial import, 0.1 + pk 0.2.4-B1
|