#
327785 |
|
10-Jan-2018 |
markj |
MFC r325530 (jeff), r325566 (kib), r325588 (kib): Replace many instances of VM_WAIT with blocking page allocation flags.
|
#
321513 |
|
26-Jul-2017 |
kib |
MFC r321247: Add pctrie_init() and vm_radix_init() to initialize generic pctrie and vm_radix trie.
|
#
318716 |
|
23-May-2017 |
markj |
MFC r308474, r308691, r309203, r309365, r309703, r309898, r310720, r308489, r308706: Add PQ_LAUNDRY and remove PG_CACHED pages.
|
#
302408 |
|
07-Jul-2016 |
gjb |
Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here.
Additional commits post-branch will follow.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
259107 |
|
08-Dec-2013 |
alc |
Eliminate a redundant parameter to vm_radix_replace().
Improve the wording of the comment describing vm_radix_replace().
Reviewed by: attilio MFC after: 6 weeks Sponsored by: EMC / Isilon Storage Division
|
#
254719 |
|
23-Aug-2013 |
alc |
Addendum to r254141: The call to vm_radix_insert() in vm_page_cache() can reclaim the last preexisting cached page in the object, resulting in a call to vdrop(). Detect this scenario so that the vnode's hold count is correctly maintained. Otherwise, we panic.
Reported by: scottl Tested by: pho Discussed with: attilio, jeff, kib
|
#
254141 |
|
09-Aug-2013 |
attilio |
On all the architectures, avoid to preallocate the physical memory for nodes used in vm_radix. On architectures supporting direct mapping, also avoid to pre-allocate the KVA for such nodes.
In order to do so make the operations derived from vm_radix_insert() to fail and handle all the deriving failure of those.
vm_radix-wise introduce a new function called vm_radix_replace(), which can replace a leaf node, already present, with a new one, and take into account the possibility, during vm_radix_insert() allocation, that the operations on the radix trie can recurse. This means that if operations in vm_radix_insert() recursed vm_radix_insert() will start from scratch again.
Sponsored by: EMC / Isilon storage division Reviewed by: alc (older version) Reviewed by: jeff Tested by: pho, scottl
|
#
248449 |
|
17-Mar-2013 |
attilio |
Sync back vmcontention branch into HEAD: Replace the per-object resident and cached pages splay tree with a path-compressed multi-digit radix trie. Along with this, switch also the x86-specific handling of idle page tables to using the radix trie.
This change is supposed to do the following: - Allowing the acquisition of read locking for lookup operations of the resident/cached pages collections as the per-vm_page_t splay iterators are now removed. - Increase the scalability of the operations on the page collections.
The radix trie does rely on the consumers locking to ensure atomicity of its operations. In order to avoid deadlocks the bisection nodes are pre-allocated in the UMA zone. This can be done safely because the algorithm needs at maximum one new node per insert which means the maximum number of the desired nodes is the number of available physical frames themselves. However, not all the times a new bisection node is really needed.
The radix trie implements path-compression because UFS indirect blocks can lead to several objects with a very sparse trie, increasing the number of levels to usually scan. It also helps in the nodes pre-fetching by introducing the single node per-insert property.
This code is not generalized (yet) because of the possible loss of performance by having much of the sizes in play configurable. However, efforts to make this code more general and then reusable in further different consumers might be really done.
The only KPI change is the removal of the function vm_page_splay() which is now reaped. The only KBI change, instead, is the removal of the left/right iterators from struct vm_page, which are now reaped.
Further technical notes broken into mealpieces can be retrieved from the svn branch: http://svn.freebsd.org/base/user/attilio/vmcontention/
Sponsored by: EMC / Isilon storage division In collaboration with: alc, jeff Tested by: flo, pho, jhb, davide Tested by: ian (arm) Tested by: andreast (powerpc)
|
#
248448 |
|
17-Mar-2013 |
attilio |
Commit new file FreeBSD tags.
Sponsored by: EMC / Isilon storage division
|
#
248428 |
|
17-Mar-2013 |
alc |
Simplify the interface to vm_radix_insert() by eliminating the parameter "index". The content of a radix tree leaf, or at least its "key", is not opaque to the other radix tree operations. Specifically, they know how to extract the "key" from a leaf. So, eliminating the parameter "index" isn't breaking the abstraction. Moreover, eliminating the parameter "index" effectively prevents the caller from passing an inconsistent "index" and leaf to vm_radix_insert().
Reviewed by: attilio Sponsored by: EMC / Isilon Storage Division
|
#
248110 |
|
09-Mar-2013 |
attilio |
Merge back vmc-playground into vmcontention. vm_radix.{c, h} and _vm_radix.h are copied straight from the branch to preserve history.
|
#
247742 |
|
03-Mar-2013 |
attilio |
Remove the boot-time cache support and rely on UMA boot-time slab cache for allocating the nodes before to have the possibility to carve directly from the UMA subsystem.
Sponsored by: EMC / Isilon storage division Reviewed by: alc
|
#
246794 |
|
14-Feb-2013 |
attilio |
The radix preallocation pages can overfow the biggestone segment, so use a different scheme for preallocation: reserve few KB of nodes to be used to cater page allocations before the memory can be efficiently pre-allocated by UMA.
This at all effects remove boot_pages further carving and along with this modifies to the boot_pages allocation system and necessity to initialize the UMA zone before pmap_init().
Reported by: pho, jhb
|
#
246726 |
|
12-Feb-2013 |
attilio |
Implement a new algorithm for managing the radix trie which also includes path-compression. This greatly helps with sparsely populated tries, where an uncompressed trie may end up by having a lot of intermediate nodes for very little leaves.
The new algorithm introduces 2 main concepts: the node level and the node owner. Every node represents a branch point where the leaves share the key up to the level specified in the node-level (current level excluded, of course). Such key partly shared is the one contained in the owner. Of course, the root branch is exempted to keep a valid owner, because theoretically all the keys are contained in the space designed by the root branch node. The search algorithm seems very intuitive and that is where one should start reading to understand the full approach.
In the end, the algorithm ends up by demanding only one node per insert and this is not necessary in all the cases. To stay safe, we basically preallocate as many nodes as the number of physical pages are in the system, using uma_preallocate(). However, this raises 2 concerns: * As pmap_init() needs to kmem_alloc(), the nodes must be pre-allocated when vm_radix_init() is currently called, which is much before UMA is fully initialized. This means that uma_prealloc() will dig into the UMA_BOOT_PAGES pool of pages, which is often not enough to keep track of such large allocations. In order to fix this, change a bit the concept of UMA_BOOT_PAGES and vm.boot_pages. More specifically make the UMA_BOOT_PAGES an initial "value" as long as vm.boot_pages and extend the boot_pages physical area by as many bytes as needed with the information returned by vm_radix_allocphys_size(). * A small amount of pages will be held in per-cpu buckets and won't be accessible from curcpu, so the vm_radix_node_get() could really panic when the pre-allocation pool is close to be exhausted. In theory we could pre-allocate more pages than the number of physical frames to satisfy such request, but as many insert would happen without a node allocation anyway, I think it is safe to assume that the over-allocation is already compensating for such problem. On the field testing can stand me correct, of course. This could be further helped by the case where we allow a single-page insert to not require a complete root node.
The use of pre-allocation gets rid all the non-direct mapping trickery and introduced lock recursion allowance for vm_page_free_queue.
The nodes children are reduced in number from 32 -> 16 and from 16 -> 8 (for respectively 64 bits and 32 bits architectures). This would make the children to fit into cacheline for amd64 case, for example, and in general spawn less cacheline, which may be helpful in lookup_ge() case. Also, path-compression cames to help in cases where there are many levels, making the fallouts of such change less hurting.
Sponsored by: EMC / Isilon storage division Reviewed by: jeff (partially) Tested by: flo
|
#
246430 |
|
06-Feb-2013 |
attilio |
Cleanup vm_radix KPI: - Avoid the return value for vm_radix_insert() - Name the functions argument per-style(9) - Avoid to get and return opaque objects but use vm_page_t as vm_radix is thought to not really be general code but to cater specifically page cache and resident cache.
|
#
246423 |
|
06-Feb-2013 |
attilio |
Avoid a namespace pollution in vm_object.h by defining separately the structure for vm_radix implementation.
|
#
245254 |
|
10-Jan-2013 |
attilio |
Remove vm_radix_lookupn() and its usage in the kernel.
|
#
238395 |
|
12-Jul-2012 |
attilio |
Style.
|
#
238394 |
|
12-Jul-2012 |
attilio |
Remove unused iterating functions.
|
#
238269 |
|
08-Jul-2012 |
attilio |
- Move VM_RADIX_STACK in vm_object.c because it is the only consumer - Import the check for the return value of vm_radix_lookup() directly in the while removing the need to use a spourious check.
|
#
238245 |
|
08-Jul-2012 |
attilio |
- Split the cached and resident pages tree into 2 distinct ones. This makes the RED/BLACK support go away and simplifies a lot vmradix functions used here. This happens because with patricia trie support the trie will be little enough that keeping 2 diffetnt will be efficient too. - Reduce differences with head, in places like backing scan where the optimizazions used shuffled the code a little bit around.
Tested by: flo, Andrea Barberio
|
#
236728 |
|
07-Jun-2012 |
attilio |
Create a sub-branch for saving a temporary working version of vmcontention and keep doing experiments.
|
#
236401 |
|
01-Jun-2012 |
attilio |
MFC
|
#
235349 |
|
12-May-2012 |
attilio |
- Fix a bug where lookupn can wrap up looking for the pages to scan, returning a non correct very low address again. - Stub out vm_lookup_foreach as it is not used
|
#
232326 |
|
29-Feb-2012 |
attilio |
- Exclude vm_radix_shrink() from the interface but retain the code still as it can be useful. - Make most of the interface private as it is unnecessary public right now. This will help in making nodes changing with arch and still avoid namespace pollution.
|
#
230750 |
|
29-Jan-2012 |
attilio |
Fix a bug in vm_radix_leaf() where the shifting start address can wrap-up at some point. This bug is triggered very easilly by indirect blocks in UFS which grow negative resulting in very high counts.
In collabouration with: flo
|
#
228314 |
|
06-Dec-2011 |
attilio |
Use atomics for rn_count on leaf node because RED operations happen without the VM_OBJECT_LOCK held, thus can be concurrent with BLACK ones. However, also use a write memory barrier in order to not reorder the operation of decrementing rn_count in respect fetching the pointer.
Discussed with: jeff
|
#
228312 |
|
06-Dec-2011 |
attilio |
- Make rn_count 32-bits as it will naturally pad for 32-bit arches - Avoid to use atomic to manipulate it at level0 because it seems unneeded and introduces a bug on big-endian architectures where only the top half (2 bits) of the double-words are written (as sparc64, for example, doesn't support atomics at 16-bits) heading to a wrong handling of rn_count.
Reported by: flo, andreast Found by: marius No answer by: jeff
|
#
228210 |
|
02-Dec-2011 |
attilio |
MFC
|
#
227544 |
|
15-Nov-2011 |
attilio |
Fix compilation for userland: - Use CTASSERT() only in the kernel. - the root pointer is required by struct vm_object which is accessible (maybe incorrectly?) by userland.
|
#
226983 |
|
01-Nov-2011 |
jeff |
- Add some convenience inlines. - Update the copyright.
|
#
226981 |
|
01-Nov-2011 |
attilio |
Add kernel protection to the header file for vmradix. Likely this file needs some more restructuration (and we should make a lot of macros private to radix implementation) but leave them as they are so far because we may enrich the KPI much further.
|
#
226980 |
|
01-Nov-2011 |
attilio |
vm_object_terminate() doesn't actually free the pages in the splay tree. Reclaim all the nodes related to the radix tree for a specified vm_object when calling vm_object_terminate() via the newly added interface vm_radix_reclaim_nodes(). The function is recursive, but we have a well-defined maximum depth, thus the amount of necessary stack can be easilly calculated.
Reported by: alc Discussed and reviewed by: jeff
|
#
226952 |
|
30-Oct-2011 |
jeff |
- Extract part of vm_radix_lookupn() into a function that just finds the first leaf page in a specified range. This permits us to make many search & operate functions without much code duplication. - Make a generic iterator for radix items.
|
#
226930 |
|
30-Oct-2011 |
jeff |
- Support two types of nodes, red and black, within the same radix tree. Black nodes support standard active pages and red nodes support cached pages. Red nodes may be removed without the object lock but will not collapse unused tree nodes. Red nodes may not be directly inserted, instead a new function is supplied to convert between black and red. - Handle cached pages and active pages in the same loop in vm_object_split, vm_object_backing_scan, and vm_object_terminate. - Retire the splay page handling as the ifdefs are too difficult to maintain. - Slightly optimize the vm_radix_lookupn() function.
|
#
226876 |
|
28-Oct-2011 |
jeff |
- Use a single uintptr_t for the root of the radix node that encodes the height and a pointer so that the update to the root is atomic. This permits safe lookups in parallel with tree expansion. Shrinking the space requirements is a small bonus.
|
#
226873 |
|
28-Oct-2011 |
attilio |
Use an UMA zone for the radix node. This avoids the problem to check for the kernel_map/kmem_map recursion because it uses direct mapping provided by amd64 to avoid object and map search and recursion.
Probabilly all the others architectures using UMA_MD_SMALL_ALLOC are also fixed by this, but other remains, where the most notable case is i386. For it a solution has still to be determined. A way to do this would be to have a reserved map just for radix node and mark all accesses to its lock to be witness safe, but that would still be unoptimal due to the large amount of virtual address space needed to cater the whole tree.
|
#
226646 |
|
22-Oct-2011 |
jeff |
- Implement vm_radix_lookup_le(). - Fix vm_radix_lookupn() when max == -1 by making the end parameter inclusive.
|
#
226645 |
|
22-Oct-2011 |
attilio |
Check in an intial implementation of radix tree implementation to replace the vm object pages splay.
TODO: - Handle differently the negative keys for having smaller depth index nodes (negative keys caming from indirect blocks) - Fix the get_node() by having support for a low reserved objects directly from UMA - Implement the lookup_le and re-enable VM_NRESERVELEVEL = 1 - Try to rework the superpage splay of idle pages and the cache splay for every vm object in order to regain space on vm_page structure - Verify performance and improve them (likely by having consumers to deal with several ranges of pages manually?)
Obtained from: jeff, Mayur Shardul (GSoC 2009)
|