#
8689d750 |
|
22-Jan-2024 |
Lukas Bulwahn <lukas.bulwahn@gmail.com> |
maple_tree: avoid duplicate variable init in mast_spanning_rebalance() The local variables r_tmp and l_tmp in mast_spanning_rebalance() are already initialized at its declaration; there is no need to assign the value again. Remove the duplicate initialization of {r,l}_tmp. No functional change. Due to common compiler optimizations, also no change to object code. This issue was identified with clang-analyzer's dead stores analysis. Link: https://lkml.kernel.org/r/20240122102000.29558-1-lukas.bulwahn@gmail.com Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
e755c43e |
|
09-Jan-2024 |
Sidhartha Kumar <sidhartha.kumar@oracle.com> |
maple_tree: fix comment describing mas_node_count_gfp() The function description comment for mas_node_count_gfp() mistakingly refers to the function as mas_node_count(). Change it to refer to the correct function. Link: https://lkml.kernel.org/r/20240109223119.162357-1-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
9b6713cc |
|
17-Feb-2024 |
Chuck Lever <chuck.lever@oracle.com> |
maple_tree: Add mtree_alloc_cyclic() I need a cyclic allocator for the simple_offset implementation in fs/libfs.c. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/170820144179.6328.12838600511394432325.stgit@91.116.238.104.host.secureserver.net Signed-off-by: Christian Brauner <brauner@kernel.org>
|
#
7e552dcd |
|
15-Dec-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: avoid checking other gaps after getting the largest gap The last range stored in maple tree is typically quite large. By checking if it exceeds the sum of the remaining ranges in that node, it is possible to avoid checking all other gaps. Running the maple tree test suite in user mode almost always results in a near 100% hit rate for this optimization. Link: https://lkml.kernel.org/r/20231215074632.82045-1-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
d5f6057c |
|
09-Dec-2023 |
Randy Dunlap <rdunlap@infradead.org> |
maple_tree: fix typos/spellos etc Fix typos/grammar and spellos in documentation. Link: https://lkml.kernel.org/r/20231210063839.29967-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
5143eecd |
|
13-Dec-2023 |
Andrew Morton <akpm@linux-foundation.org> |
lib/maple_tree.c: fix build error due to hotfix alteration Commit 0de56e38b307 ("maple_tree: use maple state end for write operations") was broken by a later patch "maple_tree: do not preallocate nodes for slot stores". But the later patch was scheduled ahead of 0de56e38b307, for 6.7-rc. This fixlet undoes the damage. Fixes: 0de56e38b307 ("maple_tree: use maple state end for write operations") Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
d9d9bd97 |
|
09-Nov-2023 |
Levi Yun <ppbuk5246@gmail.com> |
maple_tree: change return type of mas_split_final_node as void. mas_split_final_node() always returns true and its return value is never checked. Change return type to void. Link: https://lkml.kernel.org/r/20231109160821.16248-2-ppbuk5246@gmail.com Signed-off-by: Levi Yun <ppbuk5246@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
330018fe |
|
20-Nov-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: simplify mas_leaf_set_meta() Now it seems that the incoming 'end' is already pointing to the last item, so we can simplify this function, considering only whether the last slot is being used. This has passed the maple tree test suite. Link: https://lkml.kernel.org/r/20231120070937.35481-6-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
026b935c |
|
20-Nov-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: delete one of the two identical checks There are two identical checks, delete one of them. Link: https://lkml.kernel.org/r/20231120070937.35481-5-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
c5e94121 |
|
20-Nov-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: remove an unused parameter for ma_meta_end() The parameter maple_type is not used, so remove it. Link: https://lkml.kernel.org/r/20231120070937.35481-4-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
3f05fcde |
|
20-Nov-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: avoid ascending when mas->min is also the parent's minimum When the child node is the first child of its parent node, mas->min does not need to be updated. This can reduce the number of ascending times in some cases. Link: https://lkml.kernel.org/r/20231120070937.35481-3-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
2e783f0c |
|
20-Nov-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: move the check forward to avoid static check warning Patch series "Some cleanups of maple tree", v2. These are some small cleanups of maple tree. This patch (of 5): Put the check for gap before its reference to avoid Smatch static check warnings. This is not a bug, it's just a validation program. Even with this change, Smatch may still generate warnings because MT_BUG_ON() doesn't necessarily stop the program. It may require fixing Smatch itself to avoid these warnings. Link: https://lkml.kernel.org/r/20231120070937.35481-1-zhangpeng.00@bytedance.com Link: https://lkml.kernel.org/r/20231120070937.35481-2-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: http://lists.infradead.org/pipermail/maple-tree/2023-November/003046.html Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
d1fefa3d |
|
27-Oct-2023 |
Jiapeng Chong <jiapeng.chong@linux.alibaba.com> |
maple_tree: remove unused function The function are defined in the maple_tree.c file, but not called elsewhere, so delete the unused function. lib/maple_tree.c:689:29: warning: unused function 'mas_pivot'. Link: https://lkml.kernel.org/r/20231027084944.24888-1-jiapeng.chong@linux.alibaba.com Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7064 Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
a3c63c8c |
|
01-Nov-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: mtree_range_walk() clean up mtree_range_walk() needed to be updated to avoid checking if there was a pivot value. On closer examination, the code could avoid setting min or max in certain scenarios. The commit removes the extra check for pivot[offset] before setting max and only sets max when necessary. It also only sets min if it is necessary by checking offset 0 prior to the loop (as it has always done). The commit also drops a dead node check since the end of the node will return the array size when the last slot is occupied (by a potential reuse in a dead node). The data will be discarded later if the node is marked dead. Benchmarking these changes results in an increase in performance of 5.45% using the BENCH_WALK in the maple tree test code. Link: https://lkml.kernel.org/r/20231101171629.3612299-13-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
24662dec |
|
01-Nov-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: don't find node end in mtree_lookup_walk() Since the pivot being set is now reliable, the optimized loop no longer needs to find the node end. The redundant check for a dead node can also be avoided as there is no danger of using the wrong pivot since the results will be thrown out in the case of a dead node by the later check. This patch also adds a benchmark test for the function to the maple tree test framework. The benchmark shows an average increase performance of 5.98% over 3 runs with this commit. Link: https://lkml.kernel.org/r/20231101171629.3612299-12-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
0de56e38 |
|
01-Nov-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: use maple state end for write operations ma_wr_state was previously tracking the end of the node for writing. Since the implementation of the ma_state end tracking, this is duplicated work. This patch removes the maple write state tracking of the end of the node and uses the maple state end instead. Link: https://lkml.kernel.org/r/20231101171629.3612299-11-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
9a40d45c |
|
01-Nov-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: remove mas_searchable() Now that the status of the maple state is outside of the node, the mas_searchable() function can be dropped for easier open-coding of what is going on. Link: https://lkml.kernel.org/r/20231101171629.3612299-10-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
067311d3 |
|
01-Nov-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: separate ma_state node from status The maple tree node is overloaded to keep status as well as the active node. This, unfortunately, results in a re-walk on underflow or overflow. Since the maple state has room, the status can be placed in its own enum in the structure. Once an underflow/overflow is detected, certain modes can restore the status to active and others may need to re-walk just that one node to see the entry. The status being an enum has the benefit of detecting unhandled status in switch statements. [Liam.Howlett@oracle.com: fix comments about MAS_*] Link: https://lkml.kernel.org/r/20231106154124.614247-1-Liam.Howlett@oracle.com [Liam.Howlett@oracle.com: update forking to separate maple state and node] Link: https://lkml.kernel.org/r/20231106154551.615042-1-Liam.Howlett@oracle.com [Liam.Howlett@oracle.com: fix mas_prev() state separation code] Link: https://lkml.kernel.org/r/20231207193319.4025462-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20231101171629.3612299-9-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
271f61a8 |
|
01-Nov-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: clean up inlines for some functions There are a few functions which were inlined but are somewhat too large to inline, so remove the inline key word. There are also several very small functions which are used in critical code sections which gcc was not inlining, so make this more strict and use __always_line for these functions. Link: https://lkml.kernel.org/r/20231101171629.3612299-8-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
1f41ef12 |
|
01-Nov-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: use cached node end in mas_destroy() The node end is set during the walk, so use the resulting end instead of re-fetching it. Link: https://lkml.kernel.org/r/20231101171629.3612299-7-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
e9c52d89 |
|
01-Nov-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: use cached node end in mas_next() When looking for the next entry, don't recalculate the node end as it is now tracked in the maple state. Link: https://lkml.kernel.org/r/20231101171629.3612299-6-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
31c532a8 |
|
01-Nov-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: add end of node tracking to the maple state Analysis of the mas_for_each() iteration showed that there is a significant time spent finding the end of a node. This time can be greatly reduced if the end of the node is cached in the maple state. Care must be taken to update & invalidate as necessary. Link: https://lkml.kernel.org/r/20231101171629.3612299-5-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
f7a59018 |
|
01-Nov-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: make mas_erase() more robust mas_erase() may not deal correctly with all maple states. Make the function more robust by ensuring the state is in one of the two acceptable states. Link: https://lkml.kernel.org/r/20231101171629.3612299-3-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
37a8ab24 |
|
01-Nov-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: remove unnecessary default labels from switch statements Patch series "maple_tree: iterator state changes". These patches have some general cleanup and a change to separate the maple state status tracking from the maple state node. The maple state status change allows for walks to continue from previous places when the status needs to be recorded to make logical sense for the next call to the maple state. For instance, it allows for prev/next to function in a way that better resembles the linked list. It also allows switch statements to be used to detect missed states during compile, and the addition of fast-path "active" state is cleaner as an enum. While making the status change, perf showed some very small (one line) functions that were not inlined even with the inline key word. Making these small functions __always_inline is less expensive according to perf. As part of that change, some inlines have been dropped from larger functions. Perf also showed that the commonly used mas_for_each() iterator was spending a lot of time finding the end of the node. This series introduces caching of the end of the node in the maple state (and updating it during writes). This caching along with the inline changes yielded at 23.25% improvement on the BENCH_MAS_FOR_EACH maple tree test framework benchmark. I've also included a change to mtree_range_walk and mtree_lookup_walk to take advantage of Peng's change [1] to the initial pivot setup. mmtests did not produce any significant gains. [1] https://lore.kernel.org/all/20230711035444.526-1-zhangpeng.00@bytedance.com/T/#u This patch (of 12): Removing the default types from the switch statements will cause compile warnings on missing cases. Link: https://lkml.kernel.org/r/20231101171629.3612299-2-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
8e50d32c |
|
26-Oct-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: preserve the tree attributes when destroying maple tree When destroying maple tree, preserve its attributes and then turn it into an empty tree. This allows it to be reused without needing to be reinitialized. Link: https://lkml.kernel.org/r/20231027033845.90608-10-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Mike Christie <michael.christie@oracle.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
fd32e4e9 |
|
26-Oct-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: introduce interfaces __mt_dup() and mtree_dup() Introduce interfaces __mt_dup() and mtree_dup(), which are used to duplicate a maple tree. They duplicate a maple tree in Depth-First Search (DFS) pre-order traversal. It uses memcopy() to copy nodes in the source tree and allocate new child nodes in non-leaf nodes. The new node is exactly the same as the source node except for all the addresses stored in it. It will be faster than traversing all elements in the source tree and inserting them one by one into the new tree. The time complexity of these two functions is O(n). The difference between __mt_dup() and mtree_dup() is that mtree_dup() handles locks internally. Analysis of the average time complexity of this algorithm: For simplicity, let's assume that the maximum branching factor of all non-leaf nodes is 16 (in allocation mode, it is 10), and the tree is a full tree. Under the given conditions, if there is a maple tree with n elements, the number of its leaves is n/16. From bottom to top, the number of nodes in each level is 1/16 of the number of nodes in the level below. So the total number of nodes in the entire tree is given by the sum of n/16 + n/16^2 + n/16^3 + ... + 1. This is a geometric series, and it has log(n) terms with base 16. According to the formula for the sum of a geometric series, the sum of this series can be calculated as (n-1)/15. Each node has only one parent node pointer, which can be considered as an edge. In total, there are (n-1)/15-1 edges. This algorithm consists of two operations: 1. Traversing all nodes in DFS order. 2. For each node, making a copy and performing necessary modifications to create a new node. For the first part, DFS traversal will visit each edge twice. Let T(ascend) represent the cost of taking one step downwards, and T(descend) represent the cost of taking one step upwards. And both of them are constants (although mas_ascend() may not be, as it contains a loop, but here we ignore it and treat it as a constant). So the time spent on the first part can be represented as ((n-1)/15-1) * (T(ascend) + T(descend)). For the second part, each node will be copied, and the cost of copying a node is denoted as T(copy_node). For each non-leaf node, it is necessary to reallocate all child nodes, and the cost of this operation is denoted as T(dup_alloc). The behavior behind memory allocation is complex and not specific to the maple tree operation. Here, we assume that the time required for a single allocation is constant. Since the size of a node is fixed, both of these symbols are also constants. We can calculate that the time spent on the second part is ((n-1)/15) * T(copy_node) + ((n-1)/15 - n/16) * T(dup_alloc). Adding both parts together, the total time spent by the algorithm can be represented as: ((n-1)/15) * (T(ascend) + T(descend) + T(copy_node) + T(dup_alloc)) - n/16 * T(dup_alloc) - (T(ascend) + T(descend)) Let C1 = T(ascend) + T(descend) + T(copy_node) + T(dup_alloc) Let C2 = T(dup_alloc) Let C3 = T(ascend) + T(descend) Finally, the expression can be simplified as: ((16 * C1 - 15 * C2) / (15 * 16)) * n - (C1 / 15 + C3). This is a linear function, so the average time complexity is O(n). Link: https://lkml.kernel.org/r/20231027033845.90608-4-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Suggested-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Mike Christie <michael.christie@oracle.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
4f2267b5 |
|
26-Oct-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: add mt_free_one() and mt_attr() helpers Patch series "Introduce __mt_dup() to improve the performance of fork()", v7. This series introduces __mt_dup() to improve the performance of fork(). During the duplication process of mmap, all VMAs are traversed and inserted one by one into the new maple tree, causing the maple tree to be rebalanced multiple times. Balancing the maple tree is a costly operation. To duplicate VMAs more efficiently, mtree_dup() and __mt_dup() are introduced for the maple tree. They can efficiently duplicate a maple tree. Here are some algorithmic details about {mtree,__mt}_dup(). We perform a DFS pre-order traversal of all nodes in the source maple tree. During this process, we fully copy the nodes from the source tree to the new tree. This involves memory allocation, and when encountering a new node, if it is a non-leaf node, all its child nodes are allocated at once. This idea was originally from Liam R. Howlett's Maple Tree Work email, and I added some of my own ideas to implement it. Some previous discussions can be found in [1]. For a more detailed analysis of the algorithm, please refer to the logs for patch [3/10] and patch [10/10]. There is a "spawn" in byte-unixbench[2], which can be used to test the performance of fork(). I modified it slightly to make it work with different number of VMAs. Below are the test results. The first row shows the number of VMAs. The second and third rows show the number of fork() calls per ten seconds, corresponding to next-20231006 and the this patchset, respectively. The test results were obtained with CPU binding to avoid scheduler load balancing that could cause unstable results. There are still some fluctuations in the test results, but at least they are better than the original performance. 21 121 221 421 821 1621 3221 6421 12821 25621 51221 112100 76261 54227 34035 20195 11112 6017 3161 1606 802 393 114558 83067 65008 45824 28751 16072 8922 4747 2436 1233 599 2.19% 8.92% 19.88% 34.64% 42.37% 44.64% 48.28% 50.17% 51.68% 53.74% 52.42% Thanks to Liam and Matthew for the review. This patch (of 10): Add two helpers: 1. mt_free_one(), used to free a maple node. 2. mt_attr(), used to obtain the attributes of maple tree. Link: https://lkml.kernel.org/r/20231027033845.90608-1-zhangpeng.00@bytedance.com Link: https://lkml.kernel.org/r/20231027033845.90608-2-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Mike Christie <michael.christie@oracle.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
4249f13c |
|
13-Dec-2023 |
Sidhartha Kumar <sidhartha.kumar@oracle.com> |
maple_tree: do not preallocate nodes for slot stores mas_preallocate() defaults to requesting 1 node for preallocation and then ,depending on the type of store, will update the request variable. There isn't a check for a slot store type, so slot stores are preallocating the default 1 node. Slot stores do not require any additional nodes, so add a check for the slot store case that will bypass node_count_gfp(). Update the tests to reflect that slot stores do not require allocations. User visible effects of this bug include increased memory usage from the unneeded node that was allocated. Link: https://lkml.kernel.org/r/20231213205058.386589-1-sidhartha.kumar@oracle.com Fixes: 0b8bb544b1a7 ("maple_tree: update mas_preallocate() testing") Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: <stable@vger.kernel.org> [6.6+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
099d7439 |
|
12-Oct-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: add GFP_KERNEL to allocations in mas_expected_entries() Users complained about OOM errors during fork without triggering compaction. This can be fixed by modifying the flags used in mas_expected_entries() so that the compaction will be triggered in low memory situations. Since mas_expected_entries() is only used during fork, the extra argument does not need to be passed through. Additionally, the two test_maple_tree test cases and one benchmark test were altered to use the correct locking type so that allocations would not trigger sleeping and thus fail. Testing was completed with lockdep atomic sleep detection. The additional locking change requires rwsem support additions to the tools/ directory through the use of pthreads pthread_rwlock_t. With this change test_maple_tree works in userspace, as a module, and in-kernel. Users may notice that the system gave up early on attempting to start new processes instead of attempting to reclaim memory. Link: https://lkml.kernel.org/r/20230915093243epcms1p46fa00bbac1ab7b7dca94acb66c44c456@epcms1p4 Link: https://lkml.kernel.org/r/20231012155233.2272446-1-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com> Cc: <jason.sim@samsung.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
a8091f03 |
|
21-Sep-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: add MAS_UNDERFLOW and MAS_OVERFLOW states When updating the maple tree iterator to avoid rewalks, an issue was introduced when shifting beyond the limits. This can be seen by trying to go to the previous address of 0, which would set the maple node to MAS_NONE and keep the range as the last entry. Subsequent calls to mas_find() would then search upwards from mas->last and skip the value at mas->index/mas->last. This showed up as a bug in mprotect which skips the actual VMA at the current range after attempting to go to the previous VMA from 0. Since MAS_NONE may already be set when searching for a value that isn't contained within a node, changing the handling of MAS_NONE in mas_find() would make the code more complicated and error prone. Furthermore, there was no way to tell which limit was hit, and thus which action to take (next or the entry at the current range). This solution is to add two states to track what happened with the previous iterator action. This allows for the expected behaviour of the next command to return the correct item (either the item at the range requested, or the next/previous). Tests are also added and updated accordingly. Link: https://lkml.kernel.org/r/20230921181236.509072-3-Liam.Howlett@oracle.com Link: https://gist.github.com/heatd/85d2971fae1501b55b6ea401fbbe485b Link: https://lore.kernel.org/linux-mm/20230921181236.509072-1-Liam.Howlett@oracle.com/ Fixes: 39193685d585 ("maple_tree: try harder to keep active node with mas_prev()") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Pedro Falcato <pedro.falcato@gmail.com> Closes: https://gist.github.com/heatd/85d2971fae1501b55b6ea401fbbe485b Closes: https://bugs.archlinux.org/task/79656 Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
432af5c9 |
|
18-Aug-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: clean up mas_wr_append() Avoid setting the variables until necessary, and actually use the variables where applicable. Introducing a variable for the slots array avoids spanning multiple lines. Add the missing argument to the documentation. Use the node type when setting the metadata instead of blindly assuming the type. Finally, add a trace point to the function for successful store. Link: https://lkml.kernel.org/r/20230819004356.1454718-3-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
530f745c |
|
03-Aug-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: replace data before marking dead in split and spanning store Reorder the operations for split and spanning stores so that new data is placed in the tree prior to marking the old data as dead. This will limit re-walks on dead data to just once instead of a retry loop. The order of operations is as follows: Create the new data, put the new data in place, mark the top node of the old data as dead. Then repair parent links in the reused nodes through all levels of the tree, following the new nodes downwards. Finally walk the top dead node looking for nodes that are no longer used, or subtrees that should be destroyed (marked dead throughout then freed), follow the partially used nodes downwards to discover other dead nodes and subtrees. Link: https://lkml.kernel.org/r/20230804165951.2661157-7-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
068bafca |
|
03-Aug-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: change mas_adopt_children() parent usage All calls to mas_adopt_children() currently pass the parent as the node in the maple state. Allow for the parent pointer that is passed in to be used instead. Link: https://lkml.kernel.org/r/20230804165951.2661157-6-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
4ffc2ee2 |
|
03-Aug-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: introduce mas_tree_parent() definition Add a definition to shorten long code lines and clarify what the code is doing. Use the new definition to get the maple tree parent pointer from the maple state where possible. Link: https://lkml.kernel.org/r/20230804165951.2661157-5-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
1238f6a2 |
|
03-Aug-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: introduce mas_put_in_tree() mas_replace() has a single user that takes a flag which is now always true. Replace this function with mas_put_in_tree() to better align with mas_replace_node(). Inline the remaining logic into the only caller; mas_wmb_replace(). Link: https://lkml.kernel.org/r/20230804165951.2661157-4-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
72bcf4aa |
|
03-Aug-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: reorder replacement of nodes to avoid live lock Replacing nodes may cause a live lock-up if CPU resources are saturated by write operations on the tree by continuously retrying on dead nodes. To avoid the continuous retry scenario, ensure the new node is inserted into the tree prior to marking the old data as dead. This will define a window where old and new data is swapped. When reusing lower level nodes, ensure the parent pointer is updated after the parent is marked dead. This ensures that the child is still reachable from the top of the tree, but walking up to a dead node will result in a single retry that will start a fresh walk from the top down through the new node. Link: https://lkml.kernel.org/r/20230804165951.2661157-3-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
83d97f62 |
|
03-Aug-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: add hex output to maple_arange64 dump Patch series "maple_tree: Change replacement strategy". The maple tree marks nodes dead as soon as they are going to be replaced. This could be problematic when used in the RCU context since the writer may be starved of CPU time by the readers. This patch set addresses the issue by switching the data replacement strategy to one that will only mark data as dead once the new data is available. This series changes the ordering of the node replacement so that the new data is live before the old data is marked 'dead'. When readers hit 'dead' nodes, they will restart from the top of the tree and end up in the new data. In more complex scenarios, the replacement strategy means a subtree is built and graphed into the tree leaving some nodes to point to the old parent. The view of tasks into the old data will either remain with the old data, or see the new data once the old data is marked 'dead'. Iterators will see the 'dead' node and restart on their own and switch to the new data. There is no risk of the reader seeing old data in these cases. The 'dead' subtree of data is then fully marked dead, but reused nodes will still point to the dead nodes until the parent pointer is updated. Walking up to a 'dead' node will cause a re-walk from the top of the tree and enter the new data area where old data is not reachable. Once the parent pointers are fully up to date in the active data, the 'dead' subtree is iterated to collect entirely 'dead' subtrees, and dead nodes (nodes that partially contained reused data). This patch (of 6): When dumping the tree, honour formatting request to output hex for the maple node type arange64. Link: https://lkml.kernel.org/r/20230804165951.2661157-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20230804165951.2661157-2-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
fec29364 |
|
24-Jul-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: reduce resets during store setup mas_prealloc() may walk partially down the tree before finding that a split or spanning store is needed. When the write occurs, relax the logic on resetting the walk so that partial walks will not restart, but walks that have gone too far (a store that affects beyond the current node) should be restarted. Link: https://lkml.kernel.org/r/20230724183157.3939892-15-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
17983dc6 |
|
24-Jul-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: refine mas_preallocate() node calculations Calculate the number of nodes based on the pending write action instead of assuming the worst case. This addresses a performance regression introduced in platforms that have longer allocation timing. Link: https://lkml.kernel.org/r/20230724183157.3939892-14-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
a7496ad5 |
|
24-Jul-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: move mas_wr_end_piv() below mas_wr_extend_null() Relocate it and call mas_wr_extend_null() from within mas_wr_end_piv(). Extending the NULL may affect the end pivot value so call mas_wr_endtend_null() from within mas_wr_end_piv() to keep it all together. Link: https://lkml.kernel.org/r/20230724183157.3939892-12-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
c108df76 |
|
24-Jul-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: adjust node allocation on mas_rebalance() mas_rebalance() is called to rebalance an insufficient node into a single node or two sufficient nodes. The preallocation estimate is always too many in this case as the height of the tree will never grow and there is no possibility to have a three way split in this case, so revise the node allocation count. Link: https://lkml.kernel.org/r/20230724183157.3939892-9-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
da089254 |
|
24-Jul-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: re-introduce entry to mas_preallocate() arguments The current preallocation strategy is to preallocate the absolute worst-case allocation for a tree modification. The entry (or NULL) is needed to know how many nodes are needed to write to the tree. Start by adding the argument to the mas_preallocate() definition. Link: https://lkml.kernel.org/r/20230724183157.3939892-8-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
19a462f0 |
|
14-Jul-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: Be more strict about locking Use lockdep to check the write path in the maple tree holds the lock in write mode. Introduce mt_write_lock_is_held() to check if the lock is held for writing. Update the necessary checks for rcu_dereference_protected() to use the new write lock check. Link: https://lkml.kernel.org/r/20230714195551.894800-5-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oliver Sang <oliver.sang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
4ae6944d |
|
15-Jul-2023 |
Mike Rapoport (IBM) <rppt@kernel.org> |
maple_tree: mtree_insert: fix typo in kernel-doc description of GFP flags Replace FGP_FLAGS with GFP_FLAGS Link: https://lkml.kernel.org/r/20230715084038.987955-1-rppt@kernel.org Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
4445e582 |
|
15-Jul-2023 |
Mike Rapoport (IBM) <rppt@kernel.org> |
maple_tree: mtree_insert*: fix typo in kernel-doc description Replace "Insert and entry at a give index" with "Insert an entry at a given index" Link: https://lkml.kernel.org/r/20230715143920.994812-1-rppt@kernel.org Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
6783bd4b |
|
10-Jul-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: drop mas_first_entry() The internal function mas_first_entry() is no longer used, so drop it. Link: https://lkml.kernel.org/r/20230711035444.526-9-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
29b2681f |
|
10-Jul-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: replace mas_logical_pivot() with mas_safe_pivot() Replace mas_logical_pivot() with mas_safe_pivot() and drop mas_logical_pivot() since it won't be used anymore. We can do this since now all nodes will have node limit pivot (if it is not full node). Link: https://lkml.kernel.org/r/20230711035444.526-8-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
a489539e |
|
10-Jul-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: update mt_validate() Instead of using mas_first_entry() to find the leftmost leaf, use a simple loop instead. Remove an unneeded check for root node. To make the error message more accurate, check pivots first and then slots, because checking slots depend on the node limit pivot to break the loop. Link: https://lkml.kernel.org/r/20230711035444.526-7-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
33af39d0 |
|
10-Jul-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: make mas_validate_limits() check root node and node limit Update mas_validate_limits() to check root node, check node limit pivot if there is enough room for it to exist and check data_end. Remove the check for child existence as it is done in mas_validate_child_slot(). Link: https://lkml.kernel.org/r/20230711035444.526-6-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
e93fda5a |
|
10-Jul-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: fix mas_validate_child_slot() to check last missed slot Don't break the loop before checking the last slot. Also here check if non-leaf nodes are missing children. Link: https://lkml.kernel.org/r/20230711035444.526-5-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
f8e5eac8 |
|
10-Jul-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: make mas_validate_gaps() to check metadata Make mas_validate_gaps() check whether the offset in the metadata points to the largest gap. By the way, simplify this function. Add the verification that gaps beyond the node limit are zero. Link: https://lkml.kernel.org/r/20230711035444.526-4-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
d695c30a |
|
10-Jul-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: don't use MAPLE_ARANGE64_META_MAX to indicate no gap Patch series "Improve the validation for maple tree and some cleanup", v2. This patch (of 7): Do not use a special offset to indicate that there is no gap. When there is no gap, offset can point to any valid slots because its gap is 0. Link: https://lkml.kernel.org/r/20230711035444.526-1-zhangpeng.00@bytedance.com Link: https://lkml.kernel.org/r/20230711035444.526-3-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
64891ba3 |
|
28-Jun-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: add a fast path case in mas_wr_slot_store() When expanding a range in two directions, only partially overwriting the previous and next ranges, the number of entries will not be increased, so we can just update the pivots as a fast path. However, it may introduce potential risks in RCU mode, because it updates two pivots. We only enable it in non-RCU mode. Link: https://lkml.kernel.org/r/20230628073657.75314-5-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
23e9dde0 |
|
28-Jun-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: optimize mas_wr_append(), also improve duplicating VMAs When the new range can be completely covered by the original last range without touching the boundaries on both sides, two new entries can be appended to the end as a fast path. We update the original last pivot at the end, and the newly appended two entries will not be accessed before this, so it is also safe in RCU mode. This is useful for sequential insertion, which is what we do in dup_mmap(). Enabling BENCH_FORK in test_maple_tree and just running bench_forking() gives the following time-consuming numbers: before: after: 17,874.83 msec 15,738.38 msec It shows about a 12% performance improvement for duplicating VMAs. Link: https://lkml.kernel.org/r/20230628073657.75314-4-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
fad9c80e |
|
23-May-2023 |
Thomas Gleixner <tglx@linutronix.de> |
maple_tree: fix a few documentation issues The documentation of mt_next() claims that it starts the search at the provided index. That's incorrect as it starts the search after the provided index. The documentation of mt_find() is slightly confusing. "Handles locking" is not really helpful as it does not explain how the "locking" works. Also the documentation of index talks about a range, while in reality the index is updated on a succesful search to the index of the found entry plus one. Fix similar issues for mt_find_after() and mt_prev(). Reword the confusing "Note: Will not return the zero entry." comment on mt_for_each() and document @__index correctly. Link: https://lkml.kernel.org/r/87ttw2n556.ffs@tglx Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
cfeb6ae8 |
|
18-Aug-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: disable mas_wr_append() when other readers are possible The current implementation of append may cause duplicate data and/or incorrect ranges to be returned to a reader during an update. Although this has not been reported or seen, disable the append write operation while the tree is in rcu mode out of an abundance of caution. During the analysis of the mas_next_slot() the following was artificially created by separating the writer and reader code: Writer: reader: mas_wr_append set end pivot updates end metata Detects write to last slot last slot write is to start of slot store current contents in slot overwrite old end pivot mas_next_slot(): read end metadata read old end pivot return with incorrect range store new value Alternatively: Writer: reader: mas_wr_append set end pivot updates end metata Detects write to last slot last lost write to end of slot store value mas_next_slot(): read end metadata read old end pivot read new end pivot return with incorrect range set old end pivot There may be other accesses that are not safe since we are now updating both metadata and pointers, so disabling append if there could be rcu readers is the safest action. Link: https://lkml.kernel.org/r/20230819004356.1454718-2-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
3c769fd8 |
|
10-Jul-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: set the node limit when creating a new root node Set the node limit of the root node so that the last pivot of all nodes is the node limit (if the node is not full). This patch also fixes a bug in mas_rev_awalk(). Effectively, always setting a maximum makes mas_logical_pivot() behave as mas_safe_pivot(). Without this fix, it is possible that very small tasks would fail to find the correct gap. Although this has not been observed with real tasks, it has been reported to happen in m68k nommu running the maple tree tests. Link: https://lkml.kernel.org/r/20230711035444.526-1-zhangpeng.00@bytedance.com Link: https://lore.kernel.org/linux-mm/CAMuHMdV4T53fOw7VPoBgPR7fP6RYqf=CBhD_y_vOg53zZX_DnA@mail.gmail.com/ Link: https://lkml.kernel.org/r/20230711035444.526-2-zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
7a03ae39 |
|
23-May-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: simplify and clean up mas_wr_node_store() Simplify and clean up mas_wr_node_store(), remove unnecessary code. Link: https://lkml.kernel.org/r/20230524031247.65949-10-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
e6d1ffd6 |
|
23-May-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: rework mas_wr_slot_store() to be cleaner and more efficient. Get whether the two gaps to be overwritten are empty to avoid calling mas_update_gap() all the time. Also clean up the code and add comments. Link: https://lkml.kernel.org/r/20230524031247.65949-9-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
2e1da329 |
|
23-May-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: add comments and some minor cleanups to mas_wr_append() Add comment for mas_wr_append(), move mas_update_gap() into mas_wr_append(), and other cleanups to make mas_wr_modify() cleaner. Link: https://lkml.kernel.org/r/20230524031247.65949-8-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
c6fc9e4a |
|
23-May-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: add mas_wr_new_end() to calculate new_end accurately The previous new_end calculation is inaccurate, because it assumes that two new pivots must be added (this is inaccurate), and sometimes it will miss the fast path and enter the slow path. Add mas_wr_new_end() to accurately calculate new_end to make the conditions for entering the fast path more accurate. Link: https://lkml.kernel.org/r/20230524031247.65949-7-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
8c995a63 |
|
23-May-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: make the code symmetrical in mas_wr_extend_null() Just make the code symmetrical to improve readability. Link: https://lkml.kernel.org/r/20230524031247.65949-6-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
bc147f0f |
|
23-May-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: simplify mas_is_span_wr() Make the code for detecting spanning writes more concise. Link: https://lkml.kernel.org/r/20230524031247.65949-5-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
14c4b5ab |
|
23-May-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: fix the arguments to __must_hold() Fix the arguments to __must_hold() to make sparse work. Link: https://lkml.kernel.org/r/20230524031247.65949-4-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
c2aa6f53 |
|
23-May-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: drop mas_{rev_}alloc() and mas_fill_gap() mas_{rev_}alloc() and mas_fill_gap() are no longer used, delete them. Link: https://lkml.kernel.org/r/20230524031247.65949-3-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
52371677 |
|
23-May-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: rework mtree_alloc_{range,rrange}() Patch series "Clean ups for maple tree", v4. Some clean ups, mainly to make the code of maple tree more concise. This patchset has passed the self-test. This patch (of 10): Use mas_empty_area{_rev}() to refactor mtree_alloc_{range,rrange}() Link: https://lkml.kernel.org/r/20230524031247.65949-2-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
6b23a290 |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: clear up index and last setting in single entry tree When there is a single entry tree (range of 0-0 pointing to an entry), then ensure the limit is either 0-0 or 1-oo, depending on where the user walks. Ensure the correct node setting as well; either MAS_ROOT or MAS_NONE. Link: https://lkml.kernel.org/r/20230518145544.1722059-33-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
6b9e93e0 |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: add mas_prev_range() and mas_find_range_rev interface Some users of the maple tree may want to move to the previous range regardless of the value stored there. Add this interface as well as the 'find' variant to support walking to the first value, then iterating over the previous ranges. Link: https://lkml.kernel.org/r/20230518145544.1722059-32-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
dd9a8513 |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: introduce mas_prev_slot() interface Sometimes the user needs to revert to the previous slot, regardless of if it is empty or not. Add an interface to go to the previous slot. Since there can't be two consecutive NULLs in the tree, the mas_prev() function can be implemented by calling mas_prev_slot() a maximum of 2 times. Change the underlying interface to use mas_prev_slot() to align the code. Link: https://lkml.kernel.org/r/20230518145544.1722059-31-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
de6e386c |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: relocate mas_rewalk() and mas_rewalk_if_dead() These functions need to move for future use. Link: https://lkml.kernel.org/r/20230518145544.1722059-30-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
6169b553 |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: add mas_next_range() and mas_find_range() interfaces Some users of the maple tree may want to move to the next range in the tree, even if it stores a NULL. This family of function provides that functionality by advancing one slot at a time and returning the result, while mas_contiguous() will iterate over the range and stop on encountering the first NULL. Link: https://lkml.kernel.org/r/20230518145544.1722059-29-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
fff4a58c |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: introduce mas_next_slot() interface Sometimes, during a tree walk, the user needs the next slot regardless of if it is empty or not. Add an interface to get the next slot. Since there are no consecutive NULLs allowed in the tree, the mas_next() function can only advance two slots at most. So use the new mas_next_slot() interface to align both implementations. Use this method for mas_find() as well. Link: https://lkml.kernel.org/r/20230518145544.1722059-28-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
ba997212 |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: revise limit checks in mas_empty_area{_rev}() Since the maple tree is inclusive in range, ensure that a range of 1 (min = max) works for searching for a gap in either direction, and make sure the size is at least 1 but not larger than the delta between min and max. This commit also updates the testing. Unfortunately there isn't a way to safely update the tests and code without a test failure. Link: https://lkml.kernel.org/r/20230518145544.1722059-26-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Suggested-by: Peng Zhang <zhangpeng.00@bytedance.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
39193685 |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: try harder to keep active node with mas_prev() Keep a reference to the node when possible with mas_prev(). This will avoid re-walking the tree. In keeping a reference to the node, keep the last/index accurate to the range being referenced. This means the limit may be within the range, but the range may extend outside of the limit. Also fix the single entry tree to respect the range (of 0), or set the node to MAS_NONE in the case of shifting beyond 0. Link: https://lkml.kernel.org/r/20230518145544.1722059-25-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
ca80f610 |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: try harder to keep active node after mas_next() Clean up the mas_next() call to try and keep a node reference when possible. This will avoid re-walking the tree in most cases. Also clean up the single entry tree handling to ensure index/last are consistent with what one would expect. (returning NULL with limit of 1-oo). Link: https://lkml.kernel.org/r/20230518145544.1722059-24-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
d0411860 |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: mas_start() reset depth on dead node When a dead node is detected, the depth has already been set to 1 so reset it to 0. Link: https://lkml.kernel.org/r/20230518145544.1722059-22-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
23e734ec |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: remove unnecessary check from mas_destroy() mas_destroy currently checks if mas->node is MAS_START prior to calling mas_start(), but this is unnecessary as mas_start() will do nothing if the node is anything but MAS_START. Link: https://lkml.kernel.org/r/20230518145544.1722059-21-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
acd4de60 |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: return error on mte_pivots() out of range Rename mte_pivots() to mas_pivots() and pass through the ma_state to set the error code to -EIO when the offset is out of range for the node type. Change the WARN_ON() to MAS_WARN_ON() to log the maple state. Link: https://lkml.kernel.org/r/20230518145544.1722059-16-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
bec1b51e |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: use MAS_BUG_ON() prior to calling mas_meta_gap() Replace the call to BUG_ON() in mas_meta_gap() with calls before the function call MAS_BUG_ON() to get more information on error condition. Link: https://lkml.kernel.org/r/20230518145544.1722059-15-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
1c414c6a |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: use MAS_WR_BUG_ON() in mas_store_prealloc() mas_store_prealloc() should never fail, but if it does due to internal tree issues then get as much debug information as possible prior to crashing the kernel. Link: https://lkml.kernel.org/r/20230518145544.1722059-14-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
4bbd1748 |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: use MAS_BUG_ON() from mas_topiary_range() In the even of trying to remove data from a leaf node by use of mas_topiary_range(), log the maple state. Link: https://lkml.kernel.org/r/20230518145544.1722059-13-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
5950ada9 |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: use MAS_BUG_ON() in mas_set_height() Use MAS_BUG_ON() instead of MT_BUG_ON() to get the maple state information. In the unlikely event of a tree height of > 31, try to increase the probability of useful information being logged. Link: https://lkml.kernel.org/r/20230518145544.1722059-12-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
bf96715e |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: use MAS_BUG_ON() when setting a leaf node as a parent Use MAS_BUG_ON() to dump the maple state and tree in the unlikely event of an issue. Link: https://lkml.kernel.org/r/20230518145544.1722059-11-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
e6d6792a |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@Oracle.com> |
maple_tree: convert debug code to use MT_WARN_ON() and MAS_WARN_ON() Using MT_WARN_ON() allows for the removal of if statements before logging. Using MAS_WARN_ON() will provide more information when issues are encountered. Link: https://lkml.kernel.org/r/20230518145544.1722059-10-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
0d7c52bb |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@Oracle.com> |
maple_tree: convert BUG_ON() to MT_BUG_ON() Use MT_BUG_ON() to get more information when running with MAPLE_TREE_DEBUG enabled. Link: https://lkml.kernel.org/r/20230518145544.1722059-8-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
f0a1f866 |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: add debug BUG_ON and WARN_ON variants Add debug macros to dump the maple state and/or the tree for both warning and bug_on calls. Link: https://lkml.kernel.org/r/20230518145544.1722059-7-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
89f499f3 |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@Oracle.com> |
maple_tree: add format option to mt_dump() Allow different formatting strings to be used when dumping the tree. Currently supports hex and decimal. Link: https://lkml.kernel.org/r/20230518145544.1722059-6-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
c3eb787e |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: clean up mas_dfs_postorder() Convert loop type to ensure all variables are set to make the compiler happy, and use the mas_is_none() function instead of explicitly checking the node in the maple state. Link: https://lkml.kernel.org/r/20230518145544.1722059-5-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
633769c9 |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: avoid unnecessary ascending The maple tree node limits are implied by the parent. When walking up the tree, the limit may not be known until a slot that does not have implied limits are encountered. However, if the node is the left-most or right-most node, the walking up to find that limit can be skipped. This commit also fixes the debug/testing code that was not setting the limit on walking down the tree as that optimization is not compatible with this change. Link: https://lkml.kernel.org/r/20230518145544.1722059-4-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
afc754c6 |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@Oracle.com> |
maple_tree: clean up mas_parent_enum() and rename to mas_parent_type() mas_parent_enum() is a simple wrapper for mte_parent_enum() which is only called from that wrapper. Remove the wrapper and inline mte_parent_enum() into mas_parent_enum(). At the same time, clean up the bit masking of the root pointer since it cannot be set by the time the bit masking occurs. Change the check on the root bit to a WARN_ON(), and fix the verification code to not trigger the WARN_ON() before checking if the node is root. Align the name to mas_parent_type() since mas_node_type() exists already. Link: https://lkml.kernel.org/r/20230518145544.1722059-3-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Cc: David Binderman <dcb314@hotmail.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
5729e06c |
|
18-May-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: fix static analyser cppcheck issue Patch series "Maple tree mas_{next,prev}_range() and cleanup", v4. This patchset contains a number of clean ups to the code to make it more usable (next/prev range), the addition of debug output formatting, the addition of printing the maple state information in the WARN_ON/BUG_ON code. There is also work done here to keep nodes active during iterations to reduce the necessity of re-walking the tree. Finally, there is a new interface added to move to the next or previous range in the tree, even if it is empty. The organisation of the patches is as follows: 0001-0004 - Small clean ups 0005-0018 - Additional debug options and WARN_ON/BUG_ON changes 0019 - Test module __init and __exit addition 0020-0021 - More functional clean ups 0022-0026 - Changes to keep nodes active 0027-0034 - Add new mas_{prev,next}_range() 0035 - Use new mas_{prev,next}_range() in mmap_region() This patch (of 35): Static analyser of the maple tree code noticed that the split variable is being used to dereference into an array prior to checking the variable itself. Fix this issue by changing the order of the statement to check the variable first. Link: https://lkml.kernel.org/r/20230518145544.1722059-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20230518145544.1722059-2-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: David Binderman <dcb314@hotmail.com> Reviewed-by: Peng Zhang<zhangpeng.00@bytedance.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
cd00dd25 |
|
05-May-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: fix potential out-of-bounds access in mas_wr_end_piv() Check the write offset end bounds before using it as the offset into the pivot array. This avoids a possible out-of-bounds access on the pivot array if the write extends to the last slot in the node, in which case the node maximum should be used as the end pivot. akpm: this doesn't affect any current callers, but new users of mapletree may encounter this problem if backported into earlier kernels, so let's fix it in -stable kernels in case of this. Link: https://lkml.kernel.org/r/20230506024752.2550-1-zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
0257d990 |
|
05-May-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: make maple state reusable after mas_empty_area() Make mas->min and mas->max point to a node range instead of a leaf entry range. This allows mas to still be usable after mas_empty_area() returns. Users would get unexpected results from other operations on the maple state after calling the affected function. For example, x86 MAP_32BIT mmap() acts as if there is no suitable gap when there should be one. Link: https://lkml.kernel.org/r/20230505145829.74574-1-zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reported-by: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com> Reported-by: Tad <support@spotco.us> Reported-by: Michael Keyes <mgkeyes@vigovproductions.net> Link: https://lore.kernel.org/linux-mm/32f156ba80010fd97dbaf0a0cdfc84366608624d.camel@intel.com/ Link: https://lore.kernel.org/linux-mm/e6108286ac025c268964a7ead3aab9899f9bc6e9.camel@spotco.us/ Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Tested-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
29ad6bb3 |
|
19-Apr-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: fix allocation in mas_sparse_area() In the case of reverse allocation, mas->index and mas->last do not point to the correct allocation range, which will cause users to get incorrect allocation results, so fix it. If the user does not use it in a specific way, this bug will not be triggered. This is a bug, but only VMA uses it now, the way VMA is used now will not trigger it. There is a possibility that a user will trigger it in the future. Also re-check whether the size is still satisfied after the lower bound was increased, which is a corner case and is incorrect in previous versions. Link: https://lkml.kernel.org/r/20230419093625.99201-1-zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
fb20e99a |
|
10-Apr-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: use correct variable type in sizeof The type of variable pointed to by pivs is unsigned long, but the type used in sizeof is a pointer type. Change it to unsigned long. This change has no runtime effect, as sizeof(ul) == sizeof(ul *). Link: https://lkml.kernel.org/r/20230411023513.15227-1-zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reported-by: David Binderman <dcb314@hotmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
97f7e094 |
|
14-Mar-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: simplify mas_wr_node_walk() Simplify code of mas_wr_node_walk() without changing functionality, and improve readability. Remove some special judgments. Instead of dynamically recording the min and max in the loop, get the final min and max directly at the end. Link: https://lkml.kernel.org/r/20230314124203.91572-3-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
5c63a7c3 |
|
01-Mar-2023 |
Danilo Krummrich <dakr@redhat.com> |
maple_tree: export symbol mas_preallocate() Fix missing EXPORT_SYMBOL_GPL() statement for mas_preallocate(). It isn't actually used by anything yet, but mas_preallocate() is part of the maple tree's 'Advanced API'. All other functions of this API are exported already. Link: https://lkml.kernel.org/r/20230302011035.4928-1-dakr@redhat.com Signed-off-by: Danilo Krummrich <dakr@redhat.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
06e8fd99 |
|
14-Apr-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: fix mas_empty_area() search The internal function of mas_awalk() was incorrectly skipping the last entry in a node, which could potentially be NULL. This is only a problem for the left-most node in the tree - otherwise that NULL would not exist. Fix mas_awalk() by using the metadata to obtain the end of the node for the loop and the logical pivot as apposed to the raw pivot value. Link: https://lkml.kernel.org/r/20230414145728.4067069-2-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
fad8e429 |
|
14-Apr-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: make maple state reusable after mas_empty_area_rev() Stop using maple state min/max for the range by passing through pointers for those values. This will allow the maple state to be reused without resetting. Also add some logic to fail out early on searching with invalid arguments. Link: https://lkml.kernel.org/r/20230414145728.4067069-1-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
1f5f12ec |
|
10-Apr-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: fix a potential memory leak, OOB access, or other unpredictable bug In mas_alloc_nodes(), "node->node_count = 0" means to initialize the node_count field of the new node, but the node may not be a new node. It may be a node that existed before and node_count has a value, setting it to 0 will cause a memory leak. At this time, mas->alloc->total will be greater than the actual number of nodes in the linked list, which may cause many other errors. For example, out-of-bounds access in mas_pop_node(), and mas_pop_node() may return addresses that should not be used. Fix it by initializing node_count only for new nodes. Also, by the way, an if-else statement was removed to simplify the code. Link: https://lkml.kernel.org/r/20230411041005.26205-1-zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
c45ea315 |
|
14-Mar-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: fix a potential concurrency bug in RCU mode There is a concurrency bug that may cause the wrong value to be loaded when a CPU is modifying the maple tree. CPU1: mtree_insert_range() mas_insert() mas_store_root() ... mas_root_expand() ... rcu_assign_pointer(mas->tree->ma_root, mte_mk_root(mas->node)); ma_set_meta(node, maple_leaf_64, 0, slot); <---IP CPU2: mtree_load() mtree_lookup_walk() ma_data_end(); When CPU1 is about to execute the instruction pointed to by IP, the ma_data_end() executed by CPU2 may return the wrong end position, which will cause the value loaded by mtree_load() to be wrong. An example of triggering the bug: Add mdelay(100) between rcu_assign_pointer() and ma_set_meta() in mas_root_expand(). static DEFINE_MTREE(tree); int work(void *p) { unsigned long val; for (int i = 0 ; i< 30; ++i) { val = (unsigned long)mtree_load(&tree, 8); mdelay(5); pr_info("%lu",val); } return 0; } mt_init_flags(&tree, MT_FLAGS_USE_RCU); mtree_insert(&tree, 0, (void*)12345, GFP_KERNEL); run_thread(work) mtree_insert(&tree, 1, (void*)56789, GFP_KERNEL); In RCU mode, mtree_load() should always return the value before or after the data structure is modified, and in this example mtree_load(&tree, 8) may return 56789 which is not expected, it should always return NULL. Fix it by put ma_set_meta() before rcu_assign_pointer(). Link: https://lkml.kernel.org/r/20230314124203.91572-4-zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
ec07967d |
|
14-Mar-2023 |
Peng Zhang <zhangpeng.00@bytedance.com> |
maple_tree: fix get wrong data_end in mtree_lookup_walk() if (likely(offset > end)) max = pivots[offset]; The above code should be changed to if (likely(offset < end)), which is correct. This affects the correctness of ma_data_end(). Now it seems that the final result will not be wrong, but it is best to change it. This patch does not change the code as above, because it simplifies the code by the way. Link: https://lkml.kernel.org/r/20230314124203.91572-1-zhangpeng.00@bytedance.com Link: https://lkml.kernel.org/r/20230314124203.91572-2-zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
790e1fa8 |
|
27-Feb-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: add RCU lock checking to rcu callback functions Dereferencing RCU objects within the RCU callback without the RCU check has caused lockdep to complain. Fix the RCU dereferencing by using the RCU callback lock to ensure the operation is safe. Also stop creating a new lock to use for dereferencing during destruction of the tree or subtree. Instead, pass through a pointer to the tree that has the lock that is held for RCU dereferencing checking. It also does not make sense to use the maple state in the freeing scenario as the tree walk is a special case where the tree no longer has the normal encodings and parent pointers. Link: https://lkml.kernel.org/r/20230227173632.3292573-8-surenb@google.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
0a2b18d9 |
|
27-Feb-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: add smp_rmb() to dead node detection Add an smp_rmb() before reading the parent pointer to ensure that anything read from the node prior to the parent pointer hasn't been reordered ahead of this check. The is necessary for RCU mode. Link: https://lkml.kernel.org/r/20230227173632.3292573-7-surenb@google.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
c13af03d |
|
27-Feb-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: fix write memory barrier of nodes once dead for RCU mode During the development of the maple tree, the strategy of freeing multiple nodes changed and, in the process, the pivots were reused to store pointers to dead nodes. To ensure the readers see accurate pivots, the writers need to mark the nodes as dead and call smp_wmb() to ensure any readers can identify the node as dead before using the pivot values. There were two places where the old method of marking the node as dead without smp_wmb() were being used, which resulted in RCU readers seeing the wrong pivot value before seeing the node was dead. Fix this race condition by using mte_set_node_dead() which has the smp_wmb() call to ensure the race is closed. Add a WARN_ON() to the ma_free_rcu() call to ensure all nodes being freed are marked as dead to ensure there are no other call paths besides the two updated paths. This is necessary for the RCU mode of the maple tree. Link: https://lkml.kernel.org/r/20230227173632.3292573-6-surenb@google.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
8372f4d8 |
|
27-Feb-2023 |
Liam Howlett <Liam.Howlett@oracle.com> |
maple_tree: remove extra smp_wmb() from mas_dead_leaves() The call to mte_set_dead_node() before the smp_wmb() already calls smp_wmb() so this is not needed. This is an optimization for the RCU mode of the maple tree. Link: https://lkml.kernel.org/r/20230227173632.3292573-5-surenb@google.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
2e5b4921 |
|
27-Feb-2023 |
Liam Howlett <Liam.Howlett@oracle.com> |
maple_tree: fix freeing of nodes in rcu mode The walk to destroy the nodes was not always setting the node type and would result in a destroy method potentially using the values as nodes. Avoid this by setting the correct node types. This is necessary for the RCU mode of the maple tree. Link: https://lkml.kernel.org/r/20230227173632.3292573-4-surenb@google.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
a7b92d59 |
|
27-Feb-2023 |
Liam Howlett <Liam.Howlett@oracle.com> |
maple_tree: detect dead nodes in mas_start() When initially starting a search, the root node may already be in the process of being replaced in RCU mode. Detect and restart the walk if this is the case. This is necessary for RCU mode of the maple tree. Link: https://lkml.kernel.org/r/20230227173632.3292573-3-surenb@google.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
39d0bd86 |
|
27-Feb-2023 |
Liam Howlett <Liam.Howlett@oracle.com> |
maple_tree: be more cautious about dead nodes Patch series "Fix VMA tree modification under mmap read lock". Syzbot reported a BUG_ON in mm/mmap.c which was found to be caused by an inconsistency between threads walking the VMA maple tree. The inconsistency is caused by the page fault handler modifying the maple tree while holding the mmap_lock for read. This only happens for stack VMAs. We had thought this was safe as it only modifies a single pivot in the tree. Unfortunately, syzbot constructed a test case where the stack had no guard page and grew the stack to abut the next VMA. This causes us to delete the NULL entry between the two VMAs and rewrite the node. We considered several options for fixing this, including dropping the mmap_lock, then reacquiring it for write; and relaxing the definition of the tree to permit a zero-length NULL entry in the node. We decided the best option was to backport some of the RCU patches from -next, which solve the problem by allocating a new node and RCU-freeing the old node. Since the problem exists in 6.1, we preferred a solution which is similar to the one we intended to merge next merge window. These patches have been in -next since next-20230301, and have received intensive testing in Android as part of the RCU page fault patchset. They were also sent as part of the "Per-VMA locks" v4 patch series. Patches 1 to 7 are bug fixes for RCU mode of the tree and patch 8 enables RCU mode for the tree. Performance v6.3-rc3 vs patched v6.3-rc3: Running these changes through mmtests showed there was a 15-20% performance decrease in will-it-scale/brk1-processes. This tests creating and inserting a single VMA repeatedly through the brk interface and isn't representative of any real world applications. This patch (of 8): ma_pivots() and ma_data_end() may be called with a dead node. Ensure to that the node isn't dead before using the returned values. This is necessary for RCU mode of the maple tree. Link: https://lkml.kernel.org/r/20230327185532.2354250-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20230227173632.3292573-1-surenb@google.com Link: https://lkml.kernel.org/r/20230227173632.3292573-2-surenb@google.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjun Roy <arjunroy@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Chris Li <chriscli@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: David Rientjes <rientjes@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: freak07 <michalechner92@googlemail.com> Cc: Greg Thelen <gthelen@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Minchan Kim <minchan@google.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Peter Oskolkov <posk@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Punit Agrawal <punit.agrawal@bytedance.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Shakeel Butt <shakeelb@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Song Liu <songliubraving@fb.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
0fa99fdf |
|
07-Mar-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: fix mas_skip_node() end slot detection Patch series "Fix mas_skip_node() for mas_empty_area()", v2. mas_empty_area() was incorrectly returning an error when there was room. The issue was tracked down to mas_skip_node() using the incorrect end-of-slot count. Instead of using the nodes hard limit, the limit of data should be used. mas_skip_node() was also setting the min and max to that of the child node, which was unnecessary. Within these limits being set, there was also a bug that corrupted the maple state's max if the offset was set to the maximum node pivot. The bug was without consequence unless there was a sufficient gap in the next child node which would cause an error to be returned. This patch set fixes these errors by removing the limit setting from mas_skip_node() and uses the mas_data_end() for slot limits, and adds tests for all failures discovered. This patch (of 2): mas_skip_node() is used to move the maple state to the node with a higher limit. It does this by walking up the tree and increasing the slot count. Since slot count may not be able to be increased, it may need to walk up multiple times to find room to walk right to a higher limit node. The limit of slots that was being used was the node limit and not the last location of data in the node. This would cause the maple state to be shifted outside actual data and enter an error state, thus returning -EBUSY. The result of the incorrect error state means that mas_awalk() would return an error instead of finding the allocation space. The fix is to use mas_data_end() in mas_skip_node() to detect the nodes data end point and continue walking the tree up until it is safe to move to a node with a higher limit. The walk up the tree also sets the maple state limits so remove the buggy code from mas_skip_node(). Setting the limits had the unfortunate side effect of triggering another bug if the parent node was full and the there was no suitable gap in the second last child, but room in the next child. mas_skip_node() may also be passed a maple state in an error state from mas_anode_descend() when no allocations are available. Return on such an error state immediately. Link: https://lkml.kernel.org/r/20230307180247.2220303-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20230307180247.2220303-2-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Snild Dolkow <snild@sony.com> Link: https://lore.kernel.org/linux-mm/cb8dc31a-fef2-1d09-f133-e9f7b9f9e77a@sony.com/ Tested-by: Snild Dolkow <snild@sony.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
44081c77 |
|
14-Feb-2023 |
Arnd Bergmann <arnd@arndb.de> |
maple_tree: reduce stack usage with gcc-9 and earlier gcc-10 changed the way inlining works to be less aggressive, but older versions run into an oversized stack frame warning whenever CONFIG_KASAN_STACK is enabled, as that forces variables from inlined callees to be non-overlapping: lib/maple_tree.c: In function 'mas_wr_bnode': lib/maple_tree.c:4320:1: error: the frame size of 1424 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] Change the annotations on mas_store_b_node() and mas_commit_b_node() to explicitly forbid inlining in this configuration, which is the same behavior that newer versions already have. Link: https://lkml.kernel.org/r/20230214103030.1051950-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Vernon Yang <vernon2gm@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
17dc622c |
|
20-Jan-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: fix mas_prev() and mas_find() state handling When mas_prev() does not find anything, set the state to MAS_NONE. Handle the MAS_NONE in mas_find() like a MAS_START. Link: https://lkml.kernel.org/r/20230120162650.984577-7-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: <syzbot+502859d610c661e56545@syzkaller.appspotmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
1202700c |
|
20-Jan-2023 |
Liam R. Howlett <Liam.Howlett@oracle.com> |
maple_tree: fix handle of invalidated state in mas_wr_store_setup() If an invalidated maple state is encountered during write, reset the maple state to MAS_START. This will result in a re-walk of the tree to the correct location for the write. Link: https://lore.kernel.org/all/20230107020126.1627-1-sj@kernel.org/ Link: https://lkml.kernel.org/r/20230120162650.984577-6-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
50e81c82 |
|
20-Jan-2023 |
Liam R. Howlett <Liam.Howlett@Oracle.com> |
maple_tree: reduce user error potential When iterating, a user may operate on the tree and cause the maple state to be altered and left in an unintuitive state. Detect this scenario and correct it by setting to the limit and invalidating the state. Link: https://lkml.kernel.org/r/20230120162650.984577-4-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
65be6f05 |
|
20-Jan-2023 |
Liam R. Howlett <Liam.Howlett@Oracle.com> |
maple_tree: fix potential rcu issue Ensure the node isn't dead after reading the node end. Link: https://lkml.kernel.org/r/20230120162650.984577-3-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
f942b0f0 |
|
11-Jan-2023 |
Vernon Yang <vernon2gm@gmail.com> |
maple_tree: fix comment of mte_destroy_walk The parameter name of maple tree is mt, make the comment be mt instead of mn, and the separator between the parameter name and the description to be : instead of -. Link: https://lkml.kernel.org/r/20230111135348.803181-1-vernon2gm@gmail.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
c5d5546e |
|
10-Jan-2023 |
Vernon Yang <vernon2gm@gmail.com> |
maple_tree: remove the parameter entry of mas_preallocate The parameter entry of mas_preallocate is not used, so drop it. Link: https://lkml.kernel.org/r/20230110154211.1758562-1-vernon2gm@gmail.com Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
541e06b7 |
|
05-Jan-2023 |
Liam Howlett <liam.howlett@oracle.com> |
maple_tree: remove GFP_ZERO from kmem_cache_alloc() and kmem_cache_alloc_bulk() Preallocations are common in the VMA code to avoid allocating under certain locking conditions. The preallocations must also cover the worst-case scenario. Removing the GFP_ZERO flag from the kmem_cache_alloc() (and bulk variant) calls will reduce the amount of time spent zeroing memory that may not be used. Only zero out the necessary area to keep track of the allocations in the maple state. Zero the entire node prior to using it in the tree. This required internal changes to node counting on allocation, so the test code is also updated. This restores some micro-benchmark performance: up to +9% in mmtests mmap1 by my testing +10% to +20% in mmap, mmapaddr, mmapmany tests reported by Red Hat Link: https://bugzilla.redhat.com/show_bug.cgi?id=2149636 Link: https://lkml.kernel.org/r/20230105160427.2988454-1-Liam.Howlett@oracle.com Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com> Reported-by: Jirka Hladky <jhladky@redhat.com> Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
e11cb683 |
|
20-Dec-2022 |
Vernon Yang <vernon2gm@gmail.com> |
maple_tree: refine mab_calc_split function Invert the conditional judgment of the mid_split, to focus the return statement in the last statement, which is easier to understand and for better readability. Link: https://lkml.kernel.org/r/20221221060058.609003-8-vernon2gm@gmail.com Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
46b34584 |
|
20-Dec-2022 |
Vernon Yang <vernon2gm@gmail.com> |
maple_tree: refine ma_state init from mas_start() If mas->node is an MAS_START, there are three cases, and they all assign different values to mas->node and mas->offset. So there is no need to set them to a default value before updating. Update them directly to make them easier to understand and for better readability. Link: https://lkml.kernel.org/r/20221221060058.609003-7-vernon2gm@gmail.com Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
84fd3e1e |
|
20-Dec-2022 |
Vernon Yang <vernon2gm@gmail.com> |
maple_tree: use macro MA_ROOT_PARENT instead of number When you need to compare whether node->parent is parent of the root node, using macro MA_ROOT_PARENT is easier to understand and for better readability. Link: https://lkml.kernel.org/r/20221221060058.609003-5-vernon2gm@gmail.com Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
bd592703 |
|
20-Dec-2022 |
Vernon Yang <vernon2gm@gmail.com> |
maple_tree: use mt_node_max() instead of direct operations mt_max[] Use mt_node_max() to get the maximum number of slots for a node, rather than direct operations mt_max[], makes it better portability. Link: https://lkml.kernel.org/r/20221221060058.609003-4-vernon2gm@gmail.com Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
d56c593c |
|
20-Dec-2022 |
Vernon Yang <vernon2gm@gmail.com> |
maple_tree: remove extra return statement For functions with a return type of void, it is unnecessary to add a reurn statement at the end of the function, so drop it. Link: https://lkml.kernel.org/r/20221221060058.609003-3-vernon2gm@gmail.com Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
831978e3 |
|
20-Dec-2022 |
Vernon Yang <vernon2gm@gmail.com> |
maple_tree: remove extra space and blank line Patch series "Clean up and refinement for maple tree", v2. This patchset cleans up and refines some maple tree code. A few small changes make the code easier to understand and for better readability. This patch (of 7): These extra space and blank lines are unnecessary, so drop them. Link: https://lkml.kernel.org/r/20221221060058.609003-1-vernon2gm@gmail.com Link: https://lkml.kernel.org/r/20221221060058.609003-2-vernon2gm@gmail.com Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
ab6ef70a |
|
12-Nov-2022 |
Wei Yang <richard.weiyang@gmail.com> |
maple_tree: should get pivots boundary by type We should get pivots boundary by type. Fixes a potential overindexing of mt_pivots[]. Link: https://lkml.kernel.org/r/20221112234308.23823-1-richard.weiyang@gmail.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
7327e811 |
|
11-Jan-2023 |
Liam Howlett <liam.howlett@oracle.com> |
maple_tree: fix mas_empty_area_rev() lower bound validation mas_empty_area_rev() was not correctly validating the start of a gap against the lower limit. This could lead to the range starting lower than the requested minimum. Fix the issue by better validating a gap once one is found. This commit also adds tests to the maple tree test suite for this issue and tests the mas_empty_area() function for similar bound checking. Link: https://lkml.kernel.org/r/20230111200136.1851322-1-Liam.Howlett@oracle.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=216911 Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: <amanieu@gmail.com> Link: https://lore.kernel.org/linux-mm/0b9f5425-08d4-8013-aa4c-e620c3b10bb2@leemhuis.info/ Tested-by: Holger Hoffsttte <holger@applied-asynchrony.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
0abb964a |
|
19-Dec-2022 |
Liam Howlett <liam.howlett@oracle.com> |
maple_tree: fix mas_spanning_rebalance() on insufficient data Mike Rapoport contacted me off-list with a regression in running criu. Periodic tests fail with an RCU stall during execution. Although rare, it is possible to hit this with other uses so this patch should be backported to fix the regression. This patchset adds the fix and a test case to the maple tree test suite. This patch (of 2): An insufficient node was causing an out-of-bounds access on the node in mas_leaf_max_gap(). The cause was the faulty detection of the new node being a root node when overwriting many entries at the end of the tree. Fix the detection of a new root and ensure there is sufficient data prior to entering the spanning rebalance loop. Link: https://lkml.kernel.org/r/20221219161922.2708732-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20221219161922.2708732-2-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Mike Rapoport <rppt@kernel.org> Tested-by: Mike Rapoport <rppt@kernel.org> Cc: Andrei Vagin <avagin@gmail.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
d98c86b9 |
|
25-Oct-2022 |
Liam Howlett <liam.howlett@oracle.com> |
maple_tree: fix mas_find_rev() comment mas_find_rev() uses mas_prev_entry(), not mas_next_entry(), correct comment. Link: https://lkml.kernel.org/r/20221025173756.2719616-1-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
6e7ba8b5 |
|
28-Oct-2022 |
Liam Howlett <liam.howlett@oracle.com> |
maple_tree: mte_set_full() and mte_clear_full() clang-analyzer clean up mte_set_full() and mte_clear_full() were incorrectly setting a pointer to a value without returning a result. Fix this by returning the modified pointer to be use as necessary. Also add a third function to return if the bit is set or not. Link: https://lore.kernel.org/lkml/20221026120029.12555-1-lukas.bulwahn@gmail.com/ Link: https://lkml.kernel.org/r/20221028144520.2776767-1-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Suggested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
7dc5ba62 |
|
07-Nov-2022 |
Liam Howlett <liam.howlett@oracle.com> |
maple_tree: don't set a new maximum on the node when not reusing nodes In RCU mode, the node limits were being updated to the last pivot which may not be correct and would cause the metadata to be set when it shouldn't. Fix this by not setting a new limit in this case. Link: https://lkml.kernel.org/r/20221107163857.867377-1-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
9bbba563 |
|
07-Nov-2022 |
Liam Howlett <liam.howlett@oracle.com> |
maple_tree: fix depth tracking in maple_state It is possible to confuse the depth tracking in the maple state by searching the same node for values. Fix the depth tracking by moving where the depth is incremented closer to where the node changes level. Also change the initial depth setting when using the root node. Link: https://lkml.kernel.org/r/20221107163814.866612-1-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
120b1162 |
|
28-Oct-2022 |
Liam Howlett <liam.howlett@oracle.com> |
maple_tree: reorganize testing to restore module testing Along the development cycle, the testing code support for module/in-kernel compiles was removed. Restore this functionality by moving any internal API tests to the userspace side, as well as threading tests. Fix the lockdep issues and add a way to reduce memory usage so the tests can complete with KASAN + memleak detection. Make the tests work on 32 bit hosts where possible and detect 32 bit hosts in the radix test suite. [akpm@linux-foundation.org: fix module export] [akpm@linux-foundation.org: fix it some more] [liam.howlett@oracle.com: fix compile warnings on 32bit build in check_find()] Link: https://lkml.kernel.org/r/20221107203816.1260327-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20221028180415.3074673-1-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
9a887877 |
|
26-Oct-2022 |
Liam Howlett <liam.howlett@oracle.com> |
maple_tree: mas_anode_descend() clang-analyzer cleanup clang-analyzer reported some Dead Stores in mas_anode_descend(). Upon inspection, there were a few clean ups that would make the code cleaner: The count variable was set from the mt_slots array and then updated but never used again. Just use the array reference directly. Also stop updating the type since it isn't used after the update. Stop setting the gaps pointer to NULL at the start since it is always set before the loop begins. Link: https://lkml.kernel.org/r/20221026151413.4032730-1-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Suggested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
c61b3a2b |
|
26-Oct-2022 |
Liam Howlett <liam.howlett@oracle.com> |
maple_tree: remove pointer to pointer use in mas_alloc_nodes() There is a more direct and cleaner way of implementing the same functional code. Remove the confusing and unnecessary use of pointers here. Link: https://lkml.kernel.org/r/20221026151241.4031117-1-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Suggested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
1b9c9183 |
|
26-Oct-2022 |
Lukas Bulwahn <lukas.bulwahn@gmail.com> |
lib: maple_tree: remove unneeded initialization in mtree_range_walk() Before the do-while loop in mtree_range_walk(), the variables next, min, max need to be initialized. The variables last, prev_min and prev_max are set within the loop body before they are eventually used after exiting the loop body. As it is a do-while loop, the loop body is executed at least once, so the variables last, prev_min and prev_max do not need to be initialized before the loop body. Remove unneeded initialization of last and prev_min. The needless initialization was reported by clang-analyzer as Dead Stores. As the compiler already identifies these assignments as unneeded, it optimizes the assignments away. Hence: No functional change. No change in object code. Link: https://lkml.kernel.org/r/20221026120029.12555-2-lukas.bulwahn@gmail.com Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
54a611b6 |
|
06-Sep-2022 |
Liam R. Howlett <Liam.Howlett@Oracle.com> |
Maple Tree: add new data structure Patch series "Introducing the Maple Tree" The maple tree is an RCU-safe range based B-tree designed to use modern processor cache efficiently. There are a number of places in the kernel that a non-overlapping range-based tree would be beneficial, especially one with a simple interface. If you use an rbtree with other data structures to improve performance or an interval tree to track non-overlapping ranges, then this is for you. The tree has a branching factor of 10 for non-leaf nodes and 16 for leaf nodes. With the increased branching factor, it is significantly shorter than the rbtree so it has fewer cache misses. The removal of the linked list between subsequent entries also reduces the cache misses and the need to pull in the previous and next VMA during many tree alterations. The first user that is covered in this patch set is the vm_area_struct, where three data structures are replaced by the maple tree: the augmented rbtree, the vma cache, and the linked list of VMAs in the mm_struct. The long term goal is to reduce or remove the mmap_lock contention. The plan is to get to the point where we use the maple tree in RCU mode. Readers will not block for writers. A single write operation will be allowed at a time. A reader re-walks if stale data is encountered. VMAs would be RCU enabled and this mode would be entered once multiple tasks are using the mm_struct. Davidlor said : Yes I like the maple tree, and at this stage I don't think we can ask for : more from this series wrt the MM - albeit there seems to still be some : folks reporting breakage. Fundamentally I see Liam's work to (re)move : complexity out of the MM (not to say that the actual maple tree is not : complex) by consolidating the three complimentary data structures very : much worth it considering performance does not take a hit. This was very : much a turn off with the range locking approach, which worst case scenario : incurred in prohibitive overhead. Also as Liam and Matthew have : mentioned, RCU opens up a lot of nice performance opportunities, and in : addition academia[1] has shown outstanding scalability of address spaces : with the foundation of replacing the locked rbtree with RCU aware trees. A similar work has been discovered in the academic press https://pdos.csail.mit.edu/papers/rcuvm:asplos12.pdf Sheer coincidence. We designed our tree with the intention of solving the hardest problem first. Upon settling on a b-tree variant and a rough outline, we researched ranged based b-trees and RCU b-trees and did find that article. So it was nice to find reassurances that we were on the right path, but our design choice of using ranges made that paper unusable for us. This patch (of 70): The maple tree is an RCU-safe range based B-tree designed to use modern processor cache efficiently. There are a number of places in the kernel that a non-overlapping range-based tree would be beneficial, especially one with a simple interface. If you use an rbtree with other data structures to improve performance or an interval tree to track non-overlapping ranges, then this is for you. The tree has a branching factor of 10 for non-leaf nodes and 16 for leaf nodes. With the increased branching factor, it is significantly shorter than the rbtree so it has fewer cache misses. The removal of the linked list between subsequent entries also reduces the cache misses and the need to pull in the previous and next VMA during many tree alterations. The first user that is covered in this patch set is the vm_area_struct, where three data structures are replaced by the maple tree: the augmented rbtree, the vma cache, and the linked list of VMAs in the mm_struct. The long term goal is to reduce or remove the mmap_lock contention. The plan is to get to the point where we use the maple tree in RCU mode. Readers will not block for writers. A single write operation will be allowed at a time. A reader re-walks if stale data is encountered. VMAs would be RCU enabled and this mode would be entered once multiple tasks are using the mm_struct. There is additional BUG_ON() calls added within the tree, most of which are in debug code. These will be replaced with a WARN_ON() call in the future. There is also additional BUG_ON() calls within the code which will also be reduced in number at a later date. These exist to catch things such as out-of-range accesses which would crash anyways. Link: https://lkml.kernel.org/r/20220906194824.2110408-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20220906194824.2110408-2-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Sven Schnelle <svens@linux.ibm.com> Tested-by: Yu Zhao <yuzhao@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: SeongJae Park <sj@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|