#
9cbacb83 |
|
16-Feb-2024 |
Christophe Leroy <christophe.leroy@csgroup.eu> |
powerpc: Don't ignore errors from set_memory_{n}p() in __kernel_map_pages() set_memory_p() and set_memory_np() can fail. As mentioned in linux/mm.h: /* * To support DEBUG_PAGEALLOC architecture must ensure that * __kernel_map_pages() never fails */ So panic in case set_memory_p() or set_memory_np() fail in __kernel_map_pages(). Link: https://github.com/KSPP/linux/issues/7 Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20ef75884aa6a636e8298736f3d1056b0793d3d9.1708078640.git.christophe.leroy@csgroup.eu
|
#
773b93f1 |
|
04-Dec-2023 |
Aneesh Kumar K.V (IBM) <aneesh.kumar@kernel.org> |
powerpc/book3s/hash: Drop _PAGE_PRIVILEGED from PAGE_NONE There used to be a dependency on _PAGE_PRIVILEGED with pte_savedwrite. But that got dropped by commit 6a56ccbcf6c6 ("mm/autonuma: use can_change_(pte|pmd)_writable() to replace savedwrite") With the change in this patch numa fault pte (pte_protnone()) gets mapped as regular user pte with RWX cleared (no-access) whereas earlier it used to be mapped _PAGE_PRIVILEGED. Hash fault handling code gets some WARN_ON added in this patch because those functions are not expected to get called with _PAGE_READ cleared. commit 18061c17c8ec ("powerpc/mm: Update PROTFAULT handling in the page fault path") explains the details. Signed-off-by: "Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231204093638.71503-1-aneesh.kumar@kernel.org
|
#
9fee28ba |
|
02-Aug-2023 |
Matthew Wilcox (Oracle) <willy@infradead.org> |
powerpc: implement the new page table range API Add set_ptes(), update_mmu_cache_range() and flush_dcache_folio(). Change the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page to per-folio. [willy@infradead.org: re-export flush_dcache_icache_folio()] Link: https://lkml.kernel.org/r/ZMx1daYwvD9EM7Cv@casper.infradead.org Link: https://lkml.kernel.org/r/20230802151406.3735276-22-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
dea18da4 |
|
16-Dec-2022 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s: Fix stress_hpt memblock alloc alignment The stress_hpt memblock allocation did not pass in an alignment, which causes a stack dump in early boot (that I missed, oops). Fixes: 6b34a099faa1 ("powerpc/64s/hash: add stress_hpt kernel boot option to increase hash faults") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221216115930.2667772-2-npiggin@gmail.com
|
#
f12cd061 |
|
28-Dec-2022 |
Yang Yingliang <yangyingliang@huawei.com> |
powerpc/64s/hash: Make stress_hpt_timer_fn() static stress_hpt_timer_fn() is only used in hash_utils.c, make it static. Fixes: 6b34a099faa1 ("powerpc/64s/hash: add stress_hpt kernel boot option to increase hash faults") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221228093603.3166599-1-yangyingliang@huawei.com
|
#
6b34a099 |
|
23-Oct-2022 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s/hash: add stress_hpt kernel boot option to increase hash faults This option increases the number of hash misses by limiting the number of kernel HPT entries, by keeping a per-CPU record of the last kernel HPTEs installed, and removing that from the hash table on the next hash insertion. A timer round-robins CPUs removing remaining kernel HPTEs and clearing the TLB (in the case of bare metal) to increase and slightly randomise kernel fault activity. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Add comment about NR_CPUS usage, fixup whitespace] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221024030150.852517-1-npiggin@gmail.com
|
#
b12eb279 |
|
13-Oct-2022 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s: make linear_map_hash_lock a raw spinlock This lock is taken while the raw kfence_freelist_lock is held, so it must also be a raw spinlock, as reported by lockdep when raw lock nesting checking is enabled. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221013230710.1987253-3-npiggin@gmail.com
|
#
a5edf981 |
|
26-Sep-2022 |
Nicholas Miehlbradt <nicholas@linux.ibm.com> |
powerpc/64s: Enable KFENCE on book3s64 KFENCE support was added for ppc32 in commit 90cbac0e995d ("powerpc: Enable KFENCE for PPC32"). Enable KFENCE on ppc64 architecture with hash and radix MMUs. It uses the same mechanism as debug pagealloc to protect/unprotect pages. All KFENCE kunit tests pass on both MMUs. KFENCE memory is initially allocated using memblock but is later marked as SLAB allocated. This necessitates the change to __pud_free to ensure that the KFENCE pages are freed appropriately. Based on previous work by Christophe Leroy and Jordan Niethe. Signed-off-by: Nicholas Miehlbradt <nicholas@linux.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220926075726.2846-4-nicholas@linux.ibm.com
|
#
d7902d31 |
|
26-Sep-2022 |
Christophe Leroy <christophe.leroy@csgroup.eu> |
powerpc/64s: Allow double call of kernel_[un]map_linear_page() If the page is already mapped resp. already unmapped, bail out. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Nicholas Miehlbradt <nicholas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220926075726.2846-3-nicholas@linux.ibm.com
|
#
3e791d0f |
|
26-Sep-2022 |
Christophe Leroy <christophe.leroy@csgroup.eu> |
powerpc/64s: Remove unneeded #ifdef CONFIG_DEBUG_PAGEALLOC in hash_utils debug_pagealloc_enabled() is always defined and constant folds to 'false' when CONFIG_DEBUG_PAGEALLOC is not enabled. Remove the #ifdefs, the code and associated static variables will be optimised out by the compiler when CONFIG_DEBUG_PAGEALLOC is not defined. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Nicholas Miehlbradt <nicholas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220926075726.2846-2-nicholas@linux.ibm.com
|
#
73ea68ad |
|
05-Sep-2022 |
Christophe Leroy <christophe.leroy@csgroup.eu> |
powerpc/book3s: Inline first level of update_mmu_cache() update_mmu_cache() voids when hash page tables are not used. On PPC32 that means when MMU_FTR_HPTE_TABLE is not defined. On PPC64 that means when RADIX is enabled. Rename core part of update_mmu_cache() as __update_mmu_cache() and include the initial verification in an inlined caller. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/bea5ad0de7f83eff256116816d46c84fa0a444de.1662370698.git.christophe.leroy@csgroup.eu
|
#
2b461880 |
|
18-Jul-2022 |
Michael Ellerman <mpe@ellerman.id.au> |
powerpc: Fix all occurences of duplicate words Since commit 87c78b612f4f ("powerpc: Fix all occurences of "the the"") fixed "the the", there's now a steady stream of patches fixing other duplicate words. Just fix them all at once, to save the overhead of dealing with individual patches for each case. This leaves a few cases of "that that", which in some contexts is correct. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220718095158.326606-1-mpe@ellerman.id.au
|
#
e6f6390a |
|
08-Mar-2022 |
Christophe Leroy <christophe.leroy@csgroup.eu> |
powerpc: Add missing headers Don't inherit headers "by chances" from asm/prom.h, asm/mpc52xx.h, asm/pci.h etc... Include the needed headers, and remove asm/prom.h when it was needed exclusively for pulling necessary headers. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/be8bdc934d152a7d8ee8d1a840d5596e2f7d85e0.1646767214.git.christophe.leroy@csgroup.eu
|
#
1fd02f66 |
|
30-Apr-2022 |
Julia Lawall <Julia.Lawall@inria.fr> |
powerpc: fix typos in comments Various spelling mistakes in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220430185654.5855-1-Julia.Lawall@inria.fr
|
#
3ba4289a |
|
09-Apr-2022 |
Christophe Leroy <christophe.leroy@csgroup.eu> |
powerpc: Simplify and move arch_randomize_brk() arch_randomize_brk() is only needed for hash on book3s/64, for other platforms the one provided by the default mmap layout is good enough. Move it to hash_utils.c and use randomize_page() like the generic one. And properly opt out the radix case instead of making an assumption on mmu_highuser_ssize. Also change to a 32M range like most other architectures instead of 8M. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/eafa4d18ec8ac7b98dd02b40181e61643707cc7c.1649523076.git.christophe.leroy@csgroup.eu
|
#
f693d38d |
|
09-Apr-2022 |
Christophe Leroy <christophe.leroy@csgroup.eu> |
powerpc/mm: Remove CONFIG_PPC_MM_SLICES CONFIG_PPC_MM_SLICES is always selected by hash book3s/64. CONFIG_PPC_MM_SLICES is never selected by other platforms. Remove it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/dc2cdc204de8978574bf7c02329b6cfc4db0bce7.1649523076.git.christophe.leroy@csgroup.eu
|
#
8b91cee5 |
|
03-Feb-2022 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s/hash: Make hash faults work in NMI context Hash faults are not resoved in NMI context, instead causing the access to fail. This is done because perf interrupts can get backtraces including walking the user stack, and taking a hash fault on those could deadlock on the HPTE lock if the perf interrupt hits while the same HPTE lock is being held by the hash fault code. The user-access for the stack walking will notice the access failed and deal with that in the perf code. The reason to allow perf interrupts in is to better profile hash faults. The problem with this is any hash fault on a kernel access that happens in NMI context will crash, because kernel accesses must not fail. Hard lockups, system reset, machine checks that access vmalloc space including modules and including stack backtracing and symbol lookup in modules, per-cpu data, etc could all run into this problem. Fix this by disallowing perf interrupts in the hash fault code (the direct hash fault is covered by MSR[EE]=0 so the PMI disable just needs to extend to the preload case). This simplifies the tricky logic in hash faults and perf, at the cost of reduced profiling of hash faults. perf can still latch addresses when interrupts are disabled, it just won't get the stack trace at that point, so it would still find hot spots, just sometimes with confusing stack chains. An alternative could be to allow perf interrupts here but always do the slowpath stack walk if we are in nmi context, but that slows down all perf interrupt stack walking on hash though and it does not remove as much tricky code. Reported-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220204035348.545435-1-npiggin@gmail.com
|
#
c13f2b2b |
|
16-Dec-2021 |
Nick Child <nick.child@ibm.com> |
powerpc/mm: Add __init attribute to eligible functions Some functions defined in 'arch/powerpc/mm' are deserving of an `__init` macro attribute. These functions are only called by other initialization functions and therefore should inherit the attribute. Also, change function declarations in header files to include `__init`. Signed-off-by: Nick Child <nick.child@ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211216220035.605465-4-nick.child@ibm.com
|
#
bdad5d57 |
|
01-Dec-2021 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s: move page size definitions from hash specific file The radix code uses some of the psize variables. Move the common ones from hash_utils.c to pgtable.c. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-10-npiggin@gmail.com
|
#
a4135cbe |
|
01-Dec-2021 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/pseries: Stop selecting PPC_HASH_MMU_NATIVE The pseries platform does not use the native hash code but the PAPR virtualised hash interfaces, so remove PPC_HASH_MMU_NATIVE. This requires moving tlbiel code from hash_native.c to hash_utils.c. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-4-npiggin@gmail.com
|
#
7ebc4903 |
|
01-Dec-2021 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc: Rename PPC_NATIVE to PPC_HASH_MMU_NATIVE PPC_NATIVE now only controls the native HPT code, so rename it to be more descriptive. Restrict it to Book3S only. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-3-npiggin@gmail.com
|
#
4f703e7f |
|
13-Oct-2021 |
Joel Stanley <joel@jms.id.au> |
powerpc/s64: Clarify that radix lacks DEBUG_PAGEALLOC The page_alloc.c code will call into __kernel_map_pages() when DEBUG_PAGEALLOC is configured and enabled. As the implementation assumes hash, this should crash spectacularly if not for a bit of luck in __kernel_map_pages(). In this function linear_map_hash_count is always zero, the for loop exits without doing any damage. There are no other platforms that determine if they support debug_pagealloc at runtime. Instead of adding code to mm/page_alloc.c to do that, this change turns the map/unmap into a noop when in radix mode and prints a warning once. Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Reformat if per Christophe's suggestion] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211013213438.675095-1-joel@jms.id.au
|
#
dbf77fed |
|
12-Aug-2021 |
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> |
powerpc: rename powerpc_debugfs_root to arch_debugfs_dir No functional change in this patch. arch_debugfs_dir is the generic kernel name declared in linux/debugfs.h for arch-specific debugfs directory. Architectures like x86/s390 already use the name. Rename powerpc specific powerpc_debugfs_root to arch_debugfs_dir. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210812132831.233794-2-aneesh.kumar@linux.ibm.com
|
#
5567b1ee |
|
30-Jun-2021 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s: fix hash page fault interrupt handler The early bad fault or key fault test in do_hash_fault() ends up calling into ___do_page_fault without having gone through an interrupt handler wrapper (except the initial _RAW one). This can end up calling local irq functions while the interrupt has not been reconciled, which will likely cause crashes and it trips up on a later patch that adds more assertions. pkey_exec_prot from selftests causes this path to be executed. There is no real reason to run the in_nmi() test should be performed before the key fault check. In fact if a perf interrupt in the hash fault code did a stack walk that was made to take a key fault somehow then running ___do_page_fault could possibly cause another hash fault causing problems. Move the in_nmi() test first, and then do everything else inside the regular interrupt handler function. Fixes: 3a96570ffceb ("powerpc: convert interrupt handlers to use wrappers") Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210630074621.2109197-2-npiggin@gmail.com
|
#
7153d4bf |
|
14-Apr-2021 |
Xiongwei Song <sxwjean@gmail.com> |
powerpc/traps: Enhance readability for trap types Define macros to list ppc interrupt types in interttupt.h, replace the reference of the trap hex values with these macros. Referred the hex numbers in arch/powerpc/kernel/exceptions-64e.S, arch/powerpc/kernel/exceptions-64s.S, arch/powerpc/kernel/head_*.S, arch/powerpc/kernel/head_booke.h and arch/powerpc/include/asm/kvm_asm.h. Signed-off-by: Xiongwei Song <sxwjean@gmail.com> [mpe: Resolve conflicts in nmi_disables_ftrace(), fix 40x build] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1618398033-13025-1-git-send-email-sxwjean@me.com
|
#
c45ba4f4 |
|
16-Mar-2021 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc: clean up do_page_fault search_exception_tables + __bad_page_fault can be substituted with bad_page_fault, do_page_fault no longer needs to return a value to asm for any sub-architecture, and __bad_page_fault can be static. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210316104206.407354-10-npiggin@gmail.com
|
#
a5d6a3e7 |
|
04-Apr-2021 |
Vaibhav Jain <vaibhav@linux.ibm.com> |
powerpc/mm: Add cond_resched() while removing hpte mappings While removing large number of mappings from hash page tables for large memory systems as soft-lockup is reported because of the time spent inside htap_remove_mapping() like one below: watchdog: BUG: soft lockup - CPU#8 stuck for 23s! <snip> NIP plpar_hcall+0x38/0x58 LR pSeries_lpar_hpte_invalidate+0x68/0xb0 Call Trace: 0x1fffffffffff000 (unreliable) pSeries_lpar_hpte_removebolted+0x9c/0x230 hash__remove_section_mapping+0xec/0x1c0 remove_section_mapping+0x28/0x3c arch_remove_memory+0xfc/0x150 devm_memremap_pages_release+0x180/0x2f0 devm_action_release+0x30/0x50 release_nodes+0x28c/0x300 device_release_driver_internal+0x16c/0x280 unbind_store+0x124/0x170 drv_attr_store+0x44/0x60 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1b0/0x290 __vfs_write+0x3c/0x70 vfs_write+0xd4/0x270 ksys_write+0xdc/0x130 system_call+0x5c/0x70 Fix this by adding a cond_resched() to the loop in htap_remove_mapping() that issues hcall to remove hpte mapping. The call to cond_resched() is issued every HZ jiffies which should prevent the soft-lockup from being reported. Suggested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210404163148.321346-1-vaibhav@linux.ibm.com
|
#
1479e3d3 |
|
16-Mar-2021 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s: Fix hash fault to use TRAP accessor Hash faults use the trap vector to decide whether this is an instruction or data fault. This should use the TRAP accessor rather than open access regs->trap. This won't cause a problem at the moment because 64s only uses trap flags for system call interrupts (the norestart flag), but that could change if any other trap flags get used in future. Fixes: a4922f5442e7e ("powerpc/64s: move the hash fault handling logic to C") Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210316105205.407767-1-npiggin@gmail.com
|
#
ec94b9b2 |
|
02-Feb-2021 |
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> |
powerpc/mm: Add PG_dcache_clean to indicate dcache clean state This just add a better name for PG_arch_1. No functional change in this patch. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210203045812.234439-2-aneesh.kumar@linux.ibm.com
|
#
540d4d34 |
|
30-Jan-2021 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64: context tracking move to interrupt wrappers This moves exception_enter/exit calls to wrapper functions for synchronous interrupts. More interrupt handlers are covered by this than previously. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-33-npiggin@gmail.com
|
#
a008f8f9 |
|
30-Jan-2021 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s/hash: improve context tracking of hash faults This moves the 64s/hash context tracking from hash_page_mm() to __do_hash_fault(), so it's no longer called by OCXL / SPU accelerators, which was certainly the wrong thing to be doing, because those callers are not low level interrupt handlers, so should have entered a kernel context tracking already. Then remain in kernel context for the duration of the fault, rather than enter/exit for the hash fault then enter/exit for the page fault, which is pointless. Even still, calling exception_enter/exit in __do_hash_fault seems questionable because that's touching per-cpu variables, tracing, etc., which might have been interrupted by this hash fault or themselves cause hash faults. But maybe I miss something because hash_page_mm very deliberately calls trace_hash_fault too, for example. So for now go with it, it's no worse than before, in this regard. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-32-npiggin@gmail.com
|
#
3a96570f |
|
30-Jan-2021 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc: convert interrupt handlers to use wrappers Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-29-npiggin@gmail.com
|
#
bf0e2374 |
|
30-Jan-2021 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s: split do_hash_fault This is required for subsequent interrupt wrapper implementation. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-16-npiggin@gmail.com
|
#
8458c628 |
|
30-Jan-2021 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc: bad_page_fault get registers from regs Similar to the previous patch this makes interrupt handler function types more regular so they can be wrapped with the next patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-12-npiggin@gmail.com
|
#
a01a3f2d |
|
30-Jan-2021 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc: remove arguments from fault handler functions Make mm fault handlers all just take the pt_regs * argument and load DAR/DSISR from that. Make those that return a value return long. This is done to make the function signatures match other handlers, which will help with a future patch to add wrappers. Explicit arguments could be added for performance but that would require more wrapper macro variants. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-7-npiggin@gmail.com
|
#
a4922f54 |
|
30-Jan-2021 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s: move the hash fault handling logic to C The fault handling still has some complex logic particularly around hash table handling, in asm. Implement most of this in C. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-6-npiggin@gmail.com
|
#
3ba150fb |
|
30-Nov-2020 |
Ganesh Goudar <ganeshgr@linux.ibm.com> |
lkdtm/powerpc: Add SLB multihit test To check machine check handling, add support to inject slb multihit errors. Co-developed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com> [mpe: Use CONFIG_PPC_BOOK3S_64 to fix compile errors reported by lkp@intel.com] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201130083057.135610-1-ganeshgr@linux.ibm.com
|
#
d94b827e |
|
26-Nov-2020 |
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> |
powerpc/book3s64/kuap: Use Key 3 for kernel mapping with hash translation This patch updates kernel hash page table entries to use storage key 3 for its mapping. This implies all kernel access will now use key 3 to control READ/WRITE. The patch also prevents the allocation of key 3 from userspace and UAMOR value is updated such that userspace cannot modify key 3. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201127044424.40686-9-aneesh.kumar@linux.ibm.com
|
#
d8bd9a12 |
|
11-Nov-2020 |
David Hildenbrand <david@redhat.com> |
powerpc/book3s64/hash: Drop WARN_ON in hash__remove_section_mapping() The single caller (arch_remove_linear_mapping()) prints a proper warning when this function fails. No need to eventually crash the kernel - let's drop this WARN_ON. Suggested-by: Oscar Salvador <osalvador@suse.de> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201111145322.15793-7-david@redhat.com
|
#
b10d6bca |
|
13-Oct-2020 |
Mike Rapoport <rppt@kernel.org> |
arch, drivers: replace for_each_membock() with for_each_mem_range() There are several occurrences of the following pattern: for_each_memblock(memory, reg) { start = __pfn_to_phys(memblock_region_memory_base_pfn(reg); end = __pfn_to_phys(memblock_region_memory_end_pfn(reg)); /* do something with start and end */ } Using for_each_mem_range() iterator is more appropriate in such cases and allows simpler and cleaner code. [akpm@linux-foundation.org: fix arch/arm/mm/pmsa-v7.c build] [rppt@linux.ibm.com: mips: fix cavium-octeon build caused by memblock refactoring] Link: http://lkml.kernel.org/r/20200827124549.GD167163@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Daniel Axtens <dja@axtens.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Emil Renner Berthing <kernel@esmil.dk> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20200818151634.14343-13-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
79b123cd |
|
06-Sep-2020 |
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> |
powerepc/book3s64/hash: Align start/end address correctly with bolt mapping This ensures we don't do a partial mapping of memory. With nvdimm, when creating namespaces with size not aligned to 16MB, the kernel ends up partially mapping the pages. This can result in kernel adding multiple hash page table entries for the same range. A new namespace will result in create_section_mapping() with start and end overlapping an already existing bolted hash page table entry. commit: 6acd7d5ef264 ("libnvdimm/namespace: Enforce memremap_compat_align()") made sure that we always create namespaces aligned to 16MB. But we can do better by avoiding mapping pages that are not aligned. This helps to catch access to these partially mapped pages early. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200907072539.67310-1-aneesh.kumar@linux.ibm.com
|
#
12564485 |
|
21-Aug-2020 |
Shawn Anastasio <shawn@anastas.io> |
Revert "powerpc/64s: Remove PROT_SAO support" This reverts commit 5c9fa16e8abd342ce04dc830c1ebb2a03abf6c05. Since PROT_SAO can still be useful for certain classes of software, reintroduce it. Concerns about guest migration for LPARs using SAO will be addressed next. Signed-off-by: Shawn Anastasio <shawn@anastas.io> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200821185558.35561-2-shawn@anastas.io
|
#
1e4e4bca |
|
17-Aug-2020 |
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> |
powerpc/pkeys: Fix build error with PPC_MEM_KEYS disabled IS_ENABLED() instead of #ifdef still requires variable declaration. In this specific case, default_uamor is declared in asm/pkeys.h which is only included if PPC_MEM_KEYS is enabled. arch/powerpc/mm/book3s64/hash_utils.c: In function ‘hash__early_init_mmu_secondary’: arch/powerpc/mm/book3s64/hash_utils.c:1119:21: error: ‘default_uamor’ undeclared (first use in this function) 1119 | mtspr(SPRN_UAMOR, default_uamor); | ^~~~~~~~~~~~~ Fixes: 6553fb799f60 ("powerpc/pkeys: Fix boot failures with Nemo board (A-EON AmigaOne X1000)") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200817103301.158836-1-aneesh.kumar@linux.ibm.com
|
#
6553fb79 |
|
10-Aug-2020 |
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> |
powerpc/pkeys: Fix boot failures with Nemo board (A-EON AmigaOne X1000) On p6 and before we should avoid updating UAMOR SPRN. This resulted in boot failure on Nemo board. Fixes: 269e829f48a0 ("powerpc/book3s64/pkey: Disable pkey on POWER6 and before") Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200810102623.685083-1-aneesh.kumar@linux.ibm.com
|
#
55548a86 |
|
27-Jul-2020 |
Bharata B Rao <bharata@linux.ibm.com> |
powerpc/mm: Limit resize_hpt_for_hotplug() call to hash guests only During memory hotplug and unplug, resize_hpt_for_hotplug() gets called for both hash and radix guests but it should be called only for hash guests. Though the call does nothing in the radix guest case, it is cleaner to push this call into hash specific memory hotplug routines. Reported-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200727095704.1432916-1-bharata@linux.ibm.com
|
#
909adfc6 |
|
27-Jul-2020 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s/hash: Fix hash_preload running with interrupts enabled Commit 2f92447f9f96 ("powerpc/book3s64/hash: Use the pte_t address from the caller") removed the local_irq_disable from hash_preload, but it was required for more than just the page table walk: the hash pte busy bit is effectively a lock which may be taken in interrupt context, and the local update flag test must not be preempted before it's used. This solves apparent lockups with perf interrupting __hash_page_64K. If get_perf_callchain then also takes a hash fault on the same page while it is already locked, it will loop forever taking hash faults, which looks like this: cpu 0x49e: Vector: 100 (System Reset) at [c00000001a4f7d70] pc: c000000000072dc8: hash_page_mm+0x8/0x800 lr: c00000000000c5a4: do_hash_page+0x24/0x38 sp: c0002ac1cc69ac70 msr: 8000000000081033 current = 0xc0002ac1cc602e00 paca = 0xc00000001de1f280 irqmask: 0x03 irq_happened: 0x01 pid = 20118, comm = pread2_processe Linux version 5.8.0-rc6-00345-g1fad14f18bc6 49e:mon> t [c0002ac1cc69ac70] c00000000000c5a4 do_hash_page+0x24/0x38 (unreliable) --- Exception: 300 (Data Access) at c00000000008fa60 __copy_tofrom_user_power7+0x20c/0x7ac [link register ] c000000000335d10 copy_from_user_nofault+0xf0/0x150 [c0002ac1cc69af70] c00032bf9fa3c880 (unreliable) [c0002ac1cc69afa0] c000000000109df0 read_user_stack_64+0x70/0xf0 [c0002ac1cc69afd0] c000000000109fcc perf_callchain_user_64+0x15c/0x410 [c0002ac1cc69b060] c000000000109c00 perf_callchain_user+0x20/0x40 [c0002ac1cc69b080] c00000000031c6cc get_perf_callchain+0x25c/0x360 [c0002ac1cc69b120] c000000000316b50 perf_callchain+0x70/0xa0 [c0002ac1cc69b140] c000000000316ddc perf_prepare_sample+0x25c/0x790 [c0002ac1cc69b1a0] c000000000317350 perf_event_output_forward+0x40/0xb0 [c0002ac1cc69b220] c000000000306138 __perf_event_overflow+0x88/0x1a0 [c0002ac1cc69b270] c00000000010cf70 record_and_restart+0x230/0x750 [c0002ac1cc69b620] c00000000010d69c perf_event_interrupt+0x20c/0x510 [c0002ac1cc69b730] c000000000027d9c performance_monitor_exception+0x4c/0x60 [c0002ac1cc69b750] c00000000000b2f8 performance_monitor_common_virt+0x1b8/0x1c0 --- Exception: f00 (Performance Monitor) at c0000000000cb5b0 pSeries_lpar_hpte_insert+0x0/0x160 [link register ] c0000000000846f0 __hash_page_64K+0x210/0x540 [c0002ac1cc69ba50] 0000000000000000 (unreliable) [c0002ac1cc69bb00] c000000000073ae0 update_mmu_cache+0x390/0x3a0 [c0002ac1cc69bb70] c00000000037f024 wp_page_copy+0x364/0xce0 [c0002ac1cc69bc20] c00000000038272c do_wp_page+0xdc/0xa60 [c0002ac1cc69bc70] c0000000003857bc handle_mm_fault+0xb9c/0x1b60 [c0002ac1cc69bd50] c00000000006c434 __do_page_fault+0x314/0xc90 [c0002ac1cc69be20] c00000000000c5c8 handle_page_fault+0x10/0x2c --- Exception: 300 (Data Access) at 00007fff8c861fe8 SP (7ffff6b19660) is in userspace Fixes: 2f92447f9f96 ("powerpc/book3s64/hash: Use the pte_t address from the caller") Reported-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Reported-by: Anton Blanchard <anton@ozlabs.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200727060947.10060-1-npiggin@gmail.com
|
#
69507b98 |
|
21-Jul-2020 |
Santosh Sivaraj <santosh@fossix.org> |
powerpc/mm/hash64: Remove comment that is no longer valid hash_low_64.S was removed in commit a43c0eb8364c ("powerpc/mm: Convert 4k insert from asm to C") and flush_hash_page() is no longer called from any assembly routine. Signed-off-by: Santosh Sivaraj <santosh@fossix.org> [mpe: Tweak comment wording] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200721091915.205006-1-santosh@fossix.org
|
#
5c9fa16e |
|
02-Jul-2020 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s: Remove PROT_SAO support ISA v3.1 does not support the SAO storage control attribute required to implement PROT_SAO. PROT_SAO was used by specialised system software (Lx86) that has been discontinued for about 7 years, and is not thought to be used elsewhere, so removal should not cause problems. We rather remove it than keep support for older processors, because live migrating guest partitions to newer processors may not be possible if SAO is in use (or worse allowed with silent races). - PROT_SAO stays in the uapi header so code using it would still build. - arch_validate_prot() is removed, the generic version rejects PROT_SAO so applications would get a failure at mmap() time. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Drop KVM change for the time being] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200703011958.1166620-3-npiggin@gmail.com
|
#
e0d8e991 |
|
08-Jul-2020 |
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> |
powerpc/book3s64/kuap: Move UAMOR setup to key init function UAMOR values are not application-specific. The kernel initializes its value based on different reserved keys. Remove the thread-specific UAMOR value and don't switch the UAMOR on context switch. Move UAMOR initialization to key initialization code and remove thread_struct.uamor because it is not used anymore. Before commit: 4a4a5e5d2aad ("powerpc/pkeys: key allocation/deallocation must not change pkey registers") we used to update uamor based on key allocation and free. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200709032946.881753-20-aneesh.kumar@linux.ibm.com
|
#
86590e52 |
|
21-Jun-2020 |
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> |
powerpc/mm/book3s64: Skip 16G page reservation with radix With hash translation, the hypervisor can hint the LPAR about 16GB contiguous range via ibm,expected#pages. The kernel marks the range specified in the device tree as reserved. Avoid doing this when using radix translation. Radix translation only supports 1G gigantic hugepage and kernel can do the 1G gigantic hugepage allocation via early memblock reservation. This can be done because with radix translation pages are not required to be contiguous on the host. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200622064019.16682-1-aneesh.kumar@linux.ibm.com
|
#
55bd9ac4 |
|
05-Jun-2020 |
Joe Perches <joe@perches.com> |
powerpc/mm: Fix typo in IS_ENABLED() IS_ENABLED() matches names exactly, so the missing "CONFIG_" prefix means this code would never be built. Also fixes a missing newline in pr_warn(). Fixes: 970d54f99cea ("powerpc/book3s64/hash: Disable 16M linear mapping size if not aligned") Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/202006050717.A2F9809E@keescook
|
#
65fddcfc |
|
08-Jun-2020 |
Mike Rapoport <rppt@kernel.org> |
mm: reorder includes after introduction of linux/pgtable.h The replacement of <asm/pgrable.h> with <linux/pgtable.h> made the include of the latter in the middle of asm includes. Fix this up with the aid of the below script and manual adjustments here and there. import sys import re if len(sys.argv) is not 3: print "USAGE: %s <file> <header>" % (sys.argv[0]) sys.exit(1) hdr_to_move="#include <linux/%s>" % sys.argv[2] moved = False in_hdrs = False with open(sys.argv[1], "r") as f: lines = f.readlines() for _line in lines: line = _line.rstrip(' ') if line == hdr_to_move: continue if line.startswith("#include <linux/"): in_hdrs = True elif not moved and in_hdrs: moved = True print hdr_to_move print line Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-4-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
ca5999fd |
|
08-Jun-2020 |
Mike Rapoport <rppt@kernel.org> |
mm: introduce include/linux/pgtable.h The include/linux/pgtable.h is going to be the home of generic page table manipulation functions. Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and make the latter include asm/pgtable.h. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
18594f9b |
|
04-May-2020 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s/radix: Don't prefetch DAR in update_mmu_cache The idea behind this prefetch was to kick off a page table walk before returning from the fault, getting some pipelining advantage. But this never showed up any noticable performance advantage, and in fact with KUAP the prefetches are actually blocked and cause some kind of micro-architectural fault. Removing this improves page fault microbenchmark performance by about 9%. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Keep the early return in update_mmu_cache()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200504122907.49304-1-npiggin@gmail.com
|
#
82a1b8ed |
|
11-May-2020 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s/hash: Add stress_slb kernel boot option to increase SLB faults This option increases the number of SLB misses by limiting the number of kernel SLB entries, and increased flushing of cached lookaside information. This helps stress test difficult to hit paths in the kernel. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Relocate the code into arch/powerpc/mm, s/torture/stress/] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200511125825.3081305-1-mpe@ellerman.id.au
|
#
2f92447f |
|
04-May-2020 |
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> |
powerpc/book3s64/hash: Use the pte_t address from the caller Don't fetch the pte value using lockless page table walk. Instead use the value from the caller. hash_preload is called with ptl lock held. So it is safe to use the pte_t address directly. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200505071729.54912-6-aneesh.kumar@linux.ibm.com
|
#
ec4abf1e |
|
04-May-2020 |
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> |
powerpc/mm/hash64: use _PAGE_PTE when checking for pte_present This makes the pte_present check stricter by checking for additional _PAGE_PTE bit. A level 1 pte pointer (THP pte) can be switched to a pointer to level 0 pte page table page by following two operations. 1) THP split. 2) madvise(MADV_DONTNEED) in parallel to page fault. A lockless page table walk need to make sure we can handle such changes gracefully. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200505071729.54912-4-aneesh.kumar@linux.ibm.com
|
#
fe4a6856 |
|
04-May-2020 |
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> |
powerpc/pkeys: Avoid using lockless page table walk Fetch pkey from vma instead of linux page table. Also document the fact that in some cases the pkey returned in siginfo won't be the same as the one we took keyfault on. Even with linux page table walk, we can end up in a similar scenario. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200505071729.54912-2-aneesh.kumar@linux.ibm.com
|
#
4e00c5af |
|
10-Apr-2020 |
Logan Gunthorpe <logang@deltatee.com> |
powerpc/mm: thread pgprot_t through create_section_mapping() In prepartion to support a pgprot_t argument for arch_add_memory(). Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Eric Badger <ebadger@gigaio.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200306170846.9333-6-logang@deltatee.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
08f6a797 |
|
09-Feb-2020 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
powerpc/mm: book3s64: hash_utils: no need to check return value of debugfs_create functions When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200209105901.1620958-3-gregkh@linuxfoundation.org
|
#
970d54f9 |
|
23-Dec-2019 |
Russell Currey <ruscur@russell.cc> |
powerpc/book3s64/hash: Disable 16M linear mapping size if not aligned With STRICT_KERNEL_RWX on in a relocatable kernel under the hash MMU, if the position the kernel is loaded at is not 16M aligned things go horribly wrong. Specifically hash__mark_initmem_nx() will call hash__change_memory_range() which then aligns down the start address, and due to the text not being 16M aligned causes some of the kernel text to be marked non-executable. We can avoid this when selecting the linear mapping size, so do so and print a warning. I tested this for various alignments and as long as the position is 64K aligned it's fine (the base requirement for powerpc). Signed-off-by: Russell Currey <ruscur@russell.cc> [mpe: Add details of the failure mode] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191224064126.183670-1-ruscur@russell.cc
|
#
16f6b67c |
|
01-Oct-2019 |
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> |
powerpc/book3s64/hash: Add cond_resched to avoid soft lockup warning With large memory (8TB and more) hotplug, we can get soft lockup warnings as below. These were caused by a long loop without any explicit cond_resched which is a problem for !PREEMPT kernels. Avoid this using cond_resched() while inserting hash page table entries. We already do similar cond_resched() in __add_pages(), see commit f64ac5e6e306 ("mm, memory_hotplug: add scheduling point to __add_pages"). rcu: 3-....: (24002 ticks this GP) idle=13e/1/0x4000000000000002 softirq=722/722 fqs=12001 (t=24003 jiffies g=4285 q=2002) NMI backtrace for cpu 3 CPU: 3 PID: 3870 Comm: ndctl Not tainted 5.3.0-197.18-default+ #2 Call Trace: dump_stack+0xb0/0xf4 (unreliable) nmi_cpu_backtrace+0x124/0x130 nmi_trigger_cpumask_backtrace+0x1ac/0x1f0 arch_trigger_cpumask_backtrace+0x28/0x3c rcu_dump_cpu_stacks+0xf8/0x154 rcu_sched_clock_irq+0x878/0xb40 update_process_times+0x48/0x90 tick_sched_handle.isra.16+0x4c/0x80 tick_sched_timer+0x68/0xe0 __hrtimer_run_queues+0x180/0x430 hrtimer_interrupt+0x110/0x300 timer_interrupt+0x108/0x2f0 decrementer_common+0x114/0x120 --- interrupt: 901 at arch_add_memory+0xc0/0x130 LR = arch_add_memory+0x74/0x130 memremap_pages+0x494/0x650 devm_memremap_pages+0x3c/0xa0 pmem_attach_disk+0x188/0x750 nvdimm_bus_probe+0xac/0x2c0 really_probe+0x148/0x570 driver_probe_device+0x19c/0x1d0 device_driver_attach+0xcc/0x100 bind_store+0x134/0x1c0 drv_attr_store+0x44/0x60 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1a0/0x270 __vfs_write+0x3c/0x70 vfs_write+0xd0/0x260 ksys_write+0xdc/0x130 system_call+0x5c/0x68 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191001084656.31277-1-aneesh.kumar@linux.ibm.com
|
#
d78d5dac |
|
24-Oct-2019 |
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> |
powerpc/book3s64/hash: Use secondary hash for bolted mapping if the primary is full With bolted hash page table entry, kernel currently only use primary hash group when inserting the hash page table entry. In the rare case where kernel find all the 8 primary hash slot occupied by bolted entries, this can result in hash page table insert failure for bolted entries. Avoid this by using the secondary hash group. This is different from what kernel does for the non-bolted mapping. With non-bolted entries kernel will try secondary before removing an existing entry from hash page table group. With bolted prefer primary hash group and hence try to insert the page table entry by removing a slot from primary before trying the secondary hash group. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191024093542.29777-3-aneesh.kumar@linux.ibm.com
|
#
75838a32 |
|
24-Oct-2019 |
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> |
powerpc/pseries: Don't fail hash page table insert for bolted mapping If the hypervisor returned H_PTEG_FULL for H_ENTER hcall, retry a hash page table insert by removing a random entry from the group. After some runtime, it is very well possible to find all the 8 hash page table entry slot in the hpte group used for mapping. Don't fail a bolted entry insert in that case. With Storage class memory a user can find this error easily since a namespace enable/disable is equivalent to memory add/remove. This results in failures as reported below: $ ndctl create-namespace -r region1 -t pmem -m devdax -a 65536 -s 100M libndctl: ndctl_dax_enable: dax1.3: failed to enable Error: namespace1.2: failed to enable failed to create namespace: No such device or address In kernel log we find the details as below: Unable to create mapping for hot added memory 0xc000042006000000..0xc00004200d000000: -1 dax_pmem: probe of dax1.3 failed with error -14 This indicates that we failed to create a bolted hash table entry for direct-map address backing the namespace. We also observe failures such that not all namespaces will be enabled with ndctl enable-namespace all command. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191024093542.29777-2-aneesh.kumar@linux.ibm.com
|
#
9ef258ba |
|
23-Sep-2019 |
Kefeng Wang <wangkefeng.wang@huawei.com> |
thp: update split_huge_page_pmd() comment According to 78ddc5347341 ("thp: rename split_huge_page_pmd() to split_huge_pmd()"), update related comment. Link: http://lkml.kernel.org/r/20190731033406.185285-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
dac39f78 |
|
11-Sep-2019 |
Michael Ellerman <mpe@ellerman.id.au> |
powerpc/64s: Remove overlaps_kvm_tmp() kvm_tmp is now in .text and so doesn't need a special overlap check. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190911115746.12433-2-mpe@ellerman.id.au
|
#
7d805acc |
|
02-Sep-2019 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s: remove unnecessary translation cache flushes at boot The various translation structure invalidations performed in early boot when the MMU is off are not required, because everything is invalidated immediately before a CPU first enables its MMU (see early_init_mmu and early_init_mmu_secondary). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190902152931.17840-6-npiggin@gmail.com
|
#
fd13daea |
|
02-Sep-2019 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s: make mmu_partition_table_set_entry TLB flush optional No functional change. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190902152931.17840-4-npiggin@gmail.com
|
#
ed6546bd |
|
02-Sep-2019 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s: remove register_process_table callback This callback is only required because the partition table init comes before process table allocation on powernv (aka bare metal aka native). Change the order to allocate the process table first, and remove the callback. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190902152931.17840-2-npiggin@gmail.com
|
#
9b123d1e |
|
02-Aug-2019 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s/exception: reduce page fault unnecessary loads This avoids 3 loads in the radix page fault case, 1 load in the hash fault case, and 2 loads in the hash miss page fault case. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802105709.27696-37-npiggin@gmail.com
|
#
52231340 |
|
21-Aug-2019 |
Claudio Carvalho <cclaudio@linux.ibm.com> |
powerpc/mm: Write to PTCR only if ultravisor disabled In ultravisor enabled systems, PTCR becomes ultravisor privileged only for writing and an attempt to write to it will cause a Hypervisor Emulation Assitance interrupt. This patch uses the set_ptcr_when_no_uv() function to restrict PTCR writing to only when ultravisor is disabled. Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190822034838.27876-6-cclaudio@linux.ibm.com
|
#
e5a1edb9 |
|
15-Aug-2019 |
Christophe Leroy <christophe.leroy@c-s.fr> |
powerpc/mm: move update_mmu_cache() into book3s hash utils. update_mmu_cache() is only for BOOK3S, and can be simplified for BOOK3S32. Move it out of mem.c into respective BOOK3S32 and BOOK3S64 files containing hash utils. BOOK3S64 version of hash_preload() is only used locally, declare it static. Remove the radix_enabled() stuff in BOOK3S32 version. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/107aaf43583a5f5d09e0d4e84c4c4390ecfcd512.1565933217.git.christophe.leroy@c-s.fr
|
#
c784be43 |
|
15-May-2019 |
Gautham R. Shenoy <ego@linux.vnet.ibm.com> |
powerpc/pseries: Fix cpu_hotplug_lock acquisition in resize_hpt() The calls to arch_add_memory()/arch_remove_memory() are always made with the read-side cpu_hotplug_lock acquired via memory_hotplug_begin(). On pSeries, arch_add_memory()/arch_remove_memory() eventually call resize_hpt() which in turn calls stop_machine() which acquires the read-side cpu_hotplug_lock again, thereby resulting in the recursive acquisition of this lock. In the absence of CONFIG_PROVE_LOCKING, we hadn't observed a system lockup during a memory hotplug operation because cpus_read_lock() is a per-cpu rwsem read, which, in the fast-path (in the absence of the writer, which in our case is a CPU-hotplug operation) simply increments the read_count on the semaphore. Thus a recursive read in the fast-path doesn't cause any problems. However, we can hit this problem in practice if there is a concurrent CPU-Hotplug operation in progress which is waiting to acquire the write-side of the lock. This will cause the second recursive read to block until the writer finishes. While the writer is blocked since the first read holds the lock. Thus both the reader as well as the writers fail to make any progress thereby blocking both CPU-Hotplug as well as Memory Hotplug operations. Memory-Hotplug CPU-Hotplug CPU 0 CPU 1 ------ ------ 1. down_read(cpu_hotplug_lock.rw_sem) [memory_hotplug_begin] 2. down_write(cpu_hotplug_lock.rw_sem) [cpu_up/cpu_down] 3. down_read(cpu_hotplug_lock.rw_sem) [stop_machine()] Lockdep complains as follows in these code-paths. swapper/0/1 is trying to acquire lock: (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: stop_machine+0x2c/0x60 but task is already holding lock: (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: mem_hotplug_begin+0x20/0x50 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(cpu_hotplug_lock.rw_sem); lock(cpu_hotplug_lock.rw_sem); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by swapper/0/1: #0: (____ptrval____) (&dev->mutex){....}, at: __driver_attach+0x12c/0x1b0 #1: (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: mem_hotplug_begin+0x20/0x50 #2: (____ptrval____) (mem_hotplug_lock.rw_sem){++++}, at: percpu_down_write+0x54/0x1a0 stack backtrace: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc5-58373-gbc99402235f3-dirty #166 Call Trace: dump_stack+0xe8/0x164 (unreliable) __lock_acquire+0x1110/0x1c70 lock_acquire+0x240/0x290 cpus_read_lock+0x64/0xf0 stop_machine+0x2c/0x60 pseries_lpar_resize_hpt+0x19c/0x2c0 resize_hpt_for_hotplug+0x70/0xd0 arch_add_memory+0x58/0xfc devm_memremap_pages+0x5e8/0x8f0 pmem_attach_disk+0x764/0x830 nvdimm_bus_probe+0x118/0x240 really_probe+0x230/0x4b0 driver_probe_device+0x16c/0x1e0 __driver_attach+0x148/0x1b0 bus_for_each_dev+0x90/0x130 driver_attach+0x34/0x50 bus_add_driver+0x1a8/0x360 driver_register+0x108/0x170 __nd_driver_register+0xd0/0xf0 nd_pmem_driver_init+0x34/0x48 do_one_initcall+0x1e0/0x45c kernel_init_freeable+0x540/0x64c kernel_init+0x2c/0x160 ret_from_kernel_thread+0x5c/0x68 Fix this issue by 1) Requiring all the calls to pseries_lpar_resize_hpt() be made with cpu_hotplug_lock held. 2) In pseries_lpar_resize_hpt() invoke stop_machine_cpuslocked() as a consequence of 1) 3) To satisfy 1), in hpt_order_set(), call mmu_hash_ops.resize_hpt() with cpu_hotplug_lock held. Fixes: dbcf929c0062 ("powerpc/pseries: Add support for hash table resizing") Cc: stable@vger.kernel.org # v4.11+ Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1557906352-29048-1-git-send-email-ego@linux.vnet.ibm.com
|
#
2b87a255 |
|
15-May-2019 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s: Make boot look nice(r) Radix boot looks like this: ----------------------------------------------------- phys_mem_size = 0x200000000 dcache_bsize = 0x80 icache_bsize = 0x80 cpu_features = 0x0000c06f8f5fb1a7 possible = 0x0000fbffcf5fb1a7 always = 0x00000003800081a1 cpu_user_features = 0xdc0065c2 0xaee00000 mmu_features = 0xbc006041 firmware_features = 0x0000000010000000 hash-mmu: ppc64_pft_size = 0x0 hash-mmu: kernel vmalloc start = 0xc008000000000000 hash-mmu: kernel IO start = 0xc00a000000000000 hash-mmu: kernel vmemmap start = 0xc00c000000000000 ----------------------------------------------------- Fix: ----------------------------------------------------- phys_mem_size = 0x200000000 dcache_bsize = 0x80 icache_bsize = 0x80 cpu_features = 0x0000c06f8f5fb1a7 possible = 0x0000fbffcf5fb1a7 always = 0x00000003800081a1 cpu_user_features = 0xdc0065c2 0xaee00000 mmu_features = 0xbc006041 firmware_features = 0x0000000010000000 vmalloc start = 0xc008000000000000 IO start = 0xc00a000000000000 vmemmap start = 0xc00c000000000000 ----------------------------------------------------- Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190516020437.11783-1-npiggin@gmail.com
|
#
da0ef933 |
|
09-Jul-2019 |
Suraj Jitindar Singh <sjitindarsingh@gmail.com> |
powerpc/mm: Limit rma_size to 1TB when running without HV mode The virtual real mode addressing (VRMA) mechanism is used when a partition is using HPT (Hash Page Table) translation and performs real mode accesses (MSR[IR|DR] = 0) in non-hypervisor mode. In this mode effective address bits 0:23 are treated as zero (i.e. the access is aliased to 0) and the access is performed using an implicit 1TB SLB entry. The size of the RMA (Real Memory Area) is communicated to the guest as the size of the first memory region in the device tree. And because of the mechanism described above can be expected to not exceed 1TB. In the event that the host erroneously represents the RMA as being larger than 1TB, guest accesses in real mode to memory addresses above 1TB will be aliased down to below 1TB. This means that a memory access performed in real mode may differ to one performed in virtual mode for the same memory address, which would likely have unintended consequences. To avoid this outcome have the guest explicitly limit the size of the RMA to the current maximum, which is 1TB. This means that even if the first memory block is larger than 1TB, only the first 1TB should be accessed in real mode. Fixes: c610d65c0ad0 ("powerpc/pseries: lift RTAS limit for hash") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Tested-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190710052018.14628-1-sjitindarsingh@gmail.com
|
#
78c94988 |
|
01-Jul-2019 |
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> |
powerpc/mm/hash/4k: Don't use 64K page size for vmemmap with 4K pagesize With hash translation and 4K PAGE_SIZE config, we need to make sure we don't use 64K page size for vmemmap. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
2874c5fd |
|
27-May-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
d667edc0 |
|
04-May-2019 |
YueHaibing <yuehaibing@huawei.com> |
powerpc/mm: Make some symbols static that can be Noticed by sparse. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
e4dccf90 |
|
26-Apr-2019 |
Christophe Leroy <christophe.leroy@c-s.fr> |
powerpc/mm: print hash info in a helper Reduce #ifdef mess by defining a helper to print hash info at startup. In the meantime, remove the display of hash table address to reduce leak of non necessary information. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
45d0ba52 |
|
25-Apr-2019 |
Christophe Leroy <christophe.leroy@c-s.fr> |
powerpc/mm: move hugetlb_disabled into asm/hugetlb.h No need to have this in asm/page.h, move it into asm/hugetlb.h Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
5953fb4f |
|
25-Apr-2019 |
Christophe Leroy <christophe.leroy@c-s.fr> |
powerpc/mm: define subarch SLB_ADDR_LIMIT_DEFAULT This patch defines a subarch specific SLB_ADDR_LIMIT_DEFAULT to remove the #ifdefs around the setup of mm->context.slb_addr_limit It also generalises the use of mm_ctx_set_slb_addr_limit() helper. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
47d99948 |
|
29-Mar-2019 |
Christophe Leroy <christophe.leroy@c-s.fr> |
powerpc/mm: Move book3s64 specifics in subdirectory mm/book3s64 Many files in arch/powerpc/mm are only for book3S64. This patch creates a subdirectory for them. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Update the selftest sym links, shorten new filenames, cleanup some whitespace and formatting in the new files.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|