Cross Reference: /linux-master/arch/arm64/include/asm/pgtable-prot.h

History log of /linux-master/arch/arm64/include/asm/pgtable-prot.h
Revision	Date	Author	Comments
# db95ea78	14-Feb-2024	Ard Biesheuvel <ardb@kernel.org>	arm64: mm: Wire up TCR.DS bit to PTE shareability fields When LPA2 is enabled, bits 8 and 9 of page and block descriptors become part of the output address instead of carrying shareability attributes for the region in question. So avoid setting these bits if TCR.DS == 1, which means LPA2 is enabled. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20240214122845.2033971-74-ardb+git@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 60d043c1	14-Feb-2024	Ard Biesheuvel <ardb@kernel.org>	arm64: Avoid #define'ing PTE_MAYBE_NG to 0x0 for asm use The PROT_* macros resolve to expressions that are only valid in C and not in assembler, and so they are only usable from C code. Currently, we make an exception for the permission indirection init code in proc.S, which doesn't care about the bits that are conditionally set, and so we just #define PTE_MAYBE_NG to 0x0 for any assembler file that includes these definitions. This is dodgy because this means that PROT_NORMAL and friends is generally available in asm code, but defined in a way that deviates from the definition that C code will observe, which might lead to hard to diagnose issues down the road. So instead, #define PTE_MAYBE_NG only in the place where the PIE constants are evaluated, and #undef it again right after. This allows us to drop the #define from pgtable-prot.h, and avoid the risk of deviating definitions between asm and C. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20240214122845.2033971-72-ardb+git@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 936a4ec2	27-Nov-2023	Ryan Roberts <ryan.roberts@arm.com>	arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs Add stub functions which is initially always return false. These provide the hooks that we need to update the range-based TLBI routines, whose operands are encoded differently depending on whether lpa2 is enabled or not. The kernel and kvm will enable the use of lpa2 asynchronously in future, and part of that enablement will involve fleshing out their respective hook to advertise when it is using lpa2. Since the kernel's decision to use lpa2 relies on more than just whether the HW supports the feature, it can't just use the same static key as kvm. This is another reason to use separate functions. lpa2_is_enabled() is already implemented as part of Ard's kernel lpa2 series. Since kvm will make its decision solely based on HW support, kvm_lpa2_is_enabled() will be defined as system_supports_lpa2() once kvm starts using lpa2. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20231127111737.1897081-3-ryan.roberts@arm.com
# bbbb6577	16-Oct-2023	Mark Rutland <mark.rutland@arm.com>	arm64: Avoid cpus_have_const_cap() for ARM64_HAS_BTI In system_supports_bti() we use cpus_have_const_cap() to check for ARM64_HAS_BTI, but this is not necessary and alternative_has_cap_() or cpus_have_final_cap() would be preferable. For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching. Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching. When CONFIG_ARM64_BTI_KERNEL=y, the ARM64_HAS_BTI cpucap is a strict boot cpu feature which is detected and patched early on the boot cpu. All uses guarded by CONFIG_ARM64_BTI_KERNEL happen after the boot CPU has detected ARM64_HAS_BTI and patched boot alternatives, and hence can safely use alternative_has_cap_() or cpus_have_final_boot_cap(). Regardless of CONFIG_ARM64_BTI_KERNEL, all other uses of ARM64_HAS_BTI happen after system capabilities have been finalized and alternatives have been patched. Hence these can safely use alternative_has_cap_*) or cpus_have_final_cap(). This patch splits system_supports_bti() into system_supports_bti() and system_supports_bti_kernel(), with the former handling where the cpucap affects userspace functionality, and ther latter handling where the cpucap affects kernel functionality. The use of cpus_have_const_cap() is replaced by cpus_have_final_cap() in cpus_have_const_cap, and cpus_have_final_boot_cap() in system_supports_bti_kernel(). This will avoid generating code to test the system_cpucaps bitmap and should be better for all subsequent calls at runtime. The use of cpus_have_final_cap() and cpus_have_final_boot_cap() will make it easier to spot if code is chaanged such that these run before the ARM64_HAS_BTI cpucap is guaranteed to have been finalized. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# eeda243d	06-Jun-2023	Joey Gouly <joey.gouly@arm.com>	arm64: add encodings of PIRx_ELx registers The encodings used in the permission indirection registers means that the values that Linux puts in the PTEs do not need to be changed. The E0 values are replicated in E1, with the execute permissions removed. This is needed as the futex operations access user mappings with privileged loads/stores. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20230606145859.697944-16-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# fa4cdcca	06-Jun-2023	Joey Gouly <joey.gouly@arm.com>	arm64: reorganise PAGE_/PROT_ macros Make these macros available to assembly code, so they can be re-used by the PIE initialisation code. This involves adding some extra macros, prepended with _ that are the raw values not `pgprot` values. A dummy value for PTE_MAYBE_NG is also provided, for use in assembly. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20230606145859.697944-14-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 7c302cfb	06-Jun-2023	Joey Gouly <joey.gouly@arm.com>	arm64: add PTE_WRITE to PROT_SECT_NORMAL With PIE enabled, PROT_SECT_NORMAL would map onto PAGE_KERNEL_RO. Add PTE_WRITE so that this maps onto PAGE_KERNEL, so that it is writable. Without PIE, this should enable DBM for PROT_SECT_NORMAL. However PTE_RDONLY is already cleared, so the DBM mechanism is not used, and it is always writable, so this is functionally equivalent. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20230606145859.697944-13-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 42251045	10-Jul-2022	Anshuman Khandual <anshuman.khandual@arm.com>	arm64/mm: move protection_map[] inside the platform This moves protection_map[] inside the platform and makes it a static. Link: https://lkml.kernel.org/r/20220711070600.2378316-6-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Brian Cain <bcain@quicinc.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Christoph Hellwig <hch@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
# 570ef363	09-May-2022	David Hildenbrand <david@redhat.com>	arm64/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE Let's use one of the type bits: core-mm only supports 5, so there is no need to consume 6. Note that we might be able to reuse bit 1, but reusing bit 1 turned out problematic in the past for PROT_NONE handling; so let's play safe and use another bit. Link: https://lkml.kernel.org/r/20220329164329.208407-5-david@redhat.com Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Don Dutile <ddutile@redhat.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Liang Zhang <zhangliang5@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Nadav Amit <namit@vmware.com> Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pedro Demarchi Gomes <pedrodemargomes@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
# 6e2edd63	03-Mar-2022	Catalin Marinas <catalin.marinas@arm.com>	arm64: Ensure execute-only permissions are not allowed without EPAN Commit 18107f8a2df6 ("arm64: Support execute-only permissions with Enhanced PAN") re-introduced execute-only permissions when EPAN is available. When EPAN is not available, arch_filter_pgprot() is supposed to change a PAGE_EXECONLY permission into PAGE_READONLY_EXEC. However, if BTI or MTE are present, such check does not detect the execute-only pgprot in the presence of PTE_GP (BTI) or MT_NORMAL_TAGGED (MTE), allowing the user to request PROT_EXEC with PROT_BTI or PROT_MTE. Remove the arch_filter_pgprot() function, change the default VM_EXEC permissions to PAGE_READONLY_EXEC and update the protection_map[] array at core_initcall() if EPAN is detected. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Fixes: 18107f8a2df6 ("arm64: Support execute-only permissions with Enhanced PAN") Cc: <stable@vger.kernel.org> # 5.13.x Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
# 21cfe6ed	26-May-2021	Will Deacon <will@kernel.org>	arm64: mm: Remove unused support for Normal-WT memory type The Normal-WT memory type is unused, so remove it and reclaim a MAIR. Cc: Christoph Hellwig <hch@lst.de> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210527110319.22157-4-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
# bc224df1	19-Mar-2021	Quentin Perret <qperret@google.com>	KVM: arm64: Introduce KVM_PGTABLE_S2_NOFWB stage 2 flag In order to further configure stage 2 page-tables, pass flags to the init function using a new enum. The first of these flags allows to disable FWB even if the hardware supports it as we will need to do so for the host stage 2. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-33-qperret@google.com
# 18107f8a	12-Mar-2021	Vladimir Murzin <vladimir.murzin@arm.com>	arm64: Support execute-only permissions with Enhanced PAN Enhanced Privileged Access Never (EPAN) allows Privileged Access Never to be used with Execute-only mappings. Absence of such support was a reason for 24cecc377463 ("arm64: Revert support for execute-only user mappings"). Thus now it can be revisited and re-enabled. Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210312173811.58284-2-vladimir.murzin@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# d15dfd31	08-Mar-2021	Catalin Marinas <catalin.marinas@arm.com>	arm64: mte: Map hotplugged memory as Normal Tagged In a system supporting MTE, the linear map must allow reading/writing allocation tags by setting the memory type as Normal Tagged. Currently, this is only handled for memory present at boot. Hotplugged memory uses Normal non-Tagged memory. Introduce pgprot_mhp() for hotplugged memory and use it in add_memory_resource(). The arm64 code maps pgprot_mhp() to pgprot_tagged(). Note that ZONE_DEVICE memory should not be mapped as Tagged and therefore setting the memory type in arch_add_memory() is not feasible. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Fixes: 0178dc761368 ("arm64: mte: Use Normal Tagged attributes for the linear map") Reported-by: Patrick Daly <pdaly@codeaurora.org> Tested-by: Patrick Daly <pdaly@codeaurora.org> Link: https://lore.kernel.org/r/1614745263-27827-1-git-send-email-pdaly@codeaurora.org Cc: <stable@vger.kernel.org> # 5.10.x Cc: Will Deacon <will@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: David Hildenbrand <david@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20210309122601.5543-1-catalin.marinas@arm.com Signed-off-by: Will Deacon <will@kernel.org>
# 3f26ab58	11-Sep-2020	Will Deacon <will@kernel.org>	KVM: arm64: Remove unused page-table code Now that KVM is using the generic page-table code to manage the guest stage-2 page-tables, we can remove a bunch of unused macros, #defines and static inline functions from the old implementation. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-20-will@kernel.org
# 0f9d09b8	11-Sep-2020	Will Deacon <will@kernel.org>	KVM: arm64: Use generic allocator for hyp stage-1 page-tables Now that we have a shiny new page-table allocator, replace the hyp page-table code with calls into the new API. This also allows us to remove the extended idmap code, as we can now simply ensure that the VA size is large enough to map everything we need. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-5-will@kernel.org
# b65399f6	08-Sep-2020	Anshuman Khandual <anshuman.khandual@arm.com>	arm64/mm: Change THP helpers to comply with generic MM semantics pmd_present() and pmd_trans_huge() are expected to behave in the following manner during various phases of a given PMD. It is derived from a previous detailed discussion on this topic [1] and present THP documentation [2]. pmd_present(pmd): - Returns true if pmd refers to system RAM with a valid pmd_page(pmd) - Returns false if pmd refers to a migration or swap entry pmd_trans_huge(pmd): - Returns true if pmd refers to system RAM and is a trans huge mapping ------------------------------------------------------------------------- \| PMD states \| pmd_present \| pmd_trans_huge \| ------------------------------------------------------------------------- \| Mapped \| Yes \| Yes \| ------------------------------------------------------------------------- \| Splitting \| Yes \| Yes \| ------------------------------------------------------------------------- \| Migration/Swap \| No \| No \| ------------------------------------------------------------------------- The problem: PMD is first invalidated with pmdp_invalidate() before it's splitting. This invalidation clears PMD_SECT_VALID as below. PMD Split -> pmdp_invalidate() -> pmd_mkinvalid -> Clears PMD_SECT_VALID Once PMD_SECT_VALID gets cleared, it results in pmd_present() return false on the PMD entry. It will need another bit apart from PMD_SECT_VALID to re- affirm pmd_present() as true during the THP split process. To comply with above mentioned semantics, pmd_trans_huge() should also check pmd_present() first before testing presence of an actual transparent huge mapping. The solution: Ideally PMD_TYPE_SECT should have been used here instead. But it shares the bit position with PMD_SECT_VALID which is used for THP invalidation. Hence it will not be there for pmd_present() check after pmdp_invalidate(). A new software defined PMD_PRESENT_INVALID (bit 59) can be set on the PMD entry during invalidation which can help pmd_present() return true and in recognizing the fact that it still points to memory. This bit is transient. During the split process it will be overridden by a page table page representing normal pages in place of erstwhile huge page. Other pmdp_invalidate() callers always write a fresh PMD value on the entry overriding this transient PMD_PRESENT_INVALID bit, which makes it safe. [1]: https://lkml.org/lkml/2018/10/17/231 [2]: https://www.kernel.org/doc/Documentation/vm/transhuge.txt Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Link: https://lore.kernel.org/r/1599627183-14453-2-git-send-email-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
# 0178dc76	27-Nov-2019	Catalin Marinas <catalin.marinas@arm.com>	arm64: mte: Use Normal Tagged attributes for the linear map Once user space is given access to tagged memory, the kernel must be able to clear/save/restore tags visible to the user. This is done via the linear mapping, therefore map it as such. The new MT_NORMAL_TAGGED index for MAIR_EL1 is initially mapped as Normal memory and later changed to Normal Tagged via the cpufeature infrastructure. From a mismatched attribute aliases perspective, the Tagged memory is considered a permission and it won't lead to undefined behaviour. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>
# 68cf6173	08-Jul-2020	Will Deacon <will@kernel.org>	KVM: arm64: Fix definition of PAGE_HYP_DEVICE PAGE_HYP_DEVICE is intended to encode attribute bits for an EL2 stage-1 pte mapping a device. Unfortunately, it includes PROT_DEVICE_nGnRE which encodes attributes for EL1 stage-1 mappings such as UXN and nG, which are RES0 for EL2, and DBM which is meaningless as TCR_EL2.HD is not set. Fix the definition of PAGE_HYP_DEVICE so that it doesn't set RES0 bits at EL2. Acked-by: Marc Zyngier <maz@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200708162546.26176-1-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
# e4e9f6df	11-May-2020	Mark Brown <broonie@kernel.org>	arm64: bti: Fix support for userspace only BTI When setting PTE_MAYBE_GP we check system_supports_bti() but this is true for systems where only CONFIG_BTI is set causing us to enable BTI on some kernel text. Add an extra check for the kernel mode option, using an ifdef due to line length. Fixes: c8027285e366 ("arm64: Set GP bit in kernel page tables to enable BTI for the kernel") Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20200512113950.29996-1-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
# c8027285	06-May-2020	Mark Brown <broonie@kernel.org>	arm64: Set GP bit in kernel page tables to enable BTI for the kernel Now that the kernel is built with BTI annotations enable the feature by setting the GP bit in the stage 1 translation tables. This is done based on the features supported by the boot CPU so that we do not need to rewrite the translation tables. In order to avoid potential issues on big.LITTLE systems when there are a mix of BTI and non-BTI capable CPUs in the system when we have enabled kernel mode BTI we change BTI to be a _STRICT_BOOT_CPU_FEATURE when we have kernel BTI. This will prevent any CPUs that don't support BTI being started if the boot CPU supports BTI rather than simply not using BTI as we do when supporting BTI only in userspace. The main concern is the possibility of BTYPE being preserved by a CPU that does not implement BTI when a thread is migrated to it resulting in an incorrect state which could generate an exception when the thread migrates back to a CPU that does support BTI. If we encounter practical systems which mix BTI and non-BTI CPUs we will need to revisit this implementation. Since we currently do not generate landing pads in the BPF JIT we only map the base kernel text in this way. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20200506195138.22086-5-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
# c8355785	18-Mar-2020	Will Deacon <will@kernel.org>	arm64: kpti: Fix "kpti=off" when KASLR is enabled Enabling KASLR forces the use of non-global page-table entries for kernel mappings, as this is a decision that we have to make very early on before mapping the kernel proper. When used in conjunction with the "kpti=off" command-line option, it is possible to use non-global kernel mappings but with the kpti trampoline disabled. Since commit 09e3c22a86f6 ("arm64: Use a variable to store non-global mappings decision"), arm64_kernel_unmapped_at_el0() reflects only the use of non-global mappings and does not take into account whether the kpti trampoline is enabled. This breaks context switching of the TPIDRRO_EL0 register for 64-bit tasks, where the clearing of the register is deferred to the ret-to-user code, but it also breaks the ARM SPE PMU driver which helpfully recommends passing "kpti=off" on the command line! Report whether or not KPTI is actually enabled in arm64_kernel_unmapped_at_el0() and check the 'arm64_use_ng_mappings' global variable directly when determining the protection flags for kernel mappings. Cc: Mark Brown <broonie@kernel.org> Reported-by: Hongbo Yao <yaohongbo@huawei.com> Tested-by: Hongbo Yao <yaohongbo@huawei.com> Fixes: 09e3c22a86f6 ("arm64: Use a variable to store non-global mappings decision") Signed-off-by: Will Deacon <will@kernel.org>
# 09e3c22a	09-Dec-2019	Mark Brown <broonie@kernel.org>	arm64: Use a variable to store non-global mappings decision Refactor the code which checks to see if we need to use non-global mappings to use a variable instead of checking with the CPU capabilities each time, doing the initial check for KPTI early in boot before we start allocating memory so we still avoid transitioning to non-global mappings in common cases. Since this variable always matches our decision about non-global mappings this means we can also combine arm64_kernel_use_ng_mappings() and arm64_unmap_kernel_at_el0() into a single function, the variable simply stores the result and the decision code is elsewhere. We could just have the users check the variable directly but having a function makes it clear that these uses are read-only. The result is that we simplify the code a bit and reduces the amount of code executed at runtime. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# 24cecc37	06-Jan-2020	Catalin Marinas <catalin.marinas@arm.com>	arm64: Revert support for execute-only user mappings The ARMv8 64-bit architecture supports execute-only user permissions by clearing the PTE_USER and PTE_UXN bits, practically making it a mostly privileged mapping but from which user running at EL0 can still execute. The downside, however, is that the kernel at EL1 inadvertently reading such mapping would not trip over the PAN (privileged access never) protection. Revert the relevant bits from commit cab15ce604e5 ("arm64: Introduce execute-only page access permissions") so that PROT_EXEC implies PROT_READ (and therefore PTE_USER) until the architecture gains proper support for execute-only user mappings. Fixes: cab15ce604e5 ("arm64: Introduce execute-only page access permissions") Cc: <stable@vger.kernel.org> # 4.9.x- Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# aa57157b	29-Oct-2019	Catalin Marinas <catalin.marinas@arm.com>	arm64: Ensure VM_WRITE\|VM_SHARED ptes are clean by default Shared and writable mappings (__S.1.) should be clean (!dirty) initially and made dirty on a subsequent write either through the hardware DBM (dirty bit management) mechanism or through a write page fault. A clean pte for the arm64 kernel is one that has PTE_RDONLY set and PTE_DIRTY clear. The PAGE_SHARED{,_EXEC} attributes have PTE_WRITE set (PTE_DBM) and PTE_DIRTY clear. Prior to commit 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()"), it was the responsibility of set_pte_at() to set the PTE_RDONLY bit and mark the pte clean if the software PTE_DIRTY bit was not set. However, the above commit removed the pte_sw_dirty() check and the subsequent setting of PTE_RDONLY in set_pte_at() while leaving the PAGE_SHARED{,_EXEC} definitions unchanged. The result is that shared+writable mappings are now dirty by default Fix the above by explicitly setting PTE_RDONLY in PAGE_SHARED{,_EXEC}. In addition, remove the superfluous PTE_DIRTY bit from the kernel PROT_* attributes. Fixes: 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()") Cc: <stable@vger.kernel.org> # 4.14.x- Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# e8688ba3	27-Aug-2019	James Morse <james.morse@arm.com>	arm64: KVM: Device mappings should be execute-never Since commit 2f6ea23f63cca ("arm64: KVM: Avoid marking pages as XN in Stage-2 if CTR_EL0.DIC is set"), KVM has stopped marking normal memory as execute-never at stage2 when the system supports D->I Coherency at the PoU. This avoids KVM taking a trap when the page is first executed, in order to clean it to PoU. The patch that added this change also wrapped PAGE_S2_DEVICE mappings up in this too. The upshot is, if your CPU caches support DIC ... you can execute devices. Revert the PAGE_S2_DEVICE change so PTE_S2_XN is always used directly. Fixes: 2f6ea23f63cca ("arm64: KVM: Avoid marking pages as XN in Stage-2 if CTR_EL0.DIC is set") Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
# 73b20c84	16-Jul-2019	Robin Murphy <robin.murphy@arm.com>	arm64: mm: implement pte_devmap support In order for things like get_user_pages() to work on ZONE_DEVICE memory, we need a software PTE bit to identify device-backed PFNs. Hook this up along with the relevant helpers to join in with ARCH_HAS_PTE_DEVMAP. [robin.murphy@arm.com: build fixes] Link: http://lkml.kernel.org/r/13026c4e64abc17133bbfa07d7731ec6691c0bcd.1559050949.git.robin.murphy@arm.com Link: http://lkml.kernel.org/r/817d92886fc3b33bcbf6e105ee83a74babb3a5aa.1558547956.git.robin.murphy@arm.com Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# caab277b	02-Jun-2019	Thomas Gleixner <tglx@linutronix.de>	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 503 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Enrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
# 201d355c	20-May-2019	Anshuman Khandual <anshuman.khandual@arm.com>	arm64/mm: Move PTE_VALID from SW defined to HW page table entry definitions PTE_VALID signifies that the last level page table entry is valid and it is MMU recognized while walking the page table. This is not a software defined PTE bit and should not be listed like one. Just move it to appropriate header file. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Steve Capper <steve.capper@arm.com> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# b89d82ef	08-Jan-2019	Will Deacon <will@kernel.org>	arm64: kpti: Avoid rewriting early page tables when KASLR is enabled A side effect of commit c55191e96caa ("arm64: mm: apply r/o permissions of VM areas to its linear alias as well") is that the linear map is created with page granularity, which means that transitioning the early page table from global to non-global mappings when enabling kpti can take a significant amount of time during boot. Given that most CPU implementations do not require kpti, this mainly impacts KASLR builds where kpti is forcefully enabled. However, in these situations we know early on that non-global mappings are required and can avoid the use of global mappings from the beginning. The only gotcha is Cavium erratum #27456, which we must detect based on the MIDR value of the boot CPU. Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reported-by: John Garry <john.garry@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
# 2f6ea23f	25-Apr-2018	Marc Zyngier <maz@kernel.org>	arm64: KVM: Avoid marking pages as XN in Stage-2 if CTR_EL0.DIC is set On systems where CTR_EL0.DIC is set, we don't need to perform icache invalidation to guarantee that we'll fetch the right instruction stream. This also means that taking a permission fault to invalidate the icache is an unnecessary overhead. On such systems, we can safely leave the page as being executable. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
# e48d53a9	05-Apr-2018	Marc Zyngier <maz@kernel.org>	arm64: KVM: Add support for Stage-2 control of memory types and cacheability Up to ARMv8.3, the combinaison of Stage-1 and Stage-2 attributes results in the strongest attribute of the two stages. This means that the hypervisor has to perform quite a lot of cache maintenance just in case the guest has some non-cacheable mappings around. ARMv8.4 solves this problem by offering a different mode (FWB) where Stage-2 has total control over the memory attribute (this is limited to systems where both I/O and instruction fetches are coherent with the dcache). This is achieved by having a different set of memory attributes in the page tables, and a new bit set in HCR_EL2. On such a system, we can then safely sidestep any form of dcache management. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
# 41acec62	29-Jan-2018	Will Deacon <will@kernel.org>	arm64: kpti: Make use of nG dependent on arm64_kernel_unmapped_at_el0() To allow systems which do not require kpti to continue running with global kernel mappings (which appears to be a requirement for Cavium ThunderX due to a CPU erratum), make the use of nG in the kernel page tables dependent on arm64_kernel_unmapped_at_el0(), which is resolved at runtime. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# d0e22b4a	23-Oct-2017	Marc Zyngier <maz@kernel.org>	KVM: arm/arm64: Limit icache invalidation to prefetch aborts We've so far eagerly invalidated the icache, no matter how the page was faulted in (data or prefetch abort). But we can easily track execution by setting the XN bits in the S2 page tables, get the prefetch abort at HYP and perform the icache invalidation at that time only. As for most VMs, the instruction working set is pretty small compared to the data set, this is likely to save some traffic (specially as the invalidation is broadcast). Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
# e046eb0c	09-Aug-2017	Will Deacon <will@kernel.org>	arm64: mm: Use non-global mappings for kernel space In preparation for unmapping the kernel whilst running in userspace, make the kernel mappings non-global so we can avoid expensive TLB invalidation on kernel exit to userspace. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Tested-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
# 73e86cb0	04-Jul-2017	Catalin Marinas <catalin.marinas@arm.com>	arm64: Move PTE_RDONLY bit handling out of set_pte_at() Currently PTE_RDONLY is treated as a hardware only bit and not handled by the pte_mkwrite(), pte_wrprotect() or the user PAGE_* definitions. The set_pte_at() function is responsible for setting this bit based on the write permission or dirty state. This patch moves the PTE_RDONLY handling out of set_pte_at into the pte_mkwrite()/pte_wrprotect() functions. The PAGE_* definitions to need to be updated to explicitly include PTE_RDONLY when !PTE_WRITE. The patch also removes the redundant PAGE_COPY(_EXEC) definitions as they are identical to the corresponding PAGE_READONLY(_EXEC). Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# cab15ce6	11-Aug-2016	Catalin Marinas <catalin.marinas@arm.com>	arm64: Introduce execute-only page access permissions The ARMv8 architecture allows execute-only user permissions by clearing the PTE_UXN and PTE_USER bits. However, the kernel running on a CPU implementation without User Access Override (ARMv8.2 onwards) can still access such page, so execute-only page permission does not protect against read(2)/write(2) etc. accesses. Systems requiring such protection must enable features like SECCOMP. This patch changes the arm64 __P100 and __S100 protection_map[] macros to the new __PAGE_EXECONLY attributes. A side effect is that pte_user() no longer triggers for __PAGE_EXECONLY since PTE_USER isn't set. To work around this, the check is done on the PTE_NG bit via the pte_ng() macro. VM_READ is also checked now for page faults. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
# 0996353f	13-Jun-2016	Marc Zyngier <maz@kernel.org>	arm/arm64: KVM: Make default HYP mappings non-excutable Structures that can be generally written to don't have any requirement to be executable (quite the opposite). This includes the kvm and vcpu structures, as well as the stacks. Let's change the default to incorporate the XN flag. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
# 59002705	13-Jun-2016	Marc Zyngier <maz@kernel.org>	arm/arm64: KVM: Map the HYP text as read-only There should be no reason for mapping the HYP text read/write. As such, let's have a new set of flags (PAGE_HYP_EXEC) that allows execution, but makes the page as read-only, and update the two call sites that deal with mapping code. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
# 74a6b888	13-Jun-2016	Marc Zyngier <maz@kernel.org>	arm/arm64: KVM: Enforce HYP read-only mapping of the kernel's rodata section In order to be able to use C code in HYP, we're now mapping the kernel's rodata in HYP. It works absolutely fine, except that we're mapping it RWX, which is not what it should be. Add a new HYP_PAGE_RO protection, and pass it as the protection flags when mapping the rodata section. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>