History log of /linux-master/arch/arm64/mm/trans_pgd.c
Revision Date Author Comments
# 5a00bfd6 15-Feb-2024 Ryan Roberts <ryan.roberts@arm.com>

arm64/mm: new ptep layer to manage contig bit

Create a new layer for the in-table PTE manipulation APIs. For now, The
existing API is prefixed with double underscore to become the arch-private
API and the public API is just a simple wrapper that calls the private
API.

The public API implementation will subsequently be used to transparently
manipulate the contiguous bit where appropriate. But since there are
already some contig-aware users (e.g. hugetlb, kernel mapper), we must
first ensure those users use the private API directly so that the future
contig-bit manipulations in the public API do not interfere with those
existing uses.

The following APIs are treated this way:

- ptep_get
- set_pte
- set_ptes
- pte_clear
- ptep_get_and_clear
- ptep_test_and_clear_young
- ptep_clear_flush_young
- ptep_set_wrprotect
- ptep_set_access_flags

Link: https://lkml.kernel.org/r/20240215103205.2607016-11-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Tested-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Barry Song <21cnbao@gmail.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 53273655 15-Feb-2024 Ryan Roberts <ryan.roberts@arm.com>

arm64/mm: convert READ_ONCE(*ptep) to ptep_get(ptep)

There are a number of places in the arch code that read a pte by using the
READ_ONCE() macro. Refactor these call sites to instead use the
ptep_get() helper, which itself is a READ_ONCE(). Generated code should
be the same.

This will benefit us when we shortly introduce the transparent contpte
support. In this case, ptep_get() will become more complex so we now have
all the code abstracted through it.

Link: https://lkml.kernel.org/r/20240215103205.2607016-8-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Tested-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Barry Song <21cnbao@gmail.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 6ecc21bb 12-Jun-2023 Rick Edgecombe <rick.p.edgecombe@intel.com>

mm: Move pte/pmd_mkwrite() callers with no VMA to _novma()

The x86 Shadow stack feature includes a new type of memory called shadow
stack. This shadow stack memory has some unusual properties, which requires
some core mm changes to function properly.

One of these unusual properties is that shadow stack memory is writable,
but only in limited ways. These limits are applied via a specific PTE
bit combination. Nevertheless, the memory is writable, and core mm code
will need to apply the writable permissions in the typical paths that
call pte_mkwrite(). Future patches will make pte_mkwrite() take a VMA, so
that the x86 implementation of it can know whether to create regular
writable or shadow stack mappings.

But there are a couple of challenges to this. Modifying the signatures of
each arch pte_mkwrite() implementation would be error prone because some
are generated with macros and would need to be re-implemented. Also, some
pte_mkwrite() callers operate on kernel memory without a VMA.

So this can be done in a three step process. First pte_mkwrite() can be
renamed to pte_mkwrite_novma() in each arch, with a generic pte_mkwrite()
added that just calls pte_mkwrite_novma(). Next callers without a VMA can
be moved to pte_mkwrite_novma(). And lastly, pte_mkwrite() and all callers
can be changed to take/pass a VMA.

Earlier work did the first step, so next move the callers that don't have
a VMA to pte_mkwrite_novma(). Also do the same for pmd_mkwrite(). This
will be ok for the shadow stack feature, as these callers are on kernel
memory which will not need to be made shadow stack, and the other
architectures only currently support one type of memory in pte_mkwrite()

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/all/20230613001108.3040476-3-rick.p.edgecombe%40intel.com


# a8bd38db 12-Jul-2023 Nikhil V <quic_nprakash@quicinc.com>

arm64: mm: Make hibernation aware of KFENCE

In the restore path, swsusp_arch_suspend_exit uses copy_page() to
over-write memory. However, with features like KFENCE enabled, there could
be situations where it may have marked some pages as not valid, due to
which it could be reported as invalid accesses.

Consider a situation where page 'P' was part of the hibernation image.
Now, when the resume kernel tries to restore the pages, the same page 'P'
is already in use in the resume kernel and is kfence protected, due to
which its mapping is removed from linear map. Since restoring pages happens
with the resume kernel page tables, we would end up accessing 'P' during
copy and results in kernel pagefault.

The proposed fix tries to solve this issue by marking PTE as valid for such
kfence protected pages.

Co-developed-by: Pavankumar Kondeti <quic_pkondeti@quicinc.com>
Signed-off-by: Pavankumar Kondeti <quic_pkondeti@quicinc.com>
Signed-off-by: Nikhil V <quic_nprakash@quicinc.com>
Link: https://lore.kernel.org/r/20230713070757.4093-1-quic_nprakash@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>


# 01142791 29-Apr-2022 Ard Biesheuvel <ardb@kernel.org>

arm64: mm: avoid writable executable mappings in kexec/hibernate code

The temporary mappings of the low-level kexec and hibernate helpers are
created with both writable and executable attributes, which is not
necessary here, and generally best avoided. So use read-only, executable
attributes instead.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220429131347.3621090-3-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 6091dd9e 30-Sep-2021 Pasha Tatashin <pasha.tatashin@soleen.com>

arm64: trans_pgd: remove trans_pgd_map_page()

The intend of trans_pgd_map_page() was to map contiguous range of VA
memory to the memory that is getting relocated during kexec. However,
since we are now using linear map instead of contiguous range this
function is not needed

Suggested-by: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-16-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>


# 788bfdd9 30-Sep-2021 Pasha Tatashin <pasha.tatashin@soleen.com>

arm64: trans_pgd: hibernate: Add trans_pgd_copy_el2_vectors

Users of trans_pgd may also need a copy of vector table because it is
also may be overwritten if a linear map can be overwritten.

Move setup of EL2 vectors from hibernate to trans_pgd, so it can be
later shared with kexec as well.

Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-3-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>


# 7018d467 25-Jan-2021 James Morse <james.morse@arm.com>

arm64: trans_pgd: hibernate: idmap the single page that holds the copy page routines

To resume from hibernate, the contents of memory are restored from
the swap image. This may overwrite any page, including the running
kernel and its page tables.

Hibernate copies the code it uses to do the restore into a single
page that it knows won't be overwritten, and maps it with page tables
built from pages that won't be overwritten.

Today the address it uses for this mapping is arbitrary, but to allow
kexec to reuse this code, it needs to be idmapped. To idmap the page
we must avoid the kernel helpers that have VA_BITS baked in.

Convert create_single_mapping() to take a single PA, and idmap it.
The page tables are built in the reverse order to normal using
pfn_pte() to stir in any bits between 52:48. T0SZ is always increased
to cover 48bits, or 52 if the copy code has bits 52:48 in its PA.

Signed-off-by: James Morse <james.morse@arm.com>

[Adopted the original patch from James to trans_pgd interface, so it can be
commonly used by both Kexec and Hibernate. Some minor clean-ups.]

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Link: https://lore.kernel.org/linux-arm-kernel/20200115143322.214247-4-james.morse@arm.com/
Link: https://lore.kernel.org/r/20210125191923.1060122-9-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>


# 5de59884 25-Jan-2021 Pavel Tatashin <pasha.tatashin@soleen.com>

arm64: trans_pgd: pass NULL instead of init_mm to *_populate functions

trans_pgd_* should be independent from mm context because the tables that
are created by this code are used when there are no mm context around, as
it is between kernels. Simply replace mm_init's with NULL.

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Acked-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20210125191923.1060122-7-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>


# 89d1410f 25-Jan-2021 Pavel Tatashin <pasha.tatashin@soleen.com>

arm64: trans_pgd: pass allocator trans_pgd_create_copy

Make trans_pgd_create_copy and its subroutines to use allocator that is
passed as an argument

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20210125191923.1060122-6-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>


# 50f53fb7 25-Jan-2021 Pavel Tatashin <pasha.tatashin@soleen.com>

arm64: trans_pgd: make trans_pgd_map_page generic

kexec is going to use a different allocator, so make
trans_pgd_map_page to accept allocator as an argument, and also
kexec is going to use a different map protection, so also pass
it via argument.

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Matthias Brugger <mbrugger@suse.com>
Link: https://lore.kernel.org/r/20210125191923.1060122-5-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>


# 072e3d96 25-Jan-2021 Pavel Tatashin <pasha.tatashin@soleen.com>

arm64: hibernate: move page handling function to new trans_pgd.c

Now, that we abstracted the required functions move them to a new home.
Later, we will generalize these function in order to be useful outside
of hibernation.

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20210125191923.1060122-4-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>