Cross Reference: /linux-master/arch/arm64/include/asm/cpucaps.h

History log of /linux-master/arch/arm64/include/asm/cpucaps.h
Revision	Date	Author	Comments
# 7187bb7d	08-May-2024	Mark Rutland <mark.rutland@arm.com>	arm64: errata: Add workaround for Arm errata 3194386 and 3312417 Cortex-X4 and Neoverse-V3 suffer from errata whereby an MSR to the SSBS special-purpose register does not affect subsequent speculative instructions, permitting speculative store bypassing for a window of time. This is described in their Software Developer Errata Notice (SDEN) documents: * Cortex-X4 SDEN v8.0, erratum 3194386: https://developer.arm.com/documentation/SDEN-2432808/0800/ * Neoverse-V3 SDEN v6.0, erratum 3312417: https://developer.arm.com/documentation/SDEN-2891958/0600/ To workaround these errata, it is necessary to place a speculation barrier (SB) after MSR to the SSBS special-purpose register. This patch adds the requisite SB after writes to SSBS within the kernel, and hides the presence of SSBS from EL0 such that userspace software which cares about SSBS will manipulate this via prctl(PR_GET_SPECULATION_CTRL, ...). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240508081400.235362-5-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
# 47759eca	16-Oct-2023	Mark Rutland <mark.rutland@arm.com>	arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_REPEAT_TLBI In arch_tlbbatch_should_defer() we use cpus_have_const_cap() to check for ARM64_WORKAROUND_REPEAT_TLBI, but this is not necessary and alternative_has_cap_() would be preferable. For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching. Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching. The cpus_have_const_cap() check in arch_tlbbatch_should_defer() is an optimization to avoid some redundant work when the ARM64_WORKAROUND_REPEAT_TLBI cpucap is detected and forces the immediate use of TLBI + DSB ISH. In the window between detecting the ARM64_WORKAROUND_REPEAT_TLBI cpucap and patching alternatives this is not a big concern and there's no need to optimize this window at the expsense of subsequent usage at runtime. This patch replaces the use of cpus_have_const_cap() with alternative_has_cap_unlikely(), which will avoid generating code to test the system_cpucaps bitmap and should be better for all subsequent calls at runtime. The ARM64_WORKAROUND_REPEAT_TLBI cpucap is added to cpucap_is_possible() so that code can be elided entirely when this is not possible without requiring ifdeffery or IS_ENABLED() checks at each usage. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 0d48058e	16-Oct-2023	Mark Rutland <mark.rutland@arm.com>	arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_NVIDIA_CARMEL_CNP In has_useable_cnp() we use cpus_have_const_cap() to check for ARM64_WORKAROUND_NVIDIA_CARMEL_CNP, but this is not necessary and cpus_have_cap() would be preferable. For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching. Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching. We use has_useable_cnp() to determine whether we have the system-wide ARM64_HAS_CNP cpucap. Due to the structure of the cpufeature code, we call has_useable_cnp() in two distinct cases: 1) When finalizing system capabilities, setup_system_capabilities() will call has_useable_cnp() with SCOPE_SYSTEM to determine whether all CPUs have the feature. This is called after we've detected any local cpucaps including ARM64_WORKAROUND_NVIDIA_CARMEL_CNP, but prior to patching alternatives. If the ARM64_WORKAROUND_NVIDIA_CARMEL_CNP was detected, we will not detect ARM64_HAS_CNP. 2) After finalizing system capabilties, verify_local_cpu_capabilities() will call has_useable_cnp() with SCOPE_LOCAL_CPU to verify that CPUs have CNP if we previously detected it. Note that if ARM64_WORKAROUND_NVIDIA_CARMEL_CNP was detected, we will not have detected ARM64_HAS_CNP. For case 1 we must check the system_cpucaps bitmap as this occurs prior to patching the alternatives. For case 2 we'll only call has_useable_cnp() once per subsequent onlining of a CPU, and as this isn't a fast path it's not necessary to optimize for this case. This patch replaces the use of cpus_have_const_cap() with cpus_have_cap(), which will only generate the bitmap test and avoid generating an alternative sequence, resulting in slightly simpler annd smaller code being generated. The ARM64_WORKAROUND_NVIDIA_CARMEL_CNP cpucap is added to cpucap_is_possible() so that code can be elided entirely when this is not possible. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# a98a5eac	16-Oct-2023	Mark Rutland <mark.rutland@arm.com>	arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_CAVIUM_23154 In gic_read_iar() we use cpus_have_const_cap() to check for ARM64_WORKAROUND_CAVIUM_23154 but this is not necessary and alternative_has_cap_() would be preferable. For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching. Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching. The ARM64_WORKAROUND_CAVIUM_23154 cpucap is detected and patched early on the boot CPU before the GICv3 driver is initialized and hence before gic_read_iar() is ever called. Thus it is not necessary to use cpus_have_const_cap(), and alternative_has_cap() is equivalent. In addition, arm64's gic_read_iar() lives in irq-gic-v3.c purely for historical reasons. It was originally added prior to 32-bit arm support in commit: 6d4e11c5e2e8cd54 ("irqchip/gicv3: Workaround for Cavium ThunderX erratum 23154") When support for 32-bit arm was added, 32-bit arm's gic_read_iar() implementation was placed in <asm/arch_gicv3.h>, but the arm64 version was kept within irq-gic-v3.c as it depended on a static key local to irq-gic-v3.c and it was easier to add ifdeffery, which is what we did in commit: 7936e914f7b0827c ("irqchip/gic-v3: Refactor the arm64 specific parts") Subsequently the static key was replaced with a cpucap in commit: a4023f682739439b ("arm64: Add hypervisor safe helper for checking constant capabilities") Since that commit there has been no need to keep arm64's gic_read_iar() in irq-gic-v3.c. This patch replaces the use of cpus_have_const_cap() with alternative_has_cap_unlikely(), which will avoid generating code to test the system_cpucaps bitmap and should be better for all subsequent calls at runtime. For consistency, move the arm64-specific gic_read_iar() implementation over to arm64's <asm/arch_gicv3.h>. The ARM64_WORKAROUND_CAVIUM_23154 cpucap is added to cpucap_is_possible() so that code can be elided entirely when this is not possible. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 412cb380	16-Oct-2023	Mark Rutland <mark.rutland@arm.com>	arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_2645198 We use cpus_have_const_cap() to check for ARM64_WORKAROUND_2645198 but this is not necessary and alternative_has_cap() would be preferable. For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching. Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching. The ARM64_WORKAROUND_2645198 cpucap is detected and patched before any userspace translation table exist, and the workaround is only necessary when manipulating usrspace translation tables which are in use. Thus it is not necessary to use cpus_have_const_cap(), and alternative_has_cap() is equivalent. This patch replaces the use of cpus_have_const_cap() with alternative_has_cap_unlikely(), which will avoid generating code to test the system_cpucaps bitmap and should be better for all subsequent calls at runtime. The ARM64_WORKAROUND_2645198 cpucap is added to cpucap_is_possible() so that code can be elided entirely when this is not possible, and redundant IS_ENABLED() checks are removed. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 48b57d91	16-Oct-2023	Mark Rutland <mark.rutland@arm.com>	arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_1742098 In elf_hwcap_fixup() we use cpus_have_const_cap() to check for ARM64_WORKAROUND_1742098, but this is not necessary and cpus_have_cap() would be preferable. For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching. Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching. The ARM64_WORKAROUND_1742098 cpucap is detected and patched before elf_hwcap_fixup() can run, and hence it is not necessary to use cpus_have_const_cap(). We run cpus_have_const_cap() at most twice: once after finalizing system cpucaps, and potentially once more after detecting mismatched CPUs which support AArch32 at EL0. Due to this, it's not necessary to optimize for many calls to elf_hwcap_fixup(), and it's fine to use cpus_have_cap(). This patch replaces the use of cpus_have_const_cap() with cpus_have_cap(), which will only generate the bitmap test and avoid generating an alternative sequence, resulting in slightly simpler annd smaller code being generated. For consistenct with other cpucaps, the ARM64_WORKAROUND_1742098 cpucap is added to cpucap_is_possible() so that code can be elided when this is not possible. However, as we only define compat_elf_hwcap2 when CONFIG_COMPAT=y, some ifdeffery is still required within user_feature_fixup() to avoid build errors when CONFIG_COMPAT=n. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 0a285dfe	16-Oct-2023	Mark Rutland <mark.rutland@arm.com>	arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_843419 In count_plts() and is_forbidden_offset_for_adrp() we use cpus_have_const_cap() to check for ARM64_WORKAROUND_843419, but this is not necessary and cpus_have_final_cap() would be preferable. For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching. Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching. It's not possible to load a module in the window between detecting the ARM64_WORKAROUND_843419 cpucap and patching alternatives. The module VA range limits are initialized much later in module_init_limits() which is a subsys_initcall, and module loading cannot happen before this. Hence it's not necessary for count_plts() or is_forbidden_offset_for_adrp() to use cpus_have_const_cap(). This patch replaces the use of cpus_have_const_cap() with cpus_have_final_cap() which will avoid generating code to test the system_cpucaps bitmap and should be better for all subsequent calls at runtime. Using cpus_have_final_cap() clearly documents that we do not expect this code to run before cpucaps are finalized, and will make it easier to spot issues if code is changed in future to allow modules to be loaded earlier. The ARM64_WORKAROUND_843419 cpucap is added to cpucap_is_possible() so that code can be elided entirely when this is not possible, and redundant IS_ENABLED() checks are removed. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# c2ef5f1e	16-Oct-2023	Mark Rutland <mark.rutland@arm.com>	arm64: Avoid cpus_have_const_cap() for ARM64_UNMAP_KERNEL_AT_EL0 In arm64_kernel_unmapped_at_el0() we use cpus_have_const_cap() to check for ARM64_UNMAP_KERNEL_AT_EL0, but this is only necessary so that arm64_get_bp_hardening_vector() and this_cpu_set_vectors() can run prior to alternatives being patched. Otherwise this is not necessary and alternative_has_cap_() would be preferable. For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching. Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching. The ARM64_UNMAP_KERNEL_AT_EL0 cpucap is a system-wide feature that is detected and patched before any translation tables are created for userspace. In the window between detecting the ARM64_UNMAP_KERNEL_AT_EL0 cpucap and patching alternatives, most users of arm64_kernel_unmapped_at_el0() do not need to know that the cpucap has been detected: * As KVM is initialized after cpucaps are finalized, no usaef of arm64_kernel_unmapped_at_el0() in the KVM code is reachable during this window. * The arm64_mm_context_get() function in arch/arm64/mm/context.c is only called after the SMMU driver is brought up after alternatives have been patched. Thus this can safely use cpus_have_final_cap() or alternative_has_cap_(). Similarly the asids_update_limit() function is called after alternatives have been patched as an arch_initcall, and this can safely use cpus_have_final_cap() or alternative_has_cap_(). Similarly we do not expect an ASID rollover to occur between cpucaps being detected and patching alternatives. Thus set_reserved_asid_bits() can safely use cpus_have_final_cap() or alternative_has_cap_(). The __tlbi_user() and __tlbi_user_level() macros are not used during this window, and only need to invalidate additional entries once userspace translation tables have been active on a CPU. Thus these can safely use alternative_has_cap_(). The xen_kernel_unmapped_at_usr() function is not used during this window as it is only used in a late_initcall. Thus this can safely use cpus_have_final_cap() or alternative_has_cap_(). The arm64_get_meltdown_state() function is not used during this window. It only used by arm64_get_meltdown_state() and KVM code, both of which are only used after cpucaps have been finalized. Thus this can safely use cpus_have_final_cap() or alternative_has_cap_(). The tls_thread_switch() uses arm64_kernel_unmapped_at_el0() as an optimization to avoid zeroing tpidrro_el0 when KPTI is enabled and this will be trampled by the KPTI trampoline. It doesn't matter if this continues to zero the register during the window between detecting the cpucap and patching alternatives, so this can safely use alternative_has_cap_(). The sdei_arch_get_entry_point() and do_sdei_event() functions aren't reachable at this time as the SDEI driver is registered later by acpi_init() -> acpi_ghes_init() -> sdei_init(), where acpi_init is a subsys_initcall. Thus these can safely use cpus_have_final_cap() or alternative_has_cap_(). The uses under drivers/ aren't reachable at this time as the drivers are registered later: - TRBE is registered via module_init() - SMMUv3 is registred via module_driver() - SPE is registred via module_init() * The arm64_get_bp_hardening_vector() and this_cpu_set_vectors() functions need to run on boot CPUs prior to patching alternatives. As these are only called during the onlining of a CPU, it's fine to perform a system_cpucaps bitmap test using cpus_have_cap(). This patch modifies this_cpu_set_vectors() to use cpus_have_cap(), and replaced all other use of cpus_have_const_cap() with alternative_has_cap_unlikely(), which will avoid generating code to test the system_cpucaps bitmap and should be better for all subsequent calls at runtime. The ARM64_UNMAP_KERNEL_AT_EL0 cpucap is added to cpucap_is_possible() so that code can be elided entirely when this is not possible. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 4e00f1d9	16-Oct-2023	Mark Rutland <mark.rutland@arm.com>	arm64: Avoid cpus_have_const_cap() for ARM64_HAS_EPAN We use cpus_have_const_cap() to check for ARM64_HAS_EPAN but this is not necessary and alternative_has_cap() or cpus_have_cap() would be preferable. For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching. Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching. The ARM64_HAS_EPAN cpucap is used to affect two things: 1) The permision bits used for userspace executable mappings, which are chosen by adjust_protection_map(), which is an arch_initcall. This is called after the ARM64_HAS_EPAN cpucap has been detected and alternatives have been patched, and before any userspace translation tables exist. 2) The handling of faults taken from (user or kernel) accesses to userspace executable mappings in do_page_fault(). Userspace translation tables are created after adjust_protection_map() is called, and hence after the ARM64_HAS_EPAN cpucap has been detected and alternatives have been patched. Neither of these run until after ARM64_HAS_EPAN cpucap has been detected and alternatives have been patched, and hence there's no need to use cpus_have_const_cap(). Since adjust_protection_map() is only executed once at boot time it would be best for it to use cpus_have_cap(), and since do_page_fault() is executed frequently it would be best for it to use alternatives_have_cap_unlikely(). This patch replaces the uses of cpus_have_const_cap() with cpus_have_cap() and alternative_has_cap_unlikely(), which will avoid generating redundant code, and should be better for all subsequent calls at runtime. The ARM64_HAS_EPAN cpucap is added to cpucap_is_possible() so that code can be elided entirely when this is not possible. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Vladimir Murzin <vladimir.murzin@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 7f632d331	16-Oct-2023	Mark Rutland <mark.rutland@arm.com>	arm64: Fixup user features at boot time For ARM64_WORKAROUND_2658417, we use a cpu_enable() callback to hide the ID_AA64ISAR1_EL1.BF16 ID register field. This is a little awkward as CPUs may attempt to apply the workaround concurrently, requiring that we protect the bulk of the callback with a raw_spinlock, and requiring some pointless work every time a CPU is subsequently hotplugged in. This patch makes this a little simpler by handling the masking once at boot time. A new user_feature_fixup() function is called at the start of setup_user_features() to mask the feature, matching the style of elf_hwcap_fixup(). The ARM64_WORKAROUND_2658417 cpucap is added to cpucap_is_possible() so that code can be elided entirely when this is not possible. Note that the ARM64_WORKAROUND_2658417 capability is matched with ERRATA_MIDR_RANGE(), which implicitly gives the capability a ARM64_CPUCAP_LOCAL_CPU_ERRATUM type, which forbids the late onlining of a CPU with the erratum if the erratum was not present at boot time. Therefore this patch doesn't change the behaviour for late onlining. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# de66cb37	16-Oct-2023	Mark Rutland <mark.rutland@arm.com>	arm64: Add cpucap_is_possible() Many cpucaps can only be set when certain CONFIG_* options are selected, and we need to check the CONFIG_* option before the cap in order to avoid generating redundant code. Due to this, we have a growing number of helpers in <asm/cpufeature.h> of the form: \| static __always_inline bool system_supports_foo(void) \| { \| return IS_ENABLED(CONFIG_ARM64_FOO) && \| cpus_have_const_cap(ARM64_HAS_FOO); \| } This is unfortunate as it forces us to use cpus_have_const_cap() unnecessarily, resulting in redundant code being generated by the compiler. In the vast majority of cases, we only require that feature checks indicate the presence of a feature after cpucaps have been finalized, and so it would be sufficient to use alternative_has_cap_(). However some code needs to handle a feature before alternatives have been patched, and must test the system_cpucaps bitmap via cpus_have_const_cap(). In other cases we'd like to check for unintentional usage of a cpucap before alternatives are patched, and so it would be preferable to use cpus_have_final_cap(). Placing the IS_ENABLED() checks in each callsite is tedious and error-prone, and the same applies for writing wrappers for each comination of cpucap and alternative_has_cap_() / cpus_have_cap() / cpus_have_final_cap(). It would be nicer if we could centralize the knowledge of which cpucaps are possible, and have alternative_has_cap_(), cpus_have_cap(), and cpus_have_final_cap() handle this automatically. This patch adds a new cpucap_is_possible() function which will be responsible for checking the CONFIG_ option, and updates the low-level cpucap checks to use this. The existing CONFIG_* checks in <asm/cpufeature.h> are moved over to cpucap_is_possible(), but the (now trival) wrapper functions are retained for now. There should be no functional change as a result of this patch alone. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 484de085	16-Oct-2023	Mark Rutland <mark.rutland@arm.com>	arm64: Factor out cpucap definitions For clarity it would be nice to factor cpucap manipulation out of <asm/cpufeature.h>, and the obvious place would be <asm/cpucap.h>, but this will clash somewhat with <generated/asm/cpucaps.h>. Rename <generated/asm/cpucaps.h> to <generated/asm/cpucap-defs.h>, matching what we do for <generated/asm/sysreg-defs.h>, and introduce a new <asm/cpucaps.h> which includes the generated header. Subsequent patches will fill out <asm/cpucaps.h>. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 18107f8a	12-Mar-2021	Vladimir Murzin <vladimir.murzin@arm.com>	arm64: Support execute-only permissions with Enhanced PAN Enhanced Privileged Access Never (EPAN) allows Privileged Access Never to be used with Execute-only mappings. Absence of such support was a reason for 24cecc377463 ("arm64: Revert support for execute-only user mappings"). Thus now it can be revisited and re-enabled. Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210312173811.58284-2-vladimir.murzin@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 20109a85	23-Mar-2021	Rich Wiley <rwiley@nvidia.com>	arm64: kernel: disable CNP on Carmel On NVIDIA Carmel cores, CNP behaves differently than it does on standard ARM cores. On Carmel, if two cores have CNP enabled and share an L2 TLB entry created by core0 for a specific ASID, a non-shareable TLBI from core1 may still see the shared entry. On standard ARM cores, that TLBI will invalidate the shared entry as well. This causes issues with patchsets that attempt to do local TLBIs based on cpumasks instead of broadcast TLBIs. Avoid these issues by disabling CNP support for NVIDIA Carmel cores. Signed-off-by: Rich Wiley <rwiley@nvidia.com> Link: https://lore.kernel.org/r/20210324002809.30271-1-rwiley@nvidia.com [will: Fix pre-existing whitespace issue] Signed-off-by: Will Deacon <will@kernel.org>
# 3eb681fb	02-Dec-2020	David Brazdil <dbrazdil@google.com>	KVM: arm64: Add ARM64_KVM_PROTECTED_MODE CPU capability Expose the boolean value whether the system is running with KVM in protected mode (nVHE + kernel param). CPU capability was selected over a global variable to allow use in alternatives. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201202184122.26046-3-dbrazdil@google.com
# 1517c4fa	02-Dec-2020	Mark Rutland <mark.rutland@arm.com>	arm64: uaccess: remove vestigal UAO support Now that arm64 no longer uses UAO, remove the vestigal feature detection code and Kconfig text. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201202131558.39270-13-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 7cf283c7	02-Dec-2020	Mark Rutland <mark.rutland@arm.com>	arm64: uaccess: remove redundant PAN toggling Some code (e.g. futex) needs to make privileged accesses to userspace memory, and uses uaccess_{enable,disable}_privileged() in order to permit this. All other uaccess primitives use LDTR/STTR, and never need to toggle PAN. Remove the redundant PAN toggling. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201202131558.39270-12-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# c4792b6d	13-Nov-2020	Will Deacon <will@kernel.org>	arm64: spectre: Rename ARM64_HARDEN_EL2_VECTORS to ARM64_SPECTRE_V3A Since ARM64_HARDEN_EL2_VECTORS is really a mitigation for Spectre-v3a, rename it accordingly for consistency with the v2 and v4 mitigation. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20201113113847.21619-9-will@kernel.org
# 364a5a8a	30-Jun-2020	Will Deacon <will@kernel.org>	arm64: cpufeatures: Add capability for LDAPR instruction Armv8.3 introduced the LDAPR instruction, which provides weaker memory ordering semantics than LDARi (RCpc vs RCsc). Generally, we provide an RCsc implementation when implementing the Linux memory model, but LDAPR can be used as a useful alternative to dependency ordering, particularly when the compiler is capable of breaking the dependencies. Since LDAPR is not available on all CPUs, add a cpufeature to detect it at runtime and allow the instruction to be used with alternative code patching. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# 96d389ca	28-Oct-2020	Rob Herring <robh@kernel.org>	arm64: Add workaround for Arm Cortex-A77 erratum 1508412 On Cortex-A77 r0p0 and r1p0, a sequence of a non-cacheable or device load and a store exclusive or PAR_EL1 read can cause a deadlock. The workaround requires a DMB SY before and after a PAR_EL1 register read. In addition, it's possible an interrupt (doing a device read) or KVM guest exit could be taken between the DMB and PAR read, so we also need a DMB before returning from interrupt and before returning to a guest. A deadlock is still possible with the workaround as KVM guests must also have the workaround. IOW, a malicious guest can deadlock an affected systems. This workaround also depends on a firmware counterpart to enable the h/w to insert DMB SY after load and store exclusive instructions. See the errata document SDEN-1152370 v10 [1] for more information. [1] https://static.docs.arm.com/101992/0010/Arm_Cortex_A77_MP074_Software_Developer_Errata_Notice_v10.pdf Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: kvmarm@lists.cs.columbia.edu Link: https://lore.kernel.org/r/20201028182839.166037-2-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
# 9b0955ba	15-Sep-2020	Will Deacon <will@kernel.org>	arm64: Rename ARM64_SSBD to ARM64_SPECTRE_V4 In a similar manner to the renaming of ARM64_HARDEN_BRANCH_PREDICTOR to ARM64_SPECTRE_V2, rename ARM64_SSBD to ARM64_SPECTRE_V4. This isn't _entirely_ accurate, as we also need to take into account the interaction with SSBS, but that will be taken care of in subsequent patches. Signed-off-by: Will Deacon <will@kernel.org>
# 688f1e4b	15-Sep-2020	Will Deacon <will@kernel.org>	arm64: Rename ARM64_HARDEN_BRANCH_PREDICTOR to ARM64_SPECTRE_V2 For better or worse, the world knows about "Spectre" and not about "Branch predictor hardening". Rename ARM64_HARDEN_BRANCH_PREDICTOR to ARM64_SPECTRE_V2 as part of moving all of the Spectre mitigations into their own little corner. Signed-off-by: Will Deacon <will@kernel.org>
# 3b714d24	06-Sep-2019	Vincenzo Frascino <vincenzo.frascino@arm.com>	arm64: mte: CPU feature detection and initial sysreg configuration Add the cpufeature and hwcap entries to detect the presence of MTE. Any secondary CPU not supporting the feature, if detected on the boot CPU, will be parked. Add the minimum SCTLR_EL1 and HCR_EL2 bits for enabling MTE. The Normal Tagged memory type is configured in MAIR_EL1 before the MMU is enabled in order to avoid disrupting other CPUs in the CnP domain. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Co-developed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>
# b620ba54	15-Jul-2020	Zhenyu Ye <yezhenyu2@huawei.com>	arm64: tlb: Detect the ARMv8.4 TLBI RANGE feature ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a range of input addresses. This patch detect this feature. Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com> Link: https://lore.kernel.org/r/20200715071945.897-2-yezhenyu2@huawei.com [catalin.marinas@arm.com: some renaming for consistency] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 552ae76f	21-Dec-2018	Marc Zyngier <maz@kernel.org>	arm64: Detect the ARMv8.4 TTL feature In order to reduce the cost of TLB invalidation, the ARMv8.4 TTL feature allows TLBs to be issued with a level allowing for quicker invalidation. Let's detect the feature for now. Further patches will implement its actual usage. Reviewed-by : Suzuki K Polose <suzuki.poulose@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
# 02ab1f50	04-May-2020	Andrew Scull <ascull@google.com>	arm64: Unify WORKAROUND_SPECULATIVE_AT_{NVHE,VHE} Errata 1165522, 1319367 and 1530923 each allow TLB entries to be allocated as a result of a speculative AT instruction. In order to avoid mandating VHE on certain affected CPUs, apply the workaround to both the nVHE and the VHE case for all affected CPUs. Signed-off-by: Andrew Scull <ascull@google.com> Acked-by: Will Deacon <will@kernel.org> CC: Marc Zyngier <maz@kernel.org> CC: James Morse <james.morse@arm.com> CC: Suzuki K Poulose <suzuki.poulose@arm.com> CC: Will Deacon <will@kernel.org> CC: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20200504094858.108917-1-ascull@google.com Signed-off-by: Will Deacon <will@kernel.org>
# 540f76d1	21-Apr-2020	Will Deacon <will@kernel.org>	arm64: cpufeature: Add CPU capability for AArch32 EL1 support Although we emit a "SANITY CHECK" warning and taint the kernel if we detect a CPU mismatch for AArch32 support at EL1, we still online the CPU with disastrous consequences for any running 32-bit VMs. Introduce a capability for AArch32 support at EL1 so that late onlining of incompatible CPUs is forbidden. Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200421142922.18950-4-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
# cfef06bd	13-Mar-2020	Kristina Martsenko <kristina.martsenko@arm.com>	arm64: cpufeature: add pointer auth meta-capabilities To enable pointer auth for the kernel, we're going to need to check for the presence of address auth and generic auth using alternative_if. We currently have two cpucaps for each, but alternative_if needs to check a single cpucap. So define meta-capabilities that are present when either of the current two capabilities is present. Leave the existing four cpucaps in place, as they are still needed to check for mismatched systems where one CPU has the architected algorithm but another has the IMP DEF algorithm. Note, the meta-capabilities were present before but were removed in commit a56005d32105 ("arm64: cpufeature: Reduce number of pointer auth CPU caps from 6 to 4") and commit 1e013d06120c ("arm64: cpufeature: Rework ptr auth hwcaps using multi_entry_cap_matches"), as they were not needed then. Note, unlike before, the current patch checks the cpucap values directly, instead of reading the CPU ID register value. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Vincenzo Frascino <Vincenzo.Frascino@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> [Amit: commit message and macro rebase, use __system_matches_cap] Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 8ef8f360	16-Mar-2020	Dave Martin <Dave.Martin@arm.com>	arm64: Basic Branch Target Identification support This patch adds the bare minimum required to expose the ARMv8.5 Branch Target Identification feature to userspace. By itself, this does _not_ automatically enable BTI for any initial executable pages mapped by execve(). This will come later, but for now it should be possible to enable BTI manually on those pages by using mprotect() from within the target process. Other arches already using the generic mman.h are already using 0x10 for arch-specific prot flags, so we use that for PROT_BTI here. For consistency, signal handler entry points in BTI guarded pages are required to be annotated as such, just like any other function. This blocks a relatively minor attack vector, but comforming userspace will have the annotations anyway, so we may as well enforce them. Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 2c9d45b4	05-Mar-2020	Ionela Voinescu <ionela.voinescu@arm.com>	arm64: add support for the AMU extension v1 The activity monitors extension is an optional extension introduced by the ARMv8.4 CPU architecture. This implements basic support for version 1 of the activity monitors architecture, AMUv1. This support includes: - Extension detection on each CPU (boot, secondary, hotplugged) - Register interface for AMU aarch64 registers Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 1a50ec0b	20-Jan-2020	Richard Henderson <richard.henderson@linaro.org>	arm64: Implement archrandom.h for ARMv8.5-RNG Expose the ID_AA64ISAR0.RNDR field to userspace, as the RNG system registers are always available at EL0. Implement arch_get_random_seed_long using RNDR. Given that the TRNG is likely to be a shared resource between cores, and VMs, do not explicitly force re-seeding with RNDRRS. In order to avoid code complexity and potential issues with hetrogenous systems only provide values after cpufeature has finalized the system capabilities. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> [Modified to only function after cpufeature has finalized the system capabilities and move all the code into the header -- broonie] Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> [will: Advertise HWCAP via /proc/cpuinfo] Signed-off-by: Will Deacon <will@kernel.org>
# db0d46a5	16-Dec-2019	Steven Price <steven.price@arm.com>	arm64: Rename WORKAROUND_1319367 to SPECULATIVE_AT_NVHE To match SPECULATIVE_AT_VHE let's also have a generic name for the NVHE variant. Acked-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# e85d68fa	16-Dec-2019	Steven Price <steven.price@arm.com>	arm64: Rename WORKAROUND_1165522 to SPECULATIVE_AT_VHE Cortex-A55 is affected by a similar erratum, so rename the existing workaround for errarum 1165522 so it can be used for both errata. Acked-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# 3e6c69a0	09-Dec-2019	Mark Brown <broonie@kernel.org>	arm64: Add initial support for E0PD Kernel Page Table Isolation (KPTI) is used to mitigate some speculation based security issues by ensuring that the kernel is not mapped when userspace is running but this approach is expensive and is incompatible with SPE. E0PD, introduced in the ARMv8.5 extensions, provides an alternative to this which ensures that accesses from userspace to the kernel's half of the memory map to always fault with constant time, preventing timing attacks without requiring constant unmapping and remapping or preventing legitimate accesses. Currently this feature will only be enabled if all CPUs in the system support E0PD, if some CPUs do not support the feature at boot time then the feature will not be enabled and in the unlikely event that a late CPU is the first CPU to lack the feature then we will reject that CPU. This initial patch does not yet integrate with KPTI, this will be dealt with in followup patches. Ideally we could ensure that by default we don't use KPTI on CPUs where E0PD is present. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> [will: Fixed typo in Kconfig text] Signed-off-by: Will Deacon <will@kernel.org>
# 05460849	17-Oct-2019	James Morse <james.morse@arm.com>	arm64: errata: Hide CTR_EL0.DIC on systems affected by Neoverse-N1 #1542419 Cores affected by Neoverse-N1 #1542419 could execute a stale instruction when a branch is updated to point to freshly generated instructions. To workaround this issue we need user-space to issue unnecessary icache maintenance that we can trap. Start by hiding CTR_EL0.DIC. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# f75e2294	23-Nov-2018	Marc Zyngier <maz@kernel.org>	arm64: Add ARM64_WORKAROUND_1319367 for all A57 and A72 versions Rework the EL2 vector hardening that is only selected for A57 and A72 so that the table can also be used for ARM64_WORKAROUND_1319367. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
# 9405447e	09-Apr-2019	Marc Zyngier <maz@kernel.org>	arm64: Avoid Cavium TX2 erratum 219 when switching TTBR As a PRFM instruction racing against a TTBR update can have undesirable effects on TX2, NOP-out such PRFM on cores that are affected by the TX2-219 erratum. Cc: <stable@vger.kernel.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# d3ec3a08	07-Feb-2019	Marc Zyngier <maz@kernel.org>	arm64: KVM: Trap VM ops when ARM64_WORKAROUND_CAVIUM_TX2_219_TVM is set In order to workaround the TX2-219 erratum, it is necessary to trap TTBRx_EL1 accesses to EL2. This is done by setting HCR_EL2.TVM on guest entry, which has the side effect of trapping all the other VM-related sysregs as well. To minimize the overhead, a fast path is used so that we don't have to go all the way back to the main sysreg handling code, unless the rest of the hypervisor expects to see these accesses. Cc: <stable@vger.kernel.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# caab277b	02-Jun-2019	Thomas Gleixner <tglx@linutronix.de>	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 503 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Enrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
# a5325089	23-May-2019	Marc Zyngier <maz@kernel.org>	arm64: Handle erratum 1418040 as a superset of erratum 1188873 We already mitigate erratum 1188873 affecting Cortex-A76 and Neoverse-N1 r0p0 to r2p0. It turns out that revisions r0p0 to r3p1 of the same cores are affected by erratum 1418040, which has the same workaround as 1188873. Let's expand the range of affected revisions to match 1418040, and repaint all occurences of 1188873 to 1418040. Whilst we're there, do a bit of reformating in silicon-errata.txt and drop a now unnecessary dependency on ARM_ARCH_TIMER_OOL_WORKAROUND. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
# 969f5ea6	29-Apr-2019	Will Deacon <will@kernel.org>	arm64: errata: Add workaround for Cortex-A76 erratum #1463225 Revisions of the Cortex-A76 CPU prior to r4p0 are affected by an erratum that can prevent interrupts from being taken when single-stepping. This patch implements a software workaround to prevent userspace from effectively being able to disable interrupts. Cc: <stable@vger.kernel.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
# b9585f53	09-Apr-2019	Andrew Murray <amurray@thegoodpenguin.co.uk>	arm64: Advertise ARM64_HAS_DCPODP cpu feature Advertise ARM64_HAS_DCPODP when both DC CVAP and DC CVADP are supported. Even though we don't use this feature now, we provide it for consistency with DCPOP and anticipate it being used in the future. Signed-off-by: Andrew Murray <andrew.murray@arm.com> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
# b90d2b22	31-Jan-2019	Julien Thierry <julien.thierry.kdev@gmail.com>	arm64: cpufeature: Add cpufeature for IRQ priority masking Add a cpufeature indicating whether a cpu supports masking interrupts by priority. The feature will be properly enabled in a later patch. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# a56005d3	12-Dec-2018	Will Deacon <will@kernel.org>	arm64: cpufeature: Reduce number of pointer auth CPU caps from 6 to 4 We can easily avoid defining the two meta-capabilities for the address and generic keys, so remove them and instead just check both of the architected and impdef capabilities when determining the level of system support. Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
# 6984eb47	07-Dec-2018	Mark Rutland <mark.rutland@arm.com>	arm64/cpufeature: detect pointer authentication So that we can dynamically handle the presence of pointer authentication functionality, wire up probing code in cpufeature.c. From ARMv8.3 onwards, ID_AA64ISAR1 is no longer entirely RES0, and now has four fields describing the presence of pointer authentication functionality: * APA - address authentication present, using an architected algorithm * API - address authentication present, using an IMP DEF algorithm * GPA - generic authentication present, using an architected algorithm * GPI - generic authentication present, using an IMP DEF algorithm This patch checks for both address and generic authentication, separately. It is assumed that if all CPUs support an IMP DEF algorithm, the same algorithm is used across all CPUs. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
# 8b2cca9a	06-Dec-2018	Marc Zyngier <maz@kernel.org>	arm64: KVM: Force VHE for systems affected by erratum 1165522 In order to easily mitigate ARM erratum 1165522, we need to force affected CPUs to run in VHE mode if using KVM. Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
# bd4fb6d2	14-Jun-2018	Will Deacon <will@kernel.org>	arm64: Add support for SB barrier and patch in over DSB; ISB sequences We currently use a DSB; ISB sequence to inhibit speculation in set_fs(). Whilst this works for current CPUs, future CPUs may implement a new SB barrier instruction which acts as an architected speculation barrier. On CPUs that support it, patch in an SB; NOP sequence over the DSB; ISB sequence and advertise the presence of the new instruction to userspace. Signed-off-by: Will Deacon <will.deacon@arm.com>
# 95b861a4	27-Sep-2018	Marc Zyngier <maz@kernel.org>	arm64: arch_timer: Add workaround for ARM erratum 1188873 When running on Cortex-A76, a timer access from an AArch32 EL0 task may end up with a corrupted value or register. The workaround for this is to trap these accesses at EL1/EL2 and execute them there. This only affects versions r0p0, r1p0 and r2p0 of the CPU. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 880f7cc4	19-Sep-2018	Will Deacon <will@kernel.org>	arm64: cpu_errata: Remove ARM64_MISMATCHED_CACHE_LINE_SIZE There's no need to treat mismatched cache-line sizes reported by CTR_EL0 differently to any other mismatched fields that we treat as "STRICT" in the cpufeature code. In both cases we need to trap and emulate EL0 accesses to the register, so drop ARM64_MISMATCHED_CACHE_LINE_SIZE and rely on ARM64_MISMATCHED_CACHE_TYPE instead. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> [catalin.marinas@arm.com: move ARM64_HAS_CNP in the empty cpucaps.h slot] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 5ffdfaed	31-Jul-2018	Vladimir Murzin <vladimir.murzin@arm.com>	arm64: mm: Support Common Not Private translations Common Not Private (CNP) is a feature of ARMv8.2 extension which allows translation table entries to be shared between different PEs in the same inner shareable domain, so the hardware can use this fact to optimise the caching of such entries in the TLB. CNP occupies one bit in TTBRx_ELy and VTTBR_EL2, which advertises to the hardware that the translation table entries pointed to by this TTBR are the same as every PE in the same inner shareable domain for which the equivalent TTBR also has CNP bit set. In case CNP bit is set but TTBR does not point at the same translation table entries for a given ASID and VMID, then the system is mis-configured, so the results of translations are UNPREDICTABLE. For kernel we postpone setting CNP till all cpus are up and rely on cpufeature framework to 1) patch the code which is sensitive to CNP and 2) update TTBR1_EL1 with CNP bit set. TTBR1_EL1 can be reprogrammed as result of hibernation or cpuidle (via __enable_mmu). For these two cases we restore CnP bit via __cpu_suspend_exit(). There are a few cases we need to care of changes in TTBR0_EL1: - a switch to idmap - software emulated PAN we rule out latter via Kconfig options and for the former we make sure that CNP is set for non-zero ASIDs only. Reviewed-by: James Morse <james.morse@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> [catalin.marinas@arm.com: default y for CONFIG_ARM64_CNP] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# d71be2b6	15-Jun-2018	Will Deacon <will@kernel.org>	arm64: cpufeature: Detect SSBS and advertise to userspace Armv8.5 introduces a new PSTATE bit known as Speculative Store Bypass Safe (SSBS) which can be used as a mitigation against Spectre variant 4. Additionally, a CPU may provide instructions to manipulate PSTATE.SSBS directly, so that userspace can toggle the SSBS control without trapping to the kernel. This patch probes for the existence of SSBS and advertise the new instructions to userspace if they exist. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 86d0dd34	27-Aug-2018	Ard Biesheuvel <ardb@kernel.org>	arm64: cpufeature: add feature for CRC32 instructions Add a CRC32 feature bit and wire it up to the CPU id register so we will be able to use alternatives patching for CRC32 operations. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# e48d53a9	05-Apr-2018	Marc Zyngier <maz@kernel.org>	arm64: KVM: Add support for Stage-2 control of memory types and cacheability Up to ARMv8.3, the combinaison of Stage-1 and Stage-2 attributes results in the strongest attribute of the two stages. This means that the hypervisor has to perform quite a lot of cache maintenance just in case the guest has some non-cacheable mappings around. ARMv8.4 solves this problem by offering a different mode (FWB) where Stage-2 has total control over the memory attribute (this is limited to systems where both I/O and instruction fetches are coherent with the dcache). This is achieved by having a different set of memory attributes in the page tables, and a new bit set in HCR_EL2. On such a system, we can then safely sidestep any form of dcache management. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
# 314d53d2	04-Jul-2018	Suzuki K Poulose <suzuki.poulose@arm.com>	arm64: Handle mismatched cache type Track mismatches in the cache type register (CTR_EL0), other than the D/I min line sizes and trap user accesses if there are any. Fixes: be68a8aaf925 ("arm64: cpufeature: Fix CTR_EL0 field definitions") Cc: <stable@vger.kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
# a725e3dd	29-May-2018	Marc Zyngier <maz@kernel.org>	arm64: Add ARCH_WORKAROUND_2 probing As for Spectre variant-2, we rely on SMCCC 1.1 to provide the discovery mechanism for detecting the SSBD mitigation. A new capability is also allocated for that purpose, and a config option. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 4bc352ff	10-Apr-2018	Shanker Donthineni <shankerd@codeaurora.org>	arm64: KVM: Use SMCCC_ARCH_WORKAROUND_1 for Falkor BP hardening The function SMCCC_ARCH_WORKAROUND_1 was introduced as part of SMC V1.1 Calling Convention to mitigate CVE-2017-5715. This patch uses the standard call SMCCC_ARCH_WORKAROUND_1 for Falkor chips instead of Silicon provider service ID 0xC2001700. Cc: <stable@vger.kernel.org> # 4.14+ Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org> [maz: reworked errata framework integration] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
# adc91ab7	28-Mar-2018	Marc Zyngier <maz@kernel.org>	Revert "arm64: KVM: Use SMCCC_ARCH_WORKAROUND_1 for Falkor BP hardening" Creates far too many conflicts with arm64/for-next/core, to be resent post -rc1. This reverts commit f9f5dc19509bbef6f5e675346f1a7d7b846bdb12. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
# 05abb595	26-Mar-2018	Suzuki K Poulose <suzuki.poulose@arm.com>	arm64: Delay enabling hardware DBM feature We enable hardware DBM bit in a capable CPU, very early in the boot via __cpu_setup. This doesn't give us a flexibility of optionally disable the feature, as the clearing the bit is a bit costly as the TLB can cache the settings. Instead, we delay enabling the feature until the CPU is brought up into the kernel. We use the feature capability mechanism to handle it. The hardware DBM is a non-conflicting feature. i.e, the kernel can safely run with a mix of CPUs with some using the feature and the others don't. So, it is safe for a late CPU to have this capability and enable it, even if the active CPUs don't. To get this handled properly by the infrastructure, we unconditionally set the capability and only enable it on CPUs which really have the feature. Also, we print the feature detection from the "matches" call back to make sure we don't mislead the user when none of the CPUs could use the feature. Cc: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Dave Martin <dave.martin@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
# f9f5dc19	05-Mar-2018	Shanker Donthineni <shankerd@codeaurora.org>	arm64: KVM: Use SMCCC_ARCH_WORKAROUND_1 for Falkor BP hardening The function SMCCC_ARCH_WORKAROUND_1 was introduced as part of SMC V1.1 Calling Convention to mitigate CVE-2017-5715. This patch uses the standard call SMCCC_ARCH_WORKAROUND_1 for Falkor chips instead of Silicon provider service ID 0xC2001700. Cc: <stable@vger.kernel.org> # 4.14+ Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
# 71dcb8be	27-Feb-2018	Marc Zyngier <maz@kernel.org>	arm64: KVM: Allow far branches from vector slots to the main vectors So far, the branch from the vector slots to the main vectors can at most be 4GB from the main vectors (the reach of ADRP), and this distance is known at compile time. If we were to remap the slots to an unrelated VA, things would break badly. A way to achieve VA independence would be to load the absolute address of the vectors (__kvm_hyp_vector), either using a constant pool or a series of movs, followed by an indirect branch. This patches implements the latter solution, using another instance of a patching callback. Note that since we have to save a register pair on the stack, we branch to the second instruction in the vectors in order to compensate for it. This also results in having to adjust this balance in the invalid vector entry point. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
# a1efdff4	03-Dec-2017	Marc Zyngier <maz@kernel.org>	arm64: cpufeatures: Drop the ARM64_HYP_OFFSET_LOW feature flag Now that we can dynamically compute the kernek/hyp VA mask, there is no need for a feature flag to trigger the alternative patching. Let's drop the flag and everything that depends on it. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
# 6ae4b6e0	07-Mar-2018	Shanker Donthineni <shankerd@codeaurora.org>	arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC The DCache clean & ICache invalidation requirements for instructions to be data coherence are discoverable through new fields in CTR_EL0. The following two control bits DIC and IDC were defined for this purpose. No need to perform point of unification cache maintenance operations from software on systems where CPU caches are transparent. This patch optimize the three functions __flush_cache_user_range(), clean_dcache_area_pou() and invalidate_icache_range() if the hardware reports CTR_EL0.IDC and/or CTR_EL0.IDC. Basically it skips the two instructions 'DC CVAU' and 'IC IVAU', and the associated loop logic in order to avoid the unnecessary overhead. CTR_EL0.DIC: Instruction cache invalidation requirements for instruction to data coherence. The meaning of this bit[29]. 0: Instruction cache invalidation to the point of unification is required for instruction to data coherence. 1: Instruction cache cleaning to the point of unification is not required for instruction to data coherence. CTR_EL0.IDC: Data cache clean requirements for instruction to data coherence. The meaning of this bit[28]. 0: Data cache clean to the point of unification is required for instruction to data coherence, unless CLIDR_EL1.LoC == 0b000 or (CLIDR_EL1.LoUIS == 0b000 && CLIDR_EL1.LoUU == 0b000). 1: Data cache clean to the point of unification is not required for instruction to data coherence. Co-authored-by: Philip Elcan <pelcan@codeaurora.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
# ca79acca	06-Mar-2018	Ard Biesheuvel <ardb@kernel.org>	arm64/kernel: enable A53 erratum #8434319 handling at runtime Omit patching of ADRP instruction at module load time if the current CPUs are not susceptible to the erratum. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [will: Drop duplicate initialisation of .def_scope field] Signed-off-by: Will Deacon <will.deacon@arm.com>
# 64c02720	15-Jan-2018	Xie XiuQi <xiexiuqi@huawei.com>	arm64: cpufeature: Detect CPU RAS Extentions ARM's v8.2 Extentions add support for Reliability, Availability and Serviceability (RAS). On CPUs with these extensions system software can use additional barriers to isolate errors and determine if faults are pending. Add cpufeature detection. Platform level RAS support may require additional firmware support. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com> [Rebased added config option, reworded commit message] Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# ec82b567	05-Jan-2018	Shanker Donthineni <shankerd@codeaurora.org>	arm64: Implement branch predictor hardening for Falkor Falkor is susceptible to branch predictor aliasing and can theoretically be attacked by malicious code. This patch implements a mitigation for these attacks, preventing any malicious entries from affecting other victim contexts. Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org> [will: fix label name when !CONFIG_KVM and remove references to MIDR_FALKOR] Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 0f15adbb	03-Jan-2018	Will Deacon <will@kernel.org>	arm64: Add skeleton to harden the branch predictor against aliasing attacks Aliasing attacks against CPU branch predictors can allow an attacker to redirect speculative control flow on some CPUs and potentially divulge information from one context to another. This patch adds initial skeleton code behind a new Kconfig option to enable implementation-specific mitigations against these attacks for CPUs that are affected. Co-developed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# ea1e3de8	14-Nov-2017	Will Deacon <will@kernel.org>	arm64: entry: Add fake CPU feature for unmapping the kernel at EL0 Allow explicit disabling of the entry trampoline on the kernel command line (kpti=off) by adding a fake CPU feature (ARM64_UNMAP_KERNEL_AT_EL0) that can be used to toggle the alternative sequences in our entry code and avoid use of the trampoline altogether if desired. This also allows us to make use of a static key in arm64_kernel_unmapped_at_el0(). Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Tested-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
# 43994d82	31-Oct-2017	Dave Martin <Dave.Martin@arm.com>	arm64/sve: Detect SVE and activate runtime support This patch enables detection of hardware SVE support via the cpufeatures framework, and reports its presence to the kernel and userspace via the new ARM64_SVE cpucap and HWCAP_SVE hwcap respectively. Userspace can also detect SVE using ID_AA64PFR0_EL1, using the cpufeatures MRS emulation. When running on hardware that supports SVE, this enables runtime kernel support for SVE, and allows user tasks to execute SVE instructions and make of the of the SVE-specific user/kernel interface extensions implemented by this series. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
# d50e071f	25-Jul-2017	Robin Murphy <robin.murphy@arm.com>	arm64: Implement pmem API support Add a clean-to-point-of-persistence cache maintenance helper, and wire up the basic architectural support for the pmem driver based on it. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> [catalin.marinas@arm.com: move arch_*_pmem() functions to arch/arm64/mm/flush.c] [catalin.marinas@arm.com: change dmb(sy) to dmb(osh)] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 690a3415	08-Jun-2017	David Daney <david.daney@cavium.com>	arm64: Add workaround for Cavium Thunder erratum 30115 Some Cavium Thunder CPUs suffer a problem where a KVM guest may inadvertently cause the host kernel to quit receiving interrupts. Use the Group-0/1 trapping in order to deal with it. [maz]: Adapted patch to the Group-0/1 trapping, reworked commit log Tested-by: Alexander Graf <agraf@suse.de> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
# eeb1efbc	20-Mar-2017	Marc Zyngier <maz@kernel.org>	arm64: cpu_errata: Add capability to advertise Cortex-A73 erratum 858921 In order to work around Cortex-A73 erratum 858921 in a subsequent patch, add the required capability that advertise the erratum. As the configuration option it depends on is not present yet, this has no immediate effect. Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
# 38fd94b0	08-Feb-2017	Christopher Covington <cov@codeaurora.org>	arm64: Work around Falkor erratum 1003 The Qualcomm Datacenter Technologies Falkor v1 CPU may allocate TLB entries using an incorrect ASID when TTBRx_EL1 is being updated. When the erratum is triggered, page table entries using the new translation table base address (BADDR) will be allocated into the TLB using the old ASID. All circumstances leading to the incorrect ASID being cached in the TLB arise when software writes TTBRx_EL1[ASID] and TTBRx_EL1[BADDR], a memory operation is in the process of performing a translation using the specific TTBRx_EL1 being written, and the memory operation uses a translation table descriptor designated as non-global. EL2 and EL3 code changing the EL1&0 ASID is not subject to this erratum because hardware is prohibited from performing translations from an out-of-context translation regime. Consider the following pseudo code. write new BADDR and ASID values to TTBRx_EL1 Replacing the above sequence with the one below will ensure that no TLB entries with an incorrect ASID are used by software. write reserved value to TTBRx_EL1[ASID] ISB write new value to TTBRx_EL1[BADDR] ISB write new value to TTBRx_EL1[ASID] ISB When the above sequence is used, page table entries using the new BADDR value may still be incorrectly allocated into the TLB using the reserved ASID. Yet this will not reduce functionality, since TLB entries incorrectly tagged with the reserved ASID will never be hit by a later instruction. Based on work by Shanker Donthineni <shankerd@codeaurora.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
# d9ff80f8	30-Jan-2017	Christopher Covington <cov@codeaurora.org>	arm64: Work around Falkor erratum 1009 During a TLB invalidate sequence targeting the inner shareable domain, Falkor may prematurely complete the DSB before all loads and stores using the old translation are observed. Instruction fetches are not subject to the conditions of this erratum. If the original code sequence includes multiple TLB invalidate instructions followed by a single DSB, onle one of the TLB instructions needs to be repeated to work around this erratum. While the erratum only applies to cases in which the TLBI specifies the inner-shareable domain (*IS form of TLBI) and the DSB is ISH form or stronger (OSH, SYS), this changes applies the workaround overabundantly-- to local TLBI, DSB NSH sequences as well--for simplicity. Based on work by Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Christopher Covington <cov@codeaurora.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
# 272d01bd	03-Nov-2016	Catalin Marinas <catalin.marinas@arm.com>	arm64: Fix circular include of asm/lse.h through linux/jump_label.h Commit efd9e03facd0 ("arm64: Use static keys for CPU features") introduced support for static keys in asm/cpufeature.h, including linux/jump_label.h. When CC_HAVE_ASM_GOTO is not defined, this causes a circular dependency via linux/atomic.h, asm/lse.h and asm/cpufeature.h. This patch moves the capability macros out out of asm/cpufeature.h into a separate asm/cpucaps.h and modifies some of the #includes accordingly. Fixes: efd9e03facd0 ("arm64: Use static keys for CPU features") Reported-by: Artem Savkov <asavkov@redhat.com> Tested-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>