History log of /linux-master/arch/arm64/kernel/cacheinfo.c
Revision Date Author Comments
# c931680c 12-Apr-2023 Radu Rendec <rrendec@redhat.com>

cacheinfo: Add arm64 early level initializer implementation

This patch adds an architecture specific early cache level detection
handler for arm64. This is basically the CLIDR_EL1 based detection that
was previously done (only) in init_cache_level().

This is part of a patch series that attempts to further the work in
commit 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU").
Previously, in the absence of any DT/ACPI cache info, architecture
specific cache detection and info allocation for secondary CPUs would
happen in non-preemptible context during early CPU initialization and
trigger a "BUG: sleeping function called from invalid context" splat on
an RT kernel.

This patch does not solve the problem completely for RT kernels. It
relies on the assumption that on most systems, the CPUs are symmetrical
and therefore have the same number of cache leaves. The cacheinfo memory
is allocated early (on the primary CPU), relying on the new handler. If
later (when CLIDR_EL1 based detection runs again on the secondary CPU)
the initial assumption proves to be wrong and the CPU has in fact more
leaves, the cacheinfo memory is reallocated, and that still triggers a
splat on an RT kernel.

In other words, asymmetrical CPU systems *must* still provide cacheinfo
data in DT/ACPI to avoid the splat on RT kernels (unless secondary CPUs
happen to have less leaves than the primary CPU). But symmetrical CPU
systems (the majority) can now get away without the additional DT/ACPI
data and rely on CLIDR_EL1 based detection.

Signed-off-by: Radu Rendec <rrendec@redhat.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://lore.kernel.org/r/20230412185759.755408-3-rrendec@redhat.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>


# 805e6ec1 11-Jan-2023 Akihiko Odaki <akihiko.odaki@daynix.com>

arm64/cache: Move CLIDR macro definitions

The macros are useful for KVM which needs to manage how CLIDR is exposed
to vcpu so move them to include/asm/cache.h, which KVM can refer to.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Link: https://lore.kernel.org/r/20230112023852.42012-5-akihiko.odaki@daynix.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>


# 921e672d 24-Jan-2023 Pierre Gondois <pierre.gondois@arm.com>

cacheinfo: Remove unused check in init_cache_level()

commit e75d18cecbb3 ("arm64: cacheinfo: Fix incorrect assignment
of signed error value to unsigned fw_level")
checks the fw_level value in init_cache_level() in case the value is
negative.
Remove this check as the error code is not returned through
fw_level anymore, and reset fw_level if acpi_get_cache_info()
failed. This allows to try fetching the cache information from
clidr_el1.

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://lore.kernel.org/r/20230124154053.355376-4-pierre.gondois@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# d931b83e 24-Jan-2023 Pierre Gondois <pierre.gondois@arm.com>

cacheinfo: Make default acpi_get_cache_info() return an error

commit bd500361a937 ("ACPI: PPTT: Update acpi_find_last_cache_level()
to acpi_get_cache_info()")
updates the prototype of acpi_get_cache_info(). The cache 'levels'
is update through a pointer and not the return value of the function.

If CONFIG_ACPI_PPTT is not defined, acpi_get_cache_info() doesn't
update its *levels and *split_levels parameters and returns 0.
This can lead to a faulty behaviour.

Make acpi_get_cache_info() return an error code if CONFIG_ACPI_PPTT
is not defined.
Also,

In init_cache_level(), if no PPTT is present or CONFIG_ACPI_PPTT is
not defined, instead of aborting if acpi_get_cache_info() returns an
error code, just continue. This allows to try fetching the cache
information from clidr_el1.

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://lore.kernel.org/r/20230124154053.355376-3-pierre.gondois@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# bd500361 04-Jan-2023 Pierre Gondois <pierre.gondois@arm.com>

ACPI: PPTT: Update acpi_find_last_cache_level() to acpi_get_cache_info()

acpi_find_last_cache_level() allows to find the last level of cache
for a given CPU. The function is only called on arm64 ACPI based
platforms to check for cache information that would be missing in
the CLIDR_EL1 register.
To allow populating (struct cpu_cacheinfo).num_leaves by only parsing
a PPTT, update acpi_find_last_cache_level() to get the 'split_levels',
i.e. the number of cache levels being split in data/instruction
caches.

It is assumed that there will not be data/instruction caches above a
unified cache.
If a split level consist of one data cache and no instruction cache
(or opposite), then the missing cache will still be populated
by default with minimal cache information, and maximal cpumask
(all non-existing caches have the same fw_token).

Suggested-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Jeremy Linton <jeremy.linton@arm.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20230104183033.755668-6-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>


# e75d18ce 08-Aug-2022 Sudeep Holla <sudeep.holla@arm.com>

arm64: cacheinfo: Fix incorrect assignment of signed error value to unsigned fw_level

Though acpi_find_last_cache_level() always returned signed value and the
document states it will return any errors caused by lack of a PPTT table,
it never returned negative values before.

Commit 0c80f9e165f8 ("ACPI: PPTT: Leave the table mapped for the runtime usage")
however changed it by returning -ENOENT if no PPTT was found. The value
returned from acpi_find_last_cache_level() is then assigned to unsigned
fw_level.

It will result in the number of cache leaves calculated incorrectly as
a huge value which will then cause the following warning from __alloc_pages
as the order would be great than MAX_ORDER because of incorrect and huge
cache leaves value.

| WARNING: CPU: 0 PID: 1 at mm/page_alloc.c:5407 __alloc_pages+0x74/0x314
| Modules linked in:
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-10393-g7c2a8d3ac4c0 #73
| pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : __alloc_pages+0x74/0x314
| lr : alloc_pages+0xe8/0x318
| Call trace:
| __alloc_pages+0x74/0x314
| alloc_pages+0xe8/0x318
| kmalloc_order_trace+0x68/0x1dc
| __kmalloc+0x240/0x338
| detect_cache_attributes+0xe0/0x56c
| update_siblings_masks+0x38/0x284
| store_cpu_topology+0x78/0x84
| smp_prepare_cpus+0x48/0x134
| kernel_init_freeable+0xc4/0x14c
| kernel_init+0x2c/0x1b4
| ret_from_fork+0x10/0x20

Fix the same by changing fw_level to be signed integer and return the
error from init_cache_level() early in case of error.

Reported-and-Tested-by: Bruno Goncalves <bgoncalv@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20220808084640.3165368-1-sudeep.holla@arm.com
Signed-off-by: Will Deacon <will@kernel.org>


# 4b92d4ad 31-Aug-2021 Thomas Gleixner <tglx@linutronix.de>

drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION()

DEFINE_SMP_CALL_CACHE_FUNCTION() was usefel before the CPU hotplug rework
to ensure that the cache related functions are called on the upcoming CPU
because the notifier itself could run on any online CPU.

The hotplug state machine guarantees that the callbacks are invoked on the
upcoming CPU. So there is no need to have this SMP function call
obfuscation. That indirection was missed when the hotplug notifiers were
converted.

This also solves the problem of ARM64 init_cache_level() invoking ACPI
functions which take a semaphore in that context. That's invalid as SMP
function calls run with interrupts disabled. Running it just from the
callback in context of the CPU hotplug thread solves this.

Fixes: 8571890e1513 ("arm64: Add support for ACPI based firmware tables")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/871r69ersb.ffs@tglx


# 8f5c9037 14-Jun-2019 Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>

arm64/mm: Correct the cache line size warning with non coherent device

If the cache line size is greater than ARCH_DMA_MINALIGN (128),
the warning shows and it's tainted as TAINT_CPU_OUT_OF_SPEC.

However, it's not good because as discussed in the thread [1], the cpu
cache line size will be problem only on non-coherent devices.

Since the coherent flag is already introduced to struct device,
show the warning only if the device is non-coherent device and
ARCH_DMA_MINALIGN is smaller than the cpu cache size.

[1] https://lore.kernel.org/linux-arm-kernel/20180514145703.celnlobzn3uh5tc2@localhost/

Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Tested-by: Zhang Lei <zhang.lei@jp.fujitsu.com>
[catalin.marinas@arm.com: removed 'if' block for WARN_TAINT]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# c9af7f31 29-May-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 252

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation this program is
distributed as is without any warranty of any kind whether express
or implied without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details you should have received a copy of the gnu general
public license along with this program if not see http www gnu org
licenses

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 2 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141332.617181045@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 7b8c87b2 27-May-2019 Shaokun Zhang <zhangshaokun@hisilicon.com>

arm64: cacheinfo: Update cache_line_size detected from DT or PPTT

cache_line_size is derived from CTR_EL0.CWG field and is called mostly
for I/O device drivers. For some platforms like the HiSilicon Kunpeng920
server SoC, cache line sizes are different between L1/2 cache and L3
cache while L1 cache line size is 64-byte and L3 is 128-byte, but
CTR_EL0.CWG is misreporting using L1 cache line size.

We shall correct the right value which is important for I/O performance.
Let's update the cache line size if it is detected from DT or PPTT
information.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Zhenfa Qiu <qiuzhenfa@hisilicon.com>
Reported-by: Zhenfa Qiu <qiuzhenfa@hisilicon.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 8571890e 11-May-2018 Jeremy Linton <jeremy.linton@arm.com>

arm64: Add support for ACPI based firmware tables

The /sys cache entries should support ACPI/PPTT generated cache
topology information. For arm64, if ACPI is enabled, determine
the max number of cache levels and populate them using the PPTT
table if one is available.

Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Vijaya Kumar K <vkilari@codeaurora.org>
Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# a8d4636f 10-Mar-2017 Will Deacon <will@kernel.org>

arm64: cacheinfo: Remove CCSIDR-based cache information probing

The CCSIDR_EL1.{NumSets,Associativity,LineSize} fields are only for use
in conjunction with set/way cache maintenance and are not guaranteed to
represent the actual microarchitectural features of a design.

The architecture explicitly states:

| You cannot make any inference about the actual sizes of caches based
| on these parameters.

Furthermore, CCSIDR_EL1.{WT,WB,RA,WA} have been removed retrospectively
from ARMv8 and are now considered to be UNKNOWN.

Since the kernel doesn't make use of set/way cache maintenance and it is
not possible for userspace to execute these instructions, we have no
need for the CCSIDR information in the kernel.

This patch removes the accessors, along with the related portions of the
cacheinfo support, which should instead be reintroduced when firmware has
a mechanism to provide us with reliable information.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 9a802431 16-Jan-2017 Sudeep Holla <sudeep.holla@arm.com>

arm64: cacheinfo: add support to override cache levels via device tree

The cache hierarchy can be identified through Cache Level ID(CLIDR)
architected system register. However in some cases it will provide
only the number of cache levels that are integrated into the processor
itself. In other words, it can't provide any information about the
caches that are external and/or transparent.

Some platforms require to export the information about all such external
caches to the userspace applications via the sysfs interface.

This patch adds support to override the cache levels using device tree
to take such external non-architected caches into account.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Tested-by: Tan Xiaojun <tanxiaojun@huawei.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# adf75899 08-Sep-2016 Mark Rutland <mark.rutland@arm.com>

arm64: simplify sysreg manipulation

A while back we added {read,write}_sysreg accessors to handle accesses
to system registers, without the usual boilerplate asm volatile,
temporary variable, etc.

This patch makes use of these across arm64 to make code shorter and
clearer. For sequences with a trailing ISB, the existing isb() macro is
also used so that asm blocks can be removed entirely.

A few uses of inline assembly for msr/mrs are left as-is. Those
manipulating sp_el0 for the current thread_info value have special
clobber requiremends.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 5d425c18 08-Jan-2015 Sudeep Holla <sudeep.holla@arm.com>

arm64: kernel: add support for cpu cache information

This patch adds support for cacheinfo on ARM64.

On ARMv8, the cache hierarchy can be identified through Cache Level ID
(CLIDR) register while the cache geometry is provided by Cache Size ID
(CCSIDR) register.

Since the architecture doesn't provide any way of detecting the cpus
sharing particular cache, device tree is used for the same purpose.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>