History log of /freebsd-current/sys/arm64/arm64/cpufunc_asm.S
Revision Date Author Comments
# 1e3f42b6 15-Mar-2024 John Baldwin <jhb@FreeBSD.org>

arm64: Switch the address argument to cpu_*cache* to a pointer

No functional change, but this reduces diffs with CheriBSD downstream.

Reviewed by: andrew
Sponsored by: University of Cambridge, Google, Inc.
Differential Revision: https://reviews.freebsd.org/D44342


# 7eb26be9 11-Nov-2023 Andrew Turner <andrew@FreeBSD.org>

arm64: Use adrp + :lo12: to load globals from asm

When loading a global variable we can use a pseudo-instruction similar
to "ldr, xn, =global" to load the address of the symbol. As this is
unlikely to be supported by a mov instruction a pc-relative load is
used, with the absolute address written at the end of the function so
it will be loaded.

This load can be partially replaced with an adrp instruction. This
generates the address, aligned to a 4k boundary, using a pc-relative
addition. Because the address is 4k-aligned we then update reading the
global variable using a load with the offset of the load the low
12-bits of the global. Arm64 assemblers have :lo12: to support this,
e.g. "ldr xn, [xn, :lo12:global]".

The only remaining users of "ldr, xn, =global" that I can find are
executed from the physical address space the kernel was loaded in and
need an address in the kernels virtual address space. Because of this
they can't use adrp.

Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D42565


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 7a060a88 22-Aug-2022 Andrew Turner <andrew@FreeBSD.org>

Add an IDC only arm64 icache sync function

When the IDC flag is set in the cache type register we don't need to
clean the data cache to the point of unification. Previously we
supported this flag being set only when the DIC flags was also set.
Add a new handler for when this is not the case.

Reviewed by: kib
Sponsored by: The FreeBSD Foundation, Ampere (hardware)
Differential Revision: https://reviews.freebsd.org/D36296


# ad020198 19-May-2020 Andrew Turner <andrew@FreeBSD.org>

Stop performing a full icache sync when the DIC and IDC flags are set

The DIC and IDC bits in the CTR_EL0 register signal to the kernel when it
can relax the instruction cache synchronisation operations. The IDC bit
means we can relax cleaning the data cache to the point of unification
while the DIC bit means we don't need to invalidate the instruction cache
for data coherence. In both cases an appropriate barrier is still needed.

For now only implement the case where both bits are set, as is the case
on the Neoverse-N1 as used in the Amazon AWS Graviton 2 CPU. Note that
this behaviour is a optional on the N1 so we may later need to implement
only one or the other bit being set.

There is a tunable to disable each flag on boot.

Testing on a 4 core Graviton 2 instance found a significant improvement
in sys and real time when running "make buildkernel -j4", with no
significant difference in user time.

Reviewed by: markj
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D24853


# fd1f4df2 15-May-2020 Andrew Turner <andrew@FreeBSD.org>

Remove arm64_idcache_wbinv_range as it's unused.

Sponsored by: Innovate UK


# 50e3ab6b 03-Nov-2019 Alan Cox <alc@FreeBSD.org>

Utilize ASIDs to reduce both the direct and indirect costs of context
switching. The indirect costs being unnecessary TLB misses that are
incurred when ASIDs are not used. In fact, currently, when we perform a
context switch on one processor, we issue a broadcast TLB invalidation that
flushes the TLB contents on every processor.

Mark all user-space ("ttbr0") page table entries with the non-global flag so
that they are cached in the TLB under their ASID.

Correct an error in pmap_pinit0(). The pointer to the root of the page
table was being initialized to the root of the kernel-space page table
rather than a user-space page table. However, the root of the page table
that was being cached in process 0's md_l0addr field correctly pointed to a
user-space page table. As long as ASIDs weren't being used, this was
harmless, except that it led to some unnecessary page table switches in
pmap_switch(). Specifically, other kernel processes besides process 0 would
have their md_l0addr field set to the root of the kernel-space page table,
and so pmap_switch() would actually change page tables when switching
between process 0 and other kernel processes.

Implement a workaround for Cavium erratum 27456 affecting ThunderX machines.
(I would like to thank andrew@ for providing the code to detect the affected
machines.)

Address integer overflow in the definition of TCR_ASID_16.

Setup TCR according to the PARange and ASIDBits fields from
ID_AA64MMFR0_EL1. Previously, TCR_ASID_16 was unconditionally set.

Modify build_l1_block_pagetable so that lower attributes, such as ATTR_nG,
can be specified as a parameter.

Eliminate some unused code.

Earlier versions were tested to varying degrees by: andrew, emaste, markj

MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D21922


# d8d0bf06 05-Jul-2019 Alan Cox <alc@FreeBSD.org>

Restructure cache_handle_range to avoid repeated barriers. Specifically,
restructure cache_handle_range so that all of the data cache operations are
performed before any instruction cache operations. Then, we only need one
barrier between the data and instruction cache operations and one barrier
after the instruction cache operations.

On an Amazon EC2 a1.2xlarge instance, this simple change reduces the time
for a "make -j8 buildworld" by 9%.

Reviewed by: andrew
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D20848


# 8308d2a2 07-Feb-2019 Andrew Turner <andrew@FreeBSD.org>

Add a missing data barrier to the start of arm64_tlb_flushID.

We need to ensure the page table store has happened before the tlbi.

Reported by: jchandra
Tested by: jchandra
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D19097


# cd0c606f 15-Jan-2019 Andrew Turner <andrew@FreeBSD.org>

Ensure the I-Cache is correctly handled in arm64_icache_sync_range

The cache_handle_range macro to handle the arm64 instruction and data
cache operations would return when it was complete. This causes problems
for arm64_icache_sync_range and arm64_icache_sync_range_checked as they
assume they can execute the i-cache handling instruction after it has been
called.

Fix this by making this assumption correct.

While here add missing instruction barriers and adjust the style to
match the rest of the assembly.

Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D18838


# e8d5909c 13-Jan-2019 Olivier Houchard <cognet@FreeBSD.org>

Don't forget to add the needed #includes.

Pointy hat to: cognet


# 9cd27257 13-Jan-2019 Olivier Houchard <cognet@FreeBSD.org>

Introduce cpu_icache_sync_range_checked(), that does the same thing as
cpu_icache_sync_range(), except that it sets pcb_onfault to catch any page
fault, as doing cache maintenance operations for non-mapped generates a
data abort, and use it in freebsd32_sysarch(), so that a userland program
attempting to sync the icache with unmapped addresses doesn't crash the
kernel.

Spotted out by: andrew


# 89b090f1 28-Jan-2018 Michal Meloun <mmel@FreeBSD.org>

Fix handling of I-cache sync operations

- pmap_enter_object() can be used for mapping of executable pages, so it's
necessary to handle I-cache synchronization within it.

- Fix race in I-cache synchronization in pmap_enter(). The current code firstly
maps given page to target VA and then do I-cache sync on it. This causes
race, because this mapping become visible to other threads, before I-cache
is synced.
Do sync I-cache firstly (by using DMAP VA) and then map it to target VA.

- ARM64 ARM permits implementation of aliased (AIVIVT, VIPT) I-cache, but we
can use different that final VA for flushing it. So we should use full
I-cache flush on affected platforms. For now, and as temporary solution,
use full flush always.


# 2c404506 06-Feb-2017 Andrew Turner <andrew@FreeBSD.org>

Remove arm64_tlb_flushID_SE, it's unused and may be wrong.

Sponsored by: ABT Systems Ltd


# 77c02ecc 07-Sep-2016 Andrew Turner <andrew@FreeBSD.org>

When synchronising the instruction and data caches we only need to clean
the data cache to the point of unification. This is the point where the
two caches are unified to a single unified cache so cleaning past here
is just extra unneeded work.

This was noticed when investigating r305545.

Reported by: bz
Obtained from: ABT Systems Ltd
MFC after: 1 week
Sponsored by: The FreeBSD Foundation


# b8bbefed 17-Jul-2015 Zbigniew Bodek <zbb@FreeBSD.org>

Fix possible coherency issues between PEs related to I-cache

Basing on B.2.3.4:
Synchronization and coherency issues between data and
instruction accesses.

To ensure that modified instructions are visible to all PEs
(Processing Elements) in a shareability domain one need to
perform following sequence:
1. Clean D-cache
2. Ensure the visibility of data cleaned from cache
3. Invalidate I-cache
4. Ensure completion
5. In SMP system PE must issue isb to ensure execution of the
modified instructions

Reviewed by: andrew
Obtained from: Semihalf
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3106


# e5acd89c 13-Apr-2015 Andrew Turner <andrew@FreeBSD.org>

Bring in the start of the arm64 kernel.

This is only the minimum set of files needed to boot in qemu. As such it is
missing a few things.

The bus_dma code is currently only stub functions with a full implementation
from the development tree to follow.

The gic driver has been copied as the interrupt framework is different. It
is expected the two drivers will be merged by the arm intrng project,
however this will need to be imported into the tree and support for arm64
would need to be added.

This includes code developed by myself, SemiHalf, Ed Maste, and Robin
Randhawa from ARM. This has been funded by the FreeBSD Foundation, with
early development by myself in my spare time with assistance from Robin.

Differential Revision: https://reviews.freebsd.org/D2199
Reviewed by: emaste, imp
Relnotes: yes
Sponsored by: The FreeBSD Foundation