History log of /netbsd-current/sys/arch/aarch64/aarch64/cpu.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.76 09-May-2024 pho

kern/58195: arm: Support drvctl -d and -r for cpufeaturebus

This is required for detaching and re-attaching the vmt(4) driver on aarch64.


# 1.75 09-May-2024 pho

port-arm/58194: Resurrect vmt(4) from bitrot

On this architecture vmt(4) used to search for a node "/hypervisor" in the
FDT and probed the VMware hypervisor call only when the node was
found. However, things appear to have changed and VMware no longer provides
the FDT node.

Since vmt(4) doesn't actually need to read anything from FDT, and the
hypervisor call logically resides in virtual CPUs themselves, it would be
better to attach it directly to cpu, just like how it's probed on x86.


# 1.74 07-Feb-2024 msaitoh

Remove ryo@'s mail addresses.


Revision tags: thorpej-ifq-base thorpej-altq-separation-base
# 1.73 03-Feb-2023 skrll

Remove useless/harmful casts in debug messages. MPIDR AFF3 would not
be printed before.


# 1.72 22-Dec-2022 ryo

PMCR_EL0.LC should be set. ARM deprecates use of PMCR_EL0.LC=0


# 1.71 22-Dec-2022 ryo

Explicitly disable overflow interrupts before enabling the cycle counter.


Revision tags: netbsd-10-base bouyer-sunxi-drm-base
# 1.70 29-May-2022 ryo

branches: 1.70.4;
fix build without options DDB


# 1.69 03-Mar-2022 riastradh

arm: Use device_set_private for cpuN.

For cpu at fdt, nix the fdt softc -- this was leaked and never used
for anything. The device's private storage is the cpu_info.


# 1.68 12-Nov-2021 skrll

Print a big warning about trying to run on early ThunderX parts


# 1.67 31-Oct-2021 skrll

Rework Arm (32bit and 64bit) AP startup so that cpu_hatch doesn't sleep.

The AP initialisation code in cpu_init_secondary_processor will read and
initialise the required system registers and state for the BP to attach
and report.

Rework the interrupt handler code for this new sequence. Thankfully,
this removes a bunch of code for bcm2836mp.

The VFP detection handler on <= armv7 relies on the global undefined
handler being in place until the BP attaches vfp. That is, after the
APs have been spun up.

gicv3_its.c has a serialisation issue which is protected against in
the gicv3_its_cpu_init, which is called from cpu_hatch, with a spin
lock. The serialisation issue needs addressing more completely.

Tested on RPI3, Apple M1, QEMU, and lx2k

Fixes PR port-arm/56264:
diagnostic assertion "l->l_stat == LSONPROC" failed on RPI3


# 1.66 30-Oct-2021 skrll

G/C MD_CPU_HATCH. It's old evbarm (<= armv7)


# 1.65 30-Oct-2021 skrll

style. NFCI.


# 1.64 17-Oct-2021 skrll

Remove some newlines


# 1.63 10-Oct-2021 skrll

Need to call pmap_tlb_info_attach for each CPU. Missed in previous
commit.
CVS ----------------------------------------------------------------------


# 1.62 04-Oct-2021 skrll

Add a KASSERT


# 1.61 30-Aug-2021 jmcneill

Identify Apple M1 "Icestorm" and "Firestorm" CPU types.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.60 19-Jun-2021 jmcneill

Do not try to initialize PMU if ID_AA64DFR0_EL1 reports a non-standard
PMU implementation.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.59 09-Mar-2021 ryo

branches: 1.59.4;
Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().


# 1.58 11-Jan-2021 skrll

Improve a comment


# 1.57 11-Dec-2020 skrll

s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge


# 1.56 10-Oct-2020 jmcneill

branches: 1.56.2;
Fix detection of FP and SIMD features on Armv8.2+.


# 1.55 07-Oct-2020 jmcneill

Only touch PMC registers if Performance Monitor Extensions are present.


# 1.54 25-Jul-2020 riastradh

Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.


# 1.53 25-Jul-2020 riastradh

Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.74 07-Feb-2024 msaitoh

Remove ryo@'s mail addresses.


Revision tags: thorpej-ifq-base thorpej-altq-separation-base
# 1.73 03-Feb-2023 skrll

Remove useless/harmful casts in debug messages. MPIDR AFF3 would not
be printed before.


# 1.72 22-Dec-2022 ryo

PMCR_EL0.LC should be set. ARM deprecates use of PMCR_EL0.LC=0


# 1.71 22-Dec-2022 ryo

Explicitly disable overflow interrupts before enabling the cycle counter.


Revision tags: netbsd-10-base bouyer-sunxi-drm-base
# 1.70 29-May-2022 ryo

branches: 1.70.4;
fix build without options DDB


# 1.69 03-Mar-2022 riastradh

arm: Use device_set_private for cpuN.

For cpu at fdt, nix the fdt softc -- this was leaked and never used
for anything. The device's private storage is the cpu_info.


# 1.68 12-Nov-2021 skrll

Print a big warning about trying to run on early ThunderX parts


# 1.67 31-Oct-2021 skrll

Rework Arm (32bit and 64bit) AP startup so that cpu_hatch doesn't sleep.

The AP initialisation code in cpu_init_secondary_processor will read and
initialise the required system registers and state for the BP to attach
and report.

Rework the interrupt handler code for this new sequence. Thankfully,
this removes a bunch of code for bcm2836mp.

The VFP detection handler on <= armv7 relies on the global undefined
handler being in place until the BP attaches vfp. That is, after the
APs have been spun up.

gicv3_its.c has a serialisation issue which is protected against in
the gicv3_its_cpu_init, which is called from cpu_hatch, with a spin
lock. The serialisation issue needs addressing more completely.

Tested on RPI3, Apple M1, QEMU, and lx2k

Fixes PR port-arm/56264:
diagnostic assertion "l->l_stat == LSONPROC" failed on RPI3


# 1.66 30-Oct-2021 skrll

G/C MD_CPU_HATCH. It's old evbarm (<= armv7)


# 1.65 30-Oct-2021 skrll

style. NFCI.


# 1.64 17-Oct-2021 skrll

Remove some newlines


# 1.63 10-Oct-2021 skrll

Need to call pmap_tlb_info_attach for each CPU. Missed in previous
commit.
CVS ----------------------------------------------------------------------


# 1.62 04-Oct-2021 skrll

Add a KASSERT


# 1.61 30-Aug-2021 jmcneill

Identify Apple M1 "Icestorm" and "Firestorm" CPU types.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.60 19-Jun-2021 jmcneill

Do not try to initialize PMU if ID_AA64DFR0_EL1 reports a non-standard
PMU implementation.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.59 09-Mar-2021 ryo

branches: 1.59.4;
Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().


# 1.58 11-Jan-2021 skrll

Improve a comment


# 1.57 11-Dec-2020 skrll

s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge


# 1.56 10-Oct-2020 jmcneill

branches: 1.56.2;
Fix detection of FP and SIMD features on Armv8.2+.


# 1.55 07-Oct-2020 jmcneill

Only touch PMC registers if Performance Monitor Extensions are present.


# 1.54 25-Jul-2020 riastradh

Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.


# 1.53 25-Jul-2020 riastradh

Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.73 03-Feb-2023 skrll

Remove useless/harmful casts in debug messages. MPIDR AFF3 would not
be printed before.


# 1.72 22-Dec-2022 ryo

PMCR_EL0.LC should be set. ARM deprecates use of PMCR_EL0.LC=0


# 1.71 22-Dec-2022 ryo

Explicitly disable overflow interrupts before enabling the cycle counter.


Revision tags: netbsd-10-base bouyer-sunxi-drm-base
# 1.70 29-May-2022 ryo

branches: 1.70.4;
fix build without options DDB


# 1.69 03-Mar-2022 riastradh

arm: Use device_set_private for cpuN.

For cpu at fdt, nix the fdt softc -- this was leaked and never used
for anything. The device's private storage is the cpu_info.


# 1.68 12-Nov-2021 skrll

Print a big warning about trying to run on early ThunderX parts


# 1.67 31-Oct-2021 skrll

Rework Arm (32bit and 64bit) AP startup so that cpu_hatch doesn't sleep.

The AP initialisation code in cpu_init_secondary_processor will read and
initialise the required system registers and state for the BP to attach
and report.

Rework the interrupt handler code for this new sequence. Thankfully,
this removes a bunch of code for bcm2836mp.

The VFP detection handler on <= armv7 relies on the global undefined
handler being in place until the BP attaches vfp. That is, after the
APs have been spun up.

gicv3_its.c has a serialisation issue which is protected against in
the gicv3_its_cpu_init, which is called from cpu_hatch, with a spin
lock. The serialisation issue needs addressing more completely.

Tested on RPI3, Apple M1, QEMU, and lx2k

Fixes PR port-arm/56264:
diagnostic assertion "l->l_stat == LSONPROC" failed on RPI3


# 1.66 30-Oct-2021 skrll

G/C MD_CPU_HATCH. It's old evbarm (<= armv7)


# 1.65 30-Oct-2021 skrll

style. NFCI.


# 1.64 17-Oct-2021 skrll

Remove some newlines


# 1.63 10-Oct-2021 skrll

Need to call pmap_tlb_info_attach for each CPU. Missed in previous
commit.
CVS ----------------------------------------------------------------------


# 1.62 04-Oct-2021 skrll

Add a KASSERT


# 1.61 30-Aug-2021 jmcneill

Identify Apple M1 "Icestorm" and "Firestorm" CPU types.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.60 19-Jun-2021 jmcneill

Do not try to initialize PMU if ID_AA64DFR0_EL1 reports a non-standard
PMU implementation.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.59 09-Mar-2021 ryo

branches: 1.59.4;
Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().


# 1.58 11-Jan-2021 skrll

Improve a comment


# 1.57 11-Dec-2020 skrll

s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge


# 1.56 10-Oct-2020 jmcneill

branches: 1.56.2;
Fix detection of FP and SIMD features on Armv8.2+.


# 1.55 07-Oct-2020 jmcneill

Only touch PMC registers if Performance Monitor Extensions are present.


# 1.54 25-Jul-2020 riastradh

Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.


# 1.53 25-Jul-2020 riastradh

Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.72 22-Dec-2022 ryo

PMCR_EL0.LC should be set. ARM deprecates use of PMCR_EL0.LC=0


# 1.71 22-Dec-2022 ryo

Explicitly disable overflow interrupts before enabling the cycle counter.


Revision tags: netbsd-10-base bouyer-sunxi-drm-base
# 1.70 29-May-2022 ryo

fix build without options DDB


# 1.69 03-Mar-2022 riastradh

arm: Use device_set_private for cpuN.

For cpu at fdt, nix the fdt softc -- this was leaked and never used
for anything. The device's private storage is the cpu_info.


# 1.68 12-Nov-2021 skrll

Print a big warning about trying to run on early ThunderX parts


# 1.67 31-Oct-2021 skrll

Rework Arm (32bit and 64bit) AP startup so that cpu_hatch doesn't sleep.

The AP initialisation code in cpu_init_secondary_processor will read and
initialise the required system registers and state for the BP to attach
and report.

Rework the interrupt handler code for this new sequence. Thankfully,
this removes a bunch of code for bcm2836mp.

The VFP detection handler on <= armv7 relies on the global undefined
handler being in place until the BP attaches vfp. That is, after the
APs have been spun up.

gicv3_its.c has a serialisation issue which is protected against in
the gicv3_its_cpu_init, which is called from cpu_hatch, with a spin
lock. The serialisation issue needs addressing more completely.

Tested on RPI3, Apple M1, QEMU, and lx2k

Fixes PR port-arm/56264:
diagnostic assertion "l->l_stat == LSONPROC" failed on RPI3


# 1.66 30-Oct-2021 skrll

G/C MD_CPU_HATCH. It's old evbarm (<= armv7)


# 1.65 30-Oct-2021 skrll

style. NFCI.


# 1.64 17-Oct-2021 skrll

Remove some newlines


# 1.63 10-Oct-2021 skrll

Need to call pmap_tlb_info_attach for each CPU. Missed in previous
commit.
CVS ----------------------------------------------------------------------


# 1.62 04-Oct-2021 skrll

Add a KASSERT


# 1.61 30-Aug-2021 jmcneill

Identify Apple M1 "Icestorm" and "Firestorm" CPU types.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.60 19-Jun-2021 jmcneill

Do not try to initialize PMU if ID_AA64DFR0_EL1 reports a non-standard
PMU implementation.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.59 09-Mar-2021 ryo

branches: 1.59.4;
Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().


# 1.58 11-Jan-2021 skrll

Improve a comment


# 1.57 11-Dec-2020 skrll

s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge


# 1.56 10-Oct-2020 jmcneill

branches: 1.56.2;
Fix detection of FP and SIMD features on Armv8.2+.


# 1.55 07-Oct-2020 jmcneill

Only touch PMC registers if Performance Monitor Extensions are present.


# 1.54 25-Jul-2020 riastradh

Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.


# 1.53 25-Jul-2020 riastradh

Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.70 29-May-2022 ryo

fix build without options DDB


# 1.69 03-Mar-2022 riastradh

arm: Use device_set_private for cpuN.

For cpu at fdt, nix the fdt softc -- this was leaked and never used
for anything. The device's private storage is the cpu_info.


# 1.68 12-Nov-2021 skrll

Print a big warning about trying to run on early ThunderX parts


# 1.67 31-Oct-2021 skrll

Rework Arm (32bit and 64bit) AP startup so that cpu_hatch doesn't sleep.

The AP initialisation code in cpu_init_secondary_processor will read and
initialise the required system registers and state for the BP to attach
and report.

Rework the interrupt handler code for this new sequence. Thankfully,
this removes a bunch of code for bcm2836mp.

The VFP detection handler on <= armv7 relies on the global undefined
handler being in place until the BP attaches vfp. That is, after the
APs have been spun up.

gicv3_its.c has a serialisation issue which is protected against in
the gicv3_its_cpu_init, which is called from cpu_hatch, with a spin
lock. The serialisation issue needs addressing more completely.

Tested on RPI3, Apple M1, QEMU, and lx2k

Fixes PR port-arm/56264:
diagnostic assertion "l->l_stat == LSONPROC" failed on RPI3


# 1.66 30-Oct-2021 skrll

G/C MD_CPU_HATCH. It's old evbarm (<= armv7)


# 1.65 30-Oct-2021 skrll

style. NFCI.


# 1.64 17-Oct-2021 skrll

Remove some newlines


# 1.63 10-Oct-2021 skrll

Need to call pmap_tlb_info_attach for each CPU. Missed in previous
commit.
CVS ----------------------------------------------------------------------


# 1.62 04-Oct-2021 skrll

Add a KASSERT


# 1.61 30-Aug-2021 jmcneill

Identify Apple M1 "Icestorm" and "Firestorm" CPU types.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.60 19-Jun-2021 jmcneill

Do not try to initialize PMU if ID_AA64DFR0_EL1 reports a non-standard
PMU implementation.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.59 09-Mar-2021 ryo

branches: 1.59.4;
Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().


# 1.58 11-Jan-2021 skrll

Improve a comment


# 1.57 11-Dec-2020 skrll

s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge


# 1.56 10-Oct-2020 jmcneill

branches: 1.56.2;
Fix detection of FP and SIMD features on Armv8.2+.


# 1.55 07-Oct-2020 jmcneill

Only touch PMC registers if Performance Monitor Extensions are present.


# 1.54 25-Jul-2020 riastradh

Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.


# 1.53 25-Jul-2020 riastradh

Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.69 03-Mar-2022 riastradh

arm: Use device_set_private for cpuN.

For cpu at fdt, nix the fdt softc -- this was leaked and never used
for anything. The device's private storage is the cpu_info.


# 1.68 12-Nov-2021 skrll

Print a big warning about trying to run on early ThunderX parts


# 1.67 31-Oct-2021 skrll

Rework Arm (32bit and 64bit) AP startup so that cpu_hatch doesn't sleep.

The AP initialisation code in cpu_init_secondary_processor will read and
initialise the required system registers and state for the BP to attach
and report.

Rework the interrupt handler code for this new sequence. Thankfully,
this removes a bunch of code for bcm2836mp.

The VFP detection handler on <= armv7 relies on the global undefined
handler being in place until the BP attaches vfp. That is, after the
APs have been spun up.

gicv3_its.c has a serialisation issue which is protected against in
the gicv3_its_cpu_init, which is called from cpu_hatch, with a spin
lock. The serialisation issue needs addressing more completely.

Tested on RPI3, Apple M1, QEMU, and lx2k

Fixes PR port-arm/56264:
diagnostic assertion "l->l_stat == LSONPROC" failed on RPI3


# 1.66 30-Oct-2021 skrll

G/C MD_CPU_HATCH. It's old evbarm (<= armv7)


# 1.65 30-Oct-2021 skrll

style. NFCI.


# 1.64 17-Oct-2021 skrll

Remove some newlines


# 1.63 10-Oct-2021 skrll

Need to call pmap_tlb_info_attach for each CPU. Missed in previous
commit.
CVS ----------------------------------------------------------------------


# 1.62 04-Oct-2021 skrll

Add a KASSERT


# 1.61 30-Aug-2021 jmcneill

Identify Apple M1 "Icestorm" and "Firestorm" CPU types.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.60 19-Jun-2021 jmcneill

Do not try to initialize PMU if ID_AA64DFR0_EL1 reports a non-standard
PMU implementation.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.59 09-Mar-2021 ryo

branches: 1.59.4;
Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().


# 1.58 11-Jan-2021 skrll

Improve a comment


# 1.57 11-Dec-2020 skrll

s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge


# 1.56 10-Oct-2020 jmcneill

branches: 1.56.2;
Fix detection of FP and SIMD features on Armv8.2+.


# 1.55 07-Oct-2020 jmcneill

Only touch PMC registers if Performance Monitor Extensions are present.


# 1.54 25-Jul-2020 riastradh

Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.


# 1.53 25-Jul-2020 riastradh

Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.68 12-Nov-2021 skrll

Print a big warning about trying to run on early ThunderX parts


# 1.67 31-Oct-2021 skrll

Rework Arm (32bit and 64bit) AP startup so that cpu_hatch doesn't sleep.

The AP initialisation code in cpu_init_secondary_processor will read and
initialise the required system registers and state for the BP to attach
and report.

Rework the interrupt handler code for this new sequence. Thankfully,
this removes a bunch of code for bcm2836mp.

The VFP detection handler on <= armv7 relies on the global undefined
handler being in place until the BP attaches vfp. That is, after the
APs have been spun up.

gicv3_its.c has a serialisation issue which is protected against in
the gicv3_its_cpu_init, which is called from cpu_hatch, with a spin
lock. The serialisation issue needs addressing more completely.

Tested on RPI3, Apple M1, QEMU, and lx2k

Fixes PR port-arm/56264:
diagnostic assertion "l->l_stat == LSONPROC" failed on RPI3


# 1.66 30-Oct-2021 skrll

G/C MD_CPU_HATCH. It's old evbarm (<= armv7)


# 1.65 30-Oct-2021 skrll

style. NFCI.


# 1.64 17-Oct-2021 skrll

Remove some newlines


# 1.63 10-Oct-2021 skrll

Need to call pmap_tlb_info_attach for each CPU. Missed in previous
commit.
CVS ----------------------------------------------------------------------


# 1.62 04-Oct-2021 skrll

Add a KASSERT


# 1.61 30-Aug-2021 jmcneill

Identify Apple M1 "Icestorm" and "Firestorm" CPU types.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.60 19-Jun-2021 jmcneill

Do not try to initialize PMU if ID_AA64DFR0_EL1 reports a non-standard
PMU implementation.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.59 09-Mar-2021 ryo

branches: 1.59.4;
Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().


# 1.58 11-Jan-2021 skrll

Improve a comment


# 1.57 11-Dec-2020 skrll

s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge


# 1.56 10-Oct-2020 jmcneill

branches: 1.56.2;
Fix detection of FP and SIMD features on Armv8.2+.


# 1.55 07-Oct-2020 jmcneill

Only touch PMC registers if Performance Monitor Extensions are present.


# 1.54 25-Jul-2020 riastradh

Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.


# 1.53 25-Jul-2020 riastradh

Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.67 31-Oct-2021 skrll

Rework Arm (32bit and 64bit) AP startup so that cpu_hatch doesn't sleep.

The AP initialisation code in cpu_init_secondary_processor will read and
initialise the required system registers and state for the BP to attach
and report.

Rework the interrupt handler code for this new sequence. Thankfully,
this removes a bunch of code for bcm2836mp.

The VFP detection handler on <= armv7 relies on the global undefined
handler being in place until the BP attaches vfp. That is, after the
APs have been spun up.

gicv3_its.c has a serialisation issue which is protected against in
the gicv3_its_cpu_init, which is called from cpu_hatch, with a spin
lock. The serialisation issue needs addressing more completely.

Tested on RPI3, Apple M1, QEMU, and lx2k

Fixes PR port-arm/56264:
diagnostic assertion "l->l_stat == LSONPROC" failed on RPI3


# 1.66 30-Oct-2021 skrll

G/C MD_CPU_HATCH. It's old evbarm (<= armv7)


# 1.65 30-Oct-2021 skrll

style. NFCI.


# 1.64 17-Oct-2021 skrll

Remove some newlines


# 1.63 10-Oct-2021 skrll

Need to call pmap_tlb_info_attach for each CPU. Missed in previous
commit.
CVS ----------------------------------------------------------------------


# 1.62 04-Oct-2021 skrll

Add a KASSERT


# 1.61 30-Aug-2021 jmcneill

Identify Apple M1 "Icestorm" and "Firestorm" CPU types.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.60 19-Jun-2021 jmcneill

Do not try to initialize PMU if ID_AA64DFR0_EL1 reports a non-standard
PMU implementation.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.59 09-Mar-2021 ryo

branches: 1.59.4;
Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().


# 1.58 11-Jan-2021 skrll

Improve a comment


# 1.57 11-Dec-2020 skrll

s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge


# 1.56 10-Oct-2020 jmcneill

branches: 1.56.2;
Fix detection of FP and SIMD features on Armv8.2+.


# 1.55 07-Oct-2020 jmcneill

Only touch PMC registers if Performance Monitor Extensions are present.


# 1.54 25-Jul-2020 riastradh

Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.


# 1.53 25-Jul-2020 riastradh

Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.66 30-Oct-2021 skrll

G/C MD_CPU_HATCH. It's old evbarm (<= armv7)


# 1.65 30-Oct-2021 skrll

style. NFCI.


# 1.64 17-Oct-2021 skrll

Remove some newlines


# 1.63 10-Oct-2021 skrll

Need to call pmap_tlb_info_attach for each CPU. Missed in previous
commit.
CVS ----------------------------------------------------------------------


# 1.62 04-Oct-2021 skrll

Add a KASSERT


# 1.61 30-Aug-2021 jmcneill

Identify Apple M1 "Icestorm" and "Firestorm" CPU types.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.60 19-Jun-2021 jmcneill

Do not try to initialize PMU if ID_AA64DFR0_EL1 reports a non-standard
PMU implementation.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.59 09-Mar-2021 ryo

branches: 1.59.4;
Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().


# 1.58 11-Jan-2021 skrll

Improve a comment


# 1.57 11-Dec-2020 skrll

s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge


# 1.56 10-Oct-2020 jmcneill

branches: 1.56.2;
Fix detection of FP and SIMD features on Armv8.2+.


# 1.55 07-Oct-2020 jmcneill

Only touch PMC registers if Performance Monitor Extensions are present.


# 1.54 25-Jul-2020 riastradh

Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.


# 1.53 25-Jul-2020 riastradh

Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.64 17-Oct-2021 skrll

Remove some newlines


# 1.63 10-Oct-2021 skrll

Need to call pmap_tlb_info_attach for each CPU. Missed in previous
commit.
CVS ----------------------------------------------------------------------


# 1.62 04-Oct-2021 skrll

Add a KASSERT


# 1.61 30-Aug-2021 jmcneill

Identify Apple M1 "Icestorm" and "Firestorm" CPU types.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.60 19-Jun-2021 jmcneill

Do not try to initialize PMU if ID_AA64DFR0_EL1 reports a non-standard
PMU implementation.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.59 09-Mar-2021 ryo

branches: 1.59.4;
Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().


# 1.58 11-Jan-2021 skrll

Improve a comment


# 1.57 11-Dec-2020 skrll

s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge


# 1.56 10-Oct-2020 jmcneill

branches: 1.56.2;
Fix detection of FP and SIMD features on Armv8.2+.


# 1.55 07-Oct-2020 jmcneill

Only touch PMC registers if Performance Monitor Extensions are present.


# 1.54 25-Jul-2020 riastradh

Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.


# 1.53 25-Jul-2020 riastradh

Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.61 30-Aug-2021 jmcneill

Identify Apple M1 "Icestorm" and "Firestorm" CPU types.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.60 19-Jun-2021 jmcneill

Do not try to initialize PMU if ID_AA64DFR0_EL1 reports a non-standard
PMU implementation.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.59 09-Mar-2021 ryo

branches: 1.59.4;
Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().


# 1.58 11-Jan-2021 skrll

Improve a comment


# 1.57 11-Dec-2020 skrll

s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge


# 1.56 10-Oct-2020 jmcneill

branches: 1.56.2;
Fix detection of FP and SIMD features on Armv8.2+.


# 1.55 07-Oct-2020 jmcneill

Only touch PMC registers if Performance Monitor Extensions are present.


# 1.54 25-Jul-2020 riastradh

Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.


# 1.53 25-Jul-2020 riastradh

Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.60 19-Jun-2021 jmcneill

Do not try to initialize PMU if ID_AA64DFR0_EL1 reports a non-standard
PMU implementation.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.59 09-Mar-2021 ryo

Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().


# 1.58 11-Jan-2021 skrll

Improve a comment


# 1.57 11-Dec-2020 skrll

s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge


# 1.56 10-Oct-2020 jmcneill

branches: 1.56.2;
Fix detection of FP and SIMD features on Armv8.2+.


# 1.55 07-Oct-2020 jmcneill

Only touch PMC registers if Performance Monitor Extensions are present.


# 1.54 25-Jul-2020 riastradh

Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.


# 1.53 25-Jul-2020 riastradh

Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.59 09-Mar-2021 ryo

Add support hardware breakpoint and watchpoint again.

Limited support for hardware watchpoint has been available for some time, but it
has not been working properly. In addition, it stopped working at the time of
the PTRACE support commit on 2018-12-13. This has been fixed to work correctly,
and also fixed to be practical by sharing hardware watchpoints and breakpoints
between CPUs on MULTIPROCESSOR.

Also fixed a bug that causes a malfunction when switching CPUs with
"machine cpu N" when entering ddb mode from other than cpu_Debugger().

I have confirmed that the CPU can be switched by "machine cpu N" and return from
ddb properly in each case where ddb is called triggered by ddb break/watchpoint,
hardware break/watchpoint, and cpu_Debugger().


# 1.58 11-Jan-2021 skrll

Improve a comment


Revision tags: thorpej-futex-base
# 1.57 11-Dec-2020 skrll

s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge


# 1.56 10-Oct-2020 jmcneill

branches: 1.56.2;
Fix detection of FP and SIMD features on Armv8.2+.


# 1.55 07-Oct-2020 jmcneill

Only touch PMC registers if Performance Monitor Extensions are present.


# 1.54 25-Jul-2020 riastradh

Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.


# 1.53 25-Jul-2020 riastradh

Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.58 11-Jan-2021 skrll

Improve a comment


Revision tags: thorpej-futex-base
# 1.57 11-Dec-2020 skrll

s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge


# 1.56 10-Oct-2020 jmcneill

branches: 1.56.2;
Fix detection of FP and SIMD features on Armv8.2+.


# 1.55 07-Oct-2020 jmcneill

Only touch PMC registers if Performance Monitor Extensions are present.


# 1.54 25-Jul-2020 riastradh

Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.


# 1.53 25-Jul-2020 riastradh

Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.57 11-Dec-2020 skrll

s:aarch64/cpufunc.h:arm/cpufunc.h:

a baby step in the grand arm header unification challenge


Revision tags: thorpej-futex-base
# 1.56 10-Oct-2020 jmcneill

Fix detection of FP and SIMD features on Armv8.2+.


# 1.55 07-Oct-2020 jmcneill

Only touch PMC registers if Performance Monitor Extensions are present.


# 1.54 25-Jul-2020 riastradh

Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.


# 1.53 25-Jul-2020 riastradh

Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.56 10-Oct-2020 jmcneill

Fix detection of FP and SIMD features on Armv8.2+.


# 1.55 07-Oct-2020 jmcneill

Only touch PMC registers if Performance Monitor Extensions are present.


# 1.54 25-Jul-2020 riastradh

Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.


# 1.53 25-Jul-2020 riastradh

Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.55 07-Oct-2020 jmcneill

Only touch PMC registers if Performance Monitor Extensions are present.


# 1.54 25-Jul-2020 riastradh

Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.


# 1.53 25-Jul-2020 riastradh

Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.54 25-Jul-2020 riastradh

Implement ChaCha with NEON on ARM.

XXX Needs performance measurement.
XXX Needs adaptation to arm32 neon which has half the registers.


# 1.53 25-Jul-2020 riastradh

Split aes_impl declarations out into aes_impl.h.

This will make it less painful to add more operations to struct
aes_impl without having to recompile everything that just uses the
block cipher directly or similar.


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.52 01-Jul-2020 ryo

- On some systems with a different cache line size (and DIC,IDC) per CPU, trap "mrs Xt,ctr_el0" instruction
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.

On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.


# 1.51 01-Jul-2020 ryo

Switch the Icache sync operation to the necessary and sufficient one according to the CTR_EL0.DIC and CTR_EL0.IDC flags.

If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.

SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.50 29-Jun-2020 riastradh

New permutation-based AES implementation using ARM NEON.

Also derived from Mike Hamburg's public-domain vpaes code.


# 1.49 29-Jun-2020 riastradh

Implement AES in kernel using ARMv8.0-AES on aarch64.


# 1.48 29-Jun-2020 riastradh

Draft fpu_kern_enter/leave on aarch64.


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.47 14-Jun-2020 riastradh

Add some more id_aa64pfr0_el1 bits.


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.46 30-May-2020 jmcneill

sctlr_el1 and ctr_el0 are 64-bit registers


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.45 11-May-2020 riastradh

Add support for the ARMv8.5-RNG CPU random number generator.

We use the RNDRRS system register. I made the following two
wild-arse guesses about the architecture of real implementations,
which might not exist yet:

1. There's only one physical source per CPU package, so not worth
attaching one per core.

2. Like other CPU RNGs -- RDSEED, VIA C3 -- this probably gives about
half a bit of entropy per bit of data (although perhaps we should
say zero and revisit this once it arrives on real silicon).

Tested in qemu as well as I can, using `-cpu max' (which doesn't get
to userland for unrelated reasons).

This uses the numeric notation `mrs %0, s3_3_c2_c4_1' for the rndrrs
system register instead of the more legible `mrs %0, rndrrs' as
suggested in the ARMv8.5 ARM. Why?

- clang doesn't like `mrs %0, rndrrs' for reasons unclear to me.

- gas only likes it with `.arch armv8.5-a+rng', but there's no clear
way to keep that scoped; the `.set push/pop' stack that would be an
obvious choice for this works only on mips.

- gcc supports __attribute__((target("arch=..."))) on functions, but
the version we use doesn't yet know about armv8.5-a+rng.

Later on, we should replace this by a target attribute and the more
obvious `mrs %0, rndrrs' notation.

ok nick


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.44 10-May-2020 riastradh

Print RNDR support in verbose CPU feature identification.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base phil-wifi-20200406
# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: is-mlppp-base ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.43 05-Apr-2020 jmcneill

Cleanup CPU attach output:
- Always print the core's vendor and product name.
- Print the CPU ID on the same line as the name. Single line of dmesg
per core.
- Use aprint_verbose for reporting additional details.


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.42 30-Mar-2020 jmcneill

Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.


Revision tags: ad-namecache-base3
# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.41 15-Feb-2020 skrll

Various updates and improvements to cpu start up on arm/aarch64

- start sharing more code around the AP startup messaging.
- call arm_cpu_topology_set early so that ci_core_id is available for
drivers, e.g. bcm2835_intr.c
- both arm and aarch64 now have
- a static cpu_info_store array
- the same arm_cpu_{hatched,mbox}


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.40 09-Feb-2020 skrll

#if 0 / #endif -> a comment


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.39 28-Jan-2020 maxv

Fetch ID_AA64MMFR2_EL1. Okayed by Nick the other day.


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.38 27-Jan-2020 skrll

NVIDIA's breakaway marketing dept have been in touch.


# 1.37 27-Jan-2020 skrll

Identify the Denver2 CPU in the Nvidia TX2


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


Revision tags: ad-namecache-base2
# 1.36 25-Jan-2020 skrll

Trailing whitespace


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.35 20-Jan-2020 skrll

KNF


Revision tags: ad-namecache-base1
# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

branches: 1.31.2;
Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.34 15-Jan-2020 mrg

port the arm64 cpu topology setup for big.little to arm.

rename arm64 cpu_do_topology() to arm_cpu_do_topology() and
call it from both arm cpu_attach().

replace both aarch64_set_topology() inline code in arm
cpu_attach() with new arm_cpu_do_topology(), which is called
by the arm64 locore as well (possibly not needed, which would
allow it to become static.)

not yet tested on a real big.little armv7 system. tested
on rockpro64 and pinebook pro.


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.33 12-Jan-2020 mrg

provide some semblance of valid cpu topology for big.little systems.

while attaching cpus, if the FDT provides "capacity-dmips-mhz" track
the fastest set, and call cpu_topology_set() with slow=true for any
cpus that are not the fastest.

bug fix for cpu_topology_set(): actually set ci_is_slow for slow cpus.

with this change, and -current's recent scheduler changes, this means
that long running processes run on the faster cores. on RK3399 based
systems, i am seeing 20-50% speed ups for many tasks.


XXX: all this can be made common with armv7 big.little.


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.32 09-Jan-2020 martin

When attaching the first fdtbus, use the root "comptabile" (or failing that:
"model") property to set the cpu model (in userland aka sysctl hw.model).
When attaching the first cpu, do not overwrite a cpu model if it already
had been set.


Revision tags: ad-namecache-base
# 1.31 28-Dec-2019 jmcneill

Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.31 28-Dec-2019 jmcneill

Identify Arm Neoverse E1 and N1 CPUs.


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.30 27-Dec-2019 mlelstv

Fix build.


# 1.29 27-Dec-2019 skrll

Add a missing newline


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.28 21-Dec-2019 ad

Fix build break (ci->ci_dev is not available on every port).


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.27 20-Dec-2019 ad

Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.26 22-Nov-2019 mlelstv

Make cache operations available early.


Revision tags: phil-wifi-20191119
# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

branches: 1.20.2;
Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.25 20-Oct-2019 jmcneill

Use separate cacheline aligned arrays for mbox and hatched as before.


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.24 20-Oct-2019 jmcneill

Invalidate dcache before polling AP hatched status


# 1.23 19-Oct-2019 jmcneill

Increase aarch64 MAXCPUS to 256.


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.22 14-Oct-2019 jmcneill

Remove the A72 errata #859971 detection, it causes an illegal instruction on AWS A1 (virtualized)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.21 15-Sep-2019 tnn

report A72 errata #859971 workaround status during boot


Revision tags: netbsd-9-base
# 1.20 16-Jul-2019 jmcneill

Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.20 16-Jul-2019 jmcneill

Need CPU_PARTMASK for eMAG CPU ID


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.19 16-Jul-2019 jmcneill

Add Ampere eMAG 8180 cpuid


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.18 19-Jun-2019 mrg

add several cortex CPU implementations found in their TRMs:
- A32 R1 (aarch32 only, not supported)
- A35 R1
- A65 R0
- A76AE R1
- A77

add the aarch64 ones to cpu.c for identification.


Revision tags: phil-wifi-20190609
# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2; 1.1.4;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


# 1.17 09-May-2019 mrg

add cortex A-76 detection.


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)


Revision tags: isaki-audio2-base pgoyette-compat-20190127
# 1.16 21-Jan-2019 skrll

Use ci_{package,core,smt}_id instead of ci_data.cpu_{package,core,smt}_id

NFC


Revision tags: pgoyette-compat-20190118 pgoyette-compat-1226
# 1.15 21-Dec-2018 ryo

- add workaround for Cavium ThunderX errata 27456.
- add cpufuncs table in cpu_info. each cpu clusters may have different erratum. (e.g. big.LITTLE)


# 1.14 28-Nov-2018 ryo

support boot option "-1" to disable multiprocessor boot, and "-z" to set AB_SILENT flag.


Revision tags: pgoyette-compat-1126
# 1.13 20-Nov-2018 mrg

rewrite the CPU identification on arm64:

- publish per-cpu data
- publish a whole bunch of info in struct aarch64_sysctl_cpu_id
instead of various individual nodes (there are 16 total.)
- add MIDR extractor bits
- define ARMv8.2-A id_aa64mmfr2_el1 and id_aa64zfr0_el1 regs,
but avoid using them until we make sure they exist. (these
members are added to aarch64_sysctl_cpu_id to avoid future
compat issues.)

the arm32 and aarch32 version of these need to be adjusted as
well (and aarch32 data published at all.) still trying to
work out how to make the same userland binary running on a
real arm32 or an aarch32 system can work sanely here.

ok ryo@.


Revision tags: pgoyette-compat-1020
# 1.12 14-Oct-2018 skrll

Use __nothing


# 1.11 04-Oct-2018 ryo

remove XXX delay to attach cpus in order


# 1.10 03-Oct-2018 skrll

Another space that hurts Jared's eyes.


# 1.9 03-Oct-2018 skrll

Fix some product names and details as suggested by jmcneill


# 1.8 03-Oct-2018 skrll

Identify some Cavium ThunderX CPUs


Revision tags: pgoyette-compat-0930
# 1.7 10-Sep-2018 ryo

cleanup aarch64 mpstart and fdt bootstrap
* arm_cpu_hatch_arg is a bad idea. avoid serializing CPU startup, and eliminate arm_cpu_hatch_arg.
in mpstart, resolve own cpu index using array of cpu_mpidr[] (aarch64)
* add support fdt enable-method "spin-table"
* add support fdt enable-method "brcm,bcm2836-smp" (for 32bit RaspberryPi)
* use arm_fdt_cpu_bootstrap() instead of psci_fdt_bootstrap()
* rename "arm/fdt/psci_fdt.h" to "arm/fdt/psci_fdtvar.h" because of conflict of include file for needs-flag
* add devmap for cpu spin-table of raspberrypi3/aarch64
* no need to force hatch APs for raspberrypi3/arm32 ifndef MULTIPROCESSOR.
* fix to work pmap_extract(kerneltext/data/bss) even if before calling pmap_bootstrap

idea to use cpu_mpidr[] by jmcneill@. reviewd by skrll@. thanks.


Revision tags: pgoyette-compat-0906
# 1.6 26-Aug-2018 ryo

add support multiple cpu clusters.
* pass cpu index as an argument to secondary processors when hatching.
* keep cpu cache confituration per cpu clusters.

Hello big.LITTLE!


# 1.5 20-Aug-2018 jmcneill

Use __SHIFTOUT to extract MPIDR affinity levels


# 1.4 31-Jul-2018 skrll

Define and use VPRINTF


Revision tags: pgoyette-compat-0728
# 1.3 17-Jul-2018 christos

add default statements, use PRI?64 instead of ll?


# 1.2 09-Jul-2018 ryo

add MULTIPROCESSOR support


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407
# 1.1 01-Apr-2018 ryo

branches: 1.1.2;
Add initial support for ARMv8 (AARCH64) (by nisimura@ and ryo@)

- sys/arch/evbarm64 is gone and integrated into sys/arch/evbarm. (by skrll@)
- add support fdt. evbarm/conf/GENERIC64 fdt (bcm2837,sunxi,tegra) based generic 64bit kernel config. (by skrll@, jmcneill@)