Cross Reference: /freebsd-current/sys/x86/include/x86

History log of /freebsd-current/sys/x86/include/x86_smp.h
Revision	Date	Author	Comments
# 95ee2897	16-Aug-2023	Warner Losh <imp@FreeBSD.org>	sys: Remove $FreeBSD$: two-line .h pattern Remove /^\s\\n \*\s+\$FreeBSD\$$\n/
# 9fb6718d	24-Apr-2023	Mark Johnston <markj@FreeBSD.org>	smp: Dynamically allocate the stoppcbs array This avoids bloating the kernel image when MAXCPU is large. A follow-up patch for kgdb and other kernel debuggers is needed since the stoppcbs symbol is now a pointer. Bump __FreeBSD_version so that debuggers can use osreldate to figure out how to handle stoppcbs. PR: 269572 MFC after: never Reviewed by: mjg, emaste Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D39806
# bab8274c	25-Apr-2023	Dimitry Andric <dim@FreeBSD.org>	Use bool for one-bit wide bit-fields A signed one-bit wide bit-field can take only the values 0 and -1. Clang 16 introduced a warning that "implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1". Fix the warnings by using C99 bool. Reported by: Clang 16 Reviewed by: emaste, jhb MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D39705
# ab12e8db	14-Nov-2021	Mark Johnston <markj@FreeBSD.org>	amd64: Reduce the amount of cpuset copying done for TLB shootdowns We use pmap_invalidate_cpu_mask() to get the set of active CPUs. This (32-byte) set is copied by value through multiple frames until we get to smp_targeted_tlb_shootdown(), where it is copied yet again. Avoid this copying by having smp_targeted_tlb_shootdown() make a local copy of the active CPUs for the pmap, and drop the cpuset parameter, simplifying callers. Also leverage the use of the non-destructive CPU_FOREACH_ISSET to avoid unneeded copying within smp_targeted_tlb_shootdown(). Reviewed by: alc, kib Tested by: pho MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32792
# b27fe1c3	28-Jul-2021	Konstantin Belousov <kib@FreeBSD.org>	amd64: stop doing special allocation for the AP startup trampoline There is no reason now why do we need to allocate trampoline page very early in the boot process. The only requirement for the page is that it is below 1M to be usable by the real mode during init. This can be handled by vm_alloc_contig() when we do the startup. Also assert that startup trampoline fits into single page. In principle we can do multi-page allocation if needed, but it is not. Move the alloc_ap_trampoline() function and the boot_address variable to i386/mp_machdep.c. Keep existing mechanism of early alloc on i386. Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D31343
# aba10e13	25-Jul-2020	Alexander Motin <mav@FreeBSD.org>	Allow swi_sched() to be called from NMI context. For purposes of handling hardware error reported via NMIs I need a way to escape NMI context, being too restrictive to do something significant. To do it this change introduces new swi_sched() flag SWI_FROMNMI, making it careful about used KPIs. On platforms allowing IPI sending from NMI context (x86 for now) it immediately wakes clk_intr_event via new IPI_SWI, otherwise it works just like SWI_DELAY. To handle the delayed SWIs this patch calls clk_intr_event on every hardclock() tick. MFC after: 2 weeks Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D25754
# 9977c593	24-Jul-2020	Alexander Motin <mav@FreeBSD.org>	Introduce ipi_self_from_nmi(). It allows safe IPI sending to current CPU from NMI context. Unlike other ipi_*() functions this waits for delivery to leave LAPIC in a state safe for interrupted code. MFC after: 2 weeks Sponsored by: iXsystems, Inc.
# 279cd05b	24-Jul-2020	Alexander Motin <mav@FreeBSD.org>	Use APIC_IPI_DEST_OTHERS for bitmapped IPIs too. It should save bunch of LAPIC register accesses. MFC after: 2 weeks
# dc43978a	14-Jul-2020	Konstantin Belousov <kib@FreeBSD.org>	amd64: allow parallel shootdown IPIs Stop using smp_ipi_mtx to protect global shootdown state, and move/multiply the global state into pcpu. Now each CPU can initiate shootdown IPI independently from other CPUs. Initiator enters critical section, then fills its local PCPU shootdown info (pc_smp_tlb_XXX), then clears scoreboard generation at location (cpu, my_cpuid) for each target cpu. After that IPI is sent to all targets which scan for zeroed scoreboard generation words. Upon finding such word the shootdown data is read from corresponding cpu' pcpu, and generation is set. Meantime initiator loops waiting for all zeroed generations in scoreboard to update. Initiator does not disable interrupts, which should allow non-invalidation IPIs from deadlocking, it only needs to disable preemption to pin itself to the instance of the pcpu smp_tlb data. The generation is set before the actual invalidation is performed in handler. It is safe because target CPU cannot return to userspace before handler finishes. In principle only NMI can preempt the handler, but NMI would see the kernel handler frame and not touch not-invalidated user page table. Handlers loop until they do not see zeroed scoreboard generations. This, together with hardware keeping one pending IPI in LAPIC IRR should prevent lost shootdowns. Notes. 1. The code does protect writes to LAPIC ICR with exclusion. I believe this is fine because we in fact do not send IPIs from interrupt handlers. More for !x2APIC mode where ICR access for write requires two registers write, we disable interrupts around it. If considered incorrect, I can add per-cpu spinlock around ipi_send(). 2. Scoreboard lines owned by given target CPU can be padded to the cache line, to reduce ping-pong. Reviewed by: markj (previous version) Discussed with: alc Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 weeks Differential revision: https://reviews.freebsd.org/D25510
# 3b23ffe2	10-Jun-2020	Konstantin Belousov <kib@FreeBSD.org>	amd64 pmap: reorder IPI send and local TLB flush in TLB invalidations. Right now code first flushes all local TLB entries that needs to be flushed, then signals IPI to remote cores, and then waits for acknowledgements while spinning idle. In the VMWare article 'Don’t shoot down TLB shootdowns!' it was noted that the time spent spinning is lost, and can be more usefully used doing local TLB invalidation. We could use the same invalidation handler for local TLB as for remote, but typically for pmap == curpmap we can use INVLPG for locals instead of INVPCID on remotes, since we cannot control context switches on them. Due to that, keep the local code and provide the callbacks to be called from smp_targeted_tlb_shootdown() after IPIs are fired but before spin wait starts. Reviewed by: alc, cem, markj, Anton Rang <rang at acm.org> Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D25188
# a8c2fcb2	12-May-2019	Mateusz Guzik <mjg@FreeBSD.org>	x86: store pending bitmapped IPIs in per-cpu areas This gets rid of the global cpu_ipi_pending array. While replace cmpset with fcmpset in the delivery code and opportunistically check if given IPI is already pending. Sponsored by: The FreeBSD Foundation
# 665919aa	04-May-2019	Conrad Meyer <cem@FreeBSD.org>	x86: Implement MWAIT support for stopping a CPU IPI_STOP is used after panic or when ddb is entered manually. MONITOR/ MWAIT allows CPUs that support the feature to sleep in a low power way instead of spinning. Something similar is already used at idle. It is perhaps especially useful in oversubscribed VM environments, and is safe to use even if the panic/ddb thread is not the BSP. (Except in the presence of MWAIT errata, which are detected automatically on platforms with known wakeup problems.) It can be tuned/sysctled with "machdep.stop_mwait," which defaults to 0 (off). This commit also introduces the tunable "machdep.mwait_cpustop_broken," which defaults to 0, unless the CPU has known errata, but may be set to "1" in loader.conf to signal that mwait wakeup is broken on CPUs FreeBSD does not yet know about. Unfortunately, Bhyve doesn't yet support MONITOR extensions, so this doesn't help bhyve hypervisors running FreeBSD guests. Submitted by: Anton Rang <rang AT acm.org> (earlier version) Reviewed by: kib Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D20135
# 9dba82a4	05-Apr-2018	Roger Pau Monné <royger@FreeBSD.org>	x86: improve reservation of AP trampoline memory So that it doesn't rely on physmap[1] containing an address below 1MiB. Instead scan the full physmap and search for a suitable address to place the trampoline code (below 1MiB) and the initial memory pages (below 4GiB). Sponsored by: Citrix Systems R&D Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D14878
# c8f9c1f3	27-Jan-2018	Konstantin Belousov <kib@FreeBSD.org>	Use PCID to optimize PTI. Use PCID to avoid complete TLB shootdown when switching between user and kernel mode with PTI enabled. I use the model close to what I read about KAISER, user-mode PCID has 1:1 correspondence to the kernel-mode PCID, by setting bit 11 in PCID. Full kernel-mode TLB shootdown is performed on context switches, since KVA TLB invalidation only works in the current pmap. User-mode part of TLB is flushed on the pmap activations as well. Similarly, IPI TLB shootdowns must handle both kernel and user address spaces for each address. Note that machines which implement PCID but do not have INVPCID instructions, cause the usual complications in the IPI handlers, due to the need to switch to the target PCID temporary. This is racy, but because for PCID/no-INVPCID we disable the interrupts in pmap_activate_sw(), IPI handler cannot see inconsistent state of CPU PCID vs PCPU pmap/kcr3/ucr3 pointers. On the other hand, on kernel/user switches, CR3_PCID_SAVE bit is set and we do not clear TLB. I can imagine alternative use of PCID, where there is only one PCID allocated for the kernel pmap. Then, there is no need to shootdown kernel TLB entries on context switch. But copyout(3) would need to either use method similar to proc_rwmem() to access the userspace data, or (in reverse) provide a temporal mapping for the kernel buffer into user mode PCID and use trampoline for copy. Reviewed by: markj (previous version) Tested by: pho Discussed with: alc (some aspects) Sponsored by: The FreeBSD Foundation MFC after: 3 weeks Differential revision: https://reviews.freebsd.org/D13985
# 64de3fdd	30-Nov-2017	Pedro F. Giffuni <pfg@FreeBSD.org>	SPDX: use the Beerware identifier.
# 84525e55	10-Aug-2017	Roger Pau Monné <royger@FreeBSD.org>	x86: make the arrays that depend on MAX_APIC_ID dynamic So that MAX_APIC_ID can be bumped without wasting memory. Note that the usage of MAX_APIC_ID in the SRAT parsing forces the parser to allocate memory directly from the phys_avail physical memory array, which is not the best approach probably, but I haven't found any other way to allocate memory so early in boot. This memory is not returned to the system afterwards, but at least it's sized according to the maximum APIC ID found in the MADT table. Sponsored by: Citrix Systems R&D MFC after: 1 month Reviewed by: kib Differential revision: https://reviews.freebsd.org/D11912
# 835c2787	24-Oct-2016	Konstantin Belousov <kib@FreeBSD.org>	Handle broadcast NMIs. On several Intel chipsets, diagnostic NMIs sent from BMC or NMIs reporting hardware errors are broadcasted to all CPUs. When kernel is configured to enter kdb on NMI, the outcome is problematic, because each CPU tries to enter kdb. All CPUs are executing NMI handlers, which set the latches disabling the nested NMI delivery; this means that stop_cpus_hard(), used by kdb_enter() to stop other cpus by broadcasting IPI_STOP_HARD NMI, cannot work. One indication of this is the harmless but annoying diagnostic "timeout stopping cpus". Much more harming behaviour is that because all CPUs try to enter kdb, and if ddb is used as debugger, all CPUs issue prompt on console and race for the input, not to mention the simultaneous use of the ddb shared state. Try to fix this by introducing a pseudo-lock for simultaneous attempts to handle NMIs. If one core happens to enter NMI trap handler, other cores see it and simulate reception of the IPI_STOP_HARD. More, generic_stop_cpus() avoids sending IPI_STOP_HARD and avoids waiting for the acknowledgement, relying on the nmi handler on other cores suspending and then restarting the CPU. Since it is impossible to detect at runtime whether some stray NMI is broadcast or unicast, add a knob for administrator (really developer) to configure debugging NMI handling mode. The updated patch was debugged with the help from Andrey Gapon (avg) and discussed with him. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D8249
# 83c001d3	04-Oct-2016	Konstantin Belousov <kib@FreeBSD.org>	Re-apply r306516 (by cem): Reduce the cost of TLB invalidation on x86 by using per-CPU completion flags Reduce contention during TLB invalidation operations by using a per-CPU completion flag, rather than a single atomically-updated variable. On a Westmere system (2 sockets x 4 cores x 1 threads), dtrace measurements show that smp_tlb_shootdown is about 50% faster with this patch; observations with VTune show that the percentage of time spent in invlrng_single_page on an interrupt (actually doing invalidation, rather than synchronization) increases from 31% with the old mechanism to 71% with the new one. (Running a basic file server workload.) Submitted by: Anton Rang <rang at acm.org> Reviewed by: cem (earlier version) Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D8041
# 31f57577	30-Sep-2016	Conrad Meyer <cem@FreeBSD.org>	Revert r306516 for now, it is incomplete on i386 Noted by: kib
# 2965d505	30-Sep-2016	Conrad Meyer <cem@FreeBSD.org>	Reduce the cost of TLB invalidation on x86 by using per-CPU completion flags Reduce contention during TLB invalidation operations by using a per-CPU completion flag, rather than a single atomically-updated variable. On a Westmere system (2 sockets x 4 cores x 1 threads), dtrace measurements show that smp_tlb_shootdown is about 50% faster with this patch; observations with VTune show that the percentage of time spent in invlrng_single_page on an interrupt (actually doing invalidation, rather than synchronization) increases from 31% with the old mechanism to 71% with the new one. (Running a basic file server workload.) Submitted by: Anton Rang <rang at acm.org> Reviewed by: cem (earlier version), kib Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D8041
# 7c958a41	07-Dec-2015	Konstantin Belousov <kib@FreeBSD.org>	Merge common parts of i386 and amd64 md_var.h and smp.h into new headers x86/include x86_var.h and x86_smp.h. Reviewed by: emaste, jhb Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D4358