History log of /freebsd-10.3-release/sys/ia64/include/pcpu.h
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 296373 04-Mar-2016 marius

- Copy stable/10@296371 to releng/10.3 in preparation for 10.3-RC1
builds.
- Update newvers.sh to reflect RC1.
- Update __FreeBSD_version to reflect 10.3.
- Update default pkg(8) configuration to use the quarterly branch.

Approved by: re (implicit)

# 271239 07-Sep-2014 marcel

Fix previous commit: unbreak build of libkvm by including sys/systm.h
only when _KERNEL is defined.

Approved by: re@ (implicit)


# 271211 06-Sep-2014 marcel

Fix the PCPU access macros. It was found that the PCPU pointer, when
held in register r13, is used outside the bounds of critical_enter()
and critical_exit() by virtue of optimizations performed by the
compiler. The net effect being that address computations of fields
in the PCPU structure could be relative to the PCPU structure of the
CPU on which the address computation was performed and not related
to the CPU that executes the actual load or store operation.
The typical failure mode being that the per-CPU cache of UMA got
corrupted due to accesses from other CPUs.

Adding more volatile decorating to the register expression does not
help. The thinking being that volatile is assumed to work on memory
references and not register references. Thus, the fix is to perform
the address computation using a volatile inline assembly statement.

Additionally, since the reference is fundamentally non-atomic on ia64
by virtue of have a distinct address computation followed by the
actual load or store operation, it is required to wrap the entire
PCPU access in a critical section.

With PCPU_GET and friends requiring curthread now that they're in a
critical section, low-level use of these macros in functions like
cpu_switch() is not possible anymore. Consequently, a second order
set of changes is needed to avoid using PCPU_GET and friends where
curthread is either not set yet, or in the process of being changed.
In those cases, explicit dereferencing of pcpup is needed. In those
cases it is also possible to do that.

This is a direct commit to stable/10.

Approved by: re@ (marius)


# 268200 02-Jul-2014 marcel

MFC r263323: Fix and improve exception tracing.


# 256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


# 249265 08-Apr-2013 glebius

Merge from projects/counters:

Pad struct pcpu so that its size is denominator of PAGE_SIZE. This
is done to reduce memory waste in UMA_PCPU_ZONE zones.

Sponsored by: Nginx, Inc.


# 223526 25-Jun-2011 marcel

Switch to the event timers infrastructure. This includes:
o Setting td_intr_frame to the XIVs trap frame because it's referenced
by the ET event handler.
o Signal EOI to the CPU before calling the registered XIV handlers.
This prevents lost ITC interrupts, which cause starvation in one-shot
mode.
o Adding support for IPI_HARDCLOCK with corresponding per-CPU counters.
o Have the APs call cpu_initclocks() so as to limited the scattering of
clock related initialization. cpu_initclocks() calls the <self>_bsp()
or <self>_ap() version accordingly.
o Uncomment the ET clock handling in cpu_idle().
o Update the DDB 'show pcpu' output for the new MD fields.
o Entirely rewritten ia64_ih_clock(). Note that we don't create as many
clock XIVs as we have CPUs, as is done on PowerPC. It doesn't scale.
We can only have 240 XIVs and we can have more CPUs than that. There's
a single intrcnt index for the cumulative clock ticks and we keep per
CPU counts in the PCPU stats structure.
o Register the ITC by hooking SI_SUB_CONFIGURE (2nd order).

Open issues:
o Clock interrupts can still be lost. Some tweaking is still necessary.

Thanks to: mav@ for his support, feedback and explanations.

ET stats while committing:
eris% sysctl machdep.cpu | grep nclks

machdep.cpu.0.nclks: 24007
machdep.cpu.1.nclks: 22895
machdep.cpu.2.nclks: 13523
machdep.cpu.3.nclks: 9342
machdep.cpu.4.nclks: 9103
machdep.cpu.5.nclks: 9298
machdep.cpu.6.nclks: 10039
machdep.cpu.7.nclks: 9479
eris% vmstat -i | grep clock
clock 108599 50


# 221890 14-May-2011 marcel

Be pedantic: mark the pcpu pointer (= register r13) itself as volatile.


# 210623 29-Jul-2010 jhb

Mark the __curthread() functions as __pure2 and remove the volatile keyword
from the inline assembly. This allows the compiler to cache invocations of
curthread since it's value does not change within a thread context.

Submitted by: zec (i386)
MFC after: 1 week


# 205665 26-Mar-2010 marcel

Only use the interval timer for clock interrupts on the BSP and
have the BSP use IPIs to trigger clock interrupts on the APs.
This allows us to run on hardware configurations for which the
ITC has non-uniform frequencies across CPUs.

While here, change the clock XIV to type IPI so as to protect
the interrupt delivery against CPU re-balancing once that's
implemented.


# 205431 22-Mar-2010 marcel

Define curthread as an inline function that loads the thread pointer
directly from r13, the pcpu pointer. This guarantees correct behaviour
when the thread migrates to a different CPU.


# 200207 07-Dec-2009 marcel

Define struct pcpu_md as the only MD field of struct pcpu (pc_acpi_id
excluded, as it's used by MI code) and mode the sysctl variables from
pcpu_stats to pcpu_md.
Adjust all references accordingly.

While nearby, change the PCPU sysctl tree so that they match the CPU
device sysctl tree -- they are now children of a static node called
"machdep.cpu" and are named only with their cpu ID.


# 200200 07-Dec-2009 marcel

Allocate the VHPT for each CPU in cpu_mp_start(), rather than
allocating MAXCPU VHPTs up-front. This allows us to max-out MAXCPU
without memory waste -- MAXCPU is now 32 for SMP kernels.

This change also eliminates the VHPT scaling based in the total
memory in the system. It's the workload that determines the best size
of the VHPT. The workload can be affected by the amount of memory,
but not necessarily. For example, there's no performance difference
between VHPT sizes of 256KB, 512KB and 1MB when building the LINT
kernel. This was observed with a system that has 8GB of memory.
By default the kernel will allocate a 1MB VHPT. The user can tune the
system with the "machdep.vhpt.log2size" tunable.


# 199941 29-Nov-2009 marcel

Move the sysctl related fields to the end of the structure and
make them conditional upon _KERNEL. libkvm includes <sys/pcpu.h>
and <sys/sysctl.h> does not expose the structure definitions to
userland.


# 199893 28-Nov-2009 marcel

Eliminate teh use of MAXCPU in static arrays of interrupt counters by
adding statistics counters to the PCPU structure. Export the counters
through sysctl by giving each PCPU structure its own sysctl context.

While here, fix cnt.v_intr by not just having it count clock interrupts,
but every interrupt and add more counters for each interrupt source.


# 181875 19-Aug-2008 jhb

Export 'struct pcpu' to userland w/o requiring _KERNEL. A few ports
already define _KERNEL to get to this and I'm about to add hooks to
libkvm to access per-CPU data.

MFC after: 1 week


# 170291 04-Jun-2007 attilio

Rework the PCPU_* (MD) interface:
- Rename PCPU_LAZY_INC into PCPU_INC
- Add the PCPU_ADD interface which just does an add on the pcpu member
given a specific value.

Note that for most architectures PCPU_INC and PCPU_ADD are not safe.
This is a point that needs some discussions/work in the next days.

Reviewed by: alc, bde
Approved by: jeff (mentor)


# 167429 11-Mar-2007 alc

Push down the implementation of PCPU_LAZY_INC() into the machine-dependent
header file. Reimplement PCPU_LAZY_INC() on amd64 and i386 making it
atomic with respect to interrupts.

Reviewed by: bde, jhb


# 164392 18-Nov-2006 marcel

Now that printf() needs the PCPU, set it up before we call printf().
Change the pc_pcb field from a pointer to struct pcb to struct pcb
so that sizeof(struct pcb) includes the PCB we use for IPI_STOP.
Statically declare early_pcb so that we don't have to allocate the
PCB for thread0. This way we can setup the PCPU before cninit()
and thus before we use printf().


# 122829 17-Nov-2003 bde

Fixed a pedantic syntax error (a stray semicolon at the end of
PCPU_MD_FIELDS).


# 122763 15-Nov-2003 njl

Add the pc_acpi_id PCPU member. The new acpi_cpu driver uses this to
dereference the softc.


# 118414 04-Aug-2003 marcel

Cleanup the clock code. This includes:
o Remove alpha specific timer code (mc146818A) and compiled-out
calibration of said timer.
o Remove i386 inherited timer code (i8253) and related acquire and
release functions.
o Move sysbeep() from clock.c to machdep.c and have it return
ENODEV. Console beeps should be implemented using ACPI or if no
such device is described, using the sound driver.
o Move the sysctls related to adjkerntz, disable_rtc_set and
wall_cmos_clock from machdep.c to clock.c, where the variables
are.
o Don't hardcode a hz value of 1024 in cpu_initclocks() and don't
bother faking a stathz that's 1/8 of that. Keep it simple: hz
defaults to HZ and stathz equals hz. This is also how it's done
for sparc64.
o Keep a per-CPU ITC counter (pc_clock) and adjustment (pc_clockadj)
to calculate ITC skew and corrections. On average, we adjust the
ITC match register once every ~1500 interrupts for a duration of
2 consequtive interruprs. This is to correct the non-deterministic
behaviour of the ITC interrupt (there's a delay between the match
and the raising of the interrupt).
o Add 4 debugging sysctls to monitor clock behaviour. Those are
debug.clock_adjust_edges, debug.clock_adjust_excess,
debug.clock_adjust_lost and debug.clock_adjust_ticks. The first
counts the individual adjustment cycles (when the skew first
crosses the threshold), the second counts the number of times the
adjustment was excessive (any non-zero value is to be considered
a bug), the third counts lost clock interrupts and the last counts
the number of interrupts for which we applied an adjustment
(debug.clock_adjust_ticks / debug.clock_adjust_edges gives the
avarage duration of an individual adjustment -- should be ~2).

While here, remove some nearby (trivial) left-overs from alpha and
other cleanups.


# 88690 30-Dec-2001 marcel

Remove unused MD fields (pc_pending_ipis, pc_next_asn and
pc_current_asngen) and add SMP specific fields (pc_pcb,
pc_lid and pc_awake).


# 87702 11-Dec-2001 jhb

Overhaul the per-CPU support a bit:

- The MI portions of struct globaldata have been consolidated into a MI
struct pcpu. The MD per-CPU data are specified via a macro defined in
machine/pcpu.h. A macro was chosen over a struct mdpcpu so that the
interface would be cleaner (PCPU_GET(my_md_field) vs.
PCPU_GET(md.md_my_md_field)).
- All references to globaldata are changed to pcpu instead. In a UP kernel,
this data was stored as global variables which is where the original name
came from. In an SMP world this data is per-CPU and ideally private to each
CPU outside of the context of debuggers. This also included combining
machine/globaldata.h and machine/globals.h into machine/pcpu.h.
- The pointer to the thread using the FPU on i386 was renamed from
npxthread to fpcurthread to be identical with other architectures.
- Make the show pcpu ddb command MI with a MD callout to display MD
fields.
- The globaldata_register() function was renamed to pcpu_init() and now
init's MI fields of a struct pcpu in addition to registering it with
the internal array and list.
- A pcpu_destroy() function was added to remove a struct pcpu from the
internal array and list.

Tested on: alpha, i386
Reviewed by: peter, jake


# 83643 18-Sep-2001 jhb

- If we ever do the per-cpu KTR stuff, the index won't be volatile as it
will be private to each CPU.
- Re-style(9) the globaldata structures. There really needs to be a MI
struct pcpu that has a MD struct mdpcpu member at some point.


# 83366 12-Sep-2001 julian

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# 81763 16-Aug-2001 obrien

style(9) and make consistent across platforms


# 76078 27-Apr-2001 jhb

Overhaul of the SMP code. Several portions of the SMP kernel support have
been made machine independent and various other adjustments have been made
to support Alpha SMP.

- It splits the per-process portions of hardclock() and statclock() off
into hardclock_process() and statclock_process() respectively. hardclock()
and statclock() call the *_process() functions for the current process so
that UP systems will run as before. For SMP systems, it is simply necessary
to ensure that all other processors execute the *_process() functions when the
main clock functions are triggered on one CPU by an interrupt. For the alpha
4100, clock interrupts are delievered in a staggered broadcast fashion, so
we simply call hardclock/statclock on the boot CPU and call the *_process()
functions on the secondaries. For x86, we call statclock and hardclock as
usual and then call forward_hardclock/statclock in the MD code to send an IPI
to cause the AP's to execute forwared_hardclock/statclock which then call the
*_process() functions.
- forward_signal() and forward_roundrobin() have been reworked to be MI and to
involve less hackery. Now the cpu doing the forward sets any flags, etc. and
sends a very simple IPI_AST to the other cpu(s). AST IPIs now just basically
return so that they can execute ast() and don't bother with setting the
astpending or needresched flags themselves. This also removes the loop in
forward_signal() as sched_lock closes the race condition that the loop worked
around.
- need_resched(), resched_wanted() and clear_resched() have been changed to take
a process to act on rather than assuming curproc so that they can be used to
implement forward_roundrobin() as described above.
- Various other SMP variables have been moved to a MI subr_smp.c and a new
header sys/smp.h declares MI SMP variables and API's. The IPI API's from
machine/ipl.h have moved to machine/smp.h which is included by sys/smp.h.
- The globaldata_register() and globaldata_find() functions as well as the
SLIST of globaldata structures has become MI and moved into subr_smp.c.
Also, the globaldata list is only available if SMP support is compiled in.

Reviewed by: jake, peter
Looked over by: eivind


# 74912 28-Mar-2001 jhb

Rework the witness code to work with sx locks as well as mutexes.
- Introduce lock classes and lock objects. Each lock class specifies a
name and set of flags (or properties) shared by all locks of a given
type. Currently there are three lock classes: spin mutexes, sleep
mutexes, and sx locks. A lock object specifies properties of an
additional lock along with a lock name and all of the extra stuff needed
to make witness work with a given lock. This abstract lock stuff is
defined in sys/lock.h. The lockmgr constants, types, and prototypes have
been moved to sys/lockmgr.h. For temporary backwards compatability,
sys/lock.h includes sys/lockmgr.h.
- Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin
locks held. By making this per-cpu, we do not have to jump through
magic hoops to deal with sched_lock changing ownership during context
switches.
- Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with
proc->p_sleeplocks, which is a list of held sleep locks including sleep
mutexes and sx locks.
- Add helper macros for logging lock events via the KTR_LOCK KTR logging
level so that the log messages are consistent.
- Add some new flags that can be passed to mtx_init():
- MTX_NOWITNESS - specifies that this lock should be ignored by witness.
This is used for the mutex that blocks a sx lock for example.
- MTX_QUIET - this is not new, but you can pass this to mtx_init() now
and no events will be logged for this lock, so that one doesn't have
to change all the individual mtx_lock/unlock() operations.
- All lock objects maintain an initialized flag. Use this flag to export
a mtx_initialized() macro that can be safely called from drivers. Also,
we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness
performs the corresponding checks using the initialized flag.
- The lock order reversal messages have been improved to output slightly
more accurate file and line numbers.


# 72893 22-Feb-2001 jhb

Axe the astpending per-cpu variable.


# 71731 28-Jan-2001 marcel

Add gd_witness_spin_check.


# 71596 24-Jan-2001 dfr

Change cpuno to cpuid.


# 71337 21-Jan-2001 jake

Make intr_nesting_level per-process, rather than per-cpu. Setup
interrupt threads to run with it always >= 1, so that malloc can
detect M_WAITOK from "interrupt" context. This is also necessary
in order to context switch from sched_ithd() directly.

Reviewed By: peter


# 70952 12-Jan-2001 jake

Remove unused per-cpu variables inside_intr and ss_eflags.


# 66937 10-Oct-2000 dfr

* Add rudimentary DDB support (no kgdb, no backtrace, no single step).
* Track recent changes to SWI code.
* Allocate RIDs for pmaps (untested).
* Implement assembler version of cpu_switch - its cleaner that way.


# 66458 29-Sep-2000 dfr

This is the first snapshot of the FreeBSD/ia64 kernel. This kernel will
not work on any real hardware (or fully work on any simulator). Much more
needs to happen before this is actually functional but its nice to see
the FreeBSD copyright message appear in the ia64 simulator.