History log of /freebsd-9.3-release/sys/x86/x86/io_apic.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 267654 19-Jun-2014 gjb

Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 262192 18-Feb-2014 jhb

MFC 261517,261520:
Convert the license on files where I am the sole copyright holder to
2 clause BSD licenses.


# 257496 01-Nov-2013 kib

MFC r257069:
Add ddb 'show ioapic' and 'show all ioapics' commands.


# 248085 09-Mar-2013 marius

MFC: r227309 (partial)

Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.

The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.


# 225736 22-Sep-2011 kensmith

Copy head to stable/9 as part of 9.0-RELEASE release cycle.

Approved by: re (implicit)


# 216679 23-Dec-2010 jhb

Drop the icu_lock spinlock while pausing briefly after masking the
interrupt in the I/O APIC before moving it to a different CPU. If the
interrupt had been triggered by the I/O APIC after locking icu_lock but
before we masked the pin in the I/O APIC, then this could cause the
interrupt to be pending on the "old" CPU and it would finally trigger
after we had moved the interrupt to the new CPU. This could cause us to
panic as there was no interrupt source associated with the old IDT vector
on the old CPU. Dropping the lock after the interrupt is masked but
before it is moved allows the interrupt to fire and be handled in this
case before it is moved.

Tested by: Daniel Braniss danny of cs huji ac il
MFC after: 1 week


# 214631 01-Nov-2010 jhb

Move <machine/apicreg.h> to <x86/apicreg.h>.


# 208991 10-Jun-2010 mav

Do not disable edge-triggered interrupts before migration. DELAY() with
interrupt disabled highly probable causes interrupt loss.


# 208919 08-Jun-2010 jhb

Move the I/O APIC code to the x86 tree since it is identical on i386 and
amd64.


# 208915 08-Jun-2010 jhb

- Use a bit more care when moving I/O APIC interrupts between CPUs. Mask
the interrupt followed by a brief delay if it is not currently masked
before moving the interrupt.
- Move the icu_lock out of ioapic_program_intpin() and into callers. This
closes a race where ioapic_program_intpin() could use a stale value of
the masked state to compute the masked bit in the register.

Reviewed by: mav
MFC after: 2 weeks


# 195415 06-Jul-2009 jhb

After the per-CPU IDT changes, the IDT vector of an interrupt could change
when the interrupt was moved from one CPU to another. If the interrupt was
enabled, then the old IDT vector needs to be disabled and the new IDT vector
needs to be enabled. This was mostly masked prior to the recent MSI changes
since in the older code almost all allocated IDT vectors were already enabled
and the enabled vectors on the BSP during boot covered enough of the IDT
range. However, after the MSI changes, MSI interrupts that were allocated
but not enabled (e.g. DRM with MSI) during boot could result in an allocated
IDT vector that wasn't enabled. The round-robin at the end of boot could
place another interrupt at the same IDT vector without enabling the IDT
vector causing trap 30 faults.

Fix this by explicitly disabling/enabling the old and new IDT vectors for
enabled interrupt sources when moving an interrupt between CPUs via the
pic_assign_cpu() method. While here, fix a bug in my earlier changes so
that an I/O APIC interrupt pin is left unchanged if ioapic_assign_cpu()
fails to allocate a new IDT vector and returns ENOSPC.

Approved by: re (kensmith)


# 195249 01-Jul-2009 jhb

Improve the handling of cpuset with interrupts.
- For x86, change the interrupt source method to assign an interrupt source
to a specific CPU to return an error value instead of void, thus allowing
it to fail.
- If moving an interrupt to a CPU fails due to a lack of IDT vectors in the
destination CPU, fail the request with ENOSPC rather than panicing.
- For MSI interrupts on x86 (but not MSI-X), only allow cpuset to be used
on the first interrupt in a group. Moving the first interrupt in a group
moves the entire group.
- Use the icu_lock to protect intr_next_cpu() on x86 instead of the
intr_table_lock to fix a LOR introduced in the last set of MSI changes.
- Add a new privilege PRIV_SCHED_CPUSET_INTR for using cpuset with
interrupts. Previously, binding an interrupt to a CPU only performed a
privilege check if the interrupt had an interrupt thread. Interrupts
without a thread could be bound by non-root users as a result.
- If an interrupt event's assign_cpu method fails, then restore the original
cpuset mask for the associated interrupt thread.

Approved by: re (kib)


# 194985 25-Jun-2009 jhb

- Restore the behavior of pre-allocating IDT vectors for MSI interrupts.
This is mostly important for the multiple MSI message case where the
IDT vectors for the entire group need to be allocated together. This
also restores the assumptions made by the PCI bus code that it could
invoke PCIB_MAP_MSI() once MSI vectors were allocated.
- To avoid whiplash with CPU assignments, change the way that CPUs are
assigned to interrupt sources on activation. Instead of assigning the
CPU via pic_assign_cpu() before calling enable_intr(), allow the
different interrupt source drivers to ask the MD interrupt code which
CPU to use when they allocate an IDT vector. I/O APIC interrupt pins
do this in their pic_enable_intr() routines giving the same behavior as
before. MSI sources do it when the IDT vectors are allocated during
msi_alloc() and msix_alloc().
- Change the intr_table_lock from an sx lock to a mutex.

Tested by: rnoland


# 187880 29-Jan-2009 jeff

- Allocate apic vectors on a per-cpu basis. This allows us to allocate
more irqs as we have more cpus. This is principally useful on systems
with msi devices which may want many irqs per-cpu.

Discussed with: jhb
Sponsored by: Nokia


# 170340 05-Jun-2007 jhb

Move a warning under bootverbose as no machines that trigger it have ended
up being broken.


# 169391 08-May-2007 jhb

Minor fixes and tweaks to the x86 interrupt code:
- Split the intr_table_lock into an sx lock used for most things, and a
spin lock to protect intrcnt_index. Originally I had this as a spin lock
so interrupt code could use it to lookup sources. However, we don't
actually do that because it would add a lot of overhead to interrupts,
and if we ever do support removing interrupt sources, we can use other
means to safely do so w/o locking in the interrupt handling code.
- Replace is_enabled (boolean) with is_handlers (a count of handlers) to
determine if a source is enabled or not. This allows us to notice when
a source is no longer in use. When that happens, we now invoke a new
PIC method (pic_disable_intr()) to inform the PIC driver that the
source is no longer in use. The I/O APIC driver frees the APIC IDT
vector when this happens. The MSI driver no longer needs to have a
hack to clear is_enabled during msi_alloc() and msix_alloc() as a result
of this change as well.
- Add an apic_disable_vector() to reset an IDT vector back to Xrsvd to
complement apic_enable_vector() and use it in the I/O APIC and MSI code
when freeing an IDT vector.
- Add a new nexus hook: nexus_add_irq() to ask the nexus driver to add an
IRQ to its irq_rman. The MSI code uses this when it creates new
interrupt sources to let the nexus know about newly valid IRQs.
Previously the msi_alloc() and msix_alloc() passed some extra stuff
back to the nexus methods which then added the IRQs. This approach is
a bit cleaner.
- Change the MSI sx lock to a mutex. If we need to create new sources,
drop the lock, create the required number of sources, then get the lock
and try the allocation again.


# 167747 20-Mar-2007 jhb

Add a new apic0 psuedo-device to claim memory resources for the memory
address ranges used by local and I/O APICs in the system. Some systems
also reserve these ranges as system resources via either PnPBIOS or
ACPI, so this device currently attaches after acpi0 and legacy0 so that
the system resources are given precedence.


# 167247 05-Mar-2007 jhb

Use vm_paddr_t rather than uintptr_t when passing the physical address of
APICs to lapic_init() and ioapic_create().


# 167240 05-Mar-2007 jhb

Add a simple device driver to "eat" any I/O APICs that show up as PCI
devices.

MFC after: 1 week


# 164358 17-Nov-2006 jhb

Trim some noise from bootverbose:
- Drop the printf in intr_machdep.c when we assign an interrupt souce to
a CPU. Each source already has a more detailed printf.
- Don't output a line for each ioapic pin showing its initial state, this
has outlived its usefulness.
- When an APIC enumerator sets the bus, polarity, or trigger mode of an
ioapic pin, just return success without printing anything if the new
value matches the current one.

MFC after: 2 weeks


# 163219 10-Oct-2006 jhb

Change the x86 interrupt code to suspend/resume interrupt controllers
(PICs) rather than interrupt sources. This allows interrupt controllers
with no interrupt pics (such as the 8259As when APIC is in use) to
participate in suspend/resume.
- Always register the 8259A PICs even if we don't use any of their pins.
- Explicitly reset the 8259As on resume on amd64 if 'device atpic' isn't
included.
- Add a "dummy" PIC for the local APIC on the BSP to reset the local APIC
on resume. This gets suspend/resume working with APIC on UP systems.
SMP still needs more work to bring the APs back to life.

The MFC after is tentative.

Tested by: anholt (i386)
Submitted by: Andrea Bittau <a.bittau at cs.ucl.ac.uk> (3)
MFC after: 1 week


# 157541 05-Apr-2006 jhb

Cache the value of the lower half of each I/O APIC redirection table entry
so that we only have to do an ioapic_write() instead of an ioapic_read()
followed by an ioapic_write() every time we mask and unmask level triggered
interrupts. This cuts the execution time for these operations roughly in
half.

Profiled by: Paolo Pisati <p.pisati@oltrelinux.com>
MFC after: 1 week


# 156920 20-Mar-2006 jhb

Drop some unneeded casts since we program the kernel in C rather than C++.


# 156124 28-Feb-2006 jhb

Rework how we wire up interrupt sources to CPUs:
- Throw out all of the logical APIC ID stuff. The Intel docs are somewhat
ambiguous, but it seems that the "flat" cluster model we are currently
using is only supported on Pentium and P6 family CPUs. The other
"hierarchy" cluster model that is supported on all Intel CPUs with
local APICs is severely underdocumented. For example, it's not clear
if the OS needs to glean the topology of the APIC hierarchy from
somewhere (neither ACPI nor MP Table include it) and setup the logical
clusters based on the physical hierarchy or not. Not only that, but on
certain Intel chipsets, even though there were 4 CPUs in a logical
cluster, all the interrupts were only sent to one CPU anyway.
- We now bind interrupts to individual CPUs using physical addressing via
the local APIC IDs. This code has also moved out of the ioapic PIC
driver and into the common interrupt source code so that it can be
shared with MSI interrupt sources since MSI is addressed to APICs the
same way that I/O APIC pins are.
- Interrupt source classes grow a new method pic_assign_cpu() to bind an
interrupt source to a specific local APIC ID.
- The SMP code now tells the interrupt code which CPUs are avaiable to
handle interrupts in a simpler and more intuitive manner. For one thing,
it means we could now choose to not route interrupts to HT cores if we
wanted to (this code is currently in place in fact, but under an #if 0
for now).
- For now we simply do static round-robin of IRQs to CPUs when the first
interrupt handler just as before, with the change that IRQs are now
bound to individual CPUs rather than groups of up to 4 CPUs.
- Because the IRQ to CPU mapping has now been moved up a layer, it would
be easier to manage this mapping from higher levels. For example, we
could allow drivers to specify a CPU affinity map for their interrupts,
or we could allow a userland tool to bind IRQs to specific CPUs.

The MFC is tentative, but I want to see if this fixes problems some folks
had with UP APIC kernels on 6.0 on SMP machines (an SMP kernel would work
fine, but a UP APIC kernel (such as GENERIC in RELENG_6) would lose
interrupts).

MFC after: 1 week


# 152528 16-Nov-2005 jhb

Fix a typo in the check for an invalid APIC. If we are told about an
I/O APIC that doesn't exist, then a read of the version register is going
to return -1 which is 0xffffffff not 0xffffff.

Tested on: i386
Tested by: Nikos Ntarmos ntarmos at ceid dot upatras dot gr
MFC after: 1 week


# 152461 15-Nov-2005 andre

Provide a link to the documentation of the I/O APIC at Intel.


# 151979 02-Nov-2005 jhb

Change the x86 code to allocate IDT vectors on-demand when an interrupt
source is first enabled similar to how intr_event's now allocate ithreads
on-demand. Previously, we would map IDT vectors 1:1 to IRQs. Since we
only have 191 available IDT vectors for I/O interrupts, this limited us
to only supporting IRQs 0-190 corresponding to the first 190 I/O APIC
intpins. On many machines, however, each PCI-X bus has its own APIC even
though it only has 1 or 2 devices, thus, we were reserving between 24 and
32 IRQs just for 1 or 2 devices and thus 24 or 32 IDT vectors. With this
change, a machine with 100 IRQs but only 5 in use will only use up 5 IDT
vectors. Also, this change provides an API (apic_alloc_vector() and
apic_free_vector()) that will allow a future MSI interrupt source driver to
request IDT vectors for use by MSI interrupts on x86 machines.

Tested on: amd64, i386


# 151897 31-Oct-2005 rwatson

Normalize a significant number of kernel malloc type names:

- Prefer '_' to ' ', as it results in more easily parsed results in
memory monitoring tools such as vmstat.

- Remove punctuation that is incompatible with using memory type names
as file names, such as '/' characters.

- Disambiguate some collisions by adding subsystem prefixes to some
memory types.

- Generally prefer lower case to upper case.

- If the same type is defined in multiple architecture directories,
attempt to use the same name in additional cases.

Not all instances were caught in this change, so more work is required to
finish this conversion. Similar changes are required for UMA zone names.


# 148538 29-Jul-2005 jhb

Add a tunable 'hw.apic.enable_extint' that can be set from the loader to
not mask the ExtINT pin on the first I/O APIC as at least one PIII chipset
seems to need this even though all of the pins in the 8259A's are masked.
The default is still to mask the ExtINT pin.

Reported by: Mike Tancsa mike at sentex.net
MFC after: 3 days


# 145080 14-Apr-2005 jhb

Remove support for mixed mode altogether now that we no longer use IRQ 0
when using an APIC. This simplifies the APIC code somewhat and also allows
us to be pedantically more compliant with ACPI which mandates no use of
mixed mode.


# 145057 14-Apr-2005 jhb

Bah, add a missing cast.


# 145054 14-Apr-2005 jhb

If an I/O APIC returns 0xffffffff for its version register after we map it,
assume it is bogus and return NULL instead of trying to parse it as an
APIC.

Inspired by: linux bug reports via njl


# 142256 22-Feb-2005 jhb

If mixed mode is not enabled by the APIC enumerator (MPTable always does,
ACPI MADT only does if the PC-AT flag is set), then don't assume that pin 0
on the first I/O APIC is an ExtINT pin. Instead, assume that it is ISA
IRQ 0.


# 141616 10-Feb-2005 phk

Make a bunch of malloc types static.

Found by: src/tools/tools/kernxref


# 140452 18-Jan-2005 jhb

If a valid ELCR was found, consult it for the trigger mode of ISA
interrupts that have a trigger mode of conforming. This fixes problems on
some older machines that still route PCI devices via ISA interrupts when
using an I/O APIC.

Tested by: Peter Trifonov pvtrifonov at mail dot ru
MFC after: 1 month


# 133017 02-Aug-2004 scottl

Optimize intr_execute_handlers() by combining the pic_disable_source() and
pic_eoi_source() into one call. This halves the number of spinlock operations
and indirect function calls in the normal case of handling a normal (ithread)
interrupt. Optimize the atpic and ioapic drivers to use inlines where
appropriate in supporting the intr_execute_handlers() change.

This knocks 900ns, or roughly 1350 cycles, off of the time spent servicing an
interrupt in the common case on my 1.5GHz P4 uniprocessor system. SMP systems
likely won't see as much of a gain due to the ioapic being more efficient than
the atpic. I'll investigate porting this to amd64 soon.

Reviewed by: jhb


# 130984 23-Jun-2004 jhb

Finally implement bus_config_intr() support for I/O APIC interrupt sources.
This should fix problems with older SMP systems that only have ISA/EISA
IRQs when routing virgin PCI interrupts as well as on other boxes whose
MADT does not have any interrupt override entries for ISA IRQs that are
used to route PCI interrupts even in APIC mode.


# 130980 23-Jun-2004 jhb

Various cleanups in support of a future ioapic_config_intr() function:
- Allow ioapic_set_{nmi,smi,extint}() to be called multiple times on the
same pin so long as the pin's mode is the same as the mode being
requested.
- Add a notion of bus type for the interrupt associated with interrupt pin.
This is needed so that we can force all EISA interrupts to be active high
in the forthcoming ioapic_config_intr().
- Fix a bug for EISA systems that didn't remap IRQs. This would have broken
EISA systems that tried to disable mixed mode for IRQ 0.


# 129964 01-Jun-2004 jhb

- Add a function ioapic_program_intpin() that completely programs an I/O
APIC interrupt pin based on the settings in the corresponding interrupt
source structure.
- Use ioapic_program_intpin() in place of manual frobbing of the intpin
configuration in ioapic_program_destination() and ioapic_register().
- Use ioapic_program_intpin() to implement suspend/resume support for I/O
APICs.


# 129129 11-May-2004 jhb

- Remove a spurious blank line.
- Add a missing static keyword.


# 129097 10-May-2004 jhb

Rework the APIC mixed mode support a bit:
- Require the APIC enumerators to explicitly enable mixed mode by calling
ioapic_enable_mixed_mode(). Calling this function tells the apic driver
that the PC-AT 8259A PICs are present and routable through the first I/O
APIC via an ExtINT pin. The mptable enumerator always calls this
function for now. The MADT enumerator only enables mixed mode if the
PC-AT compatability flag is set in the MADT header.
- Allow mixed mode to be enabled or disabled via a 'hw.apic.mixed_mode'
tunable. By default this tunable is set to 1 (true). The kernel option
NO_MIXED_MODE changes the default to 0 to preserve existing behavior, but
adding 'hw.apic.mixed_mode=0' to loader.conf achieves the same effect.
- Only use mixed mode to route IRQ 0 if it is both enabled by the APIC
enumerator and activated by the loader tunable. Note that both
conditions must be true, so if the APIC enumerator does not enable mixed
mode, then you can't set the tunable to try to override the enumerator.


# 128931 04-May-2004 jhb

- Add a new pic method pic_config_intr() to set the trigger mode and
polarity for a specified IRQ. The intr_config_intr() function wraps
this pic method hiding the IRQ to interrupt source lookup.
- Add a config_intr() method to the atpic(4) driver that reconfigures
the interrupt using the ELCR if possible and returns an error otherwise.
- Add a config_intr() method to the apic(4) driver that just logs any
requests that would change the existing programming under bootverbose.
Currently, the only changes the apic(4) driver receives are due to bugs
in the acpi(4) driver and its handling of link devices, hence the reason
for such requests currently being ignored.
- Have the nexus(4) driver on i386 implement the bus_config_intr() function
by calling intr_config_intr().


# 128930 04-May-2004 jhb

- Change the APIC code to mostly use the recently added intr_trigger
and intr_polarity enums for passing around interrupt trigger modes and
polarity rather than using the magic numbers 0 for level/low and 1 for
edge/high.
- Convert the mptable parsing code to use the new ELCR wrapper code rather
than reading the ELCR directly. Also, use the ELCR settings to control
both the trigger and polarity of EISA IRQs instead of just the trigger
mode.
- Rework the MADT's handling of the ACPI SCI again:
- If no override entry for the SCI exists at all, use level/low trigger
instead of the default edge/high used for ISA IRQs.
- For the ACPI SCI, use level/low values for conforming trigger and
polarity rather than the edge/high values we use for all other ISA
IRQs.
- Rework the tunables available to override the MADT. The
hw.acpi.force_sci_lo tunable is no longer supported. Instead, there
are now two tunables that can independently override the trigger mode
and/or polarity of the SCI. The hw.acpi.sci.trigger tunable can be
set to either "edge" or "level", and the hw.acpi.sci.polarity tunable
can be set to either "high" or "low". To simulate hw.acpi.force_sci_lo,
set hw.acpi.sci.trigger to "level" and hw.acpi.sci.polarity to "low".
If you are having problems with ACPI either causing an interrupt storm
or not working at all (e.g., the power button doesn't turn invoke a
shutdown -p now), you can try tweaking these two tunables to find the
combination that works.


# 122572 12-Nov-2003 jhb

- Move manipulation of td_intr_nesting_level out of assembly interrupt
vector stubs and into the C functions they call.
- Move disabling and EOIing of interrupt sources out of PIC driver entry
points and into intr_execute_handlers(). Intr_execute_handlers() only
disables a source for an interrupt if it is a stray interrupt or has
threaded handlers. Sources with fast handlers no longer disable (mask)
the source while executing the handlers.
- Move the setting of clkintr_pending into intr_execute_handlers() and set
the variable for any interrupt source with a vector of 0. (Should only
be true for IRQ 0.) This fixes clkintr_pending in the NO_MIXED_MODE
case.
- Implement lapic_eoi() and use it to implement ioapic_eoi_source().
- Rename atpic_sched_ithd() to atpic_handle_intr() since it is used to
handle all atpic interrupts and not just threaded ones.

Inspired by: peter's changes to amd64 in p4 (1)
Requested by: bde (2)


# 122268 07-Nov-2003 jhb

Dump the trigger and polarity of each intpin's default setting in the
bootverbose output.


# 122148 05-Nov-2003 jhb

Two style nits.


# 122124 05-Nov-2003 jhb

- Adjust some of the bitfields in the ioapic_intsrc struct to be unsigned
rather than signed. This fixes some cosmetics such as verbose printf's
for IRQs greater than 127.
- The calculation for next_ioapic_base was also adjusted so that it will
only complain once for each hole in the IRQs provided by ACPI for IO
APICs.

Reported by: Michal Mertl <mime@traveller.cz>


# 122064 04-Nov-2003 jhb

Tweak the version string output for ioapic devices.


# 121986 03-Nov-2003 jhb

New APIC support code:

- The apic interrupt entry points have been rewritten so that each entry
point can serve 32 different vectors. When the entry is executed, it
uses one of the 32-bit ISR registers to determine which vector in its
assigned range was triggered. Thus, the apic code can support 159
different interrupt vectors with only 5 entry points.
- We now always to disable the local APIC to work around an errata in
certain PPros and then re-enable it again if we decide to use the APICs
to route interrupts.
- We no longer map IO APICs or local APICs using special page table
entries. Instead, we just use pmap_mapdev(). We also no longer
export the virtual address of the local APIC as a global symbol to
the rest of the system, but only in local_apic.c. To aid this, the
APIC ID of each CPU is exported as a per-CPU variable.
- Interrupt sources are provided for each intpin on each IO APIC.
Currently, each source is given a unique interrupt vector meaning that
PCI interrupts are not shared on most machines with an I/O APIC.
That mapping for interrupt sources to interrupt vectors is up to the
APIC enumerator driver however.
- We no longer probe to see if we need to use mixed mode to route IRQ 0,
instead we always use mixed mode to route IRQ 0 for now. This can be
disabled via the 'NO_MIXED_MODE' kernel option.
- The npx(4) driver now always probes to see if a built-in FPU is present
since this test can now be performed with the new APIC code. However,
an SMP kernel will panic if there is more than one CPU and a built-in
FPU is not found.
- PCI interrupts are now properly routed when using APICs to route
interrupts, so remove the hack to psuedo-route interrupts when the
intpin register was read.
- The apic.h header was moved to apicreg.h and a new apicvar.h header
that declares the APIs used by the new APIC code was added.