History log of /linux-master/drivers/pci/msi/irqdomain.c
Revision Date Author Comments
# db744ddd 15-Jan-2024 Vidya Sagar <vidyas@nvidia.com>

PCI/MSI: Prevent MSI hardware interrupt number truncation

While calculating the hardware interrupt number for a MSI interrupt, the
higher bits (i.e. from bit-5 onwards a.k.a domain_nr >= 32) of the PCI
domain number gets truncated because of the shifted value casting to return
type of pci_domain_nr() which is 'int'. This for example is resulting in
same hardware interrupt number for devices 0019:00:00.0 and 0039:00:00.0.

To address this cast the PCI domain number to 'irq_hw_number_t' before left
shifting it to calculate the hardware interrupt number.

Please note that this fixes the issue only on 64-bit systems and doesn't
change the behavior for 32-bit systems i.e. the 32-bit systems continue to
have the issue. Since the issue surfaces only if there are too many PCIe
controllers in the system which usually is the case in modern server
systems and they don't tend to run 32-bit kernels.

Fixes: 3878eaefb89a ("PCI/MSI: Enhance core to support hierarchy irqdomain")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Shanker Donthineni <sdonthineni@nvidia.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240115135649.708536-1-vidyas@nvidia.com


# 86b4ad7d 24-Aug-2023 Bjorn Helgaas <bhelgaas@google.com>

PCI: Fix typos in docs and comments

Fix typos in docs and comments.

Link: https://lore.kernel.org/r/20230824193712.542167-11-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>


# 0194425a 24-Nov-2022 Thomas Gleixner <tglx@linutronix.de>

PCI/MSI: Provide IMS (Interrupt Message Store) support

IMS (Interrupt Message Store) is a new specification which allows
implementation specific storage of MSI messages contrary to the
strict standard specified MSI and MSI-X message stores.

This requires new device specific interrupt domains to handle the
implementation defined storage which can be an array in device memory or
host/guest memory which is shared with hardware queues.

Add a function to create IMS domains for PCI devices. IMS domains are using
the new per device domain mechanism and are configured by the device driver
via a template. IMS domains are created as secondary device domains so they
work side on side with MSI[-X] on the same device.

The IMS domains have a few constraints:

- The index space is managed by the core code.

Device memory based IMS provides a storage array with a fixed size
which obviously requires an index. But there is no association between
index and functionality so the core can randomly allocate an index in
the array.

System memory based IMS does not have the concept of an index as the
storage is somewhere in memory. In that case the index is purely
software based to keep track of the allocations.

- There is no requirement for consecutive index ranges

This is currently a limitation of the MSI core and can be implemented
if there is a justified use case by changing the internal storage from
xarray to maple_tree. For now it's single vector allocation.

- The interrupt chip must provide the following callbacks:

- irq_mask()
- irq_unmask()
- irq_write_msi_msg()

- The interrupt chip must provide the following optional callbacks
when the irq_mask(), irq_unmask() and irq_write_msi_msg() callbacks
cannot operate directly on hardware, e.g. in the case that the
interrupt message store is in queue memory:

- irq_bus_lock()
- irq_bus_unlock()

These callbacks are invoked from preemptible task context and are
allowed to sleep. In this case the mandatory callbacks above just
store the information. The irq_bus_unlock() callback is supposed to
make the change effective before returning.

- Interrupt affinity setting is handled by the underlying parent
interrupt domain and communicated to the IMS domain via
irq_write_msi_msg(). IMS domains cannot have a irq_set_affinity()
callback. That's a reasonable restriction similar to the PCI/MSI
device domain implementations.

The domain is automatically destroyed when the PCI device is removed.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.904316841@linutronix.de


# 34026364 24-Nov-2022 Thomas Gleixner <tglx@linutronix.de>

PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X

MSI-X vectors can be allocated after the initial MSI-X enablement, but this
needs explicit support of the underlying interrupt domains.

Provide a function to query the ability and functions to allocate/free
individual vectors post-enable.

The allocation can either request a specific index in the MSI-X table or
with the index argument MSI_ANY_INDEX it allocates the next free vector.

The return value is a struct msi_map which on success contains both index
and the Linux interrupt number. In case of failure index is negative and
the Linux interrupt number is 0.

The allocation function is for a single MSI-X index at a time as that's
sufficient for the most urgent use case VFIO to get rid of the 'disable
MSI-X, reallocate, enable-MSI-X' cycle which is prone to lost interrupts
and redirections to the legacy and obviously unhandled INTx.

As single index allocation is also sufficient for the use cases Jason
Gunthorpe pointed out: Allocation of a MSI-X or IMS vector for a network
queue. See Link below.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/all/20211126232735.547996838@linutronix.de
Link: https://lore.kernel.org/r/20221124232326.731233614@linutronix.de


# 73bd063c 24-Nov-2022 Thomas Gleixner <tglx@linutronix.de>

PCI/MSI: Provide prepare_desc() MSI domain op

The setup of MSI descriptors for PCI/MSI-X interrupts depends partially on
the MSI index for which the descriptor is initialized.

Dynamic MSI-X vector allocation post MSI-X enablement allows to allocate
vectors at a given index or at any free index in the available table
range. The latter requires that the descriptor is initialized after the
MSI core has chosen an index.

Implement the prepare_desc() op in the PCI/MSI-X specific msi_domain_ops
which is invoked before the core interrupt descriptor and the associated
Linux interrupt number is allocated.

That callback is also provided for the upcoming PCI/IMS implementations so
the implementation specific interrupt domain can do their domain specific
initialization of the MSI descriptors.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.673658806@linutronix.de


# 45c04024 24-Nov-2022 Thomas Gleixner <tglx@linutronix.de>

PCI/MSI: Remove unused pci_dev_has_special_msi_domain()

The check for special MSI domains like VMD which prevents the interrupt
remapping code to overwrite device::msi::domain is not longer required and
has been replaced by an x86 specific version which is aware of MSI parent
domains.

Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.093093200@linutronix.de


# 15c72f82 24-Nov-2022 Thomas Gleixner <tglx@linutronix.de>

PCI/MSI: Add support for per device MSI[X] domains

Provide a template and the necessary callbacks to create PCI/MSI and
PCI/MSI-X domains.

The domains are created when MSI or MSI-X is enabled. The domain's lifetime
is either the device lifetime or in case that e.g. MSI-X was tried first
and failed, then the MSI-X domain is removed and a MSI domain is created as
both are mutually exclusive and reside in the default domain ID slot of the
per device domain pointer array.

Also expand pci_msi_domain_supports() to handle feature checks correctly
even in the case that the per device domain was not yet created by checking
the features supported by the MSI parent.

Add the necessary setup calls into the MSI and MSI-X enable code path.
These setup calls are backwards compatible. They return success when there
is no parent domain found, which means the existing global domains or the
legacy allocation path keep just working.

Co-developed-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232325.975388241@linutronix.de


# d3a11dee 24-Nov-2022 Thomas Gleixner <tglx@linutronix.de>

PCI/MSI: Use msi_domain_alloc/free_irqs_all_locked()

Switch to the new domain id aware interfaces to phase out the previous
ones. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124230314.455168748@linutronix.de


# 9c03b258 11-Nov-2022 Thomas Gleixner <tglx@linutronix.de>

PCI/MSI: Remove redundant msi_check() callback

All these sanity checks are now done _before_ any allocation work
happens. No point in doing it twice.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122015.749446904@linutronix.de


# d2a463b2 11-Nov-2022 Thomas Gleixner <tglx@linutronix.de>

PCI/MSI: Reject multi-MSI early

When hierarchical MSI interrupt domains are enabled then there is no point
to do tons of work and detect the missing support for multi-MSI late in the
allocation path.

Just query the domain feature flags right away. The query function is going
to be used for other purposes later and has a mode argument which influences
the result:

ALLOW_LEGACY returns true when:
- there is no irq domain attached (legacy support)
- there is a irq domain attached which has the feature flag set

DENY_LEGACY returns only true when:
- there is a irq domain attached which has the feature flag set

This allows to use the function universally without ifdeffery in the
calling code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122015.574339988@linutronix.de


# b2bdda20 11-Nov-2022 Ahmed S. Darwish <darwi@linutronix.de>

PCI/MSI: Let the MSI core free descriptors

Let the core do the freeing of descriptors and just keep it around for the
legacy case.

Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122014.409654736@linutronix.de


# 38c0c10a 11-Nov-2022 Ahmed S. Darwish <darwi@linutronix.de>

PCI/MSI: Use msi_domain_info:: Bus_token

Set the bus token in the msi_domain_info structure and let the core code
handle the update.

Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122014.352437595@linutronix.de


# a0af3d11 19-Jan-2022 Thomas Gleixner <tglx@linutronix.de>

PCI/MSI: Prevent UAF in error path

When the core MSI allocation fails, then the PCI/MSI code uses an already
freed MSI descriptor to unmask the MSI mask register in order to bring it back
into reset state.

Remove MSI_FLAG_FREE_MSI_DESCS from the PCI/MSI irqdomain flags and let the
PCI/MSI code free the MSI descriptors after usage.

Fixes: 0f62d941acf9 ("genirq/msi: Provide msi_domain_alloc/free_irqs_descs_locked()")
Reported-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Tong Zhang <ztong0001@gmail.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/87r1938vbn.ffs@tglx


# ae24e28f 06-Dec-2021 Thomas Gleixner <tglx@linutronix.de>

PCI/MSI: Use msi_on_each_desc()

Use the new iterator functions which pave the way for dynamically extending
MSI-X vectors.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20211206210748.142603657@linutronix.de


# 9fb9eb4b 06-Dec-2021 Thomas Gleixner <tglx@linutronix.de>

PCI/MSI: Let core code free MSI descriptors

Set the domain info flag which tells the core code to free the MSI
descriptors from msi_domain_free_irqs() and add an explicit call to the
core function into the legacy code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20211206210748.089085131@linutronix.de


# 5512c5ea 06-Dec-2021 Thomas Gleixner <tglx@linutronix.de>

PCI/MSI: Protect MSI operations

To prepare for dynamic extension of MSI-X vectors, protect the MSI
operations for MSI and MSI-X. This requires to move the invocation of
irq_create_affinity_masks() out of the descriptor lock section to avoid
reverse lock ordering vs. CPU hotplug lock as some callers of the PCI/MSI
allocation interfaces already hold it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20211206210747.982292705@linutronix.de


# 7a823443 10-Dec-2021 Thomas Gleixner <tglx@linutronix.de>

PCI/MSI: Provide MSI_FLAG_MSIX_CONTIGUOUS

Provide a domain info flag which makes the core code check for a contiguous
MSI-X index on allocation. That's simpler than checking it at some other
domain callback in architecture code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20211210221814.662401116@linutronix.de


# 173ffad7 10-Dec-2021 Thomas Gleixner <tglx@linutronix.de>

PCI/MSI: Use msi_desc::msi_index

The usage of msi_desc::pci::entry_nr is confusing at best. It's the index
into the MSI[X] descriptor table.

Use msi_desc::msi_index which is shared between all MSI incarnations
instead of having a PCI specific storage for no value.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20211210221814.602911509@linutronix.de


# ffd84485 10-Dec-2021 Thomas Gleixner <tglx@linutronix.de>

PCI/MSI: Let the irq code handle sysfs groups

Set the domain info flag which makes the core code handle sysfs groups and
put an explicit invocation into the legacy code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20211210221814.048612053@linutronix.de


# 89033762 06-Dec-2021 Thomas Gleixner <tglx@linutronix.de>

genirq/msi: Handle PCI/MSI allocation fail in core code

Get rid of yet another irqdomain callback and let the core code return the
already available information of how many descriptors could be allocated.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # PCI
Link: https://lore.kernel.org/r/20211206210225.046615302@linutronix.de


# 57ce3a3c 06-Dec-2021 Thomas Gleixner <tglx@linutronix.de>

PCI/MSI: Make pci_msi_domain_check_cap() static

No users outside of that file.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20211206210224.980989243@linutronix.de


# aa423ac4 06-Dec-2021 Thomas Gleixner <tglx@linutronix.de>

PCI/MSI: Split out irqdomain code

Move the irqdomain specific code into its own file.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20211206210224.817754783@linutronix.de