History log of /linux-master/drivers/pci/access.c
Revision Date Author Comments
# c9d52fb3 30-May-2024 Dan Williams <dan.j.williams@intel.com>

PCI: Revert the cfg_access_lock lockdep mechanism

While the experiment did reveal that there are additional places that are
missing the lock during secondary bus reset, one of the places that needs
to take cfg_access_lock (pci_bus_lock()) is not prepared for lockdep
annotation.

Specifically, pci_bus_lock() takes pci_dev_lock() recursively and is
currently dependent on the fact that the device_lock() is marked
lockdep_set_novalidate_class(&dev->mutex). Otherwise, without that
annotation, pci_bus_lock() would need to use something like a new
pci_dev_lock_nested() helper, a scheme to track a PCI device's depth in the
topology, and a hope that the depth of a PCI tree never exceeds the max
value for a lockdep subclass.

The alternative to ripping out the lockdep coverage would be to deploy a
dynamic lock key for every PCI device. Unfortunately, there is evidence
that increasing the number of keys that lockdep needs to track to be
per-PCI-device is prohibitively expensive for something like the
cfg_access_lock.

The main motivation for adding the annotation in the first place was to
catch unlocked secondary bus resets, not necessarily catch lock ordering
problems between cfg_access_lock and other locks. Solve that narrower
problem with follow-on patches, and just due to targeted revert for now.

Link: https://lore.kernel.org/r/171711746402.1628941.14575335981264103013.stgit@dwillia2-xfh.jf.intel.com
Fixes: 7e89efc6e9e4 ("PCI: Lock upstream bridge for pci_reset_function()")
Reported-by: Imre Deak <imre.deak@intel.com>
Closes: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_134186v1/shard-dg2-1/igt@device_reset@unbind-reset-rebind.html
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Kalle Valo <kvalo@kernel.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Cc: Jani Saarinen <jani.saarinen@intel.com>


# 7e89efc6 02-May-2024 Dave Jiang <dave.jiang@intel.com>

PCI: Lock upstream bridge for pci_reset_function()

Fix a long-standing locking gap for missing pci_cfg_access_lock() while
manipulating bridge reset registers and configuration during
pci_reset_bus_function().

If there is an upstream bridge, lock it before locking the device itself.
pci_dev_lock() calls pci_cfg_access_lock(), which blocks the writing of PCI
config space by user space.

Add lockdep assertion via pci_dev->cfg_access_lock to verify
pci_dev->block_cfg_access is set.

Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20240502165851.1948523-3-dave.jiang@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 4c407392 28-Apr-2024 Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

PCI: Clean up accessor macro formatting

Clean up formatting of PCI accessor macros:

- Put return statements on own line

- Add a few empty lines for better readability

- Align macro continuation backslashes

- Correct function call argument indentation level

- Reorder variable declarations to order of use

- Drop unnecessary variable initialization

Link: https://lore.kernel.org/r/20240429094707.2529-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: drop initialization, tweak variables to order of use]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# ac160871 07-Dec-2023 Shuai Xue <xueshuai@linux.alibaba.com>

PCI: Move pci_clear_and_set_dword() helper to PCI header

The clear and set pattern is commonly used for accessing PCI config,
move the helper pci_clear_and_set_dword() from aspm.c into PCI header.
In addition, rename to pci_clear_and_set_config_dword() to retain the
"config" information and match the other accessors.

No functional change intended.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20231208025652.87192-4-xueshuai@linux.alibaba.com
Signed-off-by: Will Deacon <will@kernel.org>


# 294c1e4f 24-Aug-2023 Bjorn Helgaas <bhelgaas@google.com>

PCI: Simplify pcie_capability_clear_and_set_word() control flow

Return early for errors in pcie_capability_clear_and_set_word_unlocked()
and pcie_capability_clear_and_set_dword() to simplify the control flow.

No functional change intended.

Link: https://lore.kernel.org/r/20230824193712.542167-13-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>


# 5e70d0ac 17-Jul-2023 Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

PCI: Add locking to RMW PCI Express Capability Register accessors

Many places in the kernel write the Link Control and Root Control PCI
Express Capability Registers without proper concurrency control and this
could result in losing the changes one of the writers intended to make.

Add pcie_cap_lock spinlock into the struct pci_dev and use it to protect
bit changes made in the RMW capability accessors. Protect only a selected
set of registers by differentiating the RMW accessor internally to
locked/unlocked variants using a wrapper which has the same signature as
pcie_capability_clear_and_set_word(). As the Capability Register (pos)
given to the wrapper is always a constant, the compiler should be able to
simplify all the dead-code away.

So far only the Link Control Register (ASPM, hotplug, link retraining,
various drivers) and the Root Control Register (AER & PME) seem to
require RMW locking.

Suggested-by: Lukas Wunner <lukas@wunner.de>
Fixes: c7f486567c1d ("PCI PM: PCIe PME root port service driver")
Fixes: f12eb72a268b ("PCI/ASPM: Use PCI Express Capability accessors")
Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support")
Fixes: affa48de8417 ("staging/rdma/hfi1: Add support for enabling/disabling PCIe ASPM")
Fixes: 849a9366cba9 ("misc: rtsx: Add support new chip rts5228 mmc: rtsx: Add support MMC_CAP2_NO_MMC")
Fixes: 3d1e7aa80d1c ("misc: rtsx: Use pcie_capability_clear_and_set_word() for PCI_EXP_LNKCTL")
Fixes: c0e5f4e73a71 ("misc: rtsx: Add support for RTS5261")
Fixes: 3df4fce739e2 ("misc: rtsx: separate aspm mode into MODE_REG and MODE_CFG")
Fixes: 121e9c6b5c4c ("misc: rtsx: modify and fix init_hw function")
Fixes: 19f3bd548f27 ("mfd: rtsx: Remove LCTLR defination")
Fixes: 773ccdfd9cc6 ("mfd: rtsx: Read vendor setting from config space")
Fixes: 8275b77a1513 ("mfd: rts5249: Add support for RTS5250S power saving")
Fixes: 5da4e04ae480 ("misc: rtsx: Add support for RTS5260")
Fixes: 0f49bfbd0f2e ("tg3: Use PCI Express Capability accessors")
Fixes: 5e7dfd0fb94a ("tg3: Prevent corruption at 10 / 100Mbps w CLKREQ")
Fixes: b726e493e8dc ("r8169: sync existing 8168 device hardware start sequences with vendor driver")
Fixes: e6de30d63eb1 ("r8169: more 8168dp support.")
Fixes: 8a06127602de ("Bluetooth: hci_bcm4377: Add new driver for BCM4377 PCIe boards")
Fixes: 6f461f6c7c96 ("e1000e: enable/disable ASPM L0s and L1 and ERT according to hardware errata")
Fixes: 1eae4eb2a1c7 ("e1000e: Disable L1 ASPM power savings for 82573 mobile variants")
Fixes: 8060e169e02f ("ath9k: Enable extended synch for AR9485 to fix L0s recovery issue")
Fixes: 69ce674bfa69 ("ath9k: do btcoex ASPM disabling at initialization time")
Fixes: f37f05503575 ("mt76: mt76x2e: disable pcie_aspm by default")
Link: https://lore.kernel.org/r/20230717120503.15276-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: "Rafael J. Wysocki" <rafael@kernel.org>


# 503fa236 17-Sep-2022 Maciej W. Rozycki <macro@orcam.me.uk>

PCI: Access Link 2 registers only for devices with Links

PCIe r2.0, sec 7.8 added Link Capabilities/Status/Control 2 registers to
the PCIe Capability with Capability Version 2.

Previously we assumed these registers were implemented for all PCIe
Capabilities of version 2 or greater, but in fact they are only
implemented for devices with Links.

Update pcie_capability_reg_implemented() to check whether the device has
a Link.

[bhelgaas: commit log, squash export]
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2209100057070.2275@angie.orcam.me.uk
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2209100057300.2275@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 92c45b63 05-Aug-2020 Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>

PCI: Reduce warnings on possible RW1C corruption

For hardware that only supports 32-bit writes to PCI there is the
possibility of clearing RW1C (write-one-to-clear) bits. A rate-limited
messages was introduced by fb2659230120, but rate-limiting is not the best
choice here. Some devices may not show the warnings they should if another
device has just produced a bunch of warnings. Also, the number of messages
can be a nuisance on devices which are otherwise working fine.

Change the ratelimit to a single warning per bus. This ensures no bus is
'starved' of emitting a warning and also that there isn't a continuous
stream of warnings. It would be preferable to have a warning per device,
but the pci_dev structure is not available here, and a lookup from devfn
would be far too slow.

Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
Fixes: fb2659230120 ("PCI: Warn on possible RW1C corruption for sub-32 bit config writes")
Link: https://lore.kernel.org/r/20200806041455.11070-1-mark.tomlinson@alliedtelesis.co.nz
Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Scott Branden <scott.branden@broadcom.com>


# 289e3ea3 18-Nov-2021 Naveen Naidu <naveennaidu479@gmail.com>

PCI: Use PCI_ERROR_RESPONSE to identify config read errors

Include PCI_ERROR_RESPONSE along with 0xFFFF and 0xFFFFFFFF in the comment
about identifying config read errors. This makes checks for config read
errors easier to find. Comment change only.

Link: https://lore.kernel.org/r/866e2db544df45af70df7e64659bf02e03998ae3.1637243717.git.naveennaidu479@gmail.com
Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 316df706 18-Nov-2021 Naveen Naidu <naveennaidu479@gmail.com>

PCI: Drop error data fabrication when config read fails

If config pci_ops.read() methods return failure, the PCI_OP_READ() and
PCI_USER_READ_CONFIG() wrappers use PCI_SET_ERROR_RESPONSE() to set the
data value, so there's no need to set it in the pci_ops.read() methods
themselves.

Drop the unnecessary data value fabrication when pci_ops.read() fails.

Link: https://lore.kernel.org/r/1b2edb060cf19b45f70645b331e6c08c9ba798c0.1637243717.git.naveennaidu479@gmail.com
Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring <robh@kernel.org>


# 9bc9310c 18-Nov-2021 Naveen Naidu <naveennaidu479@gmail.com>

PCI: Use PCI_SET_ERROR_RESPONSE() for disconnected devices

A config read from a PCI device that doesn't exist or doesn't respond
causes a PCI error. There's no real data to return to satisfy the CPU read,
so most hardware fabricates ~0 data.

Use PCI_SET_ERROR_RESPONSE() to set the error response when we think the
device has already been disconnected.

This helps unify PCI error response checking and make error checks
consistent and easier to find.

Link: https://lore.kernel.org/r/29db0a6874716db80757e4e3cdd03562f13eb0cb.1637243717.git.naveennaidu479@gmail.com
Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# f4f7eb43 18-Nov-2021 Naveen Naidu <naveennaidu479@gmail.com>

PCI: Set error response data when config read fails

When a PCI config read fails, most PCI host bridges fabricate ~0 data to
complete the CPU read. But some host bridges do not; their drivers may
only return an error from the pci_ops.read() method.

In PCI_OP_READ() and PCI_USER_READ_CONFIG(), use PCI_SET_ERROR_RESPONSE()
to set the data value to indicate an error when pci_ops.read() fails.

This means the host bridge driver no longer needs to fabricate error data
when they detect errors.

This makes error response fabrication consistent and helps in removal of a
lot of repeated code.

Suggested-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/4188fc5465631ce0d472d1423de3d9fb2f09b8ff.1637243717.git.naveennaidu479@gmail.com
Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Pali Rohár <pali@kernel.org>


# 2a7e32d0 25-Jun-2020 Bjorn Helgaas <bhelgaas@google.com>

PCI: Fix pci_cfg_wait queue locking problem

The pci_cfg_wait queue is used to prevent user-space config accesses to
devices while they are recovering from reset.

Previously we used these operations on pci_cfg_wait:

__add_wait_queue(&pci_cfg_wait, ...)
__remove_wait_queue(&pci_cfg_wait, ...)
wake_up_all(&pci_cfg_wait)

The wake_up acquires the wait queue lock, but the add and remove do not.

Originally these were all protected by the pci_lock, but cdcb33f98244
("PCI: Avoid possible deadlock on pci_lock and p->pi_lock"), moved
wake_up_all() outside pci_lock, so it could race with add/remove
operations, which caused occasional kernel panics, e.g., during vfio-pci
hotplug/unplug testing:

Unable to handle kernel read from unreadable memory at virtual address ffff802dac469000

Resolve this by using wait_event() instead of __add_wait_queue() and
__remove_wait_queue(). The wait queue lock is held by both wait_event()
and wake_up_all(), so it provides mutual exclusion.

Fixes: cdcb33f98244 ("PCI: Avoid possible deadlock on pci_lock and p->pi_lock")
Link: https://lore.kernel.org/linux-pci/79827f2f-9b43-4411-1376-b9063b67aee3@huawei.com/T/#u
Based-on: https://lore.kernel.org/linux-pci/20191210031527.40136-1-zhengxiang9@huawei.com/
Based-on-patch-by: Xiang Zheng <zhengxiang9@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiang Zheng <zhengxiang9@huawei.com>
Cc: Heyi Guo <guoheyi@huawei.com>
Cc: Biaoxiang Ye <yebiaoxiang@huawei.com>


# b9153581 15-Jun-2020 Bolarinwa Olayemi Saheed <refactormyself@gmail.com>

PCI: Align PCIe capability and PCI accessor return values

The PCI config accessors (pci_read_config_word(), et al) return
PCIBIOS_SUCCESSFUL (zero) or positive error values like
PCIBIOS_FUNC_NOT_SUPPORTED.

The PCIe capability accessors similarly return PCIBIOS errors, but in
addition, they can return -EINVAL. This makes it harder than it should be
to check for errors.

Return PCIBIOS_BAD_REGISTER_NUMBER instead of -EINVAL in all PCIe
capability accessors.

Suggested-by: Bjorn Helgaas <bjorn@helgaas.com>
Link: https://lore.kernel.org/r/20200615073225.24061-9-refactormyself@gmail.com
Signed-off-by: Bolarinwa Olayemi Saheed <refactormyself@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# af65d1ad 18-Oct-2019 Patel, Mayurkumar <mayurkumar.patel@intel.com>

PCI/AER: Save AER Capability for suspend/resume

Previously we did not save and restore the AER configuration on
suspend/resume, so the configuration may be lost after resume.

Save the AER configuration during suspend and restore it during resume.

[bhelgaas: commit log]
Link: https://lore.kernel.org/r/92EBB4272BF81E4089A7126EC1E7B28492C3B007@IRSMSX101.ger.corp.intel.com
Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>


# 984998e3 22-Aug-2019 Mika Westerberg <mika.westerberg@linux.intel.com>

PCI: Make pcie_downstream_port() available outside of access.c

pcie_downstream_port() is useful in other places where code needs to
determine whether the PCIe port is downstream so make it available outside
of access.c.

Link: https://lore.kernel.org/r/20190822085553.62697-1-mika.westerberg@linux.intel.com
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>


# 5180fd91 18-Sep-2018 Keith Busch <kbusch@kernel.org>

PCI: Uninline PCI bus accessors for better ftracing

The PCI bus config accessors could be inlined into other accessor
functions, which makes it so they can't be traced. Force them to never be
inlined so that ftrace can hook into these functions.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# df62ab5e 09-Mar-2018 Bjorn Helgaas <bhelgaas@google.com>

PCI: Tidy comments

Remove pointless comments that tell us the file name, remove blank line
comments, follow multi-line comment conventions. No functional change
intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# f0eb77ae 19-Mar-2018 Bjorn Helgaas <bhelgaas@google.com>

PCI/VPD: Move VPD access code to vpd.c

Move the VPD-related code from access.c to vpd.c. The goal is to
encapsulate all the VPD code and structures in vpd.c.

No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 7328c8f4 26-Jan-2018 Bjorn Helgaas <bhelgaas@google.com>

PCI: Add SPDX GPL-2.0 when no license was specified

b24413180f56 ("License cleanup: add SPDX GPL-2.0 license identifier to
files with no license") added SPDX GPL-2.0 to several PCI files that
previously contained no license information.

Add SPDX GPL-2.0 to all other PCI files that did not contain any license
information and hence were under the default GPL version 2 license of the
kernel.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 7506dc79 17-Jan-2018 Frederick Lawler <fred@fredlawl.com>

PCI: Add wrappers for dev_printk()

Add PCI-specific dev_printk() wrappers and use them to simplify the code
slightly. No functional change intended.

Signed-off-by: Frederick Lawler <fred@fredlawl.com>
[bhelgaas: squash into one patch]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 714fe383 16-Mar-2017 Thomas Gleixner <tglx@linutronix.de>

PCI: Provide Kconfig option for lockless config space accessors

The generic PCI configuration space accessors are globally serialized via
pci_lock. On larger systems this causes massive lock contention when the
configuration space has to be accessed frequently. One such access pattern
is the Intel Uncore performance counter unit.

Provide a kernel config option which can be selected by an architecture
when the low level PCI configuration space accessors in the architecture
use their own serialization or can operate completely lockless.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <helgaas@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: linux-pci@vger.kernel.org
Link: http://lkml.kernel.org/r/20170316215057.205961140@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# 449e2f9e 23-May-2017 Brian Norris <briannorris@chromium.org>

PCI: Make error code types consistent in pci_{read,write}_config_*

Callers normally treat the config space accessors as returning PCBIOS_*
error codes, not Linux error codes (or they don't look at them at all). We
have pcibios_err_to_errno() in case the error code needs to be translated.

Fixes: 4b1038834739 ("PCI: Don't attempt config access to disconnected devices")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>


# 9b70ae49 19-Apr-2017 Bjorn Helgaas <bhelgaas@google.com>

PCI: Include PCI-to-PCIe bridges as "Downstream Ports"

A PCI/PCI-X to PCI Express bridge, sometimes referred to as a "reverse
bridge", is a bridge with conventional PCI or PCI-X on its primary side and
a PCI Express Port on its secondary (downstream) side.

That PCIe Port is a Downstream Port and could be connected to a slot, just
like a Root Port or a Switch Downstream Port. Make pcie_downstream_port()
return true for them, so we can access the Slot registers in the PCIe
capability.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 0b131b13 27-Mar-2017 Brian Norris <briannorris@chromium.org>

PCI: Fix typo pci_cfg_access_lock() comment

There is no pci_cfg_access_unlocked(). I think the author meant
pci_cfg_access_unlock().

Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 4b103883 29-Mar-2017 Keith Busch <kbusch@kernel.org>

PCI: Don't attempt config access to disconnected devices

If we've detected the PCI device is disconnected, there is no need to
attempt to access its config space since we know the operation will fail.
Make all the config reads and writes return -ENODEV error immediately when
in such a state.

If a caller requests a config read to a disconnected device, return a data
value of all 1's. This is the same as what hardware is expected to return
when accessing a removed device, but software can do this faster without
relying on hardware.

Tested-by: Krishna Dhulipala <krishnad@fb.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Wei Zhang <wzhang@fb.com>


# d3881e50 07-Feb-2017 Keith Busch <kbusch@kernel.org>

PCI: Export PCI device config accessors

Replace the inline PCI device config read and write accessors with exported
functions. This is preparing for these functions to make use of private
data.

Tested-by: Krishna Dhulipala <krishnad@fb.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Wei Zhang <wzhang@fb.com>


# 174cd4b1 02-Feb-2017 Ingo Molnar <mingo@kernel.org>

sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h>

Fix up affected files that include this signal functionality via sched.h.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 4f69bd16 28-Nov-2016 Matthew R. Ochs <mrochs@linux.vnet.ibm.com>

PCI: Increase VPD access timeout to 125ms

The PCI core uses a fixed 50ms timeout when waiting for VPD accesses to
complete. When an access does not complete within this period, a warning
is logged and an error returned to the caller.

While this default timeout is valid for most hardware, some devices can
experience longer access delays under certain circumstances. For example,
one of the IBM CXL Flash devices can take up to ~120ms in a worst-case
scenario. These types of devices can benefit from an extended timeout.

To support devices with a longer access delay, increase the timeout in
pci_vpd_wait() to 125ms. The PCI specification is silent with respect to
VPD delays, therefore there is no concern for violating a threshold.

Tested-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>


# cdcb33f9 13-Jan-2017 Bjorn Helgaas <bhelgaas@google.com>

PCI: Avoid possible deadlock on pci_lock and p->pi_lock

pci_lock is an IRQ-safe spinlock that protects all accesses to PCI
configuration space (see PCI_OP_READ() and PCI_OP_WRITE() in pci/access.c).

The pci_cfg_access_unlock() path acquires pci_lock, then p->pi_lock (inside
wake_up_all()). According to lockdep, there is a possible path involving
snbep_uncore_pci_read_counter() that could acquire them in the reverse
order: acquiring p->pi_lock, then pci_lock, which could result in a
deadlock. Lockdep details are in the bugzilla below.

Avoid the possible deadlock by dropping pci_lock before waking up any
config access waiters.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=192901
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# fb265923 31-Oct-2016 Bjorn Helgaas <bhelgaas@google.com>

PCI: Warn on possible RW1C corruption for sub-32 bit config writes

Hardware that supports only 32-bit config writes is not spec-compliant.
For example, if software performs a 16-bit write, we must do a 32-bit read,
merge in the 16 bits we intend to write, followed by a 32-bit write. If
the 16 bits we *don't* intend to write happen to have any RW1C (write-one-
to-clear) bits set, we just inadvertently cleared something we shouldn't
have.

Add a rate-limited warning when we do sub-32 bit config writes. Remove
similar probe-time warnings from some of the affected host bridge drivers.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Enthusiastically-Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Shawn Lin <shawn.lin@rock-chips.com> # rockchip
Acked-by: Thierry Reding <treding@nvidia.com>


# cb92148b 15-Apr-2016 Hariprasad Shenai <hariprasad@chelsio.com>

PCI: Add pci_set_vpd_size() to set VPD size

After 104daa71b396 ("PCI: Determine actual VPD size on first access"), the
PCI core computes the valid VPD size by parsing the VPD starting at offset
0x0. We don't attempt to read past that valid size because that causes
some devices to crash.

However, some devices do have data past that valid size. For example,
Chelsio adapters contain two VPD structures, and the driver needs both of
them.

Add pci_set_vpd_size(). If a driver knows it is safe to read past the end
of the VPD data structure at offset 0, it can use pci_set_vpd_size() to
allow access to as much data as it needs.

[bhelgaas: changelog, split patches, rename to pci_set_vpd_size() and
return int (not ssize_t)]
Fixes: 104daa71b396 ("PCI: Determine actual VPD size on first access")
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# c521b014 22-Feb-2016 Bjorn Helgaas <bhelgaas@google.com>

PCI: Sleep rather than busy-wait for VPD access completion

Use usleep_range() instead of udelay() while waiting for a VPD access to
complete. This is not a performance path, so no need to hog the CPU.

Rationale for usleep_range() parameters:

We clear PCI_VPD_ADDR_F for a read (or set it for a write), then wait for
the device to change it. For a device that updates PCI_VPD_ADDR between
our config write and subsequent config read, we won't sleep at all and
can get the device's maximum rate.

Sleeping a minimum of 10 usec per 4-byte access limits throughput to
about 400Kbytes/second. VPD is small (32K bytes at most), and most
devices use only a fraction of that.

We back off exponentially up to 1024 usec per iteration. If we reach
1024, we've already waited up to 1008 usec (16 + 32 + ... + 512), so if
we miss an update and wait an extra 1024 usec, we can still get about
1/2 of the device's maximum rate.

Tested-by: Shane Seymour <shane.seymour@hpe.com>
Tested-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>


# 408641e9 22-Feb-2016 Bjorn Helgaas <bhelgaas@google.com>

PCI: Fold struct pci_vpd_pci22 into struct pci_vpd

We only support one flavor of VPD, so there's no need to complicate things
by having a "generic" struct pci_vpd and a more specific struct
pci_vpd_pci22.

Fold struct pci_vpd_pci22 directly into struct pci_vpd.

[bhelgaas: remove NULL check before kfree of dev->vpd (per kfreeaddr.cocci)]
Tested-by: Shane Seymour <shane.seymour@hpe.com>
Tested-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>


# f1cd93f9 22-Feb-2016 Bjorn Helgaas <bhelgaas@google.com>

PCI: Rename VPD symbols to remove unnecessary "pci22"

There's only one kind of VPD, so we don't need to qualify it as "the
version described by PCI spec rev 2.2."

Rename the following symbols to remove unnecessary "pci22":

PCI_VPD_PCI22_SIZE -> PCI_VPD_MAX_SIZE
pci_vpd_pci22_size() -> pci_vpd_size()
pci_vpd_pci22_wait() -> pci_vpd_wait()
pci_vpd_pci22_read() -> pci_vpd_read()
pci_vpd_pci22_write() -> pci_vpd_write()
pci_vpd_pci22_ops -> pci_vpd_ops
pci_vpd_pci22_init() -> pci_vpd_init()

Tested-by: Shane Seymour <shane.seymour@hpe.com>
Tested-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>


# da006847 22-Feb-2016 Bjorn Helgaas <bhelgaas@google.com>

PCI: Remove struct pci_vpd_ops.release function pointer

The struct pci_vpd_ops.release function pointer is always
pci_vpd_pci22_release(), so there's no need for the flexibility of a
function pointer.

Inline the pci_vpd_pci22_release() body into pci_vpd_release() and remove
pci_vpd_pci22_release() and the struct pci_vpd_ops.release function
pointer.

Tested-by: Shane Seymour <shane.seymour@hpe.com>
Tested-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>


# 64379079 22-Feb-2016 Bjorn Helgaas <bhelgaas@google.com>

PCI: Move pci_vpd_release() from header file to pci/access.c

Move pci_vpd_release() so it's next to the other VPD functions. This puts
it next to pci_vpd_pci22_init(), which allocates the space freed by
pci_vpd_release().

Tested-by: Shane Seymour <shane.seymour@hpe.com>
Tested-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>


# fc0a407e 22-Feb-2016 Bjorn Helgaas <bhelgaas@google.com>

PCI: Move pci_read_vpd() and pci_write_vpd() close to other VPD code

pci_read_vpd() and pci_write_vpd() were stranded in the middle of config
accessor functions. Move them close to the other VPD code in the file.
No functional change.

Tested-by: Shane Seymour <shane.seymour@hpe.com>
Tested-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>


# 104daa71 15-Feb-2016 Hannes Reinecke <hare@suse.de>

PCI: Determine actual VPD size on first access

PCI-2.2 VPD entries have a maximum size of 32k, but might actually be
smaller than that. To figure out the actual size one has to read the VPD
area until the 'end marker' is reached.

Per spec, reading outside of the VPD space is "not allowed." In practice,
it may cause simple read errors or even crash the card. To make matters
worse not every PCI card implements this properly, leaving us with no 'end'
marker or even completely invalid data.

Try to determine the size of the VPD data when it's first accessed. If no
valid data can be read an I/O error will be returned when reading or
writing the sysfs attribute.

As the amount of VPD data is unknown initially the size of the sysfs
attribute will always be set to '0'.

[bhelgaas: changelog, use 0/1 (not false/true) for bitfield, tweak
pci_vpd_pci22_read() error checking]
Tested-by: Shane Seymour <shane.seymour@hpe.com>
Tested-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>


# c5563887 22-Feb-2016 Bjorn Helgaas <bhelgaas@google.com>

PCI: Use bitfield instead of bool for struct pci_vpd_pci22.busy

Make struct pci_vpd_pci22.busy a 1-bit field instead of a bool. We intend
to add another flag, and two bitfields are cheaper than two bools.

Tested-by: Shane Seymour <shane.seymour@hpe.com>
Tested-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>


# ff3ce480 27-Dec-2015 Bogicevic Sasa <brutallesale@gmail.com>

PCI: Fix all whitespace issues

Fix all whitespace issues (missing or needed whitespace) in all files in
drivers/pci. Code is compiled with allyesconfig before and after code
changes and objects are recorded and checked with objdiff and they are not
changed after this commit.

Signed-off-by: Bogicevic Sasa <brutallesale@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# da2d03ea 15-Sep-2015 Alex Williamson <alex.williamson@redhat.com>

PCI: Use function 0 VPD for identical functions, regular VPD for others

932c435caba8 ("PCI: Add dev_flags bit to access VPD through function 0")
added PCI_DEV_FLAGS_VPD_REF_F0. Previously, we set the flag on every
non-zero function of quirked devices. If a function turned out to be
different from function 0, i.e., it had a different class, vendor ID, or
device ID, the flag remained set but we didn't make VPD accessible at all.

Flip this around so we only set PCI_DEV_FLAGS_VPD_REF_F0 for functions that
are identical to function 0, and allow regular VPD access for any other
functions.

[bhelgaas: changelog, stable tag]
Fixes: 932c435caba8 ("PCI: Add dev_flags bit to access VPD through function 0")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Acked-by: Myron Stowe <myron.stowe@redhat.com>
Acked-by: Mark Rustad <mark.d.rustad@intel.com>
CC: stable@vger.kernel.org


# 9d924075 15-Sep-2015 Alex Williamson <alex.williamson@redhat.com>

PCI: Fix devfn for VPD access through function 0

Commit 932c435caba8 ("PCI: Add dev_flags bit to access VPD through function
0") passes PCI_SLOT(devfn) for the devfn parameter of pci_get_slot().
Generally this works because we're fairly well guaranteed that a PCIe
device is at slot address 0, but for the general case, including
conventional PCI, it's incorrect. We need to get the slot and then convert
it back into a devfn.

Fixes: 932c435caba8 ("PCI: Add dev_flags bit to access VPD through function 0")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Acked-by: Myron Stowe <myron.stowe@redhat.com>
Acked-by: Mark Rustad <mark.d.rustad@intel.com>
CC: stable@vger.kernel.org


# 932c435c 13-Jul-2015 Mark Rustad <mark.d.rustad@intel.com>

PCI: Add dev_flags bit to access VPD through function 0

Add a dev_flags bit, PCI_DEV_FLAGS_VPD_REF_F0, to access VPD through
function 0 to provide VPD access on other functions. This is for hardware
devices that provide copies of the same VPD capability registers in
multiple functions. Because the kernel expects that each function has its
own registers, both the locking and the state tracking are affected by VPD
accesses to different functions.

On such devices for example, if a VPD write is performed on function 0,
*any* later attempt to read VPD from any other function of that device will
hang. This has to do with how the kernel tracks the expected value of the
F bit per function.

Concurrent accesses to different functions of the same device can not only
hang but also corrupt both read and write VPD data.

When hangs occur, typically the error message:

vpd r/w failed. This is likely a firmware bug on this device.

will be seen.

Never set this bit on function 0 or there will be an infinite recursion.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com>
CC: stable@vger.kernel.org


# ffb4d602 24-Jun-2015 Bjorn Helgaas <bhelgaas@google.com>

PCI: Add pcie_downstream_port() (true for Root and Switch Downstream Ports)

As used in the PCIe spec, "Downstream Port" includes both Root Ports and
Switch Downstream Ports. We sometimes checked for PCI_EXP_TYPE_DOWNSTREAM
when we should have checked for PCI_EXP_TYPE_ROOT_PORT or
PCI_EXP_TYPE_DOWNSTREAM.

For a Root Port without a slot, the effect of this was that using
pcie_capability_read_word() to read PCI_EXP_SLTSTA returned zero instead of
showing the Presence Detect State bit hardwired to one as the PCIe Spec,
r3.0, sec 7.8, requires. (This read is completed in software because
previous PCIe spec versions didn't require PCI_EXP_SLTSTA to exist at all.)

Nothing in the kernel currently depends on this (pciehp only reads
PCI_EXP_SLTSTA on ports with slots), so this is a cleanup and not a
functional change.

Add a pcie_downstream_port() helper function and use it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>


# 1f94a94f 09-Jan-2015 Rob Herring <robh@kernel.org>

PCI: Add generic config accessors

Many PCI controllers' configuration space accesses are memory-mapped and
vary only in address calculation and access checks. There are 2 main
access methods: a decoded address space such as ECAM or a single address
and data register similar to x86. This implementation can support both
cases as well as be used in cases that need additional pre- or post-access
handling.

Add a new pci_ops member, map_bus, which can do access checks and any
necessary setup. It returns the address to use for the configuration space
access. The access types supported are 32-bit only accesses or correct
byte, word, or dword sized accesses.

Tested-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>


# 7a1562d4 11-Nov-2014 Yinghai Lu <yinghai@kernel.org>

PCI: Apply _HPX Link Control settings to all devices with a link

Previously we applied _HPX type 2 record Link Control register settings
only to bridges with a subordinate bus. But it's better to apply them to
all devices with a link because if the subordinate bus has not been
allocated yet, we won't apply settings to the device.

Use pcie_cap_has_lnkctl() to determine whether the device has a Link
Control register instead of looking at dev->subordinate.

[bhelgaas: changelog]
Fixes: 6cd33649fa83 ("PCI: Add pci_configure_device() during enumeration")
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 227f0647 18-Apr-2014 Ryan Desfosses <ryan@desfo.org>

PCI: Merge multi-line quoted strings

Merge quoted strings that are broken across lines into a single entity.
The compiler merges them anyway, but checkpatch complains about it, and
merging them makes it easier to grep for strings.

No functional change.

[bhelgaas: changelog, do the same for everything under drivers/pci]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# d97ffe23 20-May-2014 Gavin Shan <gwshan@linux.vnet.ibm.com>

PCI: Fix return value from pci_user_{read,write}_config_*()

The PCI user-space config accessors pci_user_{read,write}_config_*() return
negative error numbers, which were introduced by commit 34e3207205ef
("PCI: handle positive error codes"). That patch converted all positive
error numbers from platform-specific PCI config accessors to -EINVAL, which
means the callers don't know anything about the specific cause of the
failure.

The patch fixes the issue by converting the positive PCIBIOS_* error values
to generic negative error numbers with pcibios_err_to_errno().

[bhelgaas: changelog]
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Greg Thelen <gthelen@google.com>


# 3984ca1c 10-Jan-2014 Stephen Hemminger <stephen@networkplumber.org>

PCI: Remove unused pci_vpd_truncate()

My philosophy is unused code is dead code. And dead code is subject to bit
rot and is a likely source of bugs. Use it or lose it.

This reverts db5679437a2b ("PCI: add interface to set visible size of
VPD"), removing this interface:

pci_vpd_truncate()

[bhelgaas: split to separate patch, also remove prototype from pci.h]
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# fed24515 28-Aug-2013 Bjorn Helgaas <bhelgaas@google.com>

PCI: Remove pcie_cap_has_devctl()

pcie_cap_has_devctl() does nothing, so remove it. Simplicity over
consistency in this case. No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-By: Jiang Liu <jiang.liu@huawei.com>


# 6d3a1741 28-Aug-2013 Bjorn Helgaas <bhelgaas@google.com>

PCI: Support PCIe Capability Slot registers only for ports with slots

Previously we allowed callers to access Slot Capabilities, Status, and
Control for Root Ports even if the Root Port did not implement a slot.
This seems dubious because the spec only requires these registers if a
slot is implemented.

It's true that even Root Ports without slots must have *space* for these
slot registers, because the Root Capabilities, Status, and Control
registers are after the slot registers in the capability. However,
for a v1 PCIe Capability, the *semantics* of the slot registers are
undefined unless a slot is implemented.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-By: Jiang Liu <jiang.liu@huawei.com>


# c8b303d0 28-Aug-2013 Bjorn Helgaas <bhelgaas@google.com>

PCI: Remove PCIe Capability version checks

Previously we relied on the PCIe r3.0, sec 7.8, spec language that says
"For Functions that do not implement the [Link, Slot, Root] registers,
these spaces must be hardwired to 0b," which means that for v2 PCIe
capabilities, we don't need to check the device type at all.

But it's simpler if we don't need to check the capability version at all,
and I think the spec is explicit enough about which registers are required
for which types that we can remove the version checks.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-By: Jiang Liu <jiang.liu@huawei.com>


# d3694d4f 27-Aug-2013 Bjorn Helgaas <bhelgaas@google.com>

PCI: Allow PCIe Capability link-related register access for switches

Every PCIe device has a link, except Root Complex Integrated Endpoints
and Root Complex Event Collectors. Previously we didn't give access
to PCIe capability link-related registers for Upstream Ports, Downstream
Ports, and Bridges, so attempts to read PCI_EXP_LNKCTL incorrectly
returned zero. See PCIe spec r3.0, sec 7.8 and 1.3.2.3.

Reference: http://lkml.kernel.org/r/979A8436335E3744ADCD3A9F2A2B68A52AD136BE@SJEXCHMB10.corp.ad.broadcom.com
Reported-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-By: Jiang Liu <jiang.liu@huawei.com>


# 969daa34 14-Feb-2013 Alex Williamson <alex.williamson@redhat.com>

PCI: Fix PCI Express Capability accessors for PCI_EXP_FLAGS

PCI_EXP_FLAGS_TYPE is a mask, not an offset. Fix it.

Previously, pcie_capability_read_word(..., PCI_EXP_FLAGS, ...) would
fail.

[bhelgaas: tweak changelog]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.7+


# 1c531d82 25-Jan-2013 Myron Stowe <myron.stowe@redhat.com>

PCI: Use PCI Express Capability accessor

Use PCI Express Capability access functions to simplify device
Capabilities Register usages.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 8c0d3a02 24-Jul-2012 Jiang Liu <jiang.liu@huawei.com>

PCI: Add accessors for PCI Express Capability

The PCI Express Capability (PCIe spec r3.0, sec 7.8) comes in two
versions, v1 and v2. In v1 Capability structures (PCIe spec r1.0 and
r1.1), some fields are optional, so the structure size depends on the
device type.

This patch adds functions to access this capability so drivers don't
have to be aware of the differences between v1 and v2. Note that these
new functions apply only to the "PCI Express Capability," not to any of
the other "PCI Express Extended Capabilities" (AER, VC, ACS, MFVC, etc.)

Function pcie_capability_read_word/dword() reads the PCIe Capabilities
register and returns the value in the reference parameter "val". If
the PCIe Capabilities register is not implemented on the PCIe device,
"val" is set to 0.

Function pcie_capability_write_word/dword() writes the value to the
specified PCIe Capability register.

Function pcie_capability_clear_and_set_word/dword() sets and/or clears bits
of a PCIe Capability register.

[bhelgaas: changelog, drop "pci_" prefixes, don't export
pcie_capability_reg_implemented()]
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# c63587d7 10-Jun-2012 Alex Williamson <alex.williamson@redhat.com>

PCI: export pci_user functions for use by other drivers

VFIO PCI support will make use of these for user-initiated
PCI config accesses.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# a2e27787 04-Nov-2011 Jan Kiszka <jan.kiszka@siemens.com>

PCI: Introduce INTx check & mask API

These new PCI services allow to probe for 2.3-compliant INTx masking
support and then use the feature from PCI interrupt handlers. The
services are properly synchronized with concurrent config space access
via sysfs or on device reset.

This enables generic PCI device drivers like uio_pci_generic or KVM's
device assignment to implement the necessary kernel-side IRQ handling
without any knowledge about device-specific interrupt status and control
registers.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# fb51ccbf 04-Nov-2011 Jan Kiszka <jan.kiszka@siemens.com>

PCI: Rework config space blocking services

pci_block_user_cfg_access was designed for the use case that a single
context, the IPR driver, temporarily delays user space accesses to the
config space via sysfs. This assumption became invalid by the time
pci_dev_reset was added as locking instance. Today, if you run two loops
in parallel that reset the same device via sysfs, you end up with a
kernel BUG as pci_block_user_cfg_access detect the broken assumption.

This reworks the pci_block_user_cfg_access to a sleeping service
pci_cfg_access_lock and an atomic-compatible variant called
pci_cfg_access_trylock. The former not only blocks user space access as
before but also waits if access was already locked. The latter service
just returns false in this case, allowing the caller to resolve the
conflict instead of raising a BUG.

Adaptions of the ipr driver were originally written by Brian King.

Acked-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 34e32072 17-Apr-2011 Greg Thelen <gthelen@google.com>

PCI: handle positive error codes

Callers expect pci_user_{read,write}_config_*() to indicate errors by
returning negative values. Prior to this change, the indicated routines
could return positive error codes (e.g. PCIBIOS_BAD_REGISTER_NUMBER)
which callers would mistakenly interpret as success.

This change converts any non-zero return from the mentioned routines
into unambiguous negative value return codes.

Signed-off-by: Greg Thelen <gthelen@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# d97ecd81 17-Apr-2011 Greg Thelen <gthelen@google.com>

PCI: check pci_vpd_pci22_wait() return

pci_vpd_pci22_write() calls pci_vpd_pci22_wait() after writing
PCI_VPD_DATA and PCI_VPD_ADDR to wait for the VPD operation to complete.
The result pci_vpd_pci22_wait() was not checked for error.

This change checks for error.

Signed-off-by: Greg Thelen <gthelen@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 5030718e 17-May-2010 Prarit Bhargava <prarit@redhat.com>

PCI: output FW warning in pci_read/write_vpd

pci_read/write_vpd() can fail due to a timeout. Usually the command
times out because of firmware issues (incorrect vpd length, etc.) on the
PCI card. Currently, the timeout occurs silently.

Output a message to the user indicating that they should check with
their vendor for new firmware.

Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 511dd98c 17-Feb-2010 Thomas Gleixner <tglx@linutronix.de>

PCI: Convert pci_lock to raw_spinlock

pci_lock must be a real spinlock in preempt-rt. Convert it to
raw_spinlock. No change for !RT kernels.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 5a0e3ad6 24-Mar-2010 Tejun Heo <tj@kernel.org>

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>


# a72b46c3 23-Apr-2009 Huang Ying <ying.huang@intel.com>

PCI: Add pci_bus_set_ops

pci_bus_set_ops changes pci_ops associated with a pci_bus. This can be
used by debug tools such as PCIE AER error injection to fake some PCI
configuration registers.

Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# cffb2faf 10-Apr-2009 Randy Dunlap <randy.dunlap@oracle.com>

docbooks: add/fix PCI kernel-doc

Add drivers/pci/*.c source files to DocBook/kernel-api.tmpl
and update those pci/*.c source files that need kernel-doc fixes.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# d407e32e 31-Mar-2009 Anton Vorontsov <avorontsov@ru.mvista.com>

PCI: Fix oops in pci_vpd_truncate

pci_vpd_truncate() should check for dev->vpd->attr, otherwise this might
happen:

sky2 driver version 1.22
Unable to handle kernel paging request for data at address 0x0000000c
Faulting instruction address: 0xc01836fc
Oops: Kernel access of bad area, sig: 11 [#1]
[...]
NIP [c01836fc] pci_vpd_truncate+0x38/0x40
LR [c029be18] sky2_probe+0x14c/0x518
Call Trace:
[ef82bde0] [c029bda4] sky2_probe+0xd8/0x518 (unreliable)
[ef82be20] [c018a11c] local_pci_probe+0x24/0x34
[ef82be30] [c018a14c] pci_call_probe+0x20/0x30
[ef82be50] [c018a330] __pci_device_probe+0x64/0x78
[ef82be60] [c018a44c] pci_device_probe+0x30/0x58
[ef82be80] [c01aa270] really_probe+0x78/0x1a0
[ef82bea0] [c01aa460] __driver_attach+0xa4/0xa8
[ef82bec0] [c01a96ac] bus_for_each_dev+0x60/0x9c
[ef82bef0] [c01aa0b4] driver_attach+0x24/0x34
[ef82bf00] [c01a9e08] bus_add_driver+0x12c/0x1cc
[ef82bf20] [c01aa87c] driver_register+0x6c/0x110
[ef82bf30] [c018a770] __pci_register_driver+0x4c/0x9c
[ef82bf50] [c03782c8] sky2_init_module+0x30/0x40
[ef82bf60] [c0001dbc] do_one_initcall+0x34/0x1a0
[ef82bfd0] [c0362240] do_initcalls+0x38/0x58

This happens with CONFIG_SKY2=y, and "ip=on" kernel command line, so
pci_vpd_truncate() is called before late_initcall(pci_sysfs_init),
therefore ->attr isn't yet initialized.

Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# d6141668 31-Mar-2009 Anton Vorontsov <avorontsov@ru.mvista.com>

PCI: Fix oops in pci_vpd_truncate

pci_vpd_truncate() should check for dev->vpd->attr, otherwise
this might happen:

sky2 driver version 1.22
Unable to handle kernel paging request for data at address 0x0000000c
Faulting instruction address: 0xc01836fc
Oops: Kernel access of bad area, sig: 11 [#1]
[...]
NIP [c01836fc] pci_vpd_truncate+0x38/0x40
LR [c029be18] sky2_probe+0x14c/0x518
Call Trace:
[ef82bde0] [c029bda4] sky2_probe+0xd8/0x518 (unreliable)
[ef82be20] [c018a11c] local_pci_probe+0x24/0x34
[ef82be30] [c018a14c] pci_call_probe+0x20/0x30
[ef82be50] [c018a330] __pci_device_probe+0x64/0x78
[ef82be60] [c018a44c] pci_device_probe+0x30/0x58
[ef82be80] [c01aa270] really_probe+0x78/0x1a0
[ef82bea0] [c01aa460] __driver_attach+0xa4/0xa8
[ef82bec0] [c01a96ac] bus_for_each_dev+0x60/0x9c
[ef82bef0] [c01aa0b4] driver_attach+0x24/0x34
[ef82bf00] [c01a9e08] bus_add_driver+0x12c/0x1cc
[ef82bf20] [c01aa87c] driver_register+0x6c/0x110
[ef82bf30] [c018a770] __pci_register_driver+0x4c/0x9c
[ef82bf50] [c03782c8] sky2_init_module+0x30/0x40
[ef82bf60] [c0001dbc] do_one_initcall+0x34/0x1a0
[ef82bfd0] [c0362240] do_initcalls+0x38/0x58

This happens with CONFIG_SKY2=y, and "ip=on" kernel command line, so
pci_vpd_truncate() is called before late_initcall(pci_sysfs_init),
therefore ->attr isn't yet initialized.

Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# db567943 18-Dec-2008 Stephen Hemminger <shemminger@vyatta.com>

PCI: add interface to set visible size of VPD

The VPD on all devices may not be 32K. Unfortunately, there is no
generic way to find the size, so this adds a simple API hook
to reset it.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 287d19ce 18-Dec-2008 Stephen Hemminger <shemminger@vyatta.com>

PCI: revise VPD access interface

Change PCI VPD API which was only used by sysfs to something usable
in drivers.
* move iteration over multiple words to the low level
* use conventional types for arguments
* add exportable wrapper

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 1120f8b8 18-Dec-2008 Stephen Hemminger <shemminger@vyatta.com>

PCI: handle long delays in VPD access

Accessing the VPD area can take a long time. The existing
VPD access code fails consistently on my hardware. There are comments
in the SysKonnect vendor driver that it can take up to 13ms per word.

Change the access routines to:
* use a mutex rather than spinning with IRQ's disabled and lock held
* have a much longer timeout
* call cond_resched while spinning

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 99cb233d 02-Jul-2008 Benjamin Li <benli@broadcom.com>

PCI: Limit VPD read/write lengths for Broadcom 5706, 5708, 5709 rev.

For Broadcom 5706, 5708, 5709 rev. A nics, any read beyond the
VPD end tag will hang the device. This problem was initially
observed when a vpd entry was created in sysfs
('/sys/bus/pci/devices/<id>/vpd'). A read to this sysfs entry
will dump 32k of data. Reading a full 32k will cause an access
beyond the VPD end tag causing the device to hang. Once the device
is hung, the bnx2 driver will not be able to reset the device.
We believe that it is legal to read beyond the end tag and
therefore the solution is to limit the read/write length.

A majority of this patch is from Matthew Wilcox who gave code for
reworking the PCI vpd size information. A PCI quirk added for the
Broadcom NIC's to limit the read/write's.

Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 94e61088 05-Mar-2008 Ben Hutchings <bhutchings@solarflare.com>

PCI: Expose PCI VPD through sysfs

Vital Product Data (VPD) may be exposed by PCI devices in several
ways. It is generally unsafe to read this information through the
existing interfaces to user-land because of stateful interfaces.

This adds:
- abstract operations for VPD access (struct pci_vpd_ops)
- VPD state information in struct pci_dev (struct pci_vpd)
- an implementation of the VPD access method specified in PCI 2.2
(in access.c)
- a 'vpd' binary file in sysfs directories for PCI devices with VPD
operations defined

It adds a probe for PCI 2.2 VPD in pci_scan_device() and release of
VPD state in pci_release_dev().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# f6a57033 17-Oct-2006 Al Viro <viro@zeniv.linux.org.uk>

[PATCH] severing module.h->sched.h

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 7ea7e98f 19-Oct-2006 Matthew Wilcox <willy@infradead.org>

PCI: Block on access to temporarily unavailable pci device

The existing implementation of pci_block_user_cfg_access() was recently
criticised for providing out of date information and for returning errors
on write, which applications won't be expecting.

This reimplementation uses a global wait queue and a bit per device.
I've open-coded prepare_to_wait() / finish_wait() as I could optimise
it significantly by knowing that the pci_lock protected us at all points.

It looked a bit funny to be doing a spin_unlock_irqsave(); schedule(),
so I used spin_lock_irq() for the _user versions of pci_read_config and
pci_write_config. Not carrying a flags pointer around made the code
much less nasty.

Attempts to block an already blocked device hit a BUG() and attempts to
unblock an already unblocked device hit a WARN(). If we need to block
access to a device from userspace, it's because it's unsafe for even
another bit of the kernel to access the device. An attempt to block
a device for a second time means we're about to access the device to
perform some other operation, which could provoke undefined behaviour
from the device.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Acked-by: Adam Belay <abelay@novell.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 48b19148 05-Nov-2005 Adrian Bunk <bunk@stusta.de>

[PATCH] PCI: drivers/pci/: small cleanups

This patch contains the following cleanups:
- access.c should #include "pci.h" for getting the prototypes of it's
global functions
- hotplug/shpchp_pci.c: make the needlessly global function
program_fw_provided_values() static

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# e04b0ea2 27-Sep-2005 Brian King <brking@us.ibm.com>

[PATCH] PCI: Block config access during BIST

Some PCI adapters (eg. ipr scsi adapters) have an exposure today in that they
issue BIST to the adapter to reset the card. If, during the time it takes to
complete BIST, userspace attempts to access PCI config space, the host bus
bridge will master abort the access since the ipr adapter does not respond on
the PCI bus for a brief period of time when running BIST. On PPC64 hardware,
this master abort results in the host PCI bridge isolating that PCI device
from the rest of the system, making the device unusable until Linux is
rebooted. This patch is an attempt to close that exposure by introducing some
blocking code in the PCI code. When blocked, writes will be humored and reads
will return the cached value. Ben Herrenschmidt has also mentioned that he
plans to use this in PPC power management.

Signed-off-by: Brian King <brking@us.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drivers/pci/access.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++++
drivers/pci/pci-sysfs.c | 20 +++++-----
drivers/pci/pci.h | 7 +++
drivers/pci/proc.c | 28 +++++++--------
drivers/pci/syscall.c | 14 +++----
include/linux/pci.h | 7 +++
6 files changed, 134 insertions(+), 31 deletions(-)


# 1da177e4 16-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org>

Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!