History log of /linux-master/drivers/pci/hotplug/pciehp_hpc.c
Revision Date Author Comments
# abaaac48 18-Oct-2023 Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

PCI: hotplug: Use FIELD_GET/PREP()

Instead of handcrafted shifts to handle register fields, use
FIELD_GET/FIELD_PREP().

Link: https://lore.kernel.org/r/20231018113254.17616-4-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 5f75f96c 17-Jul-2023 Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

PCI: pciehp: Use RMW accessors for changing LNKCTL

As hotplug is not the only driver touching LNKCTL, use the RMW capability
accessor which handles concurrent changes correctly.

Suggested-by: Lukas Wunner <lukas@wunner.de>
Fixes: 7f822999e12a ("PCI: pciehp: Add Disable/enable link functions")
Link: https://lore.kernel.org/r/20230717120503.15276-4-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: "Rafael J. Wysocki" <rafael@kernel.org>


# 1f087398 11-Jun-2023 Maciej W. Rozycki <macro@orcam.me.uk>

PCI: pciehp: Rely on dev->link_active_reporting

Use dev->link_active_reporting to determine whether Data Link Layer Link
Active Reporting is available rather than re-retrieving the capability.

Link: https://lore.kernel.org/r/alpine.DEB.2.21.2305310028150.59226@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>


# 5054133a 22-May-2023 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Simplify Attention Button logging

Previously, pressing the Attention Button always logged two lines, the
first from pciehp_ist() and the second from pciehp_handle_button_press():

Attention button pressed
Powering on due to button press

Since pciehp_handle_button_press() always logs the more detailed message,
remove the generic "Attention button pressed" message. Reword the
pciehp_handle_button_press() to be of the form:

Button press: will power on in 5 sec
Button press: will power off in 5 sec
Button press: canceling request to power on
Button press: canceling request to power off
Button press: ignoring invalid state %#x

Link: https://lore.kernel.org/r/20230522214051.619337-1-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>


# 82b34b08 13-Feb-2023 Manivannan Sadhasivam <mani@kernel.org>

PCI: pciehp: Add Qualcomm quirk for Command Completed erratum

The Qualcomm PCI bridge device (Device ID 0x010e) found in chipsets such as
SC8280XP used in Lenovo Thinkpad X13s, does not set the Command Completed
bit unless writes to the Slot Command register change "Control" bits.

This results in timeouts like below during boot and resume from suspend:

pcieport 0002:00:00.0: pciehp: Timeout on hotplug command 0x03c0 (issued 2020 msec ago)
...
pcieport 0002:00:00.0: pciehp: Timeout on hotplug command 0x13f1 (issued 107724 msec ago)

Add the device to the Command Completed quirk to mark commands "completed"
immediately unless they change the "Control" bits.

Link: https://lore.kernel.org/r/20230213144922.89982-1-manivannan.sadhasivam@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 6d4671b5 27-Sep-2022 Pali Rohár <pali@kernel.org>

PCI: pciehp: Enable Command Completed Interrupt only if supported

The No Command Completed Support bit in the Slot Capabilities register
indicates whether Command Completed Interrupt Enable is unsupported.

We already check whether No Command Completed Support bit is set in
pcie_wait_cmd(), and do not wait in this case.

Don't enable this Command Completed Interrupt at all if NCCS is set, so
that when users dump configuration space from userspace, the dump does not
confuse them by saying that Command Completed Interrupt is not supported,
but it is enabled.

Link: https://lore.kernel.org/r/20220927141926.8895-2-kabel@kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>


# 9f72d475 10-Feb-2022 Manivannan Sadhasivam <mani@kernel.org>

PCI: pciehp: Add Qualcomm quirk for Command Completed erratum

The Qualcomm PCI bridge device (Device ID 0x0110) found in chipsets such as
SM8450 does not set the Command Completed bit unless writes to the Slot
Command register change "Control" bits.

This results in timeouts like below:

pcieport 0001:00:00.0: pciehp: Timeout on hotplug command 0x03c0 (issued 2020 msec ago)

Add the device to the Command Completed quirk to mark commands "completed"
immediately unless they change the "Control" bits.

Link: https://lore.kernel.org/r/20220210145003.135907-1-manivannan.sadhasivam@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 92912b17 10-Nov-2021 Liguang Zhang <zhangliguang@linux.alibaba.com>

PCI: pciehp: Clear cmd_busy bit in polling mode

Writes to a Downstream Port's Slot Control register are PCIe hotplug
"commands." If the Port supports Command Completed events, software must
wait for a command to complete before writing to Slot Control again.

pcie_do_write_cmd() sets ctrl->cmd_busy when it writes to Slot Control. If
software notification is enabled, i.e., PCI_EXP_SLTCTL_HPIE and
PCI_EXP_SLTCTL_CCIE are set, ctrl->cmd_busy is cleared by pciehp_isr().

But when software notification is disabled, as it is when pcie_init()
powers off an empty slot, pcie_wait_cmd() uses pcie_poll_cmd() to poll for
command completion, and it neglects to clear ctrl->cmd_busy, which leads to
spurious timeouts:

pcieport 0000:00:03.0: pciehp: Timeout on hotplug command 0x01c0 (issued 2264 msec ago)
pcieport 0000:00:03.0: pciehp: Timeout on hotplug command 0x05c0 (issued 2288 msec ago)

Clear ctrl->cmd_busy in pcie_poll_cmd() when it detects a Command Completed
event (PCI_EXP_SLTSTA_CC).

[bhelgaas: commit log]
Fixes: a5dd4b4b0570 ("PCI: pciehp: Wait for hotplug command completion where necessary")
Link: https://lore.kernel.org/r/20211111054258.7309-1-zhangliguang@linux.alibaba.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215143
Link: https://lore.kernel.org/r/20211126173309.GA12255@wunner.de
Signed-off-by: Liguang Zhang <zhangliguang@linux.alibaba.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v4.19+


# 085a9f43 17-Dec-2021 Hans de Goede <hdegoede@redhat.com>

PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors

Use down_read_nested() and down_write_nested() when taking the
ctrl->reset_lock rw-sem, passing the number of PCIe hotplug controllers in
the path to the PCI root bus as lock subclass parameter.

This fixes the following false-positive lockdep report when unplugging a
Lenovo X1C8 from a Lenovo 2nd gen TB3 dock:

pcieport 0000:06:01.0: pciehp: Slot(1): Link Down
pcieport 0000:06:01.0: pciehp: Slot(1): Card not present
============================================
WARNING: possible recursive locking detected
5.16.0-rc2+ #621 Not tainted
--------------------------------------------
irq/124-pciehp/86 is trying to acquire lock:
ffff8e5ac4299ef8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_check_presence+0x23/0x80

but task is already holding lock:
ffff8e5ac4298af8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_ist+0xf3/0x180

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&ctrl->reset_lock);
lock(&ctrl->reset_lock);

*** DEADLOCK ***

May be due to missing lock nesting notation

3 locks held by irq/124-pciehp/86:
#0: ffff8e5ac4298af8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_ist+0xf3/0x180
#1: ffffffffa3b024e8 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pciehp_unconfigure_device+0x31/0x110
#2: ffff8e5ac1ee2248 (&dev->mutex){....}-{3:3}, at: device_release_driver+0x1c/0x40

stack backtrace:
CPU: 4 PID: 86 Comm: irq/124-pciehp Not tainted 5.16.0-rc2+ #621
Hardware name: LENOVO 20U90SIT19/20U90SIT19, BIOS N2WET30W (1.20 ) 08/26/2021
Call Trace:
<TASK>
dump_stack_lvl+0x59/0x73
__lock_acquire.cold+0xc5/0x2c6
lock_acquire+0xb5/0x2b0
down_read+0x3e/0x50
pciehp_check_presence+0x23/0x80
pciehp_runtime_resume+0x5c/0xa0
device_for_each_child+0x45/0x70
pcie_port_device_runtime_resume+0x20/0x30
pci_pm_runtime_resume+0xa7/0xc0
__rpm_callback+0x41/0x110
rpm_callback+0x59/0x70
rpm_resume+0x512/0x7b0
__pm_runtime_resume+0x4a/0x90
__device_release_driver+0x28/0x240
device_release_driver+0x26/0x40
pci_stop_bus_device+0x68/0x90
pci_stop_bus_device+0x2c/0x90
pci_stop_and_remove_bus_device+0xe/0x20
pciehp_unconfigure_device+0x6c/0x110
pciehp_disable_slot+0x5b/0xe0
pciehp_handle_presence_or_link_change+0xc3/0x2f0
pciehp_ist+0x179/0x180

This lockdep warning is triggered because with Thunderbolt, hotplug ports
are nested. When removing multiple devices in a daisy-chain, each hotplug
port's reset_lock may be acquired recursively. It's never the same lock, so
the lockdep splat is a false positive.

Because locks at the same hierarchy level are never acquired recursively, a
per-level lockdep class is sufficient to fix the lockdep warning.

The choice to use one lockdep subclass per pcie-hotplug controller in the
path to the root-bus was made to conserve class keys because their number
is limited and the complexity grows quadratically with number of keys
according to Documentation/locking/lockdep-design.rst.

Link: https://lore.kernel.org/linux-pci/20190402021933.GA2966@mit.edu/
Link: https://lore.kernel.org/linux-pci/de684a28-9038-8fc6-27ca-3f6f2f6400d7@redhat.com/
Link: https://lore.kernel.org/r/20211217141709.379663-1-hdegoede@redhat.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=208855
Reported-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org


# 23584c1e 17-Nov-2021 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Fix infinite loop in IRQ handler upon power fault

The Power Fault Detected bit in the Slot Status register differs from
all other hotplug events in that it is sticky: It can only be cleared
after turning off slot power. Per PCIe r5.0, sec. 6.7.1.8:

If a power controller detects a main power fault on the hot-plug slot,
it must automatically set its internal main power fault latch [...].
The main power fault latch is cleared when software turns off power to
the hot-plug slot.

The stickiness used to cause interrupt storms and infinite loops which
were fixed in 2009 by commits 5651c48cfafe ("PCI pciehp: fix power fault
interrupt storm problem") and 99f0169c17f3 ("PCI: pciehp: enable
software notification on empty slots").

Unfortunately in 2020 the infinite loop issue was inadvertently
reintroduced by commit 8edf5332c393 ("PCI: pciehp: Fix MSI interrupt
race"): The hardirq handler pciehp_isr() clears the PFD bit until
pciehp's power_fault_detected flag is set. That happens in the IRQ
thread pciehp_ist(), which never learns of the event because the hardirq
handler is stuck in an infinite loop. Fix by setting the
power_fault_detected flag already in the hardirq handler.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=214989
Link: https://lore.kernel.org/linux-pci/DM8PR11MB5702255A6A92F735D90A4446868B9@DM8PR11MB5702.namprd11.prod.outlook.com
Fixes: 8edf5332c393 ("PCI: pciehp: Fix MSI interrupt race")
Link: https://lore.kernel.org/r/66eaeef31d4997ceea357ad93259f290ededecfd.1637187226.git.lukas@wunner.de
Reported-by: Joseph Bao <joseph.bao@intel.com>
Tested-by: Joseph Bao <joseph.bao@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v4.19+
Cc: Stuart Hayes <stuart.w.hayes@gmail.com>


# a3b0f10d 18-Nov-2021 Naveen Naidu <naveennaidu479@gmail.com>

PCI: pciehp: Use PCI_POSSIBLE_ERROR() to check config reads

When config pci_ops.read() can detect failed PCI transactions, the data
returned to the CPU is PCI_ERROR_RESPONSE (~0 or 0xffffffff).

Obviously a successful PCI config read may *also* return that data if a
config register happens to contain ~0, so it doesn't definitively indicate
an error unless we know the register cannot contain ~0.

Use PCI_POSSIBLE_ERROR() to check the response we get when we read data
from hardware. This unifies PCI error response checking and makes error
checks consistent and easier to find.

Compile tested only.

Link: https://lore.kernel.org/r/e185b052fbfd530df703a36dd31126cb870eed95.1637243717.git.naveennaidu479@gmail.com
Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lukas Wunner <lukas@wunner.de>


# ea401499 31-Jul-2021 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Ignore Link Down/Up caused by error-induced Hot Reset

Stuart Hayes reports that an error handled by DPC at a Root Port results
in pciehp gratuitously bringing down a subordinate hotplug port:

RP -- UP -- DP -- UP -- DP (hotplug) -- EP

pciehp brings the slot down because the Link to the Endpoint goes down.
That is caused by a Hot Reset being propagated as a result of DPC.
Per PCIe Base Spec 5.0, section 6.6.1 "Conventional Reset":

For a Switch, the following must cause a hot reset to be sent on all
Downstream Ports: [...]

* The Data Link Layer of the Upstream Port reporting DL_Down status.
In Switches that support Link speeds greater than 5.0 GT/s, the
Upstream Port must direct the LTSSM of each Downstream Port to the
Hot Reset state, but not hold the LTSSMs in that state. This permits
each Downstream Port to begin Link training immediately after its
hot reset completes. This behavior is recommended for all Switches.

* Receiving a hot reset on the Upstream Port.

Once DPC recovers, pcie_do_recovery() walks down the hierarchy and
invokes pcie_portdrv_slot_reset() to restore each port's config space.
At that point, a hotplug interrupt is signaled per PCIe Base Spec r5.0,
section 6.7.3.4 "Software Notification of Hot-Plug Events":

If the Port is enabled for edge-triggered interrupt signaling using
MSI or MSI-X, an interrupt message must be sent every time the logical
AND of the following conditions transitions from FALSE to TRUE: [...]

* The Hot-Plug Interrupt Enable bit in the Slot Control register is
set to 1b.

* At least one hot-plug event status bit in the Slot Status register
and its associated enable bit in the Slot Control register are both
set to 1b.

Prevent pciehp from gratuitously bringing down the slot by clearing the
error-induced Data Link Layer State Changed event before restoring
config space. Afterwards, check whether the link has unexpectedly
failed to retrain and synthesize a DLLSC event if so.

Allow each pcie_port_service_driver (one of them being pciehp) to define
a slot_reset callback and re-use the existing pm_iter() function to
iterate over the callbacks.

Thereby, the Endpoint driver remains bound throughout error recovery and
may restore the device to working state.

Surprise removal during error recovery is detected through a Presence
Detect Changed event. The hotplug port is expected to not signal that
event as a result of a Hot Reset.

The issue isn't DPC-specific, it also occurs when an error is handled by
AER through aer_root_reset(). So while the issue was noticed only now,
it's been around since 2006 when AER support was first introduced.

[bhelgaas: drop PCI_ERROR_RECOVERY Kconfig, split pm_iter() rename to
preparatory patch]
Link: https://lore.kernel.org/linux-pci/08c046b0-c9f2-3489-eeef-7e7aca435bb9@gmail.com/
Fixes: 6c2b374d7485 ("PCI-Express AER implemetation: AER core and aerdriver")
Link: https://lore.kernel.org/r/251f4edcc04c14f873ff1c967bc686169cd07d2d.1627638184.git.lukas@wunner.de
Reported-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Tested-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v2.6.19+: ba952824e6c1: PCI/portdrv: Report reset for frozen channel
Cc: Keith Busch <kbusch@kernel.org>


# 9bdc81ce 17-Aug-2021 Amey Narkhede <ameynarkhede03@gmail.com>

PCI: Change the type of probe argument in reset functions

Change the type of probe argument in functions which implement reset
methods from int to bool to make the context and intent clear.

Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20210817180500.1253-10-ameynarkhede03@gmail.com
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# a97396c6 01-May-2021 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Ignore Link Down/Up caused by DPC

Downstream Port Containment (PCIe r5.0, sec. 6.2.10) disables the link upon
an error and attempts to re-enable it when instructed by the DPC driver.

A slot which is both DPC- and hotplug-capable is currently powered off by
pciehp once DPC is triggered (due to the link change) and powered back up
on successful recovery. That's undesirable, the slot should remain powered
so the hotplugged device remains bound to its driver. DPC notifies the
driver of the error and of successful recovery in pcie_do_recovery() and
the driver may then restore the device to working state.

Moreover, Sinan points out that turning off slot power by pciehp may foil
recovery by DPC: Power off/on is a cold reset concurrently to DPC's warm
reset. Sathyanarayanan reports extended delays or failure in link
retraining by DPC if pciehp brings down the slot.

Fix by detecting whether a Link Down event is caused by DPC and awaiting
recovery if so. On successful recovery, ignore both the Link Down and the
subsequent Link Up event.

Afterwards, check whether the link is down to detect surprise-removal or
another DPC event immediately after DPC recovery. Ensure that the
corresponding DLLSC event is not ignored by synthesizing it and invoking
irq_wake_thread() to trigger a re-run of pciehp_ist().

The IRQ threads of the hotplug and DPC drivers, pciehp_ist() and
dpc_handler(), race against each other. If pciehp is faster than DPC, it
will wait until DPC recovery completes.

Recovery consists of two steps: The first step (waiting for link
disablement) is recognizable by pciehp through a set DPC Trigger Status
bit. The second step (waiting for link retraining) is recognizable through
a newly introduced PCI_DPC_RECOVERING flag.

If DPC is faster than pciehp, neither of the two flags will be set and
pciehp may glean the recovery status from the new PCI_DPC_RECOVERED flag.
The flag is zero if DPC didn't occur at all, hence DLLSC events are not
ignored by default.

pciehp waits up to 4 seconds before assuming that DPC recovery failed and
bringing down the slot. This timeout is not taken from the spec (it
doesn't mandate one) but based on a report from Yicong Yang that DPC may
take a bit more than 3 seconds on HiSilicon's Kunpeng platform.

The timeout is necessary because the DPC Trigger Status bit may never
clear: On Root Ports which support RP Extensions for DPC, the DPC driver
polls the DPC RP Busy bit for up to 1 second before giving up on DPC
recovery. Without the timeout, pciehp would then wait indefinitely for DPC
to complete.

This commit draws inspiration from previous attempts to synchronize DPC
with pciehp:

By Sinan Kaya, August 2018:
https://lore.kernel.org/linux-pci/20180818065126.77912-1-okaya@kernel.org/

By Ethan Zhao, October 2020:
https://lore.kernel.org/linux-pci/20201007113158.48933-1-haifeng.zhao@intel.com/

By Kuppuswamy Sathyanarayanan, March 2021:
https://lore.kernel.org/linux-pci/59cb30f5e5ac6d65427ceaadf1012b2ba8dbf66c.1615606143.git.sathyanarayanan.kuppuswamy@linux.intel.com/

Link: https://lore.kernel.org/r/0be565d97438fe2a6d57354b3aa4e8626952a00b.1619857124.git.lukas@wunner.de
Reported-by: Sinan Kaya <okaya@kernel.org>
Reported-by: Ethan Zhao <haifeng.zhao@intel.com>
Reported-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Tested-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Keith Busch <kbusch@kernel.org>


# 8a614499 17-Sep-2020 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Reduce noisiness on hot removal

When a PCIe card is hot-removed, the Presence Detect State and Data Link
Layer Link Active bits often do not clear simultaneously. I've seen delays
of up to 244 msec between the two events with Thunderbolt.

After pciehp has brought down the slot in response to the first event, the
other bit may still be set. It's not discernible whether it's set because
a new card is already in the slot or if it will soon clear. So pciehp
tries to bring up the slot and in the latter case fails with a bunch of
messages, some of them at KERN_ERR severity. If the slot is no longer
occupied, the messages are false positives and annoy users.

Stuart Hayes reports the following splat on hot removal:

KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Link Up
KERN_INFO pcieport 0000:3c:06.0: pciehp: Timeout waiting for Presence Detect
KERN_ERR pcieport 0000:3c:06.0: pciehp: link training error: status 0x0001
KERN_ERR pcieport 0000:3c:06.0: pciehp: Failed to check link status

Dongdong Liu complains about a similar splat:

KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Link Down
KERN_INFO iommu: Removing device 0000:87:00.0 from group 12
KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Card present
KERN_INFO pcieport 0000:80:10.0: Data Link Layer Link Active not set in 1000 msec
KERN_ERR pciehp 0000:80:10.0:pcie004: Failed to check link status

Users are particularly irritated to see a bringup attempt even though the
slot was explicitly brought down via sysfs. In a perfect world, we could
avoid this by setting Link Disable on slot bringdown and re-enabling it
upon a Presence Detect State change. In reality however, there are broken
hotplug ports which hardwire Presence Detect to zero, see 80696f991424
("PCI: pciehp: Tolerate Presence Detect hardwired to zero"). Conversely,
PCIe r1.0 hotplug ports hardwire Link Active to zero because Link Active
Reporting wasn't specified before PCIe r1.1. On unplug, some ports first
clear Presence then Link (see Stuart Hayes' splat) whereas others use the
inverse order (see Dongdong Liu's splat). To top it off, there are hotplug
ports which flap the Presence and Link bits on slot bringup, see
6c35a1ac3da6 ("PCI: pciehp: Tolerate initially unstable link").

pciehp is designed to work with all of these variants. Surplus attempts at
slot bringup are a lesser evil than not being able to bring up slots at
all. Although we could try to perfect the behavior for specific hotplug
controllers, we'd risk breaking others or increasing code complexity.

But we can certainly minimize annoyance by emitting only a single message
with KERN_INFO severity if bringup is unsuccessful:

* Drop the "Timeout waiting for Presence Detect" message in
pcie_wait_for_presence(). The sole caller of that function,
pciehp_check_link_status(), ignores the timeout and carries on. It emits
error messages of its own and I don't think this particular message adds
much value.

* There's a single error condition in pciehp_check_link_status() which
does not emit a message. Adding one allows dropping the "Failed to check
link status" message emitted by board_added() if
pciehp_check_link_status() returns a non-zero integer.

* Tone down all messages in pciehp_check_link_status() to KERN_INFO
severity and rephrase them to look as innocuous as possible. To this
end, move the message emitted by pcie_wait_for_link_delay() to its
callers.

As a result, Stuart Hayes' splat becomes:

KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Link Up
KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Cannot train link: status 0x0001

Dongdong Liu's splat becomes:

KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Card present
KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): No link

The messages now merely serve as information that presence or link bits
were set a little longer than expected. Bringup failures which are not
false positives are still reported, albeit no longer at KERN_ERR severity.

Link: https://lore.kernel.org/linux-pci/20200310182100.102987-1-stuart.w.hayes@gmail.com/
Link: https://lore.kernel.org/linux-pci/1547649064-19019-1-git-send-email-liudongdong3@huawei.com/
Link: https://lore.kernel.org/r/b45e46fd8a6aa6930aaac9d7718c2e4b787a4e5e.1595935071.git.lukas@wunner.de
Reported-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Reported-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>


# 8edf5332 19-Feb-2020 Stuart Hayes <stuart.w.hayes@gmail.com>

PCI: pciehp: Fix MSI interrupt race

Without this commit, a PCIe hotplug port can stop generating interrupts on
hotplug events, so device adds and removals will not be seen:

The pciehp interrupt handler pciehp_isr() reads the Slot Status register
and then writes back to it to clear the bits that caused the interrupt. If
a different interrupt event bit gets set between the read and the write,
pciehp_isr() returns without having cleared all of the interrupt event
bits. If this happens when the MSI isn't masked (which by default it isn't
in handle_edge_irq(), and which it will never be when MSI per-vector
masking is not supported), we won't get any more hotplug interrupts from
that device.

That is expected behavior, according to the PCIe Base Spec r5.0, section
6.7.3.4, "Software Notification of Hot-Plug Events".

Because the Presence Detect Changed and Data Link Layer State Changed event
bits can both get set at nearly the same time when a device is added or
removed, this is more likely to happen than it might seem. The issue was
found (and can be reproduced rather easily) by connecting and disconnecting
an NVMe storage device on at least one system model where the NVMe devices
were being connected to an AMD PCIe port (PCI device 0x1022/0x1483).

Fix the issue by modifying pciehp_isr() to loop back and re-read the Slot
Status register immediately after writing to it, until it sees that all of
the event status bits have been cleared.

[lukas: drop loop count limitation, write "events" instead of "status",
don't loop back in INTx and poll modes, tweak code comment & commit msg]
Link: https://lore.kernel.org/r/78b4ced5072bfe6e369d20e8b47c279b8c7af12e.1582121613.git.lukas@wunner.de
Tested-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>


# 3e487d2e 17-Mar-2020 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Fix indefinite wait on sysfs requests

David Hoyer reports that powering pciehp slots up or down via sysfs may
hang: The call to wait_event() in pciehp_sysfs_enable_slot() and
_disable_slot() does not return because ctrl->ist_running remains true.

This flag, which was introduced by commit 157c1062fcd8 ("PCI: pciehp: Avoid
returning prematurely from sysfs requests"), signifies that the IRQ thread
pciehp_ist() is running. It is set to true at the top of pciehp_ist() and
reset to false at the end. However there are two additional return
statements in pciehp_ist() before which the commit neglected to reset the
flag to false and wake up waiters for the flag.

That omission opens up the following race when powering up the slot:

* pciehp_ist() runs because a PCI_EXP_SLTSTA_PDC event was requested
by pciehp_sysfs_enable_slot()

* pciehp_ist() turns on slot power via the following call stack:
pciehp_handle_presence_or_link_change() -> pciehp_enable_slot() ->
__pciehp_enable_slot() -> board_added() -> pciehp_power_on_slot()

* after slot power is turned on, the link comes up, resulting in a
PCI_EXP_SLTSTA_DLLSC event

* the IRQ handler pciehp_isr() stores the event in ctrl->pending_events
and returns IRQ_WAKE_THREAD

* the IRQ thread is already woken (it's bringing up the slot), but the
genirq code remembers to re-run the IRQ thread after it has finished
(such that it can deal with the new event) by setting IRQTF_RUNTHREAD
via __handle_irq_event_percpu() -> __irq_wake_thread()

* the IRQ thread removes PCI_EXP_SLTSTA_DLLSC from ctrl->pending_events
via board_added() -> pciehp_check_link_status() in order to deal with
presence and link flaps per commit 6c35a1ac3da6 ("PCI: pciehp:
Tolerate initially unstable link")

* after pciehp_ist() has successfully brought up the slot, it resets
ctrl->ist_running to false and wakes up the sysfs requester

* the genirq code re-runs pciehp_ist(), which sets ctrl->ist_running
to true but then returns with IRQ_NONE because ctrl->pending_events
is empty

* pciehp_sysfs_enable_slot() is finally woken but notices that
ctrl->ist_running is true, hence continues waiting

The only way to get the hung task going again is to trigger a hotplug
event which brings down the slot, e.g. by yanking out the card.

The same race exists when powering down the slot because remove_board()
likewise clears link or presence changes in ctrl->pending_events per commit
3943af9d01e9 ("PCI: pciehp: Ignore Link State Changes after powering off a
slot") and thereby may cause a re-run of pciehp_ist() which returns with
IRQ_NONE without resetting ctrl->ist_running to false.

Fix by adding a goto label before the teardown steps at the end of
pciehp_ist() and jumping to that label from the two return statements which
currently neglect to reset the ctrl->ist_running flag.

Fixes: 157c1062fcd8 ("PCI: pciehp: Avoid returning prematurely from sysfs requests")
Link: https://lore.kernel.org/r/cca1effa488065cb055120aa01b65719094bdcb5.1584530321.git.lukas@wunner.de
Reported-by: David Hoyer <David.Hoyer@netapp.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Cc: stable@vger.kernel.org # v4.19+


# 0b382546 25-Oct-2019 Stuart Hayes <stuart.w.hayes@gmail.com>

PCI: pciehp: Add DMI table for in-band presence detection disabled

Some systems have in-band presence detection disabled for hot-plug PCI
slots but do not report this in the slot capabilities 2 (SLTCAP2) register.
On these systems, presence detect can become active well after the link is
reported to be active, which can cause the slots to be disabled after a
device is connected.

Add a DMI table to flag these systems as having in-band presence detect
disabled.

Link: https://lore.kernel.org/r/20191025190047.38130-4-stuart.w.hayes@gmail.com
Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>


# f496648b 25-Oct-2019 Alexandru Gagniuc <mr.nuke.me@gmail.com>

PCI: pciehp: Wait for PDS if in-band presence is disabled

When in-band presence detect is disabled, PDS may come up at any time or
not at all. PDS being low may indicate that the card is still mating, and
we could expect contact bounce to bring down the link as well.

It is reasonable to assume that most cards will mate in a hotplug slot in
about a second. Thus, when we know PDS only reflects out-of-band presence
detect, it's worthwhile to wait the extra second or so to make sure the
card is properly mated before loading the driver and to prevent the hotplug
code from disabling a device if the presence detect change goes active
after the device is enabled.

Link: https://lore.kernel.org/r/20191025190047.38130-3-stuart.w.hayes@gmail.com
[bhelgaas: use ctrl_info() instead of pci_info()]
Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>


# 20285359 25-Oct-2019 Alexandru Gagniuc <mr.nuke.me@gmail.com>

PCI: pciehp: Disable in-band presence detect when possible

The presence detect state (PDS) is normally a logical OR of in-band and
out-of-band (OOB) presence detect. As of PCIe 4.0, there is the option to
disable in-band presence so that the PDS bit always reflects the state of
the out-of-band presence.

The recommendation of the PCIe spec is to disable in-band presence whenever
supported (PCIe r5.0, appendix I implementation note):

Due to architectural issues, the in-band (Physical-Layer-based) portion
of the PD mechanism is deprecated for use with async hot-plug. One issue
is that in-band PD as architected does not detect adapter removal during
certain LTSSM states, notably the L1 and Disabled States. Another issue
is that when both in-band and OOB PD are being used together, the
Presence Detect State bit and its associated interrupt mechanism always
reflect the logical OR of the inband and OOB PD states, and with some
hot-plug hardware configurations, it is important for software to detect
and respond to in-band and OOB PD events independently. If OOB PD is
being used and the associated DSP supports In-Band PD Disable, it is
recommended that the In-Band PD Disable bit be Set, and the Presence
Detect State bit and its associated interrupt mechanism be used
exclusively for OOB PD. As a substitute for in-band PD with async
hot-plug, the reference model uses either the DPC or the DLL Link Active
mechanism.

Link: https://lore.kernel.org/r/20191025190047.38130-2-stuart.w.hayes@gmail.com
[bhelgaas: move PCI_EXP_SLTCAP2 read earlier & print PCI_EXP_SLTCAP2_IBPD
value (suggested by Lukas)]
Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>


# 87d0f2a5 29-Oct-2019 Mika Westerberg <mika.westerberg@linux.intel.com>

PCI: pciehp: Prevent deadlock on disconnect

This addresses deadlocks in these common cases in hierarchies containing
two switches:

- All involved ports are runtime suspended and they are unplugged. This
can happen easily if the drivers involved automatically enable runtime
PM (xHCI for example does that).

- System is suspended (e.g., closing the lid on a laptop) with a dock +
something else connected, and the dock is unplugged while suspended.

These cases lead to the following deadlock:

INFO: task irq/126-pciehp:198 blocked for more than 120 seconds.
irq/126-pciehp D 0 198 2 0x80000000
Call Trace:
schedule+0x2c/0x80
schedule_timeout+0x246/0x350
wait_for_completion+0xb7/0x140
kthread_stop+0x49/0x110
free_irq+0x32/0x70
pcie_shutdown_notification+0x2f/0x50
pciehp_remove+0x27/0x50
pcie_port_remove_service+0x36/0x50
device_release_driver+0x12/0x20
bus_remove_device+0xec/0x160
device_del+0x13b/0x350
device_unregister+0x1a/0x60
remove_iter+0x1e/0x30
device_for_each_child+0x56/0x90
pcie_port_device_remove+0x22/0x40
pcie_portdrv_remove+0x20/0x60
pci_device_remove+0x3e/0xc0
device_release_driver_internal+0x18c/0x250
device_release_driver+0x12/0x20
pci_stop_bus_device+0x6f/0x90
pci_stop_bus_device+0x31/0x90
pci_stop_and_remove_bus_device+0x12/0x20
pciehp_unconfigure_device+0x88/0x140
pciehp_disable_slot+0x6a/0x110
pciehp_handle_presence_or_link_change+0x263/0x400
pciehp_ist+0x1c9/0x1d0
irq_thread_fn+0x24/0x60
irq_thread+0xeb/0x190
kthread+0x120/0x140

INFO: task irq/190-pciehp:2288 blocked for more than 120 seconds.
irq/190-pciehp D 0 2288 2 0x80000000
Call Trace:
__schedule+0x2a2/0x880
schedule+0x2c/0x80
schedule_preempt_disabled+0xe/0x10
mutex_lock+0x2c/0x30
pci_lock_rescan_remove+0x15/0x20
pciehp_unconfigure_device+0x4d/0x140
pciehp_disable_slot+0x6a/0x110
pciehp_handle_presence_or_link_change+0x263/0x400
pciehp_ist+0x1c9/0x1d0
irq_thread_fn+0x24/0x60
irq_thread+0xeb/0x190
kthread+0x120/0x140

What happens here is that the whole hierarchy is runtime resumed and the
parent PCIe downstream port, which got the hot-remove event, starts
removing devices below it, taking pci_lock_rescan_remove() lock. When the
child PCIe port is runtime resumed it calls pciehp_check_presence() which
ends up calling pciehp_card_present() and pciehp_check_link_active(). Both
of these use pcie_capability_read_word(), which notices that the underlying
device is already gone and returns PCIBIOS_DEVICE_NOT_FOUND with the
capability value set to 0. When pciehp gets this value it thinks that its
child device is also hot-removed and schedules its IRQ thread to handle the
event.

The deadlock happens when the child's IRQ thread runs and tries to acquire
pci_lock_rescan_remove() which is already taken by the parent and the
parent waits for the child's IRQ thread to finish.

Prevent this from happening by checking the return value of
pcie_capability_read_word() and if it is PCIBIOS_DEVICE_NOT_FOUND stop
performing any hot-removal activities.

[bhelgaas: add common scenarios to commit log]
Link: https://lore.kernel.org/r/20191029170022.57528-2-mika.westerberg@linux.intel.com
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# b94ec12d 08-Nov-2019 Andy Shevchenko <andriy.shevchenko@linux.intel.com>

PCI: pciehp: Refactor infinite loop in pcie_poll_cmd()

Infinite timeout loops are hard to read. Refactor it to plausible 'do {}
while ()'.

Note, the supplied timeout can't be negative for current use, though if
it's not dividable to 10, we may go below 0, that's why type of the
parameter is int. And thus, we may move the check to the loop condition.

No functional change intended.

Link: https://lore.kernel.org/r/20191108111855.85866-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andrew Murray <andrew.murray@arm.com>


# 157c1062 08-Aug-2019 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Avoid returning prematurely from sysfs requests

A sysfs request to enable or disable a PCIe hotplug slot should not
return before it has been carried out. That is sought to be achieved by
waiting until the controller's "pending_events" have been cleared.

However the IRQ thread pciehp_ist() clears the "pending_events" before
it acts on them. If pciehp_sysfs_enable_slot() / _disable_slot() happen
to check the "pending_events" after they have been cleared but while
pciehp_ist() is still running, the functions may return prematurely
with an incorrect return value.

Fix by introducing an "ist_running" flag which must be false before a sysfs
request is allowed to return.

Fixes: 32a8cef274fe ("PCI: pciehp: Enable/disable exclusively from IRQ thread")
Link: https://lore.kernel.org/linux-pci/1562226638-54134-1-git-send-email-wangxiongfeng2@huawei.com
Link: https://lore.kernel.org/r/4174210466e27eb7e2243dd1d801d5f75baaffd8.1565345211.git.lukas@wunner.de
Reported-and-tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v4.19+


# 9194094b 03-Sep-2019 Denis Efremov <efremov@linux.com>

PCI: pciehp: Remove pciehp_green_led_{on,off,blink}()

Remove pciehp_green_led_{on,off,blink}() and use pciehp_set_indicators()
instead, since the code is mostly the same.

[bhelgaas: drop set_power_indicator() wrapper to reduce the number of
interfaces]
Link: https://lore.kernel.org/r/20190903111021.1559-5-efremov@linux.com
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>


# 106feb2f 03-Sep-2019 Denis Efremov <efremov@linux.com>

PCI: pciehp: Remove pciehp_set_attention_status()

Remove pciehp_set_attention_status() and use pciehp_set_indicators()
instead, since the code is mostly the same.

Link: https://lore.kernel.org/r/20190903111021.1559-4-efremov@linux.com
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>


# 94719ba0 03-Sep-2019 Denis Efremov <efremov@linux.com>

PCI: pciehp: Combine adjacent indicator updates

Combine adjacent updates of power and attention indicators into a single
pciehp_set_indicators() call. This sends one command to the hotplug
controller instead of two.

Link: https://lore.kernel.org/r/20190903111021.1559-3-efremov@linux.com
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>


# 688033f5 03-Sep-2019 Denis Efremov <efremov@linux.com>

PCI: pciehp: Add pciehp_set_indicators() to set both indicators

Add pciehp_set_indicators() to set power and attention indicators with a
single register write.

This is a minor optimization because we frequently set both indicators and
this can do it with a single command. It also reduces the number of
interfaces related to the indicators and makes them more discoverable
because callers use the PCI_EXP_SLTCTL_ATTN_IND_* and
PCI_EXP_SLTCTL_PWR_IND_* definitions directly.

[bhelgaas: extend commit log, s/PCI_EXP_SLTCTL_.*_IND_NONE/INDICATOR_NOOP/
so they don't look like things defined by the spec, add function doc, mask
commands to make it obvious we only send valid commands
(pcie_do_write_cmd() does mask it, but requires more effort to verify)]
Link: https://lore.kernel.org/r/20190903111021.1559-2-efremov@linux.com
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# e07ca82a 08-May-2019 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Remove pointless MY_NAME definition

MY_NAME is only used once and offers no benefit, so remove it.

Link: https://lore.kernel.org/lkml/20190509141456.223614-11-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>


# 742ee16b 07-May-2019 Frederick Lawler <fred@fredlawl.com>

PCI: pciehp: Remove unused dbg/err/info/warn() wrappers

Replace the last uses of dbg() with the equivalent pr_debug(), then remove
unused dbg(), err(), info(), and warn() wrappers.

Link: https://lore.kernel.org/lkml/20190509141456.223614-9-helgaas@kernel.org
Signed-off-by: Frederick Lawler <fred@fredlawl.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>


# 94dbc956 07-May-2019 Frederick Lawler <fred@fredlawl.com>

PCI: pciehp: Log messages with pci_dev, not pcie_device

Log messages with pci_dev, not pcie_device. Factor out common message
prefixes with dev_fmt().

Example output change:

- pciehp 0000:00:06.0:pcie004: Slot(0) Powering on due to button press
+ pcieport 0000:00:06.0: pciehp: Slot(0) Powering on due to button press

Link: https://lore.kernel.org/lkml/20190509141456.223614-8-helgaas@kernel.org
Signed-off-by: Frederick Lawler <fred@fredlawl.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>


# 7e696b8a 08-May-2019 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Remove pciehp_debug uses

We're about to convert pciehp to the dyndbg mechanism, which means we can
eventually remove pciehp_debug.

Replace uses of pciehp_debug with dbg() and ctrl_dbg(), which check
pciehp_debug internally.

Link: https://lore.kernel.org/lkml/20190509141456.223614-6-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>


# bbe54ea5 31-Jan-2019 Mika Westerberg <mika.westerberg@linux.intel.com>

PCI: pciehp: Disable Data Link Layer State Changed event on suspend

Commit 0e157e528604 ("PCI/PME: Implement runtime PM callbacks") tried to
solve an issue where the hierarchy immediately wakes up when it is
transitioned into D3cold. However, it turns out to prevent PME
propagation on some systems that do not support D3cold.

I looked more closely at what might cause the immediate wakeup. It happens
when the ACPI power resource of the root port is turned off. The AML code
associated with the _OFF() method of the ACPI power resource starts a PCIe
L2/L3 Ready transition and waits for it to complete. Right after the L2/L3
Ready transition is started the root port receives a PME from the
downstream port.

The simplest hierarchy where this happens looks like this:

00:1d.0 PCIe Root Port
^
|
v
05:00.0 PCIe switch #1 upstream port
06:01.0 PCIe switch #1 downstream hotplug port
^
|
v
08:00.0 PCIe switch #2 upstream port

It seems that the PCIe link between the two switches, before
PME_Turn_Off/PME_TO_Ack is complete for the whole hierarchy, goes
inactive and triggers PME towards the root port bringing it back to D0.
The L2/L3 Ready sequence is described in PCIe r4.0 spec sections 5.2 and
5.3.3 but unfortunately they do not state what happens if DLLSCE is
enabled during the sequence.

Disabling Data Link Layer State Changed event (DLLSCE) seems to prevent
the issue and still allows the downstream hotplug port to notice when a
device is plugged/unplugged.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=202593
Fixes: 0e157e528604 ("PCI/PME: Implement runtime PM callbacks")
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org # v4.20+


# 22e4d639 07-Nov-2018 Shunyong Yang <shunyong.yang@hxt-semitech.com>

PCI: pciehp: Add HXT quirk for Command Completed errata

The HXT SD4800 PCI controller does not set the Command Completed bit unless
writes to the Slot Command register change "Control" bits.

Add SD4800 to the quirk.

Signed-off-by: Shunyong Yang <shunyong.yang@hxt-semitech.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Joey Zheng <yu.zheng@hxt-semitech.com>


# 25bd879e 07-Jan-2019 Mika Westerberg <mika.westerberg@linux.intel.com>

PCI: pciehp: Assign ctrl->slot_ctrl before writing it to hardware

Shameerali reported that running v4.20-rc1 as QEMU guest, the PCIe hotplug
port times out during boot:

pciehp 0000:00:01.0:pcie004: Timeout on hotplug command 0x03f1 (issued 1016 msec ago)
pciehp 0000:00:01.0:pcie004: Timeout on hotplug command 0x03f1 (issued 1024 msec ago)
pciehp 0000:00:01.0:pcie004: Failed to check link status
pciehp 0000:00:01.0:pcie004: Timeout on hotplug command 0x02f1 (issued 2520 msec ago)

The issue was bisected down to commit 720d6a671a6e ("PCI: pciehp: Do not
handle events if interrupts are masked") and was further analyzed by the
reporter to be caused by the fact that pciehp first updates the hardware
and only then cache the ctrl->slot_ctrl in pcie_do_write_cmd(). If the
interrupt happens before we cache the value, pciehp_isr() reads value 0 and
decides that the interrupt was not meant for it causing the above timeout
to trigger.

Fix by moving ctrl->slot_ctrl assignment to happen before it is written to
the hardware.

Fixes: 720d6a671a6e ("PCI: pciehp: Do not handle events if interrupts are masked")
Link: https://lore.kernel.org/linux-pci/5FC3163CFD30C246ABAA99954A238FA8387DD344@FRAEML521-MBX.china.huawei.com
Reported-by: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 720d6a67 27-Sep-2018 Mika Westerberg <mika.westerberg@linux.intel.com>

PCI: pciehp: Do not handle events if interrupts are masked

PCIe native hotplug shares MSI vector with native PME so the interrupt
handler might get called even the hotplug interrupt is masked. In that case
we should not handle any events because the interrupt was not meant for us.

Modify the PCIe hotplug interrupt handler to check this accordingly and
bail out if it finds out that the interrupt was not about hotplug.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>


# eb34da60 27-Sep-2018 Mika Westerberg <mika.westerberg@linux.intel.com>

PCI: pciehp: Disable hotplug interrupt during suspend

When PCIe hotplug port is transitioned into D3hot, the link to the
downstream component will go down. If hotplug interrupt generation is
enabled when that happens, it will trigger immediately, waking up the
system and bringing the link back up.

To prevent this, disable hotplug interrupt generation when system suspend
is entered. This does not prevent wakeup from low power states according
to PCIe 4.0 spec section 6.7.3.4:

Software enables a hot-plug event to generate a wakeup event by
enabling software notification of the event as described in Section
6.7.3.1. Note that in order for software to disable interrupt generation
while keeping wakeup generation enabled, the Hot-Plug Interrupt Enable
bit must be cleared.

So as long as we have set the slot event mask accordingly, wakeup should
work even if slot interrupt is disabled. The port should trigger wake and
then send PME to the root port when the PCIe hierarchy is brought back up.

Limit this to systems using native PME mechanism to make sure older Apple
systems depending on commit e3354628c376 ("PCI: pciehp: Support interrupts
sent from D3hot") still continue working.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>


# f0157160 20-Sep-2018 Keith Busch <kbusch@kernel.org>

PCI: Make link active reporting detection generic

The spec has timing requirements when waiting for a link to become active
after a conventional reset. Implement those hard delays when waiting for
an active link so pciehp and dpc drivers don't need to duplicate this.

For devices that don't support data link layer active reporting, wait the
fixed time recommended by the PCIe spec.

Signed-off-by: Keith Busch <keith.busch@intel.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>


# 125450f8 08-Sep-2018 Lukas Wunner <lukas@wunner.de>

PCI: hotplug: Embed hotplug_slot

When the PCI hotplug core and its first user, cpqphp, were introduced in
February 2002 with historic commit a8a2069f432c, cpqphp allocated a slot
struct for its internal use plus a hotplug_slot struct to be registered
with the hotplug core and linked the two with pointers:
https://git.kernel.org/tglx/history/c/a8a2069f432c

Nowadays, the predominant pattern in the tree is to embed ("subclass")
such structures in one another and cast to the containing struct with
container_of(). But it wasn't until July 2002 that container_of() was
introduced with historic commit ec4f214232cf:
https://git.kernel.org/tglx/history/c/ec4f214232cf

pnv_php, introduced in 2016, did the right thing and embedded struct
hotplug_slot in its internal struct pnv_php_slot, but all other drivers
cargo-culted cpqphp's design and linked separate structs with pointers.

Embedding structs is preferrable to linking them with pointers because
it requires fewer allocations, thereby reducing overhead and simplifying
error paths. Casting an embedded struct to the containing struct
becomes a cheap subtraction rather than a dereference. And having fewer
pointers reduces the risk of them pointing nowhere either accidentally
or due to an attack.

Convert all drivers to embed struct hotplug_slot in their internal slot
struct. The "private" pointer in struct hotplug_slot thereby becomes
unused, so drop it.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> # drivers/pci/hotplug/rpa*
Acked-by: Sebastian Ott <sebott@linux.ibm.com> # drivers/pci/hotplug/s390*
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> # drivers/platform/x86
Cc: Len Brown <lenb@kernel.org>
Cc: Scott Murray <scott@spiteful.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Corentin Chary <corentin.chary@gmail.com>
Cc: Darren Hart <dvhart@infradead.org>


# 4ff3126e 08-Sep-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Rename controller struct members for clarity

Of the members which were just moved from pciehp's slot struct to the
controller struct, rename "lock" to "state_lock" and rename "work" to
"button_work" for clarity. Perform the rename separately to the
unification of the two structs per Sinan's request.

No functional change intended.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Sinan Kaya <okaya@kernel.org>


# 5790a9c7 18-Sep-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Unify controller and slot structs

pciehp was originally introduced together with shpchp in a single
commit, c16b4b14d980 ("PCI Hotplug: Add SHPC and PCI Express hot-plug
drivers"):
https://git.kernel.org/tglx/history/c/c16b4b14d980

shpchp supports up to 31 slots per controller, hence uses separate slot
and controller structs. pciehp has a 1:1 relationship between slot and
controller and therefore never required this separation. Nevertheless,
because much of the code had been copy-pasted between the two drivers,
pciehp likewise uses separate structs to this very day.

The artificial separation of data structures adds unnecessary complexity
and bloat to pciehp and requires constantly chasing pointers at runtime.

Simplify the driver by merging struct slot into struct controller.
Merge the slot constructor pcie_init_slot() and the destructor
pcie_cleanup_slot() into the controller counterparts.

No functional change intended.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 80696f99 08-Sep-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Tolerate Presence Detect hardwired to zero

The WiGig Bus Extension (WBE) specification allows tunneling PCIe over
IEEE 802.11. A product implementing this spec is the wil6210 from
Wilocity (now part of Qualcomm Atheros). It integrates a PCIe switch
with a wireless network adapter:

00.0-+ [1ae9:0101] Upstream Port
+-00.0-+ [1ae9:0200] Downstream Port
| +-00.0 [168c:0034] Atheros AR9462 Wireless Network Adapter
+-02.0 [1ae9:0201] Downstream Port
+-03.0 [1ae9:0201] Downstream Port

Wirelessly attached devices presumably appear below the hotplug ports
with device ID [1ae9:0201]. Oddly, the Downstream Port [1ae9:0200]
leading to the wireless network adapter is likewise Hotplug Capable,
but has its Presence Detect State bit hardwired to zero. Even if the
Link Active bit is set, Presence Detect is zero, so this cannot be
caused by in-band presence detection but only by broken hardware.

pciehp assumes an empty slot if Presence Detect State is zero,
regardless of Link Active being one. Consequently, up until v4.18 it
removes the wireless network adapter in pciehp_resume(). From v4.19 it
already does so in pciehp_probe().

Be lenient towards broken hardware and assume the slot is occupied if
Link Active is set: Introduce pciehp_card_present_or_link_active()
and use it in lieu of pciehp_get_adapter_status() everywhere, except
in pciehp_handle_presence_or_link_change() whose log messages depend
on which of Presence Detect State or Link Active is set.

Remove the Presence Detect State check from __pciehp_enable_slot()
because it is only called if either of Presence Detect State or Link
Active is set.

Caution: There is a possibility that broken hardware exists which has
working Presence Detect but hardwires Link Active to one. On such
hardware the slot will now incorrectly be considered always occupied.
If such hardware is discovered, this commit can be rolled back and a
quirk can be added which sets is_hotplug_bridge = 0 for [1ae9:0200].

Link: https://bugzilla.kernel.org/show_bug.cgi?id=200839
Reported-and-tested-by: David Yang <mmyangfl@gmail.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rajat Jain <rajatja@google.com>
Cc: Ashok Raj <ashok.raj@intel.com>


# eee6e273 19-Aug-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Drop hotplug_slot_ops wrappers

pciehp's ->enable_slot, ->disable_slot, ->get_attention_status and
->reset_slot callbacks are currently implemented by wrapper functions
that do nothing else but call down to a backend function. The backends
are not called from anywhere else, so drop the wrappers and use the
backends directly as callbacks, thereby shaving off a few lines of
unnecessary code.

No functional change intended.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 7d4ba523 19-Aug-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Drop unnecessary includes

Drop the following includes from pciehp source files which no longer use
any of the included symbols:

* <linux/sched/signal.h> in pciehp.h
<linux/signal.h> in pciehp_hpc.c
Added by commit de25968cc87c ("fix more missing includes") to
accommodate for a call to signal_pending().
The call was removed by commit 262303fe329a ("pciehp: fix wait command
completion").

* <linux/interrupt.h> in pciehp_core.c
Added by historic commit f308a2dfbe63 ("PCI: add PCI Express Port Bus
Driver subsystem") to accommodate for a call to free_irq():
https://git.kernel.org/tglx/history/c/f308a2dfbe63
The call was removed by commit 407f452b05f9 ("pciehp: remove
unnecessary free_irq").

* <linux/time.h> in pciehp_core.c and pciehp_hpc.c
Added by commit 34d03419f03b ("PCIEHP: Add Electro Mechanical
Interlock (EMI) support to the PCIE hotplug driver."),
which was reverted by commit bd3d99c17039 ("PCI: Remove untested
Electromechanical Interlock (EMI) support in pciehp.").

* <linux/module.h> in pciehp_ctrl.c, pciehp_hpc.c and pciehp_pci.c
Added by historic commit c16b4b14d980 ("PCI Hotplug: Add SHPC and PCI
Express hot-plug drivers"):
https://git.kernel.org/tglx/history/c/c16b4b14d980
Module-related symbols were neither used back then in those files,
nor are they used today.

* <linux/slab.h> in pciehp_ctrl.c
Added by commit 5a0e3ad6af86 ("include cleanup: Update gfp.h and
slab.h includes to prepare for breaking implicit slab.h inclusion from
percpu.h") to accommodate for calls to kmalloc().
The calls were removed by commit 0e94916e6091 ("PCI: pciehp: Handle
events synchronously").

* "../pci.h" in pciehp_ctrl.c
Added by historic commit 67f4660b72f2 ("PCI: ASPM patch for") to
accommodate for usage of the global variable pcie_mch_quirk:
https://git.kernel.org/tglx/history/c/67f4660b72f2
The global variable was removed by commit 0ba379ec0fb1 ("PCI: Simplify
hotplug mch quirk").

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 34fb6bf9 05-Sep-2018 Keith Busch <kbusch@kernel.org>

PCI: pciehp: Fix hot-add vs powerfault detection order

If both hot-add and power fault were observed in a single interrupt, we
handled the hot-add first, then the power fault, in this path:

pciehp_ist
if (events & (PDC | DLLSC))
pciehp_handle_presence_or_link_change
case OFF_STATE:
pciehp_enable_slot
__pciehp_enable_slot
board_added
pciehp_power_on_slot
ctrl->power_fault_detected = 0
pcie_write_cmd(ctrl, PCI_EXP_SLTCTL_PWR_ON, PCI_EXP_SLTCTL_PCC)
pciehp_green_led_on(p_slot) # power LED on
pciehp_set_attention_status(p_slot, 0) # attention LED off
if ((events & PFD) && !ctrl->power_fault_detected)
ctrl->power_fault_detected = 1
pciehp_set_attention_status(1) # attention LED on
pciehp_green_led_off(slot) # power LED off

This left the attention indicator on (even though the hot-add succeeded)
and the power indicator off (even though the slot power was on).

Fix this by checking for power faults before checking for new devices.

Prior to 0e94916e6091, this was successful because everything was chained
through work queues and the order was:

INT_PRESENCE_ON -> INT_POWER_FAULT -> ENABLE_REQ

The ENABLE_REQ cleared the power fault at the end, but now everything is
handled inline with the interrupt thread, such that the work ENABLE_REQ was
doing happens before power fault handling now.

Fixes: 0e94916e6091 ("PCI: pciehp: Handle events synchronously")
Signed-off-by: Keith Busch <keith.busch@intel.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>


# 4e6a1335 27-Jul-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Deduplicate presence check on probe & resume

On driver probe and on resume from system sleep, pciehp checks the
Presence Detect State bit in the Slot Status register to bring up an
occupied slot or bring down an unoccupied slot. Both code paths are
identical, so deduplicate them per Mika's request.

On probe, an additional check is performed to disable power of an
unoccupied slot. This can e.g. happen if power was enabled by BIOS.
It cannot happen once pciehp has taken control, hence is not necessary
on resume: The Slot Control register is set to the same value that it
had on suspend by pci_restore_state(), so if the slot was occupied,
power is enabled and if it wasn't, power is disabled. Should occupancy
have changed during the system sleep transition, power is adjusted by
bringing up or down the slot per the paragraph above.

To allow for deduplication of the presence check, move the power check
to pcie_init(). This seems safer anyway, because right now it is
performed while interrupts are already enabled, and although I can't
think of a scenario where pciehp_power_off_slot() and the IRQ thread
collide, it does feel brittle.

However this means that pcie_init() may now write to the Slot Control
register before the IRQ is requested. If both the CCIE and HPIE bits
happen to be set, pcie_wait_cmd() will wait for an interrupt (instead
of polling the Command Completed bit) and eventually emit a timeout
message. Additionally, if a level-triggered INTx interrupt is used,
the user may see a spurious interrupt splat. Avoid by disabling
interrupts before disabling power. (Normally the HPIE and CCIE bits
should be clear on probe, but conceivably they may already have been
set e.g. by BIOS.)

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>


# 4417aa45 19-Jul-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Resume parent to D0 on config space access

Ensure accessibility of a hotplug port's config space when accessed via
sysfs by resuming its parent to D0.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>


# 6b08c385 27-Jul-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Support interrupts sent from D3hot

If a hotplug port is able to send an interrupt, one would naively assume
that it is accessible at that moment. After all, if it wouldn't be
accessible, i.e. if its parent is in D3hot and the link to the hotplug
port is thus down, how should an interrupt come through?

It turns out that assumption is wrong at least for Thunderbolt: Even
though its parents are in D3hot, a Thunderbolt hotplug port is able to
signal interrupts. Because the port's config space is inaccessible and
resuming the parents may sleep, the hard IRQ handler has to defer
runtime resuming the parents and reading the Slot Status register to the
IRQ thread.

If the hotplug port uses a level-triggered INTx interrupt, it needs to
be masked until the IRQ thread has cleared the signaled events. For
simplicity, this commit also masks edge-triggered MSI/MSI-X interrupts.
Note that if the interrupt is shared (which can only happen for INTx),
other devices are starved from receiving interrupts until the IRQ thread
is scheduled, has runtime resumed the hotplug port's parents and has
read and cleared the Slot Status register.

That delay is dominated by the 10 ms D3hot->D0 transition time of each
parent port. The worst case is a Thunderbolt downstream port at the
end of a daisy chain: There may be up to six Thunderbolt controllers
in-between it and the root port, each comprising an upstream and
downstream port, plus its own upstream port. That's 13 x 10 = 130 ms.
Possible mitigations are polling the interrupt while it's disabled or
reducing the d3_delay of Thunderbolt ports if possible.

Open code masking of the interrupt instead of requesting it with the
IRQF_ONESHOT flag to minimize the period during which it is masked.
(IRQF_ONESHOT unmasks the IRQ only after the IRQ thread has finished.)

PCIe r4.0 sec 6.7.3.4 states that "If wake generation is required by the
associated form factor specification, a hotplug capable Downstream Port
must support generation of a wakeup event (using the PME mechanism) on
hotplug events that occur when the system is in a sleep state or the
Port is in device state D1, D2, or D3Hot."

This would seem to imply that PME needs to be enabled on the hotplug
port when it is runtime suspended. pci_enable_wake() currently doesn't
enable PME on bridges, it may be necessary to add an exemption for
hotplug bridges there. On "Light Ridge" Thunderbolt controllers, the
PME_Status bit is not set when an interrupt occurs while the hotplug
port is in D3hot, even if PME is enabled. (I've tested this on a Mac
and we hardcode the OSC_PCI_EXPRESS_PME_CONTROL bit to 0 on Macs in
negotiate_os_control(), modifying it to 1 didn't change the behavior.)

(Side note: Section 6.7.3.4 also states that "PME and Hot-Plug Event
interrupts (when both are implemented) always share the same MSI or
MSI-X vector". That would only seem to apply to Root Ports, however
the section never mentions Root Ports, only Downstream Ports. This is
explained in the definition of "Downstream Port" in the "Terms and
Acronyms" section of the PCIe Base Spec: "The Ports on a Switch that
are not the Upstream Port are Downstream Ports. All Ports on a Root
Complex are Downstream Ports.")

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>


# 79037824 19-Jul-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Clear spurious events earlier on resume

Thunderbolt hotplug ports that were occupied before system sleep resume
with their downstream link in "off" state. Only after the Thunderbolt
controller has reestablished the PCIe tunnels does the link go up.
As a result, a spurious Presence Detect Changed and/or Data Link Layer
State Changed event occurs.

The events are not immediately acted upon because tunnel reestablishment
happens in the ->resume_noirq phase, when interrupts are still disabled.
Also, notification of events may initially be disabled in the Slot
Control register when coming out of system sleep and is reenabled in the
->resume_noirq phase through:

pci_pm_resume_noirq()
pci_pm_default_resume_early()
pci_restore_state()
pci_restore_pcie_state()

It is not guaranteed that the events are acted upon at all: PCIe r4.0,
sec 6.7.3.4 says that "a port may optionally send an MSI when there are
hot-plug events that occur while interrupt generation is disabled, and
interrupt generation is subsequently enabled." Note the "optionally".

If an MSI is sent, pciehp will gratuitously turn the slot off and back
on once the ->resume_early phase has commenced.

If an MSI is not sent, the extant, unacknowledged events in the Slot
Status register will prevent future notification of presence or link
changes.

Commit 13c65840feab ("PCI: pciehp: Clear Presence Detect and Data Link
Layer Status Changed on resume") fixed the latter by clearing the events
in the ->resume phase. Move this to the ->resume_noirq phase to also
fix the gratuitous disable/enablement of the slot.

The commit further restored the Slot Control register in the ->resume
phase, but that's dispensable because as shown above it's already been
done in the ->resume_noirq phase.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>


# 5b3f7b7d 27-Jul-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Avoid slot access during reset

The ->reset_slot callback introduced by commits:

2e35afaefe64 ("PCI: pciehp: Add reset_slot() method") and
06a8d89af551 ("PCI: pciehp: Disable link notification across slot reset")

disables notification of Presence Detect Changed and Data Link Layer
State Changed events for the duration of a secondary bus reset.

However a bus reset not only triggers these events, but may also clear
the Presence Detect State bit in the Slot Status register and the Data
Link Layer Link Active bit in the Link Status register momentarily.
According to Sinan Kaya:

"I know for a fact that bus reset clears the Data Link Layer Active bit
as soon as link goes down. It gets set again following link up.
Presence detect depends on the HW implementation. QDT root ports
don't change presence detect for instance since nobody actually
removed the card. If an implementation supports in-band presence
detect, the answer is yes. As soon as the link goes down, presence
detect bit will get cleared until recovery."
https://lkml.kernel.org/r/42e72f83-3b24-f7ef-e5bc-290fae99259a@codeaurora.org

In-band presence detect is also covered in Table 4-15 in PCIe r4.0,
sec 4.2.6.

pciehp should therefore ensure that any parts of the driver that access
those bits do not run concurrently to a bus reset. The only precaution
the commits took to that effect was to halt interrupt polling. They
made no effort to drain the slot workqueue, cancel an outstanding
Attention Button work, or block slot enable/disable requests via sysfs
and in the ->probe hook.

Now that pciehp is converted to enable/disable the slot exclusively from
the IRQ thread, the only places accessing the two above-mentioned bits
are the IRQ thread and the ->probe hook. Add locking to serialize them
with a bus reset. This obviates the need to halt interrupt polling.
Do not add locking to the ->get_adapter_status sysfs callback to afford
users unfettered access to that bit. Use an rw_semaphore in lieu of a
regular mutex to allow parallel execution of the non-reset code paths
accessing the critical bits, i.e. the IRQ thread and the ->probe hook.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rajat Jain <rajatja@google.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Sinan Kaya <okaya@kernel.org>


# cdf6b736 19-Jul-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Always enable occupied slot on probe

Per PCIe r4.0, sec 6.7.3.4, a "port may optionally send an MSI when
there are hot-plug events that occur while interrupt generation is
disabled, and interrupt generation is subsequently enabled."

On probe, we currently clear all event bits in the Slot Status register
with the notable exception of the Presence Detect Changed bit. Thereby
we seek to receive an interrupt for an already occupied slot once event
notification is enabled.

But because the interrupt is optional, users may have to specify the
pciehp_force parameter on the command line, which is inconvenient.

Moreover, now that pciehp's event handling has become resilient to
missed events, a Presence Detect Changed interrupt for a slot which is
powered on is interpreted as removal of the card. If the slot has
already been brought up by the BIOS, receiving such an interrupt on
probe causes the slot to be powered off and immediately back on, which
is likewise undesirable.

Avoid both issues by making the behavior of pciehp_force the default and
clearing the Presence Detect Changed bit on probe.

Note that the stated purpose of pciehp_force per the MODULE_PARM_DESC
("Force pciehp, even if OSHP is missing") seems nonsensical because the
OSHP control method is only relevant for SHCP slots according to the
PCI Firmware specification r3.0, sec 4.8.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>


# d331710e 19-Jul-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Become resilient to missed events

A hotplug port's Slot Status register does not count how often each type
of event occurred, it only records the fact *that* an event has occurred.

Previously pciehp queued a work item for each event. But if it missed
an event, e.g. removal of a card in-between two back-to-back insertions,
it queued up the wrong work item or no work item at all. Commit
fad214b0aa72 ("PCI: pciehp: Process all hotplug events before looking
for new ones") sought to improve the situation by shrinking the window
during which events may be missed.

But Stefan Roese reports unbalanced Card present and Link Up events,
suggesting that we're still missing events if they occur very rapidly.
Bjorn Helgaas responds that he considers pciehp's event handling
"baroque" and calls for its simplification and rationalization:
https://lkml.kernel.org/r/20180202192045.GA53759@bhelgaas-glaptop.roam.corp.google.com

It gets worse once a hotplug port is runtime suspended: The port can
signal an interrupt while it and its parents are in D3hot, i.e. while
it is inaccessible. By the time we've runtime resumed all parents to D0
and read the port's Slot Status register, we may have missed an arbitrary
number of events. Event handling therefore needs to be reworked to
become resilient to missed events.

Assume that a Presence Detect Changed event has occurred.
Consider the following truth table:
- Slot is in OFF_STATE and is currently empty. => Do nothing.
(The event is trailing a Link Down or we've
missed an insertion and subsequent removal.)
- Slot is in OFF_STATE and is currently occupied. => Turn the slot on.
- Slot is in ON_STATE and is currently empty. => Turn the slot off.
- Slot is in ON_STATE and is currently occupied. => Turn the slot off,
(Be cautious and assume the card in then back on.
the slot isn't the same as before.)

This leads to the following simple algorithm:
1 If the slot is in ON_STATE, turn it off unconditionally.
2 If the slot is currently occupied, turn it on.

Because those actions are now carried out synchronously, rather than by
scheduled work items, pciehp reacts to the *current* situation and
missed events no longer matter.

Data Link Layer State Changed events can be handled identically to
Presence Detect Changed events. Note that in the above truth table,
a Link Up trailing a Card present event didn't have to be accounted for:
It is filtered out by pciehp_check_link_status().

As for Attention Button Pressed events, PCIe r4.0, sec 6.7.1.5 says:
"Once the Power Indicator begins blinking, a 5-second abort interval
exists during which a second depression of the Attention Button cancels
the operation." In other words, the user can only expect the system to
react to a button press after it starts blinking. Missed button presses
that occur in-between are irrelevant.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stefan Roese <sr@denx.de>
Cc: Mayurkumar Patel <mayurkumar.patel@intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>


# 6c35a1ac 19-Jul-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Tolerate initially unstable link

When a device is hotplugged, Presence Detect and Link Up events often do
not occur simultaneously, but with a lag of a few milliseconds. Only
the first event received is relevant, the other one can be disregarded.

Moreover, Stefan Roese reports that on certain platforms, Link State and
Presence Detect may flap for up to 100 ms before stabilizing, suggesting
that such events should be disregarded for at least this long:
https://lkml.kernel.org/r/20180130084121.18653-1-sr@denx.de

On slot enablement, pciehp_check_link_status() waits for 100 ms per
PCIe r4.0, sec 6.7.3.3, then probes the hotplugged device's vendor
register for up to 1 second.

If this succeeds, the link is definitely up, so ignore any Presence
Detect or Link State events that occurred up to this point.

pciehp_check_link_status() then checks the Link Training bit in the
Link Status register. This is the final opportunity to detect
inaccessibility of the device and abort slot enablement. Any link
or presence change that occurs afterwards will cause the slot to be
disabled again immediately after attempting to enable it.

The astute reviewer may appreciate that achieving this behavior would be
more complicated had pciehp not just been converted to enable/disable
the slot exclusively from the IRQ thread: When the slot is enabled via
sysfs, each link or presence flap would otherwise cause the IRQ thread
to run and it would have to sense that those events are belonging to a
concurrent slot enablement operation and disregard them. It would be
much more difficult than this mere 3 line change.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stefan Roese <sr@denx.de>


# 1656716d 19-Jul-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Drop enable/disable lock

Previously slot enablement and disablement could happen concurrently.
But now it's under the exclusive control of the IRQ thread, rendering
the locking obsolete. Drop it.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 32a8cef2 19-Jul-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Enable/disable exclusively from IRQ thread

Besides the IRQ thread, there are several other places in the driver
which enable or disable the slot:

- pciehp_probe() enables the slot if it's occupied and the pciehp_force
module parameter is used.

- pciehp_resume() enables or disables the slot after system sleep.

- pciehp_queue_pushbutton_work() enables or disables the slot after the
5 second delay following an Attention Button press.

- pciehp_sysfs_enable_slot() and pciehp_sysfs_disable_slot() enable or
disable the slot on sysfs write.

This requires locking and complicates pciehp's state machine.

A simplification can be achieved by enabling and disabling the slot
exclusively from the IRQ thread.

Amend the functions listed above to request slot enable/disablement from
the IRQ thread by either synthesizing a Presence Detect Changed event or,
in the case of a disable user request (via sysfs or an Attention Button
press), submitting a newly introduced force disable request. The latter
is needed because the slot shall be forced off despite being occupied.
For this force disable request, avoid colliding with Slot Status register
bits by using a bit number greater than 16.

For synchronous execution of requests (on sysfs write), wait for the
request to finish and retrieve the result. There can only ever be one
sysfs write in flight due to the locking in kernfs_fop_write(), hence
there is no risk of returning the result of a different sysfs request to
user space.

The POWERON_STATE and POWEROFF_STATE is now no longer entered by the
above-listed functions, but solely by the IRQ thread when it begins a
power transition. Afterwards, it moves to STATIC_STATE. The same
applies to canceling the Attention Button work, it likewise becomes an
IRQ thread only operation.

An immediate consequence is that the POWERON_STATE and POWEROFF_STATE is
never observed by the IRQ thread itself, only by functions called in a
different context, such as pciehp_sysfs_enable_slot(). So remove
handling of these states from pciehp_handle_button_press() and
pciehp_handle_link_change() which are exclusively called from the IRQ
thread.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 9590192f 19-Jul-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Track enable/disable status

handle_button_press_event() currently determines whether the slot has
been turned on or off by looking at the Power Controller Control bit in
the Slot Control register. This assumes that an attention button
implies presence of a power controller even though that's not mandated
by the spec. Moreover the Power Controller Control bit is unreliable
when a power fault occurs (PCIe r4.0, sec 6.7.1.8). This issue has
existed since the driver was introduced in 2004.

Fix by replacing STATIC_STATE with ON_STATE and OFF_STATE and tracking
whether the slot has been turned on or off. This is also a required
ingredient to make pciehp resilient to missed events, which is the
object of an upcoming commit.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 55a6b7a6 19-Jul-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Drop slot workqueue

Previously the slot workqueue was used to handle events and enable or
disable the slot. That's no longer the case as those tasks are done
synchronously in the IRQ thread. The slot workqueue is thus merely used
to handle a button press after the 5 second delay and only one such work
item may be in flight at any given time. A separate workqueue isn't
necessary for this simple task, so use the system workqueue instead.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 0e94916e 19-Jul-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Handle events synchronously

Up until now, pciehp's IRQ handler schedules a work item for each event,
which in turn schedules a work item to enable or disable the slot. This
double indirection was necessary because sleeping wasn't allowed in the
IRQ handler.

However it is now that pciehp has been converted to threaded IRQ handling
and polling, so handle events synchronously in pciehp_ist() and remove
the work item infrastructure (with the exception of work items to handle
a button press after the 5 second delay).

For link or presence change events, move the register read to determine
the current link or presence state behind acquisition of the slot lock
to prevent it from becoming stale while the lock is contended.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# ec07a447 19-Jul-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Convert to threaded polling

We've just converted pciehp to threaded IRQ handling, but still cannot
sleep in pciehp_ist() because the function is also called in poll mode,
which runs in softirq context (from a timer).

Convert poll mode to a kthread so that pciehp_ist() always runs in task
context.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>


# 7b4ce26b 19-Jul-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Convert to threaded IRQ

pciehp's IRQ handler queues up a work item for each event signaled by
the hardware. A more modern alternative is to let a long running
kthread service the events. The IRQ handler's sole job is then to check
whether the IRQ originated from the device in question, acknowledge its
receipt to the hardware to quiesce the interrupt and wake up the kthread.

One benefit is reduced latency to handle the IRQ, which is a necessity
for realtime environments. Another benefit is that we can make pciehp
simpler and more robust by handling events synchronously in process
context, rather than asynchronously by queueing up work items. pciehp's
usage of work items is a historic artifact, it predates the introduction
of threaded IRQ handlers by two years. (The former was introduced in
2007 with commit 5d386e1ac402 ("pciehp: Event handling rework"), the
latter in 2009 with commit 3aa551c9b4c4 ("genirq: add threaded interrupt
handler support").)

Convert pciehp to threaded IRQ handling by retrieving the pending events
in pciehp_isr(), saving them for later consumption by the thread handler
pciehp_ist() and clearing them in the Slot Status register.

By clearing the Slot Status (and thereby acknowledging the events) in
pciehp_isr(), we can avoid requesting the IRQ with IRQF_ONESHOT, which
would have the unpleasant side effect of starving devices sharing the
IRQ until pciehp_ist() has finished.

pciehp_isr() does not count how many times each event occurred, but
merely records the fact *that* an event occurred. If the same event
occurs a second time before pciehp_ist() is woken, that second event
will not be recorded separately, which is problematic according to
commit fad214b0aa72 ("PCI: pciehp: Process all hotplug events before
looking for new ones") because we may miss removal of a card in-between
two back-to-back insertions. We're about to make pciehp_ist() resilient
to missed events. The present commit regresses the driver's behavior
temporarily in order to separate the changes into reviewable chunks.
This doesn't affect regular slow-motion hotplug, only plug-unplug-plug
operations that happen in a timespan shorter than wakeup of the IRQ
thread.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mayurkumar Patel <mayurkumar.patel@intel.com>
Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>


# 1204e35b 19-Jul-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Fix unprotected list iteration in IRQ handler

Commit b440bde74f04 ("PCI: Add pci_ignore_hotplug() to ignore hotplug
events for a device") iterates over the devices on a hotplug port's
subordinate bus in pciehp's IRQ handler without acquiring pci_bus_sem.
It is thus possible for a user to cause a crash by concurrently
manipulating the device list, e.g. by disabling slot power via sysfs
on a different CPU or by initiating a remove/rescan via sysfs.

This can't be fixed by acquiring pci_bus_sem because it may sleep.
The simplest fix is to avoid the list iteration altogether and just
check the ignore_hotplug flag on the port itself. This works because
pci_ignore_hotplug() sets the flag both on the device as well as on its
parent bridge.

We do lose the ability to print the name of the device blocking hotplug
in the debug message, but that's probably bearable.

Fixes: b440bde74f04 ("PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org


# 281e878e 19-Jul-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Fix use-after-free on unplug

When pciehp is unbound (e.g. on unplug of a Thunderbolt device), the
hotplug_slot struct is deregistered and thus freed before freeing the
IRQ. The IRQ handler and the work items it schedules print the slot
name referenced from the freed structure in various informational and
debug log messages, each time resulting in a quadruple dereference of
freed pointers (hotplug_slot -> pci_slot -> kobject -> name).

At best the slot name is logged as "(null)", at worst kernel memory is
exposed in logs or the driver crashes:

pciehp 0000:10:00.0:pcie204: Slot((null)): Card not present

An attacker may provoke the bug by unplugging multiple devices on a
Thunderbolt daisy chain at once. Unplugging can also be simulated by
powering down slots via sysfs. The bug is particularly easy to trigger
in poll mode.

It has been present since the driver's introduction in 2004:
https://git.kernel.org/tglx/history/c/c16b4b14d980

Fix by rearranging teardown such that the IRQ is freed first. Run the
work items queued by the IRQ handler to completion before freeing the
hotplug_slot struct by draining the work queue from the ->release_slot
callback which is invoked by pci_hp_deregister().

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v2.6.4


# 381634ca 19-Jul-2018 Sinan Kaya <okaya@codeaurora.org>

PCI: Hide pci_reset_bridge_secondary_bus() from drivers

Rename pci_reset_bridge_secondary_bus() to pci_bridge_secondary_bus_reset()
and move the declaration from linux/pci.h to drivers/pci.h to be used
internally in PCI directory only.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 18426238 19-Jul-2018 Sinan Kaya <okaya@codeaurora.org>

PCI: Handle error return from pci_reset_bridge_secondary_bus()

Commit 01fd61c0b9bd ("PCI: Add a return type for
pci_reset_bridge_secondary_bus()") added a return value to the function to
return if a device is accessible following a reset. Callers are not
checking the value.

Pass error code up high in the stack if device is not accessible.

Fixes: 01fd61c0b9bd ("PCI: Add a return type for pci_reset_bridge_secondary_bus()")
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 13c65840 23-May-2018 Mika Westerberg <mika.westerberg@linux.intel.com>

PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume

After a suspend/resume cycle the Presence Detect or Data Link Layer Status
Changed bits might be set. If we don't clear them those events will not
fire anymore and nothing happens for instance when a device is now
hot-unplugged.

Fix this by clearing those bits in a newly introduced function
pcie_reenable_notification(). This should be fine because immediately
after, we check if the adapter is still present by reading directly from
the status register.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: stable@vger.kernel.org


# 9f5a70f1 17-May-2018 Oza Pawandeep <poza@codeaurora.org>

PCI: Add generic pcie_wait_for_link() interface

Clients such as hotplug and Downstream Port Containment (DPC) both need to
wait until a link becomes active or inactive.

Add a generic pcie_wait_link_active() interface and use it instead of
duplicating the code.

Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>


# d22b3621 03-May-2018 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Add quirk for Command Completed errata

Several PCIe hotplug controllers have errata that mean they do not set the
Command Completed bit unless writes to the Slot Command register change
"Control" bits. Command Completed is never set for writes that only change
software notification "Enable" bits. This results in timeouts like this:

pciehp 0000:00:1c.0:pcie004: Timeout on hotplug command 0x1038 (issued 65284 msec ago)

When this erratum is present, avoid these timeouts by marking commands
"completed" immediately unless they change the "Control" bits.

Here's the text of the Intel erratum CF118. We assume this applies to all
Intel parts:

CF118 PCIe Slot Status Register Command Completed bit not always
updated on any configuration write to the Slot Control
Register

Problem: For PCIe root ports (devices 0 - 10) supporting hot-plug,
the Slot Status Register (offset AAh) Command Completed
(bit[4]) status is updated under the following condition:
IOH will set Command Completed bit after delivering the new
commands written in the Slot Controller register (offset
A8h) to VPP. The IOH detects new commands written in Slot
Control register by checking the change of value for Power
Controller Control (bit[10]), Power Indicator Control
(bits[9:8]), Attention Indicator Control (bits[7:6]), or
Electromechanical Interlock Control (bit[11]) fields. Any
other configuration writes to the Slot Control register
without changing the values of these fields will not cause
Command Completed bit to be set.

The PCIe Base Specification Revision 2.0 or later describes
the “Slot Control Register” in section 7.8.10, as follows
(Reference section 7.8.10, Slot Control Register, Offset
18h). In hot-plug capable Downstream Ports, a write to the
Slot Control register must cause a hot-plug command to be
generated (see Section 6.7.3.2 for details on hot-plug
commands). A write to the Slot Control register in a
Downstream Port that is not hotplug capable must not cause a
hot-plug command to be executed.

The PCIe Spec intended that every write to the Slot Control
Register is a command and expected a command complete status
to abstract the VPP implementation specific nuances from the
OS software. IOH PCIe Slot Control Register implementation
is not fully conforming to the PCIe Specification in this
respect.

Implication: Software checking on the Command Completed status after
writing to the Slot Control register may time out.

Workaround: Software can read the Slot Control register and compare the
existing and new values to determine if it should check the
Command Completed status after writing to the Slot Control
register.

Per Sinan, the Qualcomm QDF2400 controller also does not set the Command
Completed bit unless writes to the Slot Command register change "Control"
bits.

Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Link: https://lkml.kernel.org/r/8770820b-85a0-172b-7230-3a44524e6c9f@molgen.mpg.de
Reported-by: Paul Menzel <pmenzel+linux-pci@molgen.mpg.de> # Lenovo X60
Tested-by: Paul Menzel <pmenzel+linux-pci@molgen.mpg.de> # Lenovo X60
Signed-off-by: Sinan Kaya <okaya@codeaurora.org> # Qcom quirk
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>


# 736759ef 26-Jan-2018 Bjorn Helgaas <bhelgaas@google.com>

PCI: Add SPDX GPL-2.0+ to replace GPL v2 or later boilerplate

Add SPDX GPL-2.0+ to all PCI files that specified the GPL and allowed
either GPL version 2 or any later version.

Remove the boilerplate GPL version 2 or later language, relying on the
assertion in b24413180f56 ("License cleanup: add SPDX GPL-2.0 license
identifier to files with no license") that the SPDX identifier may be used
instead of the full boilerplate text.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 493fb50e 17-Jan-2018 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Assume NoCompl+ for Thunderbolt ports

Certain Thunderbolt 1 controllers claim to support Command Completed events
(value of 0b in the No Command Completed Support field of the Slot
Capabilities register) but in reality they neither set the Command
Completed bit in the Slot Status register nor signal a Command Completed
interrupt:

8086:1513 CV82524 [Light Ridge 4C 2010]
8086:151a DSL2310 [Eagle Ridge 2C 2011]
8086:151b CVL2510 [Light Peak 2C 2010]
8086:1547 DSL3510 [Cactus Ridge 4C 2012]
8086:1548 DSL3310 [Cactus Ridge 2C 2012]
8086:1549 DSL2210 [Port Ridge 1C 2011]

All known newer chips (Redwood Ridge and onwards) set No Command Completed
Support, indicating that they do not support Command Completed events.

The user-visible impact is that after unplugging such a device, 2 seconds
elapse until pciehp is unbound. That's because on ->remove,
pcie_write_cmd() is called via pcie_disable_notification() and every call
to pcie_write_cmd() takes 2 seconds (1 second for each invocation of
pcie_wait_cmd()):

[ 337.942727] pciehp 0000:0a:00.0:pcie204: Timeout on hotplug command 0x1038 (issued 21176 msec ago)
[ 340.014735] pciehp 0000:0a:00.0:pcie204: Timeout on hotplug command 0x0000 (issued 2072 msec ago)

That by itself has always been unpleasant, but the situation has become
worse with commit cc27b735ad3a ("PCI/portdrv: Turn off PCIe services during
shutdown"): Now pciehp is unbound on ->shutdown. Because Thunderbolt
controllers typically have 4 hotplug ports, every reboot and shutdown is
now delayed by 8 seconds, plus another 2 seconds for every attached
Thunderbolt 1 device.

Thunderbolt hotplug slots are not physical slots that one inserts cards
into, but rather logical hotplug slots implemented in silicon. Devices
appear beyond those logical slots once a PCI tunnel is established on top
of the Thunderbolt Converged I/O switch. One would expect commands written
to the Slot Control register to be executed immediately by the silicon, so
for simplicity we always assume NoCompl+ for Thunderbolt ports.

Fixes: cc27b735ad3a ("PCI/portdrv: Turn off PCIe services during shutdown")
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: stable@vger.kernel.org # v4.12+
Cc: Sinan Kaya <okaya@codeaurora.org>
Cc: Yehezkel Bernat <yehezkel.bernat@intel.com>
Cc: Michael Jamet <michael.jamet@intel.com>
Cc: Andreas Noever <andreas.noever@gmail.com>


# c7abb235 28-Dec-2017 Markus Elfring <elfring@users.sourceforge.net>

PCI: Remove unnecessary messages for memory allocation failures

Per ebfdc40969f2 ("checkpatch: attempt to find unnecessary 'out of memory'
messages"), when a memory allocation fails, the memory subsystem emits
generic "out of memory" messages (see slab_out_of_memory() for some of this
logging). Therefore, additional error messages in the caller don't add
much value.

Remove messages that merely report "out of memory".

This preserves some messages that report additional information, e.g.,
allocation failures that mean we drop hotplug events.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
[bhelgaas: changelog, squash patches, make similar changes to acpiphp,
cpqphp, ibmphp, keep warning when dropping hotplug event]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# db63d400 13-Oct-2017 Mika Westerberg <mika.westerberg@linux.intel.com>

PCI: pciehp: Do not clear Presence Detect Changed during initialization

It is possible that the hotplug event has already happened before the
driver is attached to a PCIe hotplug downstream port. If we just clear the
status we never get the hotplug interrupt and thus the event will be
missed.

To make sure that does not happen, we leave Presence Detect Changed bit
untouched during initialization. Then once the event is unmasked we get an
interrupt and handle the hotplug event properly.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 49902239 13-Oct-2017 Mika Westerberg <mika.westerberg@linux.intel.com>

PCI: pciehp: Fix race condition handling surprise link down

A surprise link down may retrain very quickly causing the same slot
generate a link up event before handling the link down event completes.

Since the link is active, the power off work queued from the first link
down will cause a second down event when power is disabled. However, the
link up event sets the slot state to POWERON_STATE before the event to
handle this is enqueued, making the second down event believe it needs to
do something.

This creates constant link up and down event cycle.

To prevent this it is better to handle each event at the time in order it
occurred, so change the driver to use ordered workqueue instead.

A normal device hotplug triggers two events (presense detect and link up)
that are already handled properly in the driver but we currently log an
error if we find an existing device in the slot. Since this is not an error
change the log level to be debug instead to avoid scaring users.

This is based on the original work by Ashok Raj.

Link: https://patchwork.kernel.org/patch/9469023
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# c4459a08 04-Oct-2017 Kees Cook <keescook@chromium.org>

PCI: pciehp: Convert timers to use timer_setup()

In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly. This fixes what appears to be a bug
in passing the wrong pointer to the timer handler (address of ctrl pointer
instead of ctrl pointer).

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Mayurkumar Patel <mayurkumar.patel@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>


# 7612b3b2 01-Aug-2017 Keith Busch <kbusch@kernel.org>

PCI: pciehp: Report power fault only once until we clear it

When a power fault occurs, the power controller sets Power Fault Detected
in the Slot Status register, and pciehp_isr() queues an INT_POWER_FAULT
event to handle it.

It also clears Power Fault Detected, but since nothing has yet changed to
correct the power fault, the power controller will likely set it again
immediately, which may cause an infinite loop when pcie_isr() rechecks
Slot Status.

Fix that by masking off Power Fault Detected from new events if the driver
hasn't seen the power fault clear from the previous handling attempt.

Fixes: fad214b0aa72 ("PCI: pciehp: Process all hotplug events before looking for new ones")
Signed-off-by: Keith Busch <keith.busch@intel.com>
[bhelgaas: changelog, pull test out and add comment]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Mayurkumar Patel <mayurkumar.patel@intel.com>
Cc: stable@vger.kernel.org # 4.9+


# 385895fe 19-Nov-2016 Ashok Raj <ashok.raj@intel.com>

PCI: pciehp: Prioritize data-link event over presence detect

If Slot Status indicates changes in both Data Link Layer Status and
Presence Detect, prioritize the Link status change.

When both events are observed, pciehp currently relies on the Slot Status
Presence Detect State (PDS) to agree with the Link Status Data Link Layer
Active status. The Presence Detect State, however, may be set to 1 through
out-of-band presence detect even if the link is down, which creates
conflicting events.

Since the Link Status accurately reflects the reachability of the
downstream bus, the Link Status event should take precedence over a
Presence Detect event. Skip checking the PDC status if we handled a link
event in the same handler.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>


# 576243b3 13-Sep-2016 Keith Busch <kbusch@kernel.org>

PCI: pciehp: Allow exclusive userspace control of indicators

PCIe hotplug supports optional Attention and Power Indicators, which are
used internally by pciehp. Users can't control the Power Indicator, but
they can control the Attention Indicator by writing to a sysfs "attention"
file.

The Slot Control register has two bits for each indicator, and the PCIe
spec defines the encodings for each as (Reserved/On/Blinking/Off). For
sysfs "attention" writes, pciehp_set_attention_status() maps into these
encodings, so the only useful write values are 0 (Off), 1 (On), and 2
(Blinking).

However, some platforms use all four bits for platform-specific indicators,
and they need to allow direct user control of them while preventing pciehp
from using them at all.

Add a "hotplug_user_indicators" flag to the pci_dev structure. When set,
pciehp does not use either the Attention Indicator or the Power Indicator,
and the low four bits (values 0x0 - 0xf) of sysfs "attention" write values
are written directly to the Attention Indicator Control and Power Indicator
Control fields.

[bhelgaas: changelog, rename flag and accessors to s/attention/indicator/]
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 6e49b304 08-Sep-2016 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Clean up dmesg "Slot(%s)" messages

Print slot name consistently as "Slot(%s)". I don't know whether that's
ideal, but we can at least do it the same way all the time. No functional
change intended.

Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>


# 49477939 08-Sep-2016 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Remove unnecessary guard

In pcie_isr(), we return early if no status bits other than
PCI_EXP_SLTSTA_CC are set. This was introduced by dbd79aed1aea ("pciehp:
fix NULL dereference in interrupt handler"), but it is no longer necessary
because all the subsequent pcie_isr() code is already predicated on a
status bit being set.

Remove the unnecessary test for ~PCI_EXP_SLTSTA_CC. No functional change
intended.

Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>


# 0c923d1d 09-Sep-2016 Mayurkumar Patel <mayurkumar.patel@intel.com>

PCI: pciehp: Don't re-read Slot Status when queuing hotplug event

Previously we read Slot Status to learn about hotplug events, then cleared
the events, then re-read Slot Status to find out what happened. But Slot
Status might have changed before the second read.

Capture the Slot Status once before clearing the events. Also capture the
Link Status if we had a link status change.

[bhelgaas: changelog, split to separate patch]
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>


# fad214b0 08-Sep-2016 Mayurkumar Patel <mayurkumar.patel@intel.com>

PCI: pciehp: Process all hotplug events before looking for new ones

Previously we accumulated hotplug events, then processed them, essentially
like this:

events = 0
do {
status = read(Slot Status)
status &= EVENT_MASK # only look at events
events |= status # accumulate events
write(Slot Status, events) # clear events
} while (status)
process events

The problem is that as soon as we clear events in Slot Status, the hardware
may send notifications for new events, and we lose information about the
first events. For example, we might see two Presence Detect Changed
events, but lose the fact that the slot was temporarily empty:

read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS clear # slot empty
write PCI_EXP_SLTSTA_PDC # clear PDC event
read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS set # slot occupied

The current code does not process a removal; it only processes the
insertion, which fails because we didn't remove the original device.

To avoid this problem, read Slot Status once and process all the events
before reading it again, like this:

do {
read events
clear events
process events
} while (events)

[bhelgaas: changelog, add external loop around pciehp_isr()]
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>


# 70e8b401 08-Sep-2016 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Return IRQ_NONE when we can't read interrupt status

After 1469d17dd341 ("PCI: pciehp: Handle invalid data when reading from
non-existent devices"), we returned IRQ_HANDLED when we failed to read
interrupt status from the bridge. I think it's better to return IRQ_NONE,
as we do in other cases where there's no interrupt pending. This will
facilitate refactoring the loop in pcie_isr(): we'll be able to call the
ISR in a loop as long as it returns IRQ_HANDLED.

Return IRQ_NONE if we couldn't read interrupt status.

Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>


# a8499f20 08-Sep-2016 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Rename pcie_isr() locals for clarity

Rename "detected" and "intr_loc" to "status" and "events" for clarity.
"status" is the value we read from the Slot Status register; "events" is
the set of hot-plug events we need to process. No functional change
intended.

Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>


# ed91de7e 13-May-2016 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Ignore interrupts during D3cold

If a hotplug port is suspended to D3cold, its slot status register cannot
be read. If that hotplug port happens to share its IRQ with other devices,
whenever an interrupt occurs for one of these devices, pciehp logs a
"no response from device" message and tries to read the PCI_EXP_SLTSTA
register, even though we know that will fail.

Ignore interrupts while we're in D3cold.

[bhelgaas: changelog]
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 2db0f71f 01-Jul-2015 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Remove ignored MRL sensor interrupt events

We queued interrupt events for the MRL being opened or closed, but the code
in interrupt_event_handler() that handles these events ignored them.

Stop enabling MRL interrupts and remove the ignored events.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 1469d17d 20-Jul-2015 Jarod Wilson <jarod@redhat.com>

PCI: pciehp: Handle invalid data when reading from non-existent devices

It's platform-dependent, but an MMIO read to a non-existent PCI device
generally returns data with all bits set. This happens when the host
bridge or Root Complex times out waiting for a response from the device and
fabricates return data to complete the CPU's read.

One example, reported in the bugzilla below, involved this hierarchy:

pci 0000:00:1c.0: PCI bridge to [bus 02-3a] Root Port
pci 0000:02:00.0: PCI bridge to [bus 03-0a] Upstream Port
pci 0000:03:03.0: PCI bridge to [bus 05-07] Downstream Port
pci 0000:05:00.0: PCI bridge to [bus 06-07] Thunderbolt Upstream Port
pci 0000:06:00.0: PCI bridge to [bus 07] Thunderbolt Downstream Port
pci 0000:07:00.0: BCM57762 NIC

Unplugging the Thunderbolt switch and the NIC below it resulted in this:

pciehp 0000:03:03.0: Surprise Removal
tg3 0000:07:00.0: tg3_abort_hw timed out, TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
pciehp 0000:06:00.0: unloading service driver pciehp
pciehp 0000:06:00.0: pcie_isr: intr_loc 11f
pciehp 0000:06:00.0: Switch interrupt received
pciehp 0000:06:00.0: Latch open on Slot
pciehp 0000:06:00.0: Attention button interrupt received
pciehp 0000:06:00.0: Button pressed on Slot
pciehp 0000:06:00.0: Presence/Notify input change
pciehp 0000:06:00.0: Card present on Slot
pciehp 0000:06:00.0: Power fault interrupt received
pciehp 0000:06:00.0: Data Link Layer State change
pciehp 0000:06:00.0: Link Up event

The pciehp driver correctly noticed that the Thunderbolt switch (05:00.0
and 06:00.0) and NIC (07:00.0) had been removed, and it called their driver
remove methods.

Since the NIC was already gone, tg3 received 0xffffffff when it tried to
read from the device. The resulting timeout is a tg3 issue and not of
interest here.

Similarly, since the 06:00.0 Thunderbolt switch was already gone,
pcie_isr() received 0xffff when it tried to read PCI_EXP_SLTSTA, and pciehp
thought that was valid status showing that many events had happened: the
latch had been opened, the attention button had been pressed, a card was
now present, and the link was now up. These are all wrong, of course, but
pciehp went on to try to power up and enumerate devices below the
non-existent bridge:

pciehp 0000:06:00.0: PCI slot - powering on due to button press
pciehp 0000:06:00.0: Surprise Insertion
pci 0000:07:00.0 id reading try 50 times with interval 20 ms to get ffffffff

[bhelgaas: changelog, also check in pcie_poll_cmd() & pcie_do_write_cmd()]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99841
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# ac10836b 19-Jun-2015 Yijing Wang <wangyijing@huawei.com>

PCI: pciehp: Simplify pcie_poll_cmd()

Move first slot status read into while to simplify code.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 4f092fec 14-Jun-2015 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Inline the "handle event" functions into the ISR

The pciehp_handle_*() functions (pciehp_handle_attention_button(), etc.)
only contain a line or two of useful code, so it's clumsy to put
them in separate functions. All they so is add an event to a work queue,
and it's clearer to see that directly in the ISR.

Inline them directly into pcie_isr(). No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rajat Jain <rajatja@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>


# 3784e0c6 15-Jun-2015 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Clean up debug logging

The pciehp debug logging is overly verbose and often redundant. Almost all
of the information printed by dbg_ctrl() is also printed by the normal PCI
core enumeration code and by pcie_init().

Remove the redundant debug info.

When claiming a pciehp bridge, we print the slot characteristics, e.g.,

Slot #6 AttnBtn- AttnInd- PwrInd- PwrCtrl- MRL- Interlock- NoCompl+ LLActRep+

Add the Hot-Plug Capable and Hot-Plug Surprise bits to this information,
and print it all in the same order as lspci does.

No functional change except the message text changes.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rajat Jain <rajatja@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>


# a5dd4b4b 08-Jun-2015 Alex Williamson <alex.williamson@redhat.com>

PCI: pciehp: Wait for hotplug command completion where necessary

The commit referenced below deferred waiting for command completion until
the start of the next command, allowing hardware to do the latching
asynchronously. Unfortunately, being ready to accept a new command is the
only indication we have that the previous command is completed. In cases
where we need that state change to be enabled, we must still wait for
completion. For instance, pciehp_reset_slot() attempts to disable anything
that might generate a surprise hotplug on slots that support presence
detection. If we don't wait for those settings to latch before the
secondary bus reset, we negate any value in attempting to prevent the
spurious hotplug.

Create a base function with optional wait and helper functions so that
pcie_write_cmd() turns back into the "safe" interface which waits before
and after issuing a command and add pcie_write_cmd_nowait(), which
eliminates the trailing wait for asynchronous completion. The following
functions are returned to their previous behavior:

pciehp_power_on_slot
pciehp_power_off_slot
pcie_disable_notification
pciehp_reset_slot

The rationale is that pciehp_power_on_slot() enables the link and therefore
relies on completion of power-on. pciehp_power_off_slot() and
pcie_disable_notification() need a wait because data structures may be
freed after these calls and continued signaling from the device would be
unexpected. And, of course, pciehp_reset_slot() needs to wait for the
scenario outlined above.

Fixes: 3461a068661c ("PCI: pciehp: Wait for hotplug command completion lazily")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.17+


# 31ff2a5e 22-Aug-2014 Yinghai Lu <yinghai@kernel.org>

PCI: pciehp: Stop disabling notifications during init

During pciehp initialization, we previously wrote two hotplug commands:

pciehp_probe
pcie_init
pcie_disable_notification
pcie_write_cmd # command 1
pcie_init_notification
pcie_enable_notification
pcie_write_cmd # command 2

For controllers with errata like Intel CF118, we previously waited for a
timeout before issuing the second hotplug command because the first command
only updates interrupt enable bits and is not a "real" hotplug command, so
the controller doesn't report Command Completed for it.

But there's no need to disable notifications in the first place. If BIOS
left them enabled, we could easily take an interrupt before disabling them,
so there's no benefit in disabling them for the tiny window before we
enable them.

Drop the unnecessary pcie_disable_notification() call.

[bhelgaas: changelog]
Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# cf8d7b58 22-Sep-2014 Yinghai Lu <yinghai@kernel.org>

PCI: pciehp: Add more Slot Control debug output

Add more Slot Control debug output and move one print after
pcie_write_cmd() to be consistent with other debug output.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# d433889c 22-Sep-2014 Yinghai Lu <yinghai@kernel.org>

PCI: pciehp: Fix wait time in timeout message

When we warned about a timeout on a hotplug command, we previously printed
the time between calls to pcie_write_cmd(), without accounting for any time
spent actually waiting. Consider this sequence:

pcie_write_cmd
write SLTCTL
cmd_started = jiffies # T1

pcie_write_cmd
pcie_wait_cmd
now = jiffies # T2
wait_event_timeout # we may wait here
if (timeout)
ctrl_info("Timeout on command issued %u msec ago",
jiffies_to_msecs(now - cmd_started))

We previously printed (T2 - T1), but that doesn't include the time spent in
wait_event_timeout().

Fix this by using the current jiffies value, not the one cached before
calling wait_event_timeout().

[bhelgaas: changelog, use current jiffies instead of adding timeout]
Fixes: 40b960831cfa ("PCI: pciehp: Compute timeout from hotplug command start time")
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 7cbeb9f9 22-Sep-2014 Yinghai Lu <yinghai@kernel.org>

PCI: pciehp: Fix pcie_wait_cmd() timeout

pcie_poll_cmd() take msecs instead of jiffies, so convert timeout to msecs.

Fixes: 40b960831cfa ("PCI: pciehp: Compute timeout from hotplug command start time")
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# d537a3ab 15-Aug-2014 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Reduce PCIe slot_ctrl to 16 bits

4283c70e91dc ("PCI: pciehp: Make pcie_wait_cmd() self-contained") added
a cache of the most recent command written to the Slot Control register.
This register is only 16 bits wide, but the cache ("slot_ctrl") is 32 bits.

Reduce slot_ctrl to a u16 so it matches the register size. No functional
change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# b440bde7 10-Sep-2014 Bjorn Helgaas <bhelgaas@google.com>

PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device

Powering off a hot-pluggable device, e.g., with pci_set_power_state(D3cold),
normally generates a hot-remove event that unbinds the driver.

Some drivers expect to remain bound to a device even while they power it
off and back on again. This can be dangerous, because if the device is
removed or replaced while it is powered off, the driver doesn't know that
anything changed. But some drivers accept that risk.

Add pci_ignore_hotplug() for use by drivers that know their device cannot
be removed. Using pci_ignore_hotplug() tells the PCI core that hot-plug
events for the device should be ignored.

The radeon and nouveau drivers use this to switch between a low-power,
integrated GPU and a higher-power, higher-performance discrete GPU. They
power off the unused GPU, but they want to remain bound to it.

This is a reimplementation of f244d8b623da ("ACPIPHP / radeon / nouveau:
Fix VGA switcheroo problem related to hotplug") but extends it to work with
both acpiphp and pciehp.

This fixes a problem where systems with dual GPUs using the radeon drivers
become unusable, freezing every few seconds (see bugzillas below). The
resume of the radeon device may also fail, e.g.,

This fixes problems on dual GPU systems where the radeon driver becomes
unusable because of problems while suspending the device, as in bug 79701:

[drm] radeon: finishing device.
radeon 0000:01:00.0: Userspace still has active objects !
radeon 0000:01:00.0: ffff8800cb4ec288 ffff8800cb4ec000 16384 4294967297 force free
...
WARNING: CPU: 0 PID: 67 at /home/apw/COD/linux/drivers/gpu/drm/radeon/radeon_gart.c:234 radeon_gart_unbind+0xd2/0xe0 [radeon]()
trying to unbind memory from uninitialized GART !

or while resuming it, as in bug 77261:

radeon 0000:01:00.0: ring 0 stalled for more than 10158msec
radeon 0000:01:00.0: GPU lockup ...
radeon 0000:01:00.0: GPU pci config reset
pciehp 0000:00:01.0:pcie04: Card not present on Slot(1-1)
radeon 0000:01:00.0: GPU reset succeeded, trying to resume
*ERROR* radeon: dpm resume failed
radeon 0000:01:00.0: Wait for MC idle timedout !

Link: https://bugzilla.kernel.org/show_bug.cgi?id=77261
Link: https://bugzilla.kernel.org/show_bug.cgi?id=79701
Reported-by: Shawn Starr <shawn.starr@rogers.com>
Reported-by: Jose P. <lbdkmjdf@sharklasers.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Rajat Jain <rajatxjain@gmail.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
CC: stable@vger.kernel.org # v3.15+


# 0d25d35c 17-Jun-2014 Myron Stowe <myron.stowe@redhat.com>

PCI: pciehp: Clear Data Link Layer State Changed during init

During PCIe hot-plug initialization - pciehp_probe() - data structures
related to slot capabilities are set up. As part of this set up, ISRs are
put in place to handle slot events and all event bits are cleared out.

This patch adds the Data Link Layer State Changed (PCI_EXP_SLTSTA_DLLSC)
Slot Status bit to the event bits that are cleared out during
initialization.

If the BIOS doesn't clear DLLSC before handoff to the OS, pciehp notices
that it's set and interprets it as a new Link Up event, which results in
spurious messages:

pciehp 0000:82:04.0:pcie24: slot(4): Link Up event
pciehp 0000:82:04.0:pcie24: Device 0000:83:00.0 already exists at 0000:83:00, cannot hot-add
pciehp 0000:82:04.0:pcie24: Cannot add device at 0000:83:00

Prior to e48f1b67f668 ("PCI: pciehp: Use link change notifications for
hot-plug and removal"), pciehp ignored DLLSC.

Reference:
PCI-SIG. PCI Express Base Specification Revision 4.0 Version 0.3
(PCI-SIG, 2014): 7.8.11. Slot Status Register (Offset 1Ah).

[bhelgaas: add e48f1b67f668 ref and stable tag]
Fixes: e48f1b67f668 ("PCI: pciehp: Use link change notifications for hot-plug and removal")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=79611
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.15+


# 6c1a32e0 26-Jun-2014 Rajat Jain <rajatxjain@gmail.com>

PCI: pciehp: Remove struct controller.no_cmd_complete

"no_cmd_complete" is only used once, and it duplicates read-only
information we already have in the cached Slot Capabilities value.

Remove the field and use the existing macro NO_CMD_CMPL() instead.

[bhelgaas: changelog]
Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 2cc56f30 14-Jun-2014 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Remove assumptions about which commands cause completion events

We use incorrect logic to decide whether a PCIe hotplug controller
generates command completion events.

5808639bfa98 ("pciehp: fix slow probing") assumed that the Slot Status
"Command Completed" bit was set only for commands affecting slot power,
indicators, or electromechanical interlock. That assumption is false: per
sec. 6.7.3.2 of PCIe spec r3.0, a write targeting any portion of the Slot
Control register is a command, and (if command completed events are
supported) software must wait for a command to complete before issuing the
next command.

5808639bfa98 was to fix boot-time timeouts (see bugzilla below) on a Lenovo
Thinkpad R61 with an Intel hotplug controller. The controller probably has
the Intel CF118 erratum, which means it doesn't report Command Completed
unless the Slot Control power, indicator, or interlock bits are changed.
This causes a timeout because pciehp always waits for Command Complete (if
supported), regardless of which bits are changed.

Remove the incorrect logic because the timeouts have been addressed
differently by these changes:

PCI: pciehp: Wait for hotplug command completion lazily
PCI: pciehp: Compute timeout from hotplug command start time

Link: https://bugzilla.kernel.org/show_bug.cgi?id=10751
Tested-by: Rajat Jain <rajatxjain@gmail.com> (IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>


# 40b96083 14-Jun-2014 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Compute timeout from hotplug command start time

If we issue a hotplug command, go do something else, then come back and
wait for the command to complete, we don't have to wait the whole timeout
period, because some of it elapsed while we were doing something else.

Keep track of the time we issued the command, and wait only until the
timeout period from that point has elapsed.

For controllers with errata like Intel CF118, we previously timed out
before issuing the second hotplug command:

At time T1 (during boot):
- Write DLLSCE, ABPE, PDCE, etc. to Slot Control
At time T2 (hotplug event):
- Wait for command completion (CC) in Slot Status
- Timeout at T2 + 1 second because CC is never set in Slot Status
- Write PCC, PIC, etc. to Slot Control

With this change, we wait until T1 + 1 second instead of T2 + 1 second.
If the hotplug event is more than 1 second after the boot-time
initialization, we won't wait for the timeout at all.

We still emit a "Timeout on hotplug command" message if it timed out; we
should see this on the first hotplug event on every controller with this
erratum, as well as on real errors on controllers without the erratum.

Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Tested-by: Rajat Jain <rajatxjain@gmail.com> (IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>


# 3461a068 13-Jun-2014 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Wait for hotplug command completion lazily

Previously we issued a hotplug command and waited for it to complete. But
there's no need to wait until we're ready to issue the *next* command. The
next command will probably be much later, so the first one may have already
completed and we may not have to actually wait at all.

Because of hardware errata, some controllers generate command completion
events for some commands but not others. In the case of Intel CF118 (see
spec update reference), the controller indicates command completion only
for Slot Control writes that change the value of the following bits:

Power Controller Control
Power Indicator Control
Attention Indicator Control
Electromechanical Interlock Control

Changes to other bits, e.g., the interrupt enable bits, do not cause the
Command Completed bit to be set. Controllers from AMD and Nvidia are
reported to have similar errata.

These errata cause timeouts when pcie_enable_notification() enables
interrupts. Previously that timeout occurred at boot-time. With this
change, the timeout occurs later, when we change the state of the slot
power, indicators, or interlock. This speeds up boot but causes a timeout
at the first hotplug event on the slot. Subsequent events don't timeout
because only the first (boot-time) hotplug command updates Slot Control
without touching the power/indicator/interlock controls.

Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Tested-by: Rajat Jain <rajatxjain@gmail.com> (IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>


# 4283c70e 13-Jun-2014 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Make pcie_wait_cmd() self-contained

pcie_wait_cmd() waits for the controller to finish a hotplug command. Move
the associated logic (to determine whether waiting is required and whether
we're using interrupts or polling) from pcie_write_cmd() to
pcie_wait_cmd().

No functional change.

Tested-by: Rajat Jain <rajatxjain@gmail.com> (IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>


# 227f0647 18-Apr-2014 Ryan Desfosses <ryan@desfo.org>

PCI: Merge multi-line quoted strings

Merge quoted strings that are broken across lines into a single entity.
The compiler merges them anyway, but checkpatch complains about it, and
merging them makes it easier to grep for strings.

No functional change.

[bhelgaas: changelog, do the same for everything under drivers/pci]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 3c78bc61 18-Apr-2014 Ryan Desfosses <ryan@desfo.org>

PCI: Whitespace cleanup

Fix various whitespace errors.

No functional change.

[bhelgaas: fix other similar problems]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 476a357f 20-Feb-2014 Rajat Jain <rajatxjain@gmail.com>

PCI: pciehp: Acknowledge spurious "cmd completed" event

In case of a spurious "cmd completed", pcie_write_cmd() does not clear it,
but yet expects more "cmd completed" events to be generated. This does not
happen because the previous (spurious) event has not been acknowledged.
Fix that.

Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 2b3940b6 18-Feb-2014 Rajat Jain <rajatxjain@gmail.com>

PCI: pciehp: Remove a non-existent card, regardless of "surprise" capability

In case a card is physically yanked out, it should immediately be removed,
regardless of the "surprise" capability bit. Thus:

- Always handle the physical removal - regardless of the "surprise" bit.
- Don't use "surprise" capability when making decisions about enabling
presence detect notifications.
- Reword the comments to indicate the intent.

Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 50b52fde 04-Feb-2014 Rajat Jain <rajatxjain@gmail.com>

PCI: pciehp: Add hotplug_lock to serialize hotplug events

Today it is there is no protection around pciehp_enable_slot() and
pciehp_disable_slot() to ensure that they complete before another
hot-plug operation can be done on that particular slot.

This patch introduces the slot->hotplug_lock to ensure that any hotplug
operations (add / remove) complete before another hotplug event can begin
processing on that particular slot.

Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 06a8d89a 04-Feb-2014 Rajat Jain <rajatxjain@gmail.com>

PCI: pciehp: Disable link notification across slot reset

Disable the link notification (in addition to presence detect
notifications) across the slot reset since the reset could flap the link,
and we don't want to treat it as hot unplug followed by a hotplug.

Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# b1811d24 04-Feb-2014 Rajat Jain <rajatxjain@gmail.com>

PCI: pciehp: Don't disable the link permanently during removal

We need future link up events for hot-add, thus don't disable the link
permanently during device removal. Also, remove the static functions that
are now left unused.

This reverts part of 2debd9289997 ("PCI: pciehp: Disable/enable link during
slot power off/on"). This was discussed at the URL below, where it was
revealed that it was done for a bug in a PCIe repeater chip on that
particular platform.

Link: https://lkml.kernel.org/r/CAErSpo72KZ-a2OSQLWoK71GCgwBt676XZdGt4tEYm-6UYnLmPw@mail.gmail.com
Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 4f854f2a 04-Feb-2014 Rajat Jain <rajatxjain@gmail.com>

PCI: pciehp: Enable link state change notifications

Enable the Link state notifications unconditionally. Enable the
presence detection notification only if attention button is absent.
This was discussed at this thread:
https://lkml.kernel.org/r/529E5C0E.80903@gmail.com

Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# e48f1b67 04-Feb-2014 Rajat Jain <rajatxjain@gmail.com>

PCI: pciehp: Use link change notifications for hot-plug and removal

A lot of systems do not have the fancy buttons and LEDs, and instead
want to rely only on the Link state change events to drive the hotplug
and removal state machinery.
(http://www.spinics.net/lists/hotplug/msg05802.html)

This patch adds support for that functionality. Here are the details
about the patch itself:

* Define and use interrupt events for linkup / linkdown.

* Make the pcie_isr() also look at link events, and direct control to
corresponding (new) link state change handler function.

* Introduce the functions to handle link-up and link-down events and
queue the add / removal work in the slot->wq to be processed by
pciehp_power_thread()

As a side note, this patch also fixes the bug
https://bugzilla.kernel.org/show_bug.cgi?id=65521 "pciehp ignores Data Link
Layer State Changed bit."

Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 4703389f 04-Feb-2014 Rajat Jain <rajatxjain@gmail.com>

PCI: pciehp: Make check_link_active() non-static

check_link_active() functionality needs to be used by subsequent patches
(that introduce link state change based hotplug). Thus make the function
non-static, and rename it to pciehp_check_link_active() so as to be
consistent with other non-static functions.

Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# af9ab791 15-Dec-2013 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Move Attention & Power Indicator support tests to accessors

Previously, the caller checked ATTN_LED() or PWR_LED() to see whether the
slot has indicators before setting the indicator state. That clutters the
caller unnecessarily, so this moves the test inside the callees. The test
may not even be necessary; per spec it should be harmless to try to turn on
a non-existent LED. But checking first does avoid unnecessary hotplug
commands.

No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# e7b4f0d7 14-Dec-2013 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Use symbolic constants for Slot Control fields

Add symbolic constants for the PCIe Slot Control indicator and power
control fields defined by spec and use them instead of open-coded hex
constants.

No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# df72648c 14-Dec-2013 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Use symbolic constants, not hard-coded bitmask

Use the PCI_EXP_SLTSTA definitions, not 0x1f, when clearing Slot Status
bits.

No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 2f2ed41c 14-Dec-2013 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Simplify "Power Fault Detected" checking/clearing

It's simpler to test the PCI_EXP_SLTSTA_PFD bit directly and to write the
constant back to PCI_EXP_SLTSTA.

No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# afe2478f 14-Dec-2013 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Announce slot capabilities (slot #, button, LEDs, etc)

We already have the vendor/device IDs from pci_setup_device(), so drop that
info and print things that will be more useful for debugging: the slot
number and presence of button/indicators/link active reporting/etc.

No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 6dae6202 14-Dec-2013 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Make various functions void since they can't fail

These functions:

pcie_enable_notification()
pciehp_power_off_slot()
pciehp_get_power_status()
pciehp_get_attention_status()
pciehp_set_attention_status()
pciehp_get_latch_status()
pciehp_get_adapter_status()
pcie_write_cmd()

now always return success, so this patch makes them void and drops the
error-checking code in their callers.

No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# 1a84b99c 14-Dec-2013 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Remove error checks when accessing PCIe Capability

There's not much point in checking the return value from every config space
access because the only likely errors are design-time things like unaligned
accesses or invalid register numbers. The checking clutters the code
significantly, so this patch removes it.

No functional change.

Reference: http://lkml.kernel.org/r/CA+55aFzP4xEbcNmZ+MS0SQ3LrULzSq+dBiT_X9U-bPpR-Ukgrw@mail.gmail.com
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# cd84d340 09-May-2013 Bjorn Helgaas <bhelgaas@google.com>

PCI: pciehp: Drop pciehp_readw()/pciehp_writew() wrappers

The pciehp_readw() and pciehp_writew() wrappers only look up the pci_dev
and call the PCIe Capability accessors, so we can make things a little
more straightforward by just using the PCIe Capability accessors directly.

No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# f7625980 14-Nov-2013 Bjorn Helgaas <bhelgaas@google.com>

PCI: Fix whitespace, capitalization, and spelling errors

Fix whitespace, capitalization, and spelling errors. No functional change.
I know "busses" is not an error, but "buses" was more common, so I used it
consistently.

Signed-off-by: Marta Rybczynska <rybczynska@gmail.com> (pci_reset_bridge_secondary_bus())
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>


# 2e35afae 08-Aug-2013 Alex Williamson <alex.williamson@redhat.com>

PCI: pciehp: Add reset_slot() method

PCIe hotplug has a bus per slot, so we can just use a normal
secondary bus reset. However, if a slot supports surprise removal,
a bus reset can be seen as a presence detection change triggering
a hot-remove followed by a hot-add. Disable presence detection from
triggering an interrupt or being polled around the bus reset.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


# d8537548 03-Jul-2013 Kees Cook <keescook@chromium.org>

drivers: avoid format strings in names passed to alloc_workqueue()

For the workqueue creation interfaces that do not expect format strings,
make sure they cannot accidently be parsed that way. Additionally, clean
up calls made with a single parameter that would be handled as a format
string. Many callers are passing potentially dynamic string content, so
use "%s" in those cases to avoid any potential accidents.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# c2be6f93 10-Jan-2013 Yijing Wang <wangyijing@huawei.com>

PCI: pciehp: Use per-slot workqueues to avoid deadlock

When we have a hotplug-capable PCIe port with a second hotplug-capable
PCIe port below it, removing the device below the upstream port causes
a deadlock.

The deadlock happens because we use the pciehp_wq workqueue to run
pciehp_power_thread(), which uses pciehp_disable_slot() to remove devices
below the upstream port. When we remove the downstream PCIe port, we call
pciehp_remove(), the pciehp driver's .remove() method. That calls
flush_workqueue(pciehp_wq), which deadlocks because the
pciehp_power_thread() work item is still running.

This patch avoids the deadlock by creating a workqueue for every PCIe port
and removing the single shared workqueue.

Here's the call path that leads to the deadlock:

pciehp_queue_pushbutton_work
queue_work(pciehp_wq) # queue pciehp_power_thread
...

pciehp_power_thread
pciehp_disable_slot
remove_board
pciehp_unconfigure_device
pci_stop_and_remove_bus_device
...
pciehp_remove # pciehp driver .remove method
pciehp_release_ctrl
pcie_cleanup_slot
flush_workqueue(pciehp_wq)

This is fairly urgent because it can be caused by simply unplugging a
Thunderbolt adapter, as reported by Daniel below.

[bhelgaas: changelog]
Reference: http://lkml.kernel.org/r/CAMVG2ssiRgcTD1bej2tkUUfsWmpL5eNtPcNif9va2-Gzb2u8nQ@mail.gmail.com
Reported-and-tested-by: Daniel J Blueman <daniel@quora.org>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org


# 537a77e6 24-Jul-2012 Jiang Liu <jiang.liu@huawei.com>

PCI/pciehp: Use PCI Express Capability accessors

Use PCI Express Capability access functions to simplify pciehp.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>


# e73cfecd 11-Jul-2012 Yinghai Lu <yinghai@kernel.org>

PCI: pciehp: remove unused pciehp_get_max_lnk_width(), pciehp_get_cur_lnk_width()

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>


# 2debd928 27-Jan-2012 Yinghai Lu <yinghai@kernel.org>

PCI: pciehp: Disable/enable link during slot power off/on

On a system with a repeater on the system board to support gen2 hotplug,
we found that when an ExpressModule is removed from some slots,
/var/log/messages will be full of "card present/not present" warnings.

It turns out the root complex is continually trying to train the link to
the repeater because the repeater has not been reset.

This patch will disable the link at removal time to allow the repeater
to be reset properly. This also prevents a potential AER message at
removal time.

Also, when testing hotplug on a system under development, we found if we
boot the system without an EM installed, and later hot-add an EM, it
does not work with Linux, but another OS is ok. The root cause is that
BIOS left link disabled when slot was empty at boot time, and other OS
is modifying the link disable bit in link ctrl during power on/off.

So we should do the same thing to disable/enable link during power off/on.

-v2: check link DLLA bit instead of 100ms waiting.
Separate link disable/enable functions to another patch.

Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 7f822999 27-Jan-2012 Yinghai Lu <yinghai@kernel.org>

PCI: pciehp: Add Disable/enable link functions

Will use it during power off/on of slots

Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# bffe4f72 27-Jan-2012 Yinghai Lu <yinghai@kernel.org>

PCI: pciehp: Add pcie_wait_link_not_active()

Will use it for link disable status checking.

Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 4e2ce405 27-Jan-2012 Yinghai Lu <yinghai@kernel.org>

PCI: pciehp: make check_link_active more helpful

A few changes:
- remove the 'inline' and let the complier decide
- return a bool to indicate whether the link was active
- add a debug message to indicate link state when it beocmes active

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 2f5d8e4f 27-Jan-2012 Yinghai Lu <yinghai@kernel.org>

PCI: pciehp: replace unconditional sleep with config space access check

During reviewing
| PCI: pciehp: wait 1000 ms before Link Training check
Linus said:
>...
> That's a *long* time, and it's irritating to the user. It makes the
> user think "the machine is slow".
>...
> And quite frankly, an unconditional one-second delay here seems bad.
>Two seconds was unacceptable, one second is just bad.

Try to access the pci conf of a pci device that is supposed to show up
in 1s. If we can read back a valid vendor/device id, we can return
early.

Related discussion could be found:
https://lkml.org/lkml/2011/12/6/339

-v2: seperate code to pci_bus_read_dev_vendor_id() from pci_scan_device()
and reuse it from pciehp code. Suggested by Matthew Wilcox.
-v3: According to Kenj, don't use array in stack, and don't wait too long
for crs, also return fail status if not found.
Also separate pci_bus_dev_read_vendor_id() change to another patch.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# ab4ca782 07-Nov-2011 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: Handle push button event asynchronously

Use non-ordered workqueue for attention button events.

Attention button events on each slot can be handled asynchronously. So
we should use non-ordered workqueue. This patch also removes ordered
workqueue in pciehp as a result.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 486b10b9 07-Nov-2011 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: Handle push button event asynchronously

Use non-ordered workqueue for attention button events.

Attention button events on each slot can be handled asynchronously. So
we should use non-ordered workqueue. This patch also removes ordered
workqueue in pciehp as a result.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# b3c00454 10-Nov-2011 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: wait 100 ms after Link Training check

If the port supports Link speeds greater than 5.0 GT/s, we must wait
for 100 ms after Link training completes before sending configuration
request.

Acked-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 0027cb3e 10-Nov-2011 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: wait 1000 ms before Link Training check

We need to wait for 1000 ms after Data Link Layer Link Active (DLLLA)
bit reads 1b before sending configuration request. Currently pciehp
does this wait after checking Link Training (LT) bit. But we need it
before checking LT bit because LT is still set even after DLLLA bit is
set on some platforms.

Acked-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# fdbd3ce9 07-Nov-2011 Yinghai Lu <yinghai.lu@oracle.com>

PCI: pciehp: Retrieve link speed after link is trained

During hot plug, board_added will call pciehp_power_on_slot().
But link speed is updated in pciehp_power_on_slot().

We should not update link speed there, because that is too early.

So move the link speed update to pciehp_check_link_status() after making
sure the link has been trained.

-v2: fix compile warning that Kenji found.

Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 0cab0841 10-Jul-2011 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: change wait time for valid configuration access

Naoki Yanagimoto reported that configuration read on some hot-added
PCIe device returns invalid value. This patch fixes this problem.

According to the PCIe spec, software must wait for at least 1 second
to judge if the hot-added device is broken after Data Link Layer State
Changed Event. This patch changes pciehp driver to wait for 1 second
after the Data Link Layer State Changed Event is detected before
initiating a configuration access instead of 100 ms.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Naoki Yanagimoto <yanagimoto@np.css.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# a827ea30 18-Oct-2010 Tejun Heo <tj@kernel.org>

pciehp: update workqueue usage

* Rename pciehp_wq to pciehp_ordered_wq and add non-ordered pciehp_wq
which is used instead of the system workqueue. This is to remove
the use of flush_scheduled_work() which is deprecated and scheduled
for removal.

* With cmwq in place, there's no point in creating workqueues lazily.
Create both pciehp_wq and pciehp_ordered_wq upfront.

* Include workqueue.h from pciehp.h.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 5a0e3ad6 24-Mar-2010 Tejun Heo <tj@kernel.org>

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>


# e1944c6b 16-Mar-2010 Bjorn Helgaas <bjorn.helgaas@hp.com>

PCI: print resources consistently with %pR

No functional change; just print resources in the conventional style.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 3749c51a 13-Dec-2009 Matthew Wilcox <willy@infradead.org>

PCI: Make current and maximum bus speeds part of the PCI core

Move the max_bus_speed and cur_bus_speed into the pci_bus. Expose the
values through the PCI slot driver instead of the hotplug slot driver.
Update all the hotplug drivers to use the pci_bus instead of their own
data structures.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 536c8cb4 13-Dec-2009 Matthew Wilcox <willy@infradead.org>

PCI: Unify pcie_link_speed and pci_bus_speed

These enums must not overlap anyway, since we only have a single
pci_bus_speed_strings array. Use a single enum, and move it to
pci.h. Add 'SPEED' to the pcie names to make it clear what they are.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 5651c48c 12-Nov-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI pciehp: fix power fault interrupt storm problem

Enabling power fault detected event notification in current pciehp
might cause power fault interrupt storm on some machines. On those
machines. On those machines, power fault detected bit in the slot
status register was set again immediately when it is cleared in the
interrupt service routine, and next power fault detected interrupt was
notified again. Therefore, disable power fault detected event
notification for now.

This patch also removes unnecessary handling for power fault cleared
event because this event is not supported by PCIe spec.

Tested-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 1518c17a 10-Nov-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: use pci_pcie_cap()

Use pci_pcie_cap() instead of pci_find_capability() to get PCIe capability
offset in pciehp driver. This avoids unnecessary search in PCI
configuration space. This patch also removes 'cap_base' field in
struct controller, that was used to hold PCIe capability offset by
pciehp itself.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 445f7985 05-Oct-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: return error on read/write failure

Current pciehp returns successfully on read/write failure with dummy
state values. It should return error instead.

With this patch, pciehp no longer uses hotplug_slot_info data
structure. So this also removes hotplug_slot_info related code. But
note that it still allocates hotplug_slot_info because it is required
by pci hotplug core.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 3c3a1b17 05-Oct-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: remove wrong workaround for bad DLLP

Remove wrong workaround for BAD DLLP error, which confused surprise
down error with DLL errors.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# f22daf1f 05-Oct-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: disable DLL state changed event notification

Current pciehp doesn't handle Data Link Layer State Changed Event
notification. So it needs to be disabled at initialization time,
otherwise other event notifications are not generated.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 07a09694 15-Sep-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: remove number field

Since slot_cap field in struct controller contains physical slot
number informationq, we don't need number field in struct slot.

Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 82a9e79e 15-Sep-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: remove hpc_ops

The struct hpc_ops seems a set of hooks to controller specific
routines. But, it is meaningless because no hotplug controller driver
follows this framework.

Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 385e2491 15-Sep-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: remove pci_dev field

Since we have a pointer to pcie_device in struct controller, we don't
need a pointer to pci_dev.

Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 6aaa6d06 15-Sep-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: remove crit_sect mutex

The crit_sect mutex defined in struct controller is to serialize
hot-plug operations against multiple slots under the same bus. But,
since PCIe doesnstream port has only one slot at most, it is
meaningless and we don't need it.

Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# d54798f0 15-Sep-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: remove first_slot field

The slot number can be calculated only by physical slot number field
in the slot capabilities register. So the first_slot field in struct
controller is meaningless and we don't need it.

Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# a2359a334 15-Sep-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: remove slot_device_offset field

Since the device number of the hot-slot under the PCIe downstream port
is always 0, the slot_device_offset field in the slot is meaningless
and we don't need it.

Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 0e363159 15-Sep-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: remove hp_slot field

The hp_slot field is to identify the slot under the same
controller. But, since PCIe downstream port has only one slot at most,
it is meaningless and we don't need it.

Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# d689f7eb 15-Sep-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: remove device field

The device field in the struct slot is not necessary because it is
always 0 in pciehp driver.

Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# ab9c6c86 15-Sep-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: remove bus field

The bus field in struct slot is not necessary.

Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# e23727da 15-Sep-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: remove num_slots field

Since PCIe downstream port has only one slot at most, we don't need
num_slots field in struct controller. Note that struct controller
itself doesn't exist if PCIe downstream port has no slot.

Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 8720d27d 15-Sep-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: remove slot_list field

Since PCIe downstream port has only one slot at most, we don't need
'slot_list' linked list to manage multiple slots under the port.

Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 825c423a 28-Jul-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI hotplug: add support for 5.0G link speed

Add support for PCI-E 5.0 GT/s in max_bus_speed and cur_bus_speed.

Reviewed-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# bd3d99c1 01-Jun-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: Remove untested Electromechanical Interlock (EMI) support in pciehp.

The EMI support in pciehp is obviously broken. It is implemented using
struct hotplug_slot_attribute, but sysfs_ops for pci_slot_ktype is NOT
for struct hotplug_slot_attribute, but for struct pci_slot_attribute.
This bug had been there for a long time, maybe it was introduced when
PCI slot framework was introduced. The reason why this bug didn't
cause any problem is maybe the EMI support is not tested at all
because of lack of test environment.

As described above, the EMI support in pciehp seems not to be tested
at all. So this patch removes EMI support from pciehp, instead of
fixing the bug.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 99f0169c 02-Feb-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: enable software notification on empty slots

Current pciehp disables software notification of adapter presence
changed event and MRL changed event when slot is turned off. Because
of this, there is no way to detect those events on empty slots in the
current pciehp implementation.

According to the past discussion(*), this behavior was introduced to
prevent endless loop that could happen if pcie_isr() runs after power
fault is detected on a certain platform whose stickey power-fault bit
remains on till the slot is powered on again.

(*) http://sourceforge.net/mailarchive/message.php?msg_id=20051130135409.A14918%40unix-os.sc.intel.com

I think this endless loop can be avoided using one bit flag that
indicates power fault had been detected, instead of disabling software
notification of adapter present changed event and MRL changed event.

With this patch, we can enable software notification mechanism of
presence changed and MRL changed event on the empty slots again.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 81b840cd 02-Feb-2009 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: fix possible endless loop in pcie_isr

Fix possible endless loop in pcie_isr.

Currently, pcie_isr() (interrupt service routine of pciehp) can end up in an
endless loop if the Slot Status register is set again immediately after being
cleared. According to the past discussion (see below URL) this case can happen
if the power fault detected bit is set during handling.

http://sourceforge.net/mailarchive/message.php?msg_id=20051130135409.A14918%40unix-os.sc.intel.com

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# dbc7e1e5 28-Jan-2009 Eric W. Biederman <ebiederm@xmission.com>

PCI: pciehp: Handle interrupts that happen during initialization.

Move the enabling of interrupts after all of the data structures
are setup so that we can safely run the interrupt handler as
soon as it is registered.

Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Jesse Barnes <jbarnes@hobbes.lan>


# 322162a7 18-Dec-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: cleanup register and field definitions

Clean up register definitions related to PCI Express Hot plug.

- Add register definitions into include/linux/pci_regs.h, and use
them instead of pciehp's locally definied register definitions.
- Remove pciehp's locally defined register definitions
- Remove unused register definitions in pciehp.
- Some minor cleanups.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 67f65338 18-Dec-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: ignore undefined bit in link status register

Bit 10 in Link Status register used to be defined as Training Error in
the PCI Express 1.0a specification. But it was removed by Training Error
ECN and is no longer defined. So pciehp must ignore the value read from
it.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 18b341b7 22-Oct-2008 Taku Izumi <izumi.taku@jp.fujitsu.com>

PCI hotplug: pciehp: message refinement

This patch refines messages in pciehp module. The main changes are as
follows:

- remove the trailing "."
- remove __func__ as much as possible
- capitalize the first letter of messages
- show PCI device address including its domain

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# f18e9625 21-Oct-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI hotplug: pciehp: poll data link layer link active

This patch adds polling mechanism for Data Link Layer Link Active bit
after turning power on, instead of waiting for 1000 msec. This reduces
reduce the unnecessary long wait.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# b84346ef 21-Oct-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI hotplug: pciehp: fix possible memory leak in pcie_init

Fix the error paths in pcie_init to avoid leaking memory.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# e1acb24f 20-Oct-2008 Alex Chiang <achiang@hp.com>

PCI: pciehp: remove 'name' parameter

We do not need to manage our own name parameter, especially since
the PCI core can change it on our behalf, in the case of duplicate
slot names.

Remove 'name' from pciehp's version of struct slot, and remove
unused 'task_list' as well.

Cc: kristen.c.accardi@intel.com
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 7f2feec1 04-Sep-2008 Taku Izumi <izumi.taku@jp.fujitsu.com>

PCI: pciehp: replace printk with dev_printk

This patch replaces printks within pciehp module with dev_printks.

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# f7a10e32 22-Aug-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: pciehp: fix irq initialization

Current pciehp driver gets irq number from pci_dev->irq. But because
pciehp driver is a pci express port service driver, it should get irq
number from pcie_device->irq.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# a5827f40 27-Aug-2008 Adrian Bunk <bunk@kernel.org>

PCI: fix pciehp_free_irq()

This patch fixes an obvious bug (loop was never entered) caused by
commit 820943b6fc4781621dee52ba026106758a727dd3
(pciehp: cleanup pcie_poll_cmd).

Reported-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 167e782e 21-Aug-2008 Alex Chiang <achiang@hp.com>

PCI: pciehp: Rename duplicate slot name N as N-1, N-2, N-M...

Commit 3800345f723fd130d50434d4717b99d4a9f383c8 (pciehp: fix slot name)
introduces the pciehp_slot_with_bus module parameter, which was intended
to help work around broken firmware that assigns the same name to multiple
slots.

Commit 9e4f2e8d4ddb04ad16a3828cd9a369a5a5287009 (pciehp: add message about
pciehp_slot_with_bus option) tells the user to use the above parameter
in the event of a name collision.

This approach is sub-optimal because it requires too much work from
the user.

Instead, let's rename the slot on behalf of the user. If firmware
assigns the name N to multiple slots, then:

The first registered slot is assigned N
The second registered slot is assigned N-1
The third registered slot is assigned N-2
The Mth registered slot becomes N-M

In the event we overflow the slot->name parameter, we report an
error to the user.

This is a temporary fix until the entire PCI core can be reworked
such that individual drivers no longer have to manage their own
slot names.

Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 56adc59d 24-Jul-2008 Jesse Barnes <jbarnes@hobbes.lan>

PCI hotplug: fix typo in pcie hotplug output

Comamnd->Command

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 3aa50c44 26-Jun-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: remove needless command completed interrupt setting

Currently, pciehp driver enables command completed interrupt as follows.

(1) Don't enable at initialization.
(2) Enable command completed interrupt whenever pciehp issues a
command, if the command doesn't attempt to disable the interrupt.
(3) Disable command completed interrupt at driver unloading.

Once we enable command completed interrupt, we don't need to re-enable
it for every command. So we can simplify above steps as follows:

(1) Enable command completed interrupt at initialization.
(2) No special sequence for command completed interrupt.
(3) Disable command completed interrupt at driver unloading.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# c4635eb0 19-Jun-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: fix interrupt initialization

Current pciehp driver's intialization sequence is as follows:

(1) initialize controller data structure
(2) install interrupt handler
(3) enable software notification
(4) initialize controller specific slot data structure
(5) initialize generic slot data structure and register it to pci hotplug core

The interrupt handler of pciehp assumes that controller specific slot
data structure is already initialized. However, it is installed at (2)
before initializing controller specific slot data structure at
(4). Because of this, pciehp driver cannot handle the following cases
properly.

- If devices that shares IRQ with pciehp raise interrupts between (2) and (4).
- If hotplug events (e.g. MRL open) happen between (3) and (4).

We already have a workaround for this problem ("pciehp: fix NULL
dereference in interrupt handler: dbd79aed1aea2bece0bf43cc2ff3b2f9baf48a08).
But we still need fundamental fix.

This patch fix the problem by changing the initilization sequence as follows:

(1) initialize controller data structure
(2) initialize controller specific slot data structure
(3) install interrupt handler
(4) enable software notification
(5) initialize generic slot data structure and register it to pci hotplug core

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 563f1190 19-Jun-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: remove inline from command related functions

The pcie_poll_cmd() and pcie_wait_cmd() are too large to be
inlined.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 66618bad 19-Jun-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: change command polling frequency

Change command polling frequency to 100Hz from 10Hz in order to reduce
the delay in the common case of a command completing quickly.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 820943b6 19-Jun-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: cleanup pcie_poll_cmd

Cleanup pcie_poll_cmd(): check the slot status once before entering our
completion test loop and convert the loop to a simpler while() block.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# b30dd56d 19-Jun-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: fix typo in hpc_release_ctlr

Fix the typo in hpc_release_ctlr().

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 6a3f0849 02-Jun-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: removes redundant NULL write to slot status register

Cleanup to remove a redundant NULL write to SLOTSTATUS.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# d8b23e8f 02-Jun-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: fixes typo in dbg_ctrl() in pciehp_hpc.c

Fixup a typo in dbg_ctrl(); it was fetching SLOTSTATUS twice.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# ac9c052d 28-May-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

shpchp: check firmware before taking control

Fix the following problems of shpchp driver about getting hotplug
control from firmware.

- The shpchp driver must not control the hotplug controller if it
fails to get control from the firmware. But current shpchp
controls the hotplug controller regardless the result, because it
doesn't check the return value of get_hp_hw_control_from_firmware().

- Current shpchp driver doesn't support _OSC.

The pciehp driver already have the code for evaluating _OSC and OSHP
and shpchp and pciehp can share it. So this patch move that code from
pciehp to acpi_pcihp.c.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# d737bdc1 27-May-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: block signals while waiting for command completion

Since we need to wait for command completion for muximum 1sec, waiting
command should not be interrupted by a signal.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 125c39f7 27-May-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: evaluate _OSC/OSHP before controller init

Current pciehp evaluates _OSC/OSHP method after some controller
initialization is done. So if evaluating _OSC/OSHP is failed, we need
to cleanup already initialized data structures or hardware. This
clearly is not robust way. With this patch, _OSC/OSHP evaluation is
done first.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 0711c70e 27-May-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: move msleep after power off

According to the PCI Express specification, we must wait for at least
1 second after turning power off before taking any action that relies
on power having been removed from the slot/adapter. For this, current
pciehp wait for 1 second after issuing the power off command in
hpc_power_off_slot() function. But waiting for 1 second in
hpc_power_off_slot() can make pciehp probing slow-down because pciehp
probe code calls hpc_power_off_slot() if the slot is not occupied just
in case. We don't need to wait for 1 second at the pciehp probe time
because there is no action on that empty slot. So move 1 second wait
from hpc_power_off_slot() to the caller of hpc_power_off_slot().

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 6592e02a 27-May-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: poll cmd completion if hotplug interrupt is disabled

Fix improper long wait for command completion in pciehp probing.

As described in PCI Express specification, software notification is
not generated if the command that occurs as a result of a write to the
Slot Control register that disables software notification of command
completed events. Since pciehp driver doesn't take it into account,
such command is issued in pciehp probing, and it causes improper long
wait for command completion.

This patch changes the pciehp driver to take such command into
account.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 5808639b 27-May-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: fix slow probing

Fix the "pciehp probing slow" problem reported from Jan C. Nordholz in
http://bugzilla.kernel.org/show_bug.cgi?id=10751.

The command completed bit in Slot Status register applies only to
commands issued to control the attention indicator, power indicator,
power controller, or electromechanical interlock. However, writes to
other parts of the Slot Control register would end up writing to the
control fields. Hence, any write to Slot Control register is
considered as a command. However, if the controller doesn't support
any of attention indicator, power indicator, power controller and
electromechanical interlock, command completed bit would not set in
writing to Slot Control register. In this case, we should not wait for
command completed bit set, otherwise all commands would be considered
not completed in timeout seconds (1 sec.).

The cause of the problem is pciehp driver didn't take this situation
into account. This patch changes pciehp to take it into account. This
patch also add the check for "No Command Completed Support" bit in
Slot Capability register. If it is set, we should not wait for command
completed bit set as well.

This problem seems to be revealed by the commit
c27fb883dffe11aa4cb35ecea1fa1832ba45d4da that fixed the bug that
pciehp did not wait for command completed properly (pciehp just
ignored the command completion event).

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# dbd79aed 27-May-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: fix NULL dereference in interrupt handler

Fix the following NULL dereference problem reported from Pierre Ossman
and Ingo Molnar.

pciehp: HPC vendor_id 8086 device_id 27d0 ss_vid 0 ss_did 0
pciehp: pciehp_find_slot: slot (device=0x0) not found
BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
IP: [<ffffffff80494a8b>] pciehp_handle_presence_change+0x7e/0x113
PGD 0
Oops: 0000 [1]
CPU 0
Modules linked in:
Pid: 1, comm: swapper Tainted: G W 2.6.26-rc3-sched-devel.git-00001-g2b99b26-dirty #170
RIP: 0010:[<ffffffff80494a8b>] [<ffffffff80494a8b>] pciehp_handle_presence_change+0x7e/0x113
RSP: 0000:ffff81003f83fbb0 EFLAGS: 00010046
RAX: 0000000000000039 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000046
RBP: ffff81003f83fbd0 R08: 0000000000000001 R09: ffffffff80245103
R10: 0000000000000020 R11: 0000000000000000 R12: ffff81003ea53a30
R13: 0000000000000000 R14: 0000000000000011 R15: ffffffff80495926
FS: 0000000000000000(0000) GS:ffffffff80be7400(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000070 CR3: 0000000000201000 CR4: 00000000000006a0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff81003f83e000, task ffff81003f840000)
Stack: 0000000000000008 ffff81003f83fbf6 ffff81003ea53a30 0000000000000008
ffff81003f83fc10 ffffffff80495ab4 0000000000000011 0000000000000002
0000000000000202 0000000000000202 00000000fffffff4 ffff81003ea53a30
Call Trace:
[<ffffffff80495ab4>] pcie_isr+0x18e/0x1bc
[<ffffffff80260831>] request_irq+0x106/0x12f
[<ffffffff80495fb6>] pcie_init+0x15e/0x6cc
[<ffffffff804933a3>] pciehp_probe+0x64/0x541
[<ffffffff8048f4e7>] pcie_port_probe_service+0x4c/0x76
[<ffffffff8054af70>] driver_probe_device+0xd4/0x1f0
[<ffffffff8054b108>] __driver_attach+0x7c/0x7e
[<ffffffff8054b08c>] ? __driver_attach+0x0/0x7e
[<ffffffff8054a4b6>] bus_for_each_dev+0x53/0x7d
[<ffffffff8054ad3c>] driver_attach+0x1c/0x1e
[<ffffffff8054a9c2>] bus_add_driver+0xdd/0x25b
[<ffffffff80c09d3d>] ? pcied_init+0x0/0x8b
[<ffffffff8054b288>] driver_register+0x5f/0x13e
[<ffffffff80c09d3d>] ? pcied_init+0x0/0x8b
[<ffffffff8048f441>] pcie_port_service_register+0x47/0x49
[<ffffffff80c09d52>] pcied_init+0x15/0x8b
[<ffffffff80bf3938>] kernel_init+0x75/0x243
[<ffffffff808639d2>] ? _spin_unlock_irq+0x2b/0x3a
[<ffffffff80228d1f>] ? finish_task_switch+0x57/0x9a
[<ffffffff8020c258>] child_rip+0xa/0x12
[<ffffffff8020bcec>] ? restore_args+0x0/0x30
[<ffffffff80bf38c3>] ? kernel_init+0x0/0x243
[<ffffffff8020c24e>] ? child_rip+0x0/0x12

Code: 83 80 00 00 00 48 39 f0 75 e1 0f b6 c9 48 c7 c2 00 0e 8d 80 48 c7 c6 8a 60 a6 80 48 c7 c7 10 db a8 80 31 c0 e8 3f 8d d9 ff 31 db <48> 8b 43 70 48 8d 75 ef 48 89 df ff 50 30 80 7d ef 00 74 37 48
RIP [<ffffffff80494a8b>] pciehp_handle_presence_change+0x7e/0x113
RSP <ffff81003f83fbb0>
CR2: 0000000000000070
Kernel panic - not syncing: Fatal exception

The situation under which it occurs is hw and timing related: it appears
to happen on a system that has PCI hotplug hardware but with no active
hotplug cards, and another interrupt in the same (shared) IRQ line
arrives too early, before the hotplug-slot entry has been set up - as
triggered by CONFIG_DEBUG_SHIRQ=y:

This patch contains the following two fixes.

(1) Clear all events bits in Slot Status register to prevent the pciehp
driver from detecting the spurious events that would have been occur
before pciehp loading.

(2) Add check whether slot initialization had been already done.

This is short term fix. We need more structural fixes to install
interrupt handler after slot initialization is done.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# a53edac1 29-Apr-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: fix error message about getting hotplug control

People are confused by the following error message that actually is
not for indicating a error.

Cannot get control of hotplug hardware for pci %s

This patch changes this message to debug message.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@hobbes.lan>


# b7aa1f16 25-Apr-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: Fix command write

Current implementation of pciehp_write_cmd() always enables command
completed interrupt. But pciehp_write_cmd() is also used for clearing
command completed interrupt enable bit. In this case, we must not set
the command completed interrupt enable bit. To fix this bug, this
patch add the check to see if caller wants to change command complete
interrupt enable bit.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 4ea3e58b 25-Apr-2008 Adrian Bunk <bunk@kernel.org>

make pciehp_acpi_get_hp_hw_control_from_firmware()

this_patch_makes_the_needlessly_global_pciehp_acpi_get_hp_hw_control_from_firmware_static

;)

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 2aeeef11 25-Apr-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: Clean up pcie_init()

Clean up pciehp_ini(). This patch is trying to

- Remove redundant capablity checks that were already done in PCIe
port bus driver.
- Separate the code only for debugging and make debug information
easier to read.
- Make the entire code easier to read and understand what it is doing.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# d84be093 25-Apr-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: Mask hotplug interrupt at controller release

We must disable hotplug interrupt at controller relase time, otherwise
spurious interrupts might happen if any slot events occured (e.g. MRL
change) after unloading pciehp driver.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# cff00654 25-Apr-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: Remove useless hotplug interrupt enabling

Hotplug interrupt is enabled at initialization and nobody clears it.
So we need to setup it in each command. This patch removes redundant
codes about this.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# ae416e6b 25-Apr-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: Fix wrong slot capability check

Current pciehp saves only 8bits of Slot Capability registers in
ctrl->ctrlcap. But it refers more than 8bit for checking EMI capability.
It is clearly a bug and EMI would never work. To fix this problem,
this patch saves full Slot Capability contens in ctrl->slot_cap. It also
reduce the redundant reads of Slot Capability register. And this pach
also cleans up the macros to check the slot capabilitys (e.g. MRL_SENS(),
and so on).

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# c27fb883 25-Apr-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: Fix wrong slot control register access

Current pciehp implementaion clears hotplug events without waiting for
command completion. Because of this, events might not be cleared properly.
To prevent this problem, we must use pciehp_write_cmd() to write to
Slot Control register.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 2d32a9ae 25-Apr-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: Add missing memory barrier

Fix the possible race condition between pcie_isr() and pciehp_write_cmd()
because of the lack of memory barrier.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# c6b069e9 25-Apr-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: Fix interrupt event handlig

Current pciehp implementation disables and re-enables hotplug interrupts
in its interrupt handler. This operation might be intend to guarantee
that interrupts for the events newly occured during previous events are
being handled will be successfully generated. But current implementaion
has the following prolems.

- Current interrupt service routin clears status changes without
waiting command completion. Because of this, events might not be
cleared properly.
- Current interrupt service routine clears status changes caused by
disabling or enabling hotplug interrupts itself. This will lose new
events that occurs during previous interrupts are being handled.
- Current implementation doesn't have any serialization mechanism
between the code to wait for command completion and the interrupt
handler that clears the command completion events caused by itself.
There is clearly race conditions between them, and it may cause
the problem that waiting for command completion doesn't work for
example.

To fix those problems, this patch stops disabling/re-enabling hotplug
interrupts in interrupt service routine. Instead of this, this patch
re-inspects Slot Status register after clearing what is presumed to
be the last bending interrupt in order to guarantee that all interrupt
events are serviced.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>


# 66bef8c0 03-Mar-2008 Harvey Harrison <harvey.harrison@gmail.com>

PCI: replace remaining __FUNCTION__ occurrences

__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# c1ef5cbd 04-Mar-2008 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pci: hotplug: pciehp: fix error code path in hpc_power_off_slot

Fix the error code path in hpc_power_off_slot().

The Bad DLLP Mask bit must be restored before return.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# f1050a35 20-Dec-2007 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: workaround against Bad DLLP during power off

Set Bad DLLP Mask bit in Correctable Error Mask Register during
turning power off the slot.

This is the workaround against Bad DLLP error that sometimes happen
during turning power off on the slot which conforms to PCI Express
1.0a spec. The cause of this error seems that PCI Express 1.0a spec
doesn't have the following consideration that was added to PCI Express
1.1 spec.

"If the port is associated with a hot-pluggable slot (Hot-Plug
Capable bit in the Slot Capabilities register set to 1b), and
Power Controller Control bit in Slot Control register is 1b(Off),
then any transition to DL Inactive must not be considered an
error."

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 8bb7c7af 20-Dec-2007 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: wait for 1000ms before LED operation after power off

After turning power off, we must wait for at least 1 second *before*
LED operation.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# ecdde939 21-Nov-2007 Mark Lord <lkml@rtr.ca>

PCIe: fix double initialization bug

Earlier patches to split out the hardware init for PCIe hotplug resulted in
some one-time initializations being redone on every resume cycle. Eg.
irq/polling initialization.

This patch splits the hardware init into two parts, and separates the
one-time initializations from those so that they only ever get done once,
as intended.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 08e7a7d2 28-Nov-2007 Mark Lord <lkml@rtr.ca>

PCI: more fixes for PCIe Hotplug so that it works with ExpressCard slots on Dell notebooks (and others?) in conjunction with modparam of pciehp_force=1

Split out the hotplug hardware initialization code from pcie_init()
into pcie_init_enable_events(), without changing any functionality.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 40730d10 09-Aug-2007 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: minor cleanups for pciehp_hpc.c

Minor cleanups for pciehp_hpc.c. The 80 column rules, removing
unnecessary lines, and so on.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 71ad556d 09-Aug-2007 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: remove trailing whitespace from pciehp_hpc.c

Remove trailing whitespaces from pciehp_hpc.c.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# c8426483 09-Aug-2007 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: remove DBG_XXX_ROUTINE

This patch removes DBG_ENTER_ROUTIN, DBG_LEAVE_ROUTINE and related
code, which seem no longer needed.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 57d90c02 09-Aug-2007 Kristen Carlson Accardi <kristen.c.accardi@intel.com>

PCI Hotplug: pciehp: Request control over PCI Express Capability as well as Native hotplug

According to the PCI firmware spec (3.0), the OS must claim control
over the PCI Express Capability bits in addition to the PCI Express
Native Hot Plug feature when executing _OSC.

Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# f4778364 31-May-2007 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

PCI: hotplug: pciehp: Fix possible race condition in writing slot

The slot control register is modified as follows:

(1) Read the register value
(2) Change the value
(3) Write the value to the register

Those must be done atomically, otherwise writing to control register
would cause an unexpected result.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 5d386e1a 06-Mar-2007 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: Event handling rework

The event handler of PCIEHP driver is unnecessarily very complex. In
addition, current event handler can only a fixed number of events at
the same time, and some of events would be lost if several number of
events happened at the same time.

This patch simplify the event handler using 'work queue', and it also
fix the above-mentioned issue.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 34d03419 09-Jan-2007 Kristen Carlson Accardi <kristen.c.accardi@intel.com>

PCIEHP: Add Electro Mechanical Interlock (EMI) support to the PCIE hotplug driver.

Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 262303fe 21-Dec-2006 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: fix wait command completion

This patch fixes this problem that pciehp driver will sleep
unnecessarily long when waiting for command completion. With this
patch, modprobe pciehp driver becomes very faster as follows for
instance.

o Without this patch
# time /sbin/modprobe pciehp

real 0m4.976s
user 0m0.000s
sys 0m0.004s

o With this patch
# time /sbin/modprobe pciehp

real 0m0.640s
user 0m0.000s
sys 0m0.004s

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 44ef4cef 21-Dec-2006 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: cleanup wait command completion

This patch cleans up the code to wait for command completion.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 75e13178 21-Dec-2006 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: remove unused pcie_cap_base

This patch removes unused pcie_cap_base variable.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# a0f018da 21-Dec-2006 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: cleanup register access

This patch cleans up register access functions. This has no functional
change.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 48fe3915 21-Dec-2006 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: remove unnecessary php_ctlr

The struct php_ctlr seems to be only for complicating codes. This
patch removes struct php_ctlr and related codes.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 5cbded58 13-Dec-2006 Robert P. J. Day <rpjday@mindspring.com>

[PATCH] getting rid of all casts of k[cmz]alloc() calls

Run this:

#!/bin/sh
for f in $(grep -Erl "\([^\)]*\) *k[cmz]alloc" *) ; do
echo "De-casting $f..."
perl -pi -e "s/ ?= ?\([^\)]*\) *(k[cmz]alloc) *\(/ = \1\(/" $f
done

And then go through and reinstate those cases where code is casting pointers
to non-pointers.

And then drop a few hunks which conflicted with outstanding work.

Cc: Russell King <rmk@arm.linux.org.uk>, Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Greg KH <greg@kroah.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Karsten Keil <kkeil@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Steven French <sfrench@us.ibm.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Jaroslav Kysela <perex@suse.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 9d167dc3 13-Nov-2006 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp: remove unnecessary pci_disable_msi

This patch fixes the problem that "irq XX: nobody cared" kernel oops
is reported when pciehp is once rmmoded and insmoded again. The cause
of this problem is pciehp driver calls pci_disable_msi() at controller
release time, even though it must be done by PCI Express Port Bus
driver. This patch removes unnecessary pci_disable_msi() call from
pciehp driver.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# dd5619cb 22-Sep-2006 Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

pciehp - add missing locking

This patch fixes the problem that system will panic if multiple power
on/off operations are issued to the same slot in parallel. This
problem can be easily reproduced by commands below.

# while true; do echo 1 > power; echo 0 > power; done &
# while true; do echo 1 > power; echo 0 > power; done &

The cause is lack of locking for enable/disable operations. This patch
fixes this problem.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 7d12e780 05-Oct-2006 David Howells <dhowells@redhat.com>

IRQ: Maintain regs pointer globally rather than passing to IRQ handlers

Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
of passing regs around manually through all ~1800 interrupt handlers in the
Linux kernel.

The regs pointer is used in few places, but it potentially costs both stack
space and code to pass it around. On the FRV arch, removing the regs parameter
from all the genirq function results in a 20% speed up of the IRQ exit path
(ie: from leaving timer_interrupt() to leaving do_IRQ()).

Where appropriate, an arch may override the generic storage facility and do
something different with the variable. On FRV, for instance, the address is
maintained in GR28 at all times inside the kernel as part of general exception
handling.

Having looked over the code, it appears that the parameter may be handed down
through up to twenty or so layers of functions. Consider a USB character
device attached to a USB hub, attached to a USB controller that posts its
interrupts through a cascaded auxiliary interrupt controller. A character
device driver may want to pass regs to the sysrq handler through the input
layer which adds another few layers of parameter passing.

I've build this code with allyesconfig for x86_64 and i386. I've runtested the
main part of the code on FRV and i386, though I can't test most of the drivers.
I've also done partial conversion for powerpc and MIPS - these at least compile
with minimal configurations.

This will affect all archs. Mostly the changes should be relatively easy.
Take do_IRQ(), store the regs pointer at the beginning, saving the old one:

struct pt_regs *old_regs = set_irq_regs(regs);

And put the old one back at the end:

set_irq_regs(old_regs);

Don't pass regs through to generic_handle_irq() or __do_IRQ().

In timer_interrupt(), this sort of change will be necessary:

- update_process_times(user_mode(regs));
- profile_tick(CPU_PROFILING, regs);
+ update_process_times(user_mode(get_irq_regs()));
+ profile_tick(CPU_PROFILING);

I'd like to move update_process_times()'s use of get_irq_regs() into itself,
except that i386, alone of the archs, uses something other than user_mode().

Some notes on the interrupt handling in the drivers:

(*) input_dev() is now gone entirely. The regs pointer is no longer stored in
the input_dev struct.

(*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking. It does
something different depending on whether it's been supplied with a regs
pointer or not.

(*) Various IRQ handler function pointers have been moved to type
irq_handler_t.

Signed-Off-By: David Howells <dhowells@redhat.com>
(cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)


# e50d1088 08-Aug-2006 Kristen Carlson Accardi <kristen.c.accardi@intel.com>

pciehp: make pciehp build for powerpc

Make pciehp build on powerpc

Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 6b4486e2 01-Jul-2006 Thomas Gleixner <tglx@linutronix.de>

[PATCH] irq-flags: pci: Use the new IRQF_ constants

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 1396a8c3 12-Jun-2006 Greg Kroah-Hartman <gregkh@suse.de>

[PATCH] 64bit resource: fix up printks for resources in pci core and hotplug drivers

This is needed if we wish to change the size of the resource structures.

Based on an original patch from Vivek Goyal <vgoyal@in.ibm.com>

Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 9c64f977 09-May-2006 Jan Beulich <jbeulich@novell.com>

[PATCH] PCI Hotplug: Fix recovery path from errors during pcie_init()

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Kristen Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 2433ee26 24-Apr-2006 Kristen Accardi <kristen.c.accardi@intel.com>

[PATCH] pciehp: dont call pci_enable_dev

Don't call pci_enable_device from pciehp because the pcie port service driver
already does this.

Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 81b26bca 18-Apr-2006 Kristen Accardi <kristen.c.accardi@intel.com>

[PATCH] PCI Hotplug: don't use acpi_os_free

acpi_os_free should not be used by drivers outside
of acpi/*/*.c. Replace with kfree().

Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# b2e6e3ba 16-Mar-2006 MUNEDA Takahiro <muneda.takahiro@jp.fujitsu.com>

[PATCH] acpiphp: fix acpi_path_name

I encountered the problem that the insmod of the acpiphp
fails because of the mis-freeing of the memory.

I tested this patch on my tiger4 box.

Signed-off-by: MUNEDA Takahiro <muneda.takahiro@jp.fujitsu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 783c49fc 03-Mar-2006 Kristen Accardi <kristen.c.accardi@intel.com>

[PATCH] PCI Hotplug: add common acpi functions to core

shpchprm_acpi.c and pciehprm_acpi.c are nearly identical. In addition,
there are functions in both these files that are also in acpiphp_glue.c.
This patch will remove duplicate functions from shpchp, pciehp, and
acpiphp and move this functionality to pci_hotplug, as it is not
hardware specific. Get rid of shpchprm* and pciehprm* files since they
are no longer needed. shpchprm_nonacpi.c and pciehprm_nonacpi.c are
identical, as well as shpchprm_legacy.c and can be replaced with a
macro.

This patch also changes acpiphp to use the common hpp code.

Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 6aa4cdd0 13-Jan-2006 Ingo Molnar <mingo@elte.hu>

[PATCH] PCI hotplug: convert semaphores to mutex

semaphore to mutex conversion.

the conversion was generated via scripts, and the result was validated
automatically via a script as well.

build tested with allyesconfig.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# c7ab337f 08-Dec-2005 Thomas Schaefer <thomas.schaefer@kontron.com>

[PATCH] pciehp: handle sticky power-fault status

This patch disables power fault, MRL sensor and presence detection
interrupts when a PCIe slot is powered-off and enables those
interrupts when it is powered-on again. This is necessary to prevent
the associated events from causing an endless cycle of interrupts
due to the power-fault bit, which stays set till power is restored
to the slot.

Signed-off-by: Thomas Schaefer <thomas.schaefer@kontron.com>
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# de25968c 08-Jan-2006 Tim Schmielau <tim@physik3.uni-rostock.de>

[PATCH] fix more missing includes

Include fixes for 2.6.14-git11. Should allow to remove sched.h from
module.h on i386, x86_64, arm, ia64, ppc, ppc64, and s390. Probably more
to come since I haven't yet checked the other archs.

Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 5a49f203 23-Nov-2005 Rajesh Shah <rajesh.shah@intel.com>

[PATCH] PCI Express Hotplug: clear sticky power-fault bit

Per the PCI Express spec, the power-fault-detected bit in the
slot status register can be set anytime hardware detects a power
fault, regardless of whether the slot has a device populated in
it or not. This bit is sticky and must be explicitly cleared.
This patch is needed to allow hot-add after such a power fault
has been detected.

Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 5d1b8c9e 13-Nov-2005 Andrew Morton <akpm@osdl.org>

[PATCH] pciehp_hpc build fix

drivers/pci/hotplug/pciehp_hpc.c:221: parse error before "pcie_isr"
drivers/pci/hotplug/pciehp_hpc.c:221: warning: type defaults to `int' in declaration of `pcie_isr'
drivers/pci/hotplug/pciehp_hpc.c:221: warning: data definition has no type or storage class
drivers/pci/hotplug/pciehp_hpc.c: In function `hpc_release_ctlr':
drivers/pci/hotplug/pciehp_hpc.c:715: implicit declaration of function `free_irq'
drivers/pci/hotplug/pciehp_hpc.c: At top level:
drivers/pci/hotplug/pciehp_hpc.c:839: parse error before "pcie_isr"
drivers/pci/hotplug/pciehp_hpc.c:840: warning: return type defaults to `int'
drivers/pci/hotplug/pciehp_hpc.c: In function `pcie_isr':
drivers/pci/hotplug/pciehp_hpc.c:850: `IRQ_NONE' undeclared (first use in this function)
drivers/pci/hotplug/pciehp_hpc.c:850: (Each undeclared identifier is reported only once
drivers/pci/hotplug/pciehp_hpc.c:850: for each function it appears in.)
drivers/pci/hotplug/pciehp_hpc.c:979: `IRQ_HANDLED' undeclared (first use in this function)
drivers/pci/hotplug/pciehp_hpc.c: In function `pcie_init':
drivers/pci/hotplug/pciehp_hpc.c:1362: implicit declaration of function `request_irq'

Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 6560aa5c 07-Nov-2005 Rajesh Shah <rajesh.shah@intel.com>

[PATCH] PCI: fix namespace clashes

Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 8239def1 31-Oct-2005 Rajesh Shah <rajesh.shah@intel.com>

[PATCH] pciehp: fix handling of power faults during hotplug

The current pciehp implementation reports a power-fail error
even if the condition has cleared by the time the corresponding
interrupt handling code gets a chance to run. This patch
fixes this problem.

Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# a3a45ec8 31-Oct-2005 Rajesh Shah <rajesh.shah@intel.com>

[PATCH] pciehp: clean-up how we request control of hotplug hardware

This patch further tweaks how we request control of hotplug
controller hardware from BIOS. We first search the ACPI namespace
corresponding to a specific hotplug controller looking for an
_OSC or OSHP method. On failure, we successively move to the
ACPI parent object, till we hit the highest level host bridge
in the hierarchy. This allows for different types of BIOS's
which place the _OSC/OSHP methods at various places in the acpi
namespace, while still not encroaching on the namespace of
some other root level host bridge.

This patch also introduces a new load time option (pciehp_force)
that allows us to bypass all _OSC/OSHP checking. Not supporting
these methods seems to be be the most common ACPI firmware problem
we've run into. This will still _not_ allow the pciehp driver to
work correctly if the BIOS really doesn't support pciehp (i.e. if
it doesn't generate a hotplug interrupt). Use this option with
caution. Some BIOS's may deliberately not build any _OSC/OSHP
methods to make sure it retains control the hotplug hardware.
Using the pciehp_force parameter for such systems can lead to
two separate entities trying to control the same hardware.

Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 1a9ed1bf 31-Oct-2005 Rajesh Shah <rajesh.shah@intel.com>

[PATCH] pciehp: reduce debug message verbosity

Reduce the number of debug messages generated if pciehp debug is
enabled. I tried to restrict this to removing debug messages that
are either early-driver-debug type messages, or print information
that can be inferred through other debug prints.

Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# ed6cbcf2 31-Oct-2005 Rajesh Shah <rajesh.shah@intel.com>

[PATCH] pciehp: miscellaneous cleanups

Remove un-necessary header includes, remove dead code, remove
some hardcoded constants...

Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# a8a2be94 31-Oct-2005 Rajesh Shah <rajesh.shah@intel.com>

[PATCH] pciehp: reduce dependence on ACPI

Reduce the PCI Express hotplug driver's dependence on ACPI.
We don't walk the acpi namespace anymore to build a list of
bridges and devices. We go to ACPI only to run the _OSC or
_OSHP methods to transition control of hotplug hardware from
system BIOS to the hotplug driver, and to run the _HPP
method to get hotplug device parameters like cache line size,
latency timer and SERR/PERR enable from BIOS.

Note that one of the side effects of this patch is that pciehp
does not automatically enable the hot-added device or its DMA
bus mastering capability now. It expects the device driver to
do that. This may break some drivers and we will have to fix
them as they are reported.

Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 8cf4c195 16-Aug-2005 Kristen Accardi <kristen.c.accardi@intel.com>

[PATCH] PCI Hotplug: new contact info

Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 8b245e45 06-May-2005 Dely Sy <dlsy@snoqualmie.dp.intel.com>

[PATCH] PCI Hotplug: get pciehp to work on the downstream port of a switch

Here is the updated patch to get pciehp driver to work for downstream
port of a switch and handle the difference in the offset value of PCI
Express capability list item of different ports.

Signed-off-by: Dely Sy <dely.l.sy@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


# 1da177e4 16-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org>

Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!