#
b5ff74c1 |
|
16-Feb-2024 |
Michael Kelley <mhklinux@outlook.com> |
PCI: hv: Fix ring buffer size calculation For a physical PCI device that is passed through to a Hyper-V guest VM, current code specifies the VMBus ring buffer size as 4 pages. But this is an inappropriate dependency, since the amount of ring buffer space needed is unrelated to PAGE_SIZE. For example, on x86 the ring buffer size ends up as 16 Kbytes, while on ARM64 with 64 Kbyte pages, the ring size bloats to 256 Kbytes. The ring buffer for PCI pass-thru devices is used for only a few messages during device setup and removal, so any space above a few Kbytes is wasted. Fix this by declaring the ring buffer size to be a fixed 16 Kbytes. Furthermore, use the VMBUS_RING_SIZE() macro so that the ring buffer header is properly accounted for, and so the size is rounded up to a page boundary, using the page size for which the kernel is built. While w/64 Kbyte pages this results in a 64 Kbyte ring buffer header plus a 64 Kbyte ring buffer, that's the smallest possible with that page size. It's still 128 Kbytes better than the current code. Link: https://lore.kernel.org/linux-pci/20240216202240.251818-1-mhklinux@outlook.com Signed-off-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Ilpo Jarvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Long Li <longli@microsoft.com> Cc: <stable@vger.kernel.org> # 5.15.x
|
#
07e8f885 |
|
01-Nov-2023 |
Andrew Cooper <andrew.cooper3@citrix.com> |
x86/apic: Drop apic::delivery_mode This field is set to APIC_DELIVERY_MODE_FIXED in all cases, and is read exactly once. Fold the constant in uv_program_mmr() and drop the field. Searching for the origin of the stale HyperV comment reveals commit a31e58e129f7 ("x86/apic: Switch all APICs to Fixed delivery mode") which notes: As a consequence of this change, the apic::irq_delivery_mode field is now pointless, but this needs to be cleaned up in a separate patch. 6 years is long enough for this technical debt to have survived. [ bp: Fold in https://lore.kernel.org/r/20231121123034.1442059-1-andrew.cooper3@citrix.com ] Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lore.kernel.org/r/20231102-x86-apic-v1-1-bf049a2a0ed6@citrix.com
|
#
f741bcad |
|
22-Sep-2023 |
Kees Cook <keescook@chromium.org> |
PCI: hv: Annotate struct hv_dr_state with __counted_by Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct hv_dr_state. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Link: https://lore.kernel.org/linux-pci/20230922175257.work.900-kees@kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: "Gustavo A. R. Silva" <gustavoars@kernel.org> Acked-by: Wei Liu <wei.liu@kernel.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Krzysztof Wilczyński <kw@linux.com> Cc: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: Rob Herring <robh@kernel.org> Cc: Wei Liu <wei.liu@kernel.org> Cc: linux-hyperv@vger.kernel.org Cc: linux-pci@vger.kernel.org
|
#
04bbe863 |
|
16-Aug-2023 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Fix a crash in hv_pci_restore_msi_msg() during hibernation When a Linux VM with an assigned PCI device runs on Hyper-V, if the PCI device driver is not loaded yet (i.e. MSI-X/MSI is not enabled on the device yet), doing a VM hibernation triggers a panic in hv_pci_restore_msi_msg() -> msi_lock_descs(&pdev->dev), because pdev->dev.msi.data is still NULL. Avoid the panic by checking if MSI-X/MSI is enabled. Link: https://lore.kernel.org/r/20230816175939.21566-1-decui@microsoft.com Fixes: dc2b453290c4 ("PCI: hv: Rework MSI handling") Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: sathyanarayanan.kuppuswamy@linux.intel.com Reviewed-by: Michael Kelley <mikelley@microsoft.com> Cc: stable@vger.kernel.org
|
#
067d6ec7 |
|
14-Jun-2023 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Add a per-bus mutex state_lock In the case of fast device addition/removal, it's possible that hv_eject_device_work() can start to run before create_root_hv_pci_bus() starts to run; as a result, the pci_get_domain_bus_and_slot() in hv_eject_device_work() can return a 'pdev' of NULL, and hv_eject_device_work() can remove the 'hpdev', and immediately send a message PCI_EJECTION_COMPLETE to the host, and the host immediately unassigns the PCI device from the guest; meanwhile, create_root_hv_pci_bus() and the PCI device driver can be probing the dead PCI device and reporting timeout errors. Fix the issue by adding a per-bus mutex 'state_lock' and grabbing the mutex before powering on the PCI bus in hv_pci_enter_d0(): when hv_eject_device_work() starts to run, it's able to find the 'pdev' and call pci_stop_and_remove_bus_device(pdev): if the PCI device driver has loaded, the PCI device driver's probe() function is already called in create_root_hv_pci_bus() -> pci_bus_add_devices(), and now hv_eject_device_work() -> pci_stop_and_remove_bus_device() is able to call the PCI device driver's remove() function and remove the device reliably; if the PCI device driver hasn't loaded yet, the function call hv_eject_device_work() -> pci_stop_and_remove_bus_device() is able to remove the PCI device reliably and the PCI device driver's probe() function won't be called; if the PCI device driver's probe() is already running (e.g., systemd-udev is loading the PCI device driver), it must be holding the per-device lock, and after the probe() finishes and releases the lock, hv_eject_device_work() -> pci_stop_and_remove_bus_device() is able to proceed to remove the device reliably. Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs") Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230615044451.5580-6-decui@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
a847234e |
|
14-Jun-2023 |
Dexuan Cui <decui@microsoft.com> |
Revert "PCI: hv: Fix a timing issue which causes kdump to fail occasionally" This reverts commit d6af2ed29c7c1c311b96dac989dcb991e90ee195. The statement "the hv_pci_bus_exit() call releases structures of all its child devices" in commit d6af2ed29c7c is not true: in the path hv_pci_probe() -> hv_pci_enter_d0() -> hv_pci_bus_exit(hdev, true): the parameter "keep_devs" is true, so hv_pci_bus_exit() does *not* release the child "struct hv_pci_dev *hpdev" that is created earlier in pci_devices_present_work() -> new_pcichild_device(). The commit d6af2ed29c7c was originally made in July 2020 for RHEL 7.7, where the old version of hv_pci_bus_exit() was used; when the commit was rebased and merged into the upstream, people didn't notice that it's not really necessary. The commit itself doesn't cause any issue, but it makes hv_pci_probe() more complicated. Revert it to facilitate some upcoming changes to hv_pci_probe(). Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Wei Hu <weh@microsoft.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230615044451.5580-5-decui@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
add9195e |
|
14-Jun-2023 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Remove the useless hv_pcichild_state from struct hv_pci_dev The hpdev->state is never really useful. The only use in hv_pci_eject_device() and hv_eject_device_work() is not really necessary. Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230615044451.5580-4-decui@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
2738d5ab |
|
14-Jun-2023 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Fix a race condition in hv_irq_unmask() that can cause panic When the host tries to remove a PCI device, the host first sends a PCI_EJECT message to the guest, and the guest is supposed to gracefully remove the PCI device and send a PCI_EJECTION_COMPLETE message to the host; the host then sends a VMBus message CHANNELMSG_RESCIND_CHANNELOFFER to the guest (when the guest receives this message, the device is already unassigned from the guest) and the guest can do some final cleanup work; if the guest fails to respond to the PCI_EJECT message within one minute, the host sends the VMBus message CHANNELMSG_RESCIND_CHANNELOFFER and removes the PCI device forcibly. In the case of fast device addition/removal, it's possible that the PCI device driver is still configuring MSI-X interrupts when the guest receives the PCI_EJECT message; the channel callback calls hv_pci_eject_device(), which sets hpdev->state to hv_pcichild_ejecting, and schedules a work hv_eject_device_work(); if the PCI device driver is calling pci_alloc_irq_vectors() -> ... -> hv_compose_msi_msg(), we can break the while loop in hv_compose_msi_msg() due to the updated hpdev->state, and leave data->chip_data with its default value of NULL; later, when the PCI device driver calls request_irq() -> ... -> hv_irq_unmask(), the guest crashes in hv_arch_irq_unmask() due to data->chip_data being NULL. Fix the issue by not testing hpdev->state in the while loop: when the guest receives PCI_EJECT, the device is still assigned to the guest, and the guest has one minute to finish the device removal gracefully. We don't really need to (and we should not) test hpdev->state in the loop. Fixes: de0aa7b2f97d ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()") Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230615044451.5580-3-decui@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
440b5e36 |
|
14-Jun-2023 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Fix a race condition bug in hv_pci_query_relations() Since day 1 of the driver, there has been a race between hv_pci_query_relations() and survey_child_resources(): during fast device hotplug, hv_pci_query_relations() may error out due to device-remove and the stack variable 'comp' is no longer valid; however, pci_devices_present_work() -> survey_child_resources() -> complete() may be running on another CPU and accessing the no-longer-valid 'comp'. Fix the race by flushing the workqueue before we exit from hv_pci_query_relations(). Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs") Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230615044451.5580-2-decui@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
a494aef2 |
|
20-Apr-2023 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Replace retarget_msi_interrupt_params with hyperv_pcpu_input_arg 4 commits are involved here: A (2016): commit 0de8ce3ee8e3 ("PCI: hv: Allocate physically contiguous hypercall params buffer") B (2017): commit be66b6736591 ("PCI: hv: Use page allocation for hbus structure") C (2019): commit 877b911a5ba0 ("PCI: hv: Avoid a kmemleak false positive caused by the hbus buffer") D (2018): commit 68bb7bfb7985 ("X86/Hyper-V: Enable IPI enlightenments") Patch D introduced the per-CPU hypercall input page "hyperv_pcpu_input_arg" in 2018. With patch D, we no longer need the per-Hyper-V-PCI-bus hypercall input page "hbus->retarget_msi_interrupt_params" that was added in patch A, and the issue addressed by patch B is no longer an issue, and we can also get rid of patch C. The change here is required for PCI device assignment to work for Confidential VMs (CVMs) running without a paravisor, because otherwise we would have to call set_memory_decrypted() for "hbus->retarget_msi_interrupt_params" before calling the hypercall HVCALL_RETARGET_INTERRUPT. Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Link: https://lore.kernel.org/r/20230421013025.17152-1-decui@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
2c6ba421 |
|
26-Mar-2023 |
Michael Kelley <mikelley@microsoft.com> |
PCI: hv: Enable PCI pass-thru devices in Confidential VMs For PCI pass-thru devices in a Confidential VM, Hyper-V requires that PCI config space be accessed via hypercalls. In normal VMs, config space accesses are trapped to the Hyper-V host and emulated. But in a confidential VM, the host can't access guest memory to decode the instruction for emulation, so an explicit hypercall must be used. Add functions to make the new MMIO read and MMIO write hypercalls. Update the PCI config space access functions to use the hypercalls when such use is indicated by Hyper-V flags. Also, set the flag to allow the Hyper-V PCI driver to be loaded and used in a Confidential VM (a.k.a., "Isolation VM"). The driver has previously been hardened against a malicious Hyper-V host[1]. [1] https://lore.kernel.org/all/20220511223207.3386-2-parri.andrea@gmail.com/ Co-developed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://lore.kernel.org/r/1679838727-87310-13-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
96ec2939 |
|
05-Jan-2023 |
Dawei Li <set_pte_at@outlook.com> |
Drivers: hv: Make remove callback of hyperv driver void returned Since commit fc7a6209d571 ("bus: Make remove callback return void") forces bus_type::remove be void-returned, it doesn't make much sense for any bus based driver implementing remove callbalk to return non-void to its caller. As such, change the remove function for Hyper-V VMBus based drivers to return void. Signed-off-by: Dawei Li <set_pte_at@outlook.com> Link: https://lore.kernel.org/r/TYCP286MB2323A93C55526E4DF239D3ACCAFA9@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
503112f4 |
|
07-Nov-2022 |
Olaf Hering <olaf@aepfle.de> |
PCI: hv: update comment in x86 specific hv_arch_irq_unmask The function hv_set_affinity was removed in commit 831c1ae7 ("PCI: hv: Make the code arch neutral by adding arch specific interfaces"). Signed-off-by: Olaf Hering <olaf@aepfle.de> Link: https://lore.kernel.org/r/20221107171831.25283-1-olaf@aepfle.de Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
d474d92d |
|
11-Nov-2022 |
Thomas Gleixner <tglx@linutronix.de> |
x86/apic: Remove X86_IRQ_ALLOC_CONTIGUOUS_VECTORS Now that the PCI/MSI core code does early checking for multi-MSI support X86_IRQ_ALLOC_CONTIGUOUS_VECTORS is not required anymore. Remove the flag and rely on MSI_FLAG_MULTI_PCI_MSI. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20221111122015.865042356@linutronix.de
|
#
c234ba80 |
|
04-Nov-2022 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Only reuse existing IRTE allocation for Multi-MSI Jeffrey added Multi-MSI support to the pci-hyperv driver by the 4 patches: 08e61e861a0e ("PCI: hv: Fix multi-MSI to allow more than one MSI vector") 455880dfe292 ("PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI") b4b77778ecc5 ("PCI: hv: Reuse existing IRTE allocation in compose_msi_msg()") a2bad844a67b ("PCI: hv: Fix interrupt mapping for multi-MSI") It turns out that the third patch (b4b77778ecc5) causes a performance regression because all the interrupts now happen on 1 physical CPU (or two pCPUs, if one pCPU doesn't have enough vectors). When a guest has many PCI devices, it may suffer from soft lockups if the workload is heavy, e.g., see https://lwn.net/ml/linux-kernel/20220804025104.15673-1-decui@microsoft.com/ Commit b4b77778ecc5 itself is good. The real issue is that the hypercall in hv_irq_unmask() -> hv_arch_irq_unmask() -> hv_do_hypercall(HVCALL_RETARGET_INTERRUPT...) only changes the target virtual CPU rather than physical CPU; with b4b77778ecc5, the pCPU is determined only once in hv_compose_msi_msg() where only vCPU0 is specified; consequently the hypervisor only uses 1 target pCPU for all the interrupts. Note: before b4b77778ecc5, the pCPU is determined twice, and when the pCPU is determined the second time, the vCPU in the effective affinity mask is used (i.e., it isn't always vCPU0), so the hypervisor chooses different pCPU for each interrupt. The hypercall will be fixed in future to update the pCPU as well, but that will take quite a while, so let's restore the old behavior in hv_compose_msi_msg(), i.e., don't reuse the existing IRTE allocation for single-MSI and MSI-X; for multi-MSI, we choose the vCPU in a round-robin manner for each PCI device, so the interrupts of different devices can happen on different pCPUs, though the interrupts of each device happen on some single pCPU. The hypercall fix may not be backported to all old versions of Hyper-V, so we want to have this guest side change forever (or at least till we're sure the old affected versions of Hyper-V are no longer supported). Fixes: b4b77778ecc5 ("PCI: hv: Reuse existing IRTE allocation in compose_msi_msg()") Co-developed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Co-developed-by: Carl Vanderlip <quic_carlv@quicinc.com> Signed-off-by: Carl Vanderlip <quic_carlv@quicinc.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20221104222953.11356-1-decui@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
e70af8d0 |
|
27-Oct-2022 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Fix the definition of vector in hv_compose_msi_msg() The local variable 'vector' must be u32 rather than u8: see the struct hv_msi_desc3. 'vector_count' should be u16 rather than u8: see struct hv_msi_desc, hv_msi_desc2 and hv_msi_desc3. Fixes: a2bad844a67b ("PCI: hv: Fix interrupt mapping for multi-MSI") Signed-off-by: Dexuan Cui <decui@microsoft.com> Cc: Jeffrey Hugo <quic_jhugo@quicinc.com> Cc: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://lore.kernel.org/r/20221027205256.17678-1-decui@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
5182fecc |
|
07-Jul-2022 |
Samuel Holland <samuel@sholland.org> |
PCI: hv: Take a const cpumask in hv_compose_msi_req_get_cpu() The cpumask that is passed to this function ultimately comes from irq_data_get_effective_affinity_mask(), which was recently changed to return a const cpumask pointer. The first level of functions handling the affinity mask were updated, but not this helper function. Fixes: 4d0b8298818b ("genirq: Return a const cpumask from irq_data_get_affinity_mask") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Samuel Holland <samuel@sholland.org> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20220708004931.1672-1-samuel@sholland.org Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
9167fd5d |
|
07-Jul-2022 |
Samuel Holland <samuel@sholland.org> |
PCI: hv: Take a const cpumask in hv_compose_msi_req_get_cpu() The cpumask that is passed to this function ultimately comes from irq_data_get_effective_affinity_mask(), which was recently changed to return a const cpumask pointer. The first level of functions handling the affinity mask were updated, but not this helper function. Fixes: 4d0b8298818b ("genirq: Return a const cpumask from irq_data_get_affinity_mask") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Samuel Holland <samuel@sholland.org> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220708004931.1672-1-samuel@sholland.org
|
#
4d0b8298 |
|
01-Jul-2022 |
Samuel Holland <samuel@sholland.org> |
genirq: Return a const cpumask from irq_data_get_affinity_mask Now that the irq_data_update_affinity helper exists, enforce its use by returning a a const cpumask from irq_data_get_affinity_mask. Since the previous commit already updated places that needed to call irq_data_update_affinity, this commit updates the remaining code that either did not modify the cpumask or immediately passed the modified mask to irq_set_affinity. Signed-off-by: Samuel Holland <samuel@sholland.org> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220701200056.46555-8-samuel@sholland.org
|
#
b4927bd2 |
|
11-May-2022 |
Andrea Parri (Microsoft) <parri.andrea@gmail.com> |
PCI: hv: Fix synchronization between channel callback and hv_pci_bus_exit() [ Similarly to commit a765ed47e4516 ("PCI: hv: Fix synchronization between channel callback and hv_compose_msi_msg()"): ] The (on-stack) teardown packet becomes invalid once the completion timeout in hv_pci_bus_exit() has expired and hv_pci_bus_exit() has returned. Prevent the channel callback from accessing the invalid packet by removing the ID associated to such packet from the VMbus requestor in hv_pci_bus_exit(). Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Link: https://lore.kernel.org/r/20220511223207.3386-3-parri.andrea@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
9937fa6d |
|
11-May-2022 |
Andrea Parri (Microsoft) <parri.andrea@gmail.com> |
PCI: hv: Add validation for untrusted Hyper-V values For additional robustness in the face of Hyper-V errors or malicious behavior, validate all values that originate from packets that Hyper-V has sent to the guest in the host-to-guest ring buffer. Ensure that invalid values cannot cause data being copied out of the bounds of the source buffer in hv_pci_onchannelcallback(). While at it, remove a redundant validation in hv_pci_generic_compl(): hv_pci_onchannelcallback() already ensures that all processed incoming packets are "at least as large as [in fact larger than] a response". Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Link: https://lore.kernel.org/r/20220511223207.3386-2-parri.andrea@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
a2bad844 |
|
11-May-2022 |
Jeffrey Hugo <quic_jhugo@quicinc.com> |
PCI: hv: Fix interrupt mapping for multi-MSI According to Dexuan, the hypervisor folks beleive that multi-msi allocations are not correct. compose_msi_msg() will allocate multi-msi one by one. However, multi-msi is a block of related MSIs, with alignment requirements. In order for the hypervisor to allocate properly aligned and consecutive entries in the IOMMU Interrupt Remapping Table, there should be a single mapping request that requests all of the multi-msi vectors in one shot. Dexuan suggests detecting the multi-msi case and composing a single request related to the first MSI. Then for the other MSIs in the same block, use the cached information. This appears to be viable, so do it. Suggested-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1652282599-21643-1-git-send-email-quic_jhugo@quicinc.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
b4b77778 |
|
11-May-2022 |
Jeffrey Hugo <quic_jhugo@quicinc.com> |
PCI: hv: Reuse existing IRTE allocation in compose_msi_msg() Currently if compose_msi_msg() is called multiple times, it will free any previous IRTE allocation, and generate a new allocation. While nothing prevents this from occurring, it is extraneous when Linux could just reuse the existing allocation and avoid a bunch of overhead. However, when future IRTE allocations operate on blocks of MSIs instead of a single line, freeing the allocation will impact all of the lines. This could cause an issue where an allocation of N MSIs occurs, then some of the lines are retargeted, and finally the allocation is freed/reallocated. The freeing of the allocation removes all of the configuration for the entire block, which requires all the lines to be retargeted, which might not happen since some lines might already be unmasked/active. Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Tested-by: Dexuan Cui <decui@microsoft.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1652282582-21595-1-git-send-email-quic_jhugo@quicinc.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
23e118a4 |
|
02-May-2022 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time Currently when the pci-hyperv driver finishes probing and initializing the PCI device, it sets the PCI_COMMAND_MEMORY bit; later when the PCI device is registered to the core PCI subsystem, the core PCI driver's BAR detection and initialization code toggles the bit multiple times, and each toggling of the bit causes the hypervisor to unmap/map the virtual BARs from/to the physical BARs, which can be slow if the BAR sizes are huge, e.g., a Linux VM with 14 GPU devices has to spend more than 3 minutes on BAR detection and initialization, causing a long boot time. Reduce the boot time by not setting the PCI_COMMAND_MEMORY bit when we register the PCI device (there is no need to have it set in the first place). The bit stays off till the PCI device driver calls pci_enable_device(). With this change, the boot time of such a 14-GPU VM is reduced by almost 3 minutes. Link: https://lore.kernel.org/lkml/20220419220007.26550-1-decui@microsoft.com/ Tested-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Jake Oshins <jakeo@microsoft.com> Link: https://lore.kernel.org/r/20220502074255.16901-1-decui@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
455880df |
|
27-Apr-2022 |
Jeffrey Hugo <quic_jhugo@quicinc.com> |
PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI In the multi-MSI case, hv_arch_irq_unmask() will only operate on the first MSI of the N allocated. This is because only the first msi_desc is cached and it is shared by all the MSIs of the multi-MSI block. This means that hv_arch_irq_unmask() gets the correct address, but the wrong data (always 0). This can break MSIs. Lets assume MSI0 is vector 34 on CPU0, and MSI1 is vector 33 on CPU0. hv_arch_irq_unmask() is called on MSI0. It uses a hypercall to configure the MSI address and data (0) to vector 34 of CPU0. This is correct. Then hv_arch_irq_unmask is called on MSI1. It uses another hypercall to configure the MSI address and data (0) to vector 33 of CPU0. This is wrong, and results in both MSI0 and MSI1 being routed to vector 33. Linux will observe extra instances of MSI1 and no instances of MSI0 despite the endpoint device behaving correctly. For the multi-MSI case, we need unique address and data info for each MSI, but the cached msi_desc does not provide that. However, that information can be gotten from the int_desc cached in the chip_data by compose_msi_msg(). Fix the multi-MSI case to use that cached information instead. Since hv_set_msi_entry_from_desc() is no longer applicable, remove it. Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1651068453-29588-1-git-send-email-quic_jhugo@quicinc.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
a765ed47 |
|
19-Apr-2022 |
Andrea Parri (Microsoft) <parri.andrea@gmail.com> |
PCI: hv: Fix synchronization between channel callback and hv_compose_msi_msg() Dexuan wrote: "[...] when we disable AccelNet, the host PCI VSP driver sends a PCI_EJECT message first, and the channel callback may set hpdev->state to hv_pcichild_ejecting on a different CPU. This can cause hv_compose_msi_msg() to exit from the loop and 'return', and the on-stack variable 'ctxt' is invalid. Now, if the response message from the host arrives, the channel callback will try to access the invalid 'ctxt' variable, and this may cause a crash." Schematically: Hyper-V sends PCI_EJECT msg hv_pci_onchannelcallback() state = hv_pcichild_ejecting hv_compose_msi_msg() alloc and init comp_pkt state == hv_pcichild_ejecting Hyper-V sends VM_PKT_COMP msg hv_pci_onchannelcallback() retrieve address of comp_pkt 'free' comp_pkt and return comp_pkt->completion_func() Dexuan also showed how the crash can be triggered after introducing suitable delays in the driver code, thus validating the 'assumption' that the host can still normally respond to the guest's compose_msi request after the host has started to eject the PCI device. Fix the synchronization by leveraging the requestor lock as follows: - Before 'return'-ing in hv_compose_msi_msg(), remove the ID (while holding the requestor lock) associated to the completion packet. - Retrieve the address *and call ->completion_func() within a same (requestor) critical section in hv_pci_onchannelcallback(). Reported-by: Wei Hu <weh@microsoft.com> Reported-by: Dexuan Cui <decui@microsoft.com> Suggested-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20220419122325.10078-7-parri.andrea@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
de5ddb7d |
|
19-Apr-2022 |
Andrea Parri (Microsoft) <parri.andrea@gmail.com> |
PCI: hv: Use vmbus_requestor to generate transaction IDs for VMbus hardening Currently, pointers to guest memory are passed to Hyper-V as transaction IDs in hv_pci. In the face of errors or malicious behavior in Hyper-V, hv_pci should not expose or trust the transaction IDs returned by Hyper-V to be valid guest memory addresses. Instead, use small integers generated by vmbus_requestor as request (transaction) IDs. Suggested-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20220419122325.10078-3-parri.andrea@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
08e61e86 |
|
13-Apr-2022 |
Jeffrey Hugo <quic_jhugo@quicinc.com> |
PCI: hv: Fix multi-MSI to allow more than one MSI vector If the allocation of multiple MSI vectors for multi-MSI fails in the core PCI framework, the framework will retry the allocation as a single MSI vector, assuming that meets the min_vecs specified by the requesting driver. Hyper-V advertises that multi-MSI is supported, but reuses the VECTOR domain to implement that for x86. The VECTOR domain does not support multi-MSI, so the alloc will always fail and fallback to a single MSI allocation. In short, Hyper-V advertises a capability it does not implement. Hyper-V can support multi-MSI because it coordinates with the hypervisor to map the MSIs in the IOMMU's interrupt remapper, which is something the VECTOR domain does not have. Therefore the fix is simple - copy what the x86 IOMMU drivers (AMD/Intel-IR) do by removing X86_IRQ_ALLOC_CONTIGUOUS_VECTORS after calling the VECTOR domain's pci_msi_prepare(). Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs") Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Link: https://lore.kernel.org/r/1649856981-14649-1-git-send-email-quic_jhugo@quicinc.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
8d217324 |
|
24-Mar-2022 |
Michael Kelley <mikelley@microsoft.com> |
PCI: hv: Propagate coherence from VMbus device to PCI device PCI pass-thru devices in a Hyper-V VM are represented as a VMBus device and as a PCI device. The coherence of the VMbus device is set based on the VMbus node in ACPI, but the PCI device has no ACPI node and defaults to not hardware coherent. This results in extra software coherence management overhead on ARM64 when devices are hardware coherent. Fix this by setting up the PCI host bus so that normal PCI mechanisms will propagate the coherence of the VMbus device to the PCI device. There's no effect on x86/x64 where devices are always hardware coherent. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Boqun Feng <boqun.feng@gmail.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/1648138492-2191-3-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
22ef7ee3 |
|
17-Mar-2022 |
YueHaibing <yuehaibing@huawei.com> |
PCI: hv: Remove unused hv_set_msi_entry_from_desc() Fix the following build error: drivers/pci/controller/pci-hyperv.c:769:13: error: ‘hv_set_msi_entry_from_desc’ defined but not used [-Werror=unused-function] 769 | static void hv_set_msi_entry_from_desc(union hv_msi_entry *msi_entry, The arm64 implementation of hv_set_msi_entry_from_desc() is not used after d06957d7a692 ("PCI: hv: Avoid the retarget interrupt hypercall in irq_unmask() on ARM64"), so remove it. Fixes: d06957d7a692 ("PCI: hv: Avoid the retarget interrupt hypercall in irq_unmask() on ARM64") Link: https://lore.kernel.org/r/20220317085130.36388-1-yuehaibing@huawei.com Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Boqun Feng <boqun.feng@gmail.com>
|
#
d06957d7 |
|
16-Feb-2022 |
Boqun Feng <boqun.feng@gmail.com> |
PCI: hv: Avoid the retarget interrupt hypercall in irq_unmask() on ARM64 On ARM64 Hyper-V guests, SPIs are used for the interrupts of virtual PCI devices, and SPIs can be managed directly via GICD registers. Therefore the retarget interrupt hypercall is not needed on ARM64. An arch-specific interface hv_arch_irq_unmask() is introduced to handle the architecture level differences on this. For x86, the behavior remains unchanged, while for ARM64 no hypercall is invoked when unmasking an irq for virtual PCI devices. Link: https://lore.kernel.org/r/20220217034525.1687678-1-boqun.feng@gmail.com Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
|
#
3149efcd |
|
26-Jan-2022 |
Long Li <longli@microsoft.com> |
PCI: hv: Fix NUMA node assignment when kernel boots with custom NUMA topology When kernel boots with a NUMA topology with some NUMA nodes offline, the PCI driver should only set an online NUMA node on the device. This can happen during KDUMP where some NUMA nodes are not made online by the KDUMP kernel. This patch also fixes the case where kernel is booting with "numa=off". Fixes: 999dd956d838 ("PCI: hv: Add support for protocol 1.3 and support PCI_BUS_RELATIONS2") Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@microsoft.com> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Link: https://lore.kernel.org/r/1643247814-15184-1-git-send-email-longli@linuxonhyperv.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
d9932b46 |
|
05-Jan-2022 |
Sunil Muthuswamy <sunilmut@microsoft.com> |
PCI: hv: Add arm64 Hyper-V vPCI support Add arm64 Hyper-V vPCI support by implementing the arch specific interfaces. Introduce an IRQ domain and chip specific to Hyper-v vPCI that is based on SPIs. The IRQ domain parents itself to the arch GIC IRQ domain for basic vector management. [bhelgaas: squash in fix from Yang Li <yang.lee@linux.alibaba.com>: https://lore.kernel.org/r/20220112003324.62755-1-yang.lee@linux.alibaba.com] Link: https://lore.kernel.org/r/1641411156-31705-3-git-send-email-sunilmut@linux.microsoft.com Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
|
#
831c1ae7 |
|
05-Jan-2022 |
Sunil Muthuswamy <sunilmut@microsoft.com> |
PCI: hv: Make the code arch neutral by adding arch specific interfaces Encapsulate arch dependencies in Hyper-V vPCI through a set of arch-dependent interfaces. Adding these arch specific interfaces will allow for an implementation for other architectures, such as arm64. There are no functional changes expected from this patch. Link: https://lore.kernel.org/r/1641411156-31705-2-git-send-email-sunilmut@linux.microsoft.com Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
|
#
14e04d0d |
|
18-Nov-2021 |
Naveen Naidu <naveennaidu479@gmail.com> |
PCI: hv: Use PCI_ERROR_RESPONSE to identify config read errors Include PCI_ERROR_RESPONSE along with 0xFFFFFFFF in the comment about identifying config read errors. This makes checks for config read errors easier to find. Comment change only. Link: https://lore.kernel.org/r/12124f41cab7d8aa944de05f85d9567bfe157704.1637243717.git.naveennaidu479@gmail.com Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
#
dc2b4532 |
|
06-Dec-2021 |
Thomas Gleixner <tglx@linutronix.de> |
PCI: hv: Rework MSI handling Replace the about to vanish iterators and make use of the filtering. Take the descriptor lock around the iterators. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20211206210748.629363944@linutronix.de
|
#
f1831208 |
|
08-Oct-2021 |
Krzysztof Wilczyński <kw@linux.com> |
PCI: hv: Remove unnecessary use of %hx "dom_req" is a u16 but varargs automatically promotes it to int, so there's no point in using the %h modifier. Drop it. See cbacb5ab0aa0 ("docs: printk-formats: Stop encouraging use of unnecessary %h[xudi] and %hh[xudi]") and 70eb2275ff8e ("checkpatch: add warning for unnecessary use of %h[xudi] and %hh[xudi]"). Link: https://lore.kernel.org/r/20211008222732.2868493-1-kw@linux.com Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
#
41608b64 |
|
30-Aug-2021 |
Long Li <longli@microsoft.com> |
PCI: hv: Fix sleep while in non-sleep context when removing child devices from the bus In hv_pci_bus_exit, the code is holding a spinlock while calling pci_destroy_slot(), which takes a mutex. This is not safe for spinlock. Fix this by moving the children to be deleted to a list on the stack, and removing them after spinlock is released. Fixes: 94d22763207a ("PCI: hv: Fix a race condition when removing the device") Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Rob Herring <robh@kernel.org> Cc: "Krzysztof Wilczyński" <kw@linux.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Michael Kelley <mikelley@microsoft.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/linux-hyperv/20210823152130.GA21501@kili/ Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/1630365207-20616-1-git-send-email-longli@linuxonhyperv.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
88f94c7f |
|
26-Jul-2021 |
Boqun Feng <boqun.feng@gmail.com> |
PCI: hv: Turn on the host bridge probing on ARM64 Now we have everything we need, just provide a proper sysdata type for the bus to use on ARM64 and everything else works. Link: https://lore.kernel.org/r/20210726180657.142727-9-boqun.feng@gmail.com Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
|
#
9e7f9178 |
|
26-Jul-2021 |
Boqun Feng <boqun.feng@gmail.com> |
PCI: hv: Set up MSI domain at bridge probing time Since PCI_HYPERV depends on PCI_MSI_IRQ_DOMAIN which selects GENERIC_MSI_IRQ_DOMAIN, we can use dev_set_msi_domain() to set up the MSI domain at probing time, and this works for both x86 and ARM64. Therefore use it as the preparation for ARM64 Hyper-V PCI support. As a result, no longer need to maintain ->fwnode in x86 specific pci_sysdata, and make hv_pcibus_device own it instead. Link: https://lore.kernel.org/r/20210726180657.142727-8-boqun.feng@gmail.com Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
|
#
38c0d266 |
|
26-Jul-2021 |
Boqun Feng <boqun.feng@gmail.com> |
PCI: hv: Set ->domain_nr of pci_host_bridge at probing time No functional change, just store and maintain the PCI domain number in the ->domain_nr of pci_host_bridge. Note that we still need to keep the copy of domain number in x86-specific pci_sysdata, because x86 is not a PCI_DOMAINS_GENERIC=y architecture, so the ->domain_nr of pci_host_bridge doesn't work for it yet. Link: https://lore.kernel.org/r/20210726180657.142727-7-boqun.feng@gmail.com Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
|
#
418cb6c8 |
|
26-Jul-2021 |
Arnd Bergmann <arnd@arndb.de> |
PCI: hv: Generify PCI probing In order to support ARM64 Hyper-V PCI, we need to set up the bridge at probing time because ARM64 is a PCI_DOMAIN_GENERIC=y arch and we don't have pci_config_window (ARM64 sysdata) for a PCI root bus on Hyper-V, so it's impossible to retrieve the information (e.g. PCI domains, MSI domains) from bus sysdata on ARM64 after creation. Originally in create_root_hv_pci_bus(), pci_create_root_bus() is used to create the root bus and the corresponding bridge based on x86 sysdata. Now we create a bridge first and then call pci_scan_root_bus_bridge(), which allows us to do the necessary set-ups for the bridge. Link: https://lore.kernel.org/r/20210726180657.142727-6-boqun.feng@gmail.com Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
|
#
8f6a6b3c |
|
12-Jul-2021 |
Sunil Muthuswamy <sunilmut@microsoft.com> |
PCI: hv: Support for create interrupt v3 Hyper-V vPCI protocol version 1_4 adds support for create interrupt v3. Create interrupt v3 essentially makes the size of the vector field bigger in the message, thereby allowing bigger vector values. For example, that will come into play for supporting LPI vectors on ARM, which start at 8192. Link: https://lore.kernel.org/r/MW4PR21MB20026A6EA554A0B9EC696AA8C0159@MW4PR21MB2002.namprd21.prod.outlook.com Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Wei Liu <wei.liu@kernel.org>
|
#
326dc2e1 |
|
12-May-2021 |
Long Li <longli@microsoft.com> |
PCI: hv: Remove bus device removal unused refcount/functions With the new method of flushing/stopping the workqueue before doing bus removal, the old mechanism of using refcount and wait for completion is no longer needed. Remove those dead code. Link: https://lore.kernel.org/r/1620806809-31055-1-git-send-email-longli@linuxonhyperv.com Signed-off-by: Long Li <longli@microsoft.com> [lorenzo.pieralisi@arm.com: Reworded subject] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
|
#
94d22763 |
|
12-May-2021 |
Long Li <longli@microsoft.com> |
PCI: hv: Fix a race condition when removing the device On removing the device, any work item (hv_pci_devices_present() or hv_pci_eject_device()) scheduled on workqueue hbus->wq may still be running and race with hv_pci_remove(). This can happen because the host may send PCI_EJECT or PCI_BUS_RELATIONS(2) and decide to rescind the channel immediately after that. Fix this by flushing/destroying the workqueue of hbus before doing hbus remove. Link: https://lore.kernel.org/r/1620806800-30983-1-git-send-email-longli@linuxonhyperv.com Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
|
#
7d815f4a |
|
25-May-2021 |
Haiyang Zhang <haiyangz@microsoft.com> |
PCI: hv: Add check for hyperv_initialized in init_hv_pci_drv() Add check for hv_is_hyperv_initialized() at the top of init_hv_pci_drv(), so if the pci-hyperv driver is force-loaded on non Hyper-V platforms, the init_hv_pci_drv() will exit immediately, without any side effects, like assignments to hvpci_block_ops, etc. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reported-and-tested-by: Mohammad Alqayeem <mohammad.alqyeem@nutanix.com> Reviewed-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/1621984653-1210-1-git-send-email-haiyangz@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
e0fad163 |
|
30-Mar-2021 |
Marc Zyngier <maz@kernel.org> |
PCI: hv: Drop msi_controller structure The Hyper-V PCI driver still makes use of a msi_controller structure, but it looks more like a distant leftover than anything actually useful, since it is initialised to 0 and never used for anything. Just remove it. Link: https://lore.kernel.org/r/20210330151145.997953-7-maz@kernel.org Tested-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
|
#
753ed9c9 |
|
16-Apr-2021 |
Joseph Salisbury <joseph.salisbury@microsoft.com> |
drivers: hv: Create a consistent pattern for checking Hyper-V hypercall status There is not a consistent pattern for checking Hyper-V hypercall status. Existing code uses a number of variants. The variants work, but a consistent pattern would improve the readability of the code, and be more conformant to what the Hyper-V TLFS says about hypercall status. Implemented new helper functions hv_result(), hv_result_success(), and hv_repcomp(). Changed the places where hv_do_hypercall() and related variants are used to use the helper functions. Signed-off-by: Joseph Salisbury <joseph.salisbury@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1618620183-9967-2-git-send-email-joseph.salisbury@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
be4017ce |
|
09-Mar-2021 |
Sebastian Andrzej Siewior <bigeasy@linutronix.de> |
PCI: hv: Use tasklet_disable_in_atomic() The hv_compose_msi_msg() callback in irq_chip::irq_compose_msi_msg is invoked via irq_chip_compose_msi_msg(), which itself is always invoked from atomic contexts from the guts of the interrupt core code. There is no way to change this w/o rewriting the whole driver, so use tasklet_disable_in_atomic() which allows to make tasklet_disable() sleepable once the remaining atomic users are addressed. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210309084242.516519290@linutronix.de
|
#
c77bfb54 |
|
26-Jan-2021 |
Bjorn Helgaas <bhelgaas@google.com> |
PCI: hv: Fix typo Fix misspelling of "silently". Link: https://lore.kernel.org/r/20210126213855.2923461-1-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
|
#
b59fb7b6 |
|
03-Feb-2021 |
Wei Liu <wei.liu@kernel.org> |
asm-generic/hyperv: update hv_interrupt_entry We will soon use the same structure to handle IO-APIC interrupts as well. Introduce an enum to identify the source and a data structure for IO-APIC RTE. While at it, update pci-hyperv.c to use the enum. No functional change. Signed-off-by: Wei Liu <wei.liu@kernel.org> Acked-by: Rob Herring <robh@kernel.org> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20210203150435.27941-13-wei.liu@kernel.org
|
#
72161299 |
|
24-Oct-2020 |
Thomas Gleixner <tglx@linutronix.de> |
x86/apic: Cleanup delivery mode defines The enum ioapic_irq_destination_types and the enumerated constants starting with 'dest_' are gross misnomers because they describe the delivery mode. Rename then enum and the constants so they actually make sense. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-6-dwmw2@infradead.org
|
#
915cff7f |
|
02-Oct-2020 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Fix hibernation in case interrupts are not re-created pci_restore_msi_state() directly writes the MSI/MSI-X related registers via MMIO. On a physical machine, this works perfectly; for a Linux VM running on a hypervisor, which typically enables IOMMU interrupt remapping, the hypervisor usually should trap and emulate the MMIO accesses in order to re-create the necessary interrupt remapping table entries in the IOMMU, otherwise the interrupts can not work in the VM after hibernation. Hyper-V is different from other hypervisors in that it does not trap and emulate the MMIO accesses, and instead it uses a para-virtualized method, which requires the VM to call hv_compose_msi_msg() to notify the hypervisor of the info that would be passed to the hypervisor in the case of the trap-and-emulate method. This is not an issue to a lot of PCI device drivers, which destroy and re-create the interrupts across hibernation, so hv_compose_msi_msg() is called automatically. However, some PCI device drivers (e.g. the in-tree GPU driver nouveau and the out-of-tree Nvidia proprietary GPU driver) do not destroy and re-create MSI/MSI-X interrupts across hibernation, so hv_pci_resume() has to call hv_compose_msi_msg(), otherwise the PCI device drivers can no longer receive interrupts after the VM resumes from hibernation. Hyper-V is also different in that chip->irq_unmask() may fail in a Linux VM running on Hyper-V (on a physical machine, chip->irq_unmask() can not fail because unmasking an MSI/MSI-X register just means an MMIO write): during hibernation, when a CPU is offlined, the kernel tries to move the interrupt to the remaining CPUs that haven't been offlined yet. In this case, hv_irq_unmask() -> hv_do_hypercall() always fails because the vmbus channel has been closed: here the early "return" in hv_irq_unmask() means the pci_msi_unmask_irq() is not called, i.e. the desc->masked remains "true", so later after hibernation, the MSI interrupt always remains masked, which is incorrect. Refer to cpu_disable_common() -> fixup_irqs() -> irq_migrate_all_off_this_cpu() -> migrate_one_irq(): static bool migrate_one_irq(struct irq_desc *desc) { ... if (maskchip && chip->irq_mask) chip->irq_mask(d); ... err = irq_do_set_affinity(d, affinity, false); ... if (maskchip && chip->irq_unmask) chip->irq_unmask(d); Fix the issue by calling pci_msi_unmask_irq() unconditionally in hv_irq_unmask(). Also suppress the error message for hibernation because the hypercall failure during hibernation does not matter (at this time all the devices have been frozen). Note: the correct affinity info is still updated into the irqdata data structure in migrate_one_irq() -> irq_do_set_affinity() -> hv_set_affinity(), so later when the VM resumes, hv_pci_restore_msi_state() is able to correctly restore the interrupt with the correct affinity. Link: https://lore.kernel.org/r/20201002085158.9168-1-decui@microsoft.com Fixes: ac82fc832708 ("PCI: hv: Add hibernation support") Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Jake Oshins <jakeo@microsoft.com>
|
#
6d2730cb |
|
25-Sep-2020 |
Krzysztof Wilczyński <kw@linux.com> |
PCI: hv: Document missing hv_pci_protocol_negotiation() parameter Add missing documentation for the parameter "version" and "num_version" of the hv_pci_protocol_negotiation() function and resolve build time kernel-doc warnings: drivers/pci/controller/pci-hyperv.c:2535: warning: Function parameter or member 'version' not described in 'hv_pci_protocol_negotiation' drivers/pci/controller/pci-hyperv.c:2535: warning: Function parameter or member 'num_version' not described in 'hv_pci_protocol_negotiation' No change to functionality intended. Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Link: https://lore.kernel.org/r/20200925234753.1767227-1-kw@linux.com Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
9006c133 |
|
26-Aug-2020 |
Thomas Gleixner <tglx@linutronix.de> |
x86/msi: Use generic MSI domain ops pci_msi_get_hwirq() and pci_msi_set_desc are not longer special. Enable the generic MSI domain ops in the core and PCI MSI code unconditionally and get rid of the x86 specific implementations in the X86 MSI code and in the hyperv PCI driver. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200826112332.564274859@linutronix.de
|
#
3b9c1d37 |
|
26-Aug-2020 |
Thomas Gleixner <tglx@linutronix.de> |
x86/msi: Consolidate MSI allocation Convert the interrupt remap drivers to retrieve the pci device from the msi descriptor and use info::hwirq. This is the first step to prepare x86 for using the generic MSI domain ops. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112332.466405395@linutronix.de
|
#
a459d9e1 |
|
06-Jul-2020 |
Wei Yongjun <weiyongjun1@huawei.com> |
PCI: hv: Make some functions static sparse report build warning as follows: drivers/pci/controller/pci-hyperv.c:941:5: warning: symbol 'hv_read_config_block' was not declared. Should it be static? drivers/pci/controller/pci-hyperv.c:1021:5: warning: symbol 'hv_write_config_block' was not declared. Should it be static? drivers/pci/controller/pci-hyperv.c:1090:5: warning: symbol 'hv_register_block_invalidate' was not declared. Should it be static? Those functions are not used outside of this file, so mark them static. Link: https://lore.kernel.org/r/20200706135234.80758-1-weiyongjun1@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
|
#
d6af2ed2 |
|
27-Jul-2020 |
Wei Hu <weh@microsoft.com> |
PCI: hv: Fix a timing issue which causes kdump to fail occasionally Kdump could fail sometime on Hyper-V guest because the retry in hv_pci_enter_d0() releases child device structures in hv_pci_bus_exit(). Although there is a second asynchronous device relations message sending from the host, if this message arrives to the guest after hv_send_resource_allocated() is called, the retry would fail. Fix the problem by moving retry to hv_pci_probe() and start the retry from hv_pci_query_relations() call. This will cause a device relations message to arrive to the guest synchronously; the guest would then be able to rebuild the child device structures before calling hv_send_resource_allocated(). Link: https://lore.kernel.org/r/20200727071731.18516-1-weh@microsoft.com Fixes: c81992e7f4aa ("PCI: hv: Retry PCI bus D0 entry on invalid device state") Signed-off-by: Wei Hu <weh@microsoft.com> [lorenzo.pieralisi@arm.com: fixed a comment and commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
|
#
d0684fd0 |
|
25-May-2020 |
Gustavo A. R. Silva <gustavoars@kernel.org> |
PCI: hv: Use struct_size() helper One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct hv_dr_state { ... struct hv_pcidev_description func[]; }; struct pci_bus_relations { ... struct pci_function_description func[]; } __packed; Make use of the struct_size() helper instead of an open-coded version in order to avoid any potential type mistakes. So, replace the following forms: offsetof(struct hv_dr_state, func) + (sizeof(struct hv_pcidev_description) * (relations->device_count)) offsetof(struct pci_bus_relations, func) + (sizeof(struct pci_function_description) * (bus_rel->device_count)) with: struct_size(dr, func, relations->device_count) and struct_size(bus_rel, func, bus_rel->device_count) respectively. Link: https://lore.kernel.org/r/20200525164319.GA13596@embeddedor Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Wei Liu <wei.liu@kernel.org>
|
#
c81992e7 |
|
06-May-2020 |
Wei Hu <weh@microsoft.com> |
PCI: hv: Retry PCI bus D0 entry on invalid device state When kdump is triggered, some PCI devices may have not been shut down cleanly before the kdump kernel starts. This causes the initial attempt to enter D0 state in the kdump kernel to fail with invalid device state returned from Hyper-V host. When this happens, explicitly call hv_pci_bus_exit() and retry to enter the D0 state. Link: https://lore.kernel.org/r/20200507050300.10974-1-weh@microsoft.com Signed-off-by: Wei Hu <weh@microsoft.com> [lorenzo.pieralisi@arm.com: commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
|
#
83cc3508 |
|
06-May-2020 |
Wei Hu <weh@microsoft.com> |
PCI: hv: Fix the PCI HyperV probe failure path to release resource properly In some error cases in hv_pci_probe(), allocated resources are not freed. Fix this by adding a field to keep track of the high water mark for slots that have resources allocated to them. In case of an error, this high water mark is used to know which slots have resources that must be released. Since slots are numbered starting with zero, a value of -1 indicates no slots have been allocated resources. There may be unused slots in the range between slot 0 and the high water mark slot, but these slots are already ignored by the existing code in the allocate and release loops with the call to get_pcichild_wslot(). Link: https://lore.kernel.org/r/20200507050211.10923-1-weh@microsoft.com Signed-off-by: Wei Hu <weh@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
|
#
240ad77c |
|
05-Apr-2020 |
Andrea Parri (Microsoft) <parri.andrea@gmail.com> |
PCI: hv: Prepare hv_compose_msi_msg() for the VMBus-channel-interrupt-to-vCPU reassignment functionality The current implementation of hv_compose_msi_msg() is incompatible with the new functionality that allows changing the vCPU a VMBus channel will interrupt: if this function always calls hv_pci_onchannelcallback() in the polling loop, the interrupt going to a different CPU could cause hv_pci_onchannelcallback() to be running simultaneously in a tasklet, which will break. The current code also has a problem in that it is not synchronized with vmbus_reset_channel_cb(): hv_compose_msi_msg() could be accessing the ring buffer via the call of hv_pci_onchannelcallback() well after the time that vmbus_reset_channel_cb() has finished. Fix these issues as follows. Disable the channel tasklet before entering the polling loop in hv_compose_msi_msg() and re-enable it when done. This will prevent hv_pci_onchannelcallback() from running in a tasklet on a different CPU. Moreover, poll by always calling hv_pci_onchannelcallback(), but check the channel callback function for NULL and invoke the callback within a sched_lock critical section. This will prevent hv_compose_msi_msg() from accessing the ring buffer after vmbus_reset_channel_cb() has acquired the sched_lock spinlock. Suggested-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Andrew Murray <amurray@thegoodpenguin.co.uk> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: <linux-pci@vger.kernel.org> Link: https://lore.kernel.org/r/20200406001514.19876-8-parri.andrea@gmail.com Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
1cf106d9 |
|
09-Feb-2020 |
Boqun Feng <boqun.feng@gmail.com> |
PCI: hv: Introduce hv_msi_entry Add a new structure (hv_msi_entry), which is also defined in the TLFS, to describe the msi entry for HVCALL_RETARGET_INTERRUPT. The structure is needed because its layout may be different from architecture to architecture. Also add a new generic interface hv_set_msi_entry_from_desc() to allow different archs to set the msi entry from msi_desc. No functional change, only preparation for the future support of virtual PCI on non-x86 architectures. Signed-off-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Dexuan Cui <decui@microsoft.com>
|
#
61bfd920 |
|
09-Feb-2020 |
Boqun Feng <boqun.feng@gmail.com> |
PCI: hv: Move retarget related structures into tlfs header Currently, retarget_msi_interrupt and other structures it relys on are defined in pci-hyperv.c. However, those structures are actually defined in Hypervisor Top-Level Functional Specification [1] and may be different in sizes of fields or layout from architecture to architecture. Let's move those definitions into x86's tlfs header file to support virtual PCI on non-x86 architectures in the future. Note that "__packed" attribute is added to these structures during the movement for the same reason as we use the attribute for other TLFS structures in the header file: make sure the structures meet the specification and avoid anything unexpected from the compilers. Additionally, rename struct retarget_msi_interrupt to hv_retarget_msi_interrupt for the consistent naming convention, also mirroring the name in TLFS. [1]: https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs Signed-off-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Dexuan Cui <decui@microsoft.com>
|
#
b00f80fc |
|
09-Feb-2020 |
Boqun Feng <boqun.feng@gmail.com> |
PCI: hv: Move hypercall related definitions into tlfs header Currently HVCALL_RETARGET_INTERRUPT and HV_PARTITION_ID_SELF are defined in pci-hyperv.c. However, similar to other hypercall related definitions, it makes more sense to put them in the tlfs header file. Besides, these definitions are arch-dependent, so for the support of virtual PCI on non-x86 archs in the future, move them into arch-specific tlfs header file. Signed-off-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Andrew Murray <amurray@thegoodpenguin.co.uk> Reviewed-by: Dexuan Cui <decui@microsoft.com>
|
#
067fb6c9 |
|
12-Feb-2020 |
Gustavo A. R. Silva <gustavo@embeddedor.com> |
PCI: hv: Replace zero-length array with flexible-array member The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Dexuan Cui <decui@microsoft.com>
|
#
999dd956 |
|
25-Feb-2020 |
Long Li <longli@microsoft.com> |
PCI: hv: Add support for protocol 1.3 and support PCI_BUS_RELATIONS2 Starting with Hyper-V PCI protocol version 1.3, the host VSP can send PCI_BUS_RELATIONS2 and pass the vNUMA node information for devices on the bus. The vNUMA node tells which guest NUMA node this device is on based on guest VM configuration topology and physical device information. Add code to negotiate v1.3 and process PCI_BUS_RELATIONS2. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
|
#
f9ad0f36 |
|
25-Feb-2020 |
Long Li <longli@microsoft.com> |
PCI: hv: Decouple the func definition in hv_dr_state from VSP message hv_dr_state is used to find present PCI devices on the bus. The structure reuses struct pci_function_description from VSP message to describe a device. To prepare support for pci_function_description v2, decouple this dependence in hv_dr_state so it can work with both v1 and v2 VSP messages. There is no functionality change. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
|
#
42c3d418 |
|
21-Feb-2020 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Add missing kfree(hbus) in hv_pci_probe()'s error handling path Now that we use kzalloc() to allocate the hbus buffer, we must call kfree() in the error path as well to prevent memory leakage. Fixes: 877b911a5ba0 ("PCI: hv: Avoid a kmemleak false positive caused by the hbus buffer") Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
|
#
e658a4fe |
|
21-Feb-2020 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Remove unnecessary type casting from kzalloc In C, there is no need to cast a void * to any other pointer type, remove an unnecessary cast. Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
|
#
877b911a |
|
24-Nov-2019 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Avoid a kmemleak false positive caused by the hbus buffer With the recent 59bb47985c1d ("mm, sl[aou]b: guarantee natural alignment for kmalloc(power-of-two)"), kzalloc() is able to allocate a 4KB buffer that is guaranteed to be 4KB-aligned. Here the size and alignment of hbus is important because hbus's field retarget_msi_interrupt_params must not cross a 4KB page boundary. Here we prefer kzalloc to get_zeroed_page(), because a buffer allocated by the latter is not tracked and scanned by kmemleak, and hence kmemleak reports the pointer contained in the hbus buffer (i.e. the hpdev struct, which is created in new_pcichild_device() and is tracked by hbus->children) as memory leak (false positive). If the kernel doesn't have 59bb47985c1d, get_zeroed_page() *must* be used to allocate the hbus buffer and we can avoid the kmemleak false positive by using kmemleak_alloc() and kmemleak_free() to ask kmemleak to track and scan the hbus buffer. Reported-by: Lili Deng <v-lide@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
|
#
14ef39fd |
|
24-Nov-2019 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Change pci_protocol_version to per-hbus A VM can have multiple Hyper-V hbus. It's incorrect to set the global variable 'pci_protocol_version' when *every* hbus is initialized in hv_pci_protocol_negotiation(). This is not an issue in practice since every hbus should have the same value of hbus->protocol_version, but we should make the variable per-hbus, so in case we have busses with different protocol versions, the driver can still work correctly. Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
|
#
ac82fc83 |
|
24-Nov-2019 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Add hibernation support Add suspend() and resume() functions so that Hyper-V virtual PCI devices are handled properly when the VM hibernates and resumes from hibernation. Note that the suspend() function must make sure there are no pending work items before calling vmbus_close(), since it runs in a process context as a callback in dpm_suspend(). When it starts to run, the channel callback hv_pci_onchannelcallback(), which runs in a tasklet context, can be still running concurrently and scheduling new work items onto hbus->wq in hv_pci_devices_present() and hv_pci_eject_device(), and the work item handlers can access the vmbus channel, which can be being closed by hv_pci_suspend(), e.g. the work item handler pci_devices_present_work() -> new_pcichild_device() writes to the vmbus channel. To eliminate the race, hv_pci_suspend() disables the channel callback tasklet, sets hbus->state to hv_pcibus_removing, and re-enables the tasklet. This way, when hv_pci_suspend() proceeds, it knows that no new work item can be scheduled, and then it flushes hbus->wq and safely closes the vmbus channel. Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
|
#
a8e37506 |
|
24-Nov-2019 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Reorganize the code in preparation of hibernation There is no functional change. This is just preparatory for a later patch which adds the hibernation support for the pci-hyperv driver. Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
|
#
c9c13ba4 |
|
27-Sep-2019 |
Denis Efremov <efremov@linux.com> |
PCI: Add PCI_STD_NUM_BARS for the number of standard BARs Code that iterates over all standard PCI BARs typically uses PCI_STD_RESOURCE_END. However, that requires the unusual test "i <= PCI_STD_RESOURCE_END" rather than something the typical "i < PCI_STD_NUM_BARS". Add a definition for PCI_STD_NUM_BARS and change loops to use the more idiomatic C style to help avoid fencepost errors. Link: https://lore.kernel.org/r/20190927234026.23342-1-efremov@linux.com Link: https://lore.kernel.org/r/20190927234308.23935-1-efremov@linux.com Link: https://lore.kernel.org/r/20190916204158.6889-3-efremov@linux.com Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Sebastian Ott <sebott@linux.ibm.com> # arch/s390/ Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> # video/fbdev/ Acked-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> # pci/controller/dwc/ Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> # scsi/pm8001/ Acked-by: Martin K. Petersen <martin.petersen@oracle.com> # scsi/pm8001/ Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # memstick/
|
#
f73f8a50 |
|
15-Aug-2019 |
Haiyang Zhang <haiyangz@microsoft.com> |
PCI: hv: Use bytes 4 and 5 from instance ID as the PCI domain numbers As recommended by Azure host team, the bytes 4, 5 have more uniqueness (info entropy) than bytes 8, 9 so use them as the PCI domain numbers. On older hosts, bytes 4, 5 can also be used -- no backward compatibility issues are introduced and the chance of collision is greatly reduced. In the rare cases of collision, the driver code detects and finds another number that is not in use. Suggested-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Sasha Levin <sashal@kernel.org>
|
#
348dd93e |
|
21-Aug-2019 |
Haiyang Zhang <haiyangz@microsoft.com> |
PCI: hv: Add a Hyper-V PCI interface driver for software backchannel interface This interface driver is a helper driver allows other drivers to have a common interface with the Hyper-V PCI frontend driver. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e5d2f910 |
|
21-Aug-2019 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Add a paravirtual backchannel in software Windows SR-IOV provides a backchannel mechanism in software for communication between a VF driver and a PF driver. These "configuration blocks" are similar in concept to PCI configuration space, but instead of doing reads and writes in 32-bit chunks through a very slow path, packets of up to 128 bytes can be sent or received asynchronously. Nearly every SR-IOV device contains just such a communications channel in hardware, so using this one in software is usually optional. Using the software channel, however, allows driver implementers to leverage software tools that fuzz the communications channel looking for vulnerabilities. The usage model for these packets puts the responsibility for reading or writing on the VF driver. The VF driver sends a read or a write packet, indicating which "block" is being referred to by number. If the PF driver wishes to initiate communication, it can "invalidate" one or more of the first 64 blocks. This invalidation is delivered via a callback supplied by the VF driver by this driver. No protocol is implied, except that supplied by the PF and VF drivers. Signed-off-by: Jake Oshins <jakeo@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
be700103 |
|
15-Aug-2019 |
Haiyang Zhang <haiyangz@microsoft.com> |
PCI: hv: Detect and fix Hyper-V PCI domain number collision Currently in Azure cloud, for passthrough devices, the host sets the device instance ID's bytes 8 - 15 to a value derived from the host HWID, which is the same on all devices in a VM. So, the device instance ID's bytes 8 and 9 provided by the host are no longer unique. This affects all Azure hosts since July 2018, and can cause device passthrough to VMs to fail because the bytes 8 and 9 are used as PCI domain number. Collision of domain numbers will cause the second device with the same domain number fail to load. In the cases of collision, we will detect and find another number that is not in use. Suggested-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Sasha Levin <sashal@kernel.org>
|
#
533ca1fe |
|
02-Aug-2019 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Avoid use of hv_pci_dev->pci_slot after freeing it The slot must be removed before the pci_dev is removed, otherwise a panic can happen due to use-after-free. Fixes: 15becc2b56c6 ("PCI: hv: Add hv_pci_remove_slots() when we unload the driver") Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: stable@vger.kernel.org
|
#
467a3bb9 |
|
06-Aug-2019 |
Marc Zyngier <maz@kernel.org> |
PCI: hv: Allocate a named fwnode instead of an address-based one To allocate its fwnode that is then used to allocate an irqdomain, the driver uses irq_domain_alloc_fwnode(), passing it a VA as an identifier. This is a rather bad idea, as this address ends up published in debugfs (and we want to move away from VAs there anyway). Instead, let's allocate a named fwnode by using the device GUID as an identifier. It is allegedly unique, and can be traced back to the original device. Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Marc Zyngier <maz@kernel.org>
|
#
4df591b2 |
|
21-Jun-2019 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Fix a use-after-free bug in hv_eject_device_work() Fix a use-after-free in hv_eject_device_work(). Fixes: 05f151a73ec2 ("PCI: hv: Fix a memory leak in hv_eject_device_work()") Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Cc: stable@vger.kernel.org
|
#
340d4556 |
|
04-Mar-2019 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Add pci_destroy_slot() in pci_devices_present_work(), if necessary When we hot-remove a device, usually the host sends us a PCI_EJECT message, and a PCI_BUS_RELATIONS message with bus_rel->device_count == 0. When we execute the quick hot-add/hot-remove test, the host may not send us the PCI_EJECT message if the guest has not fully finished the initialization by sending the PCI_RESOURCES_ASSIGNED* message to the host, so it's potentially unsafe to only depend on the pci_destroy_slot() in hv_eject_device_work() because the code path create_root_hv_pci_bus() -> hv_pci_assign_slots() is not called in this case. Note: in this case, the host still sends the guest a PCI_BUS_RELATIONS message with bus_rel->device_count == 0. In the quick hot-add/hot-remove test, we can have such a race before the code path pci_devices_present_work() -> new_pcichild_device() adds the new device into the hbus->children list, we may have already received the PCI_EJECT message, and since the tasklet handler hv_pci_onchannelcallback() may fail to find the "hpdev" by calling get_pcichild_wslot(hbus, dev_message->wslot.slot) hv_pci_eject_device() is not called; Later, by continuing execution create_root_hv_pci_bus() -> hv_pci_assign_slots() creates the slot and the PCI_BUS_RELATIONS message with bus_rel->device_count == 0 removes the device from hbus->children, and we end up being unable to remove the slot in hv_pci_remove() -> hv_pci_remove_slots() Remove the slot in pci_devices_present_work() when the device is removed to address this race. pci_devices_present_work() and hv_eject_device_work() run in the singled-threaded hbus->wq, so there is not a double-remove issue for the slot. We cannot offload hv_pci_eject_device() from hv_pci_onchannelcallback() to the workqueue, because we need the hv_pci_onchannelcallback() synchronously call hv_pci_eject_device() to poll the channel ringbuffer to work around the "hangs in hv_compose_msi_msg()" issue fixed in commit de0aa7b2f97d ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()") Fixes: a15f2c08c708 ("PCI: hv: support reporting serial number as slot information") Signed-off-by: Dexuan Cui <decui@microsoft.com> [lorenzo.pieralisi@arm.com: rewritten commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Cc: stable@vger.kernel.org
|
#
15becc2b |
|
04-Mar-2019 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Add hv_pci_remove_slots() when we unload the driver When we unload the pci-hyperv host controller driver, the host does not send us a PCI_EJECT message. In this case we also need to make sure the sysfs PCI slot directory is removed, otherwise a command on a slot file eg: "cat /sys/bus/pci/slots/2/address" will trigger a "BUG: unable to handle kernel paging request" and, if we unload/reload the driver several times we would end up with stale slot entries in PCI slot directories in /sys/bus/pci/slots/ root@localhost:~# ls -rtl /sys/bus/pci/slots/ total 0 drwxr-xr-x 2 root root 0 Feb 7 10:49 2 drwxr-xr-x 2 root root 0 Feb 7 10:49 2-1 drwxr-xr-x 2 root root 0 Feb 7 10:51 2-2 Add the missing code to remove the PCI slot and fix the current behaviour. Fixes: a15f2c08c708 ("PCI: hv: support reporting serial number as slot information") Signed-off-by: Dexuan Cui <decui@microsoft.com> [lorenzo.pieralisi@arm.com: reformatted the log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Cc: stable@vger.kernel.org
|
#
05f151a7 |
|
04-Mar-2019 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Fix a memory leak in hv_eject_device_work() When a device is created in new_pcichild_device(), hpdev->refs is set to 2 (i.e. the initial value of 1 plus the get_pcichild()). When we hot remove the device from the host, in a Linux VM we first call hv_pci_eject_device(), which increases hpdev->refs by get_pcichild() and then schedules a work of hv_eject_device_work(), so hpdev->refs becomes 3 (let's ignore the paired get/put_pcichild() in other places). But in hv_eject_device_work(), currently we only call put_pcichild() twice, meaning the 'hpdev' struct can't be freed in put_pcichild(). Add one put_pcichild() to fix the memory leak. The device can also be removed when we run "rmmod pci-hyperv". On this path (hv_pci_remove() -> hv_pci_bus_exit() -> hv_pci_devices_present()), hpdev->refs is 2, and we do correctly call put_pcichild() twice in pci_devices_present_work(). Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs") Signed-off-by: Dexuan Cui <decui@microsoft.com> [lorenzo.pieralisi@arm.com: commit log rework] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Cc: stable@vger.kernel.org
|
#
c8ccf759 |
|
01-Mar-2019 |
Maya Nakamura <m.maya.nakamura@gmail.com> |
PCI: hv: Refactor hv_irq_unmask() to use cpumask_to_vpset() Remove the duplicate implementation of cpumask_to_vpset() and use the shared implementation. Export hv_max_vp_index, which is required by cpumask_to_vpset(). Signed-off-by: Maya Nakamura <m.maya.nakamura@gmail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
|
#
9bc11742 |
|
28-Feb-2019 |
Maya Nakamura <m.maya.nakamura@gmail.com> |
PCI: hv: Replace hv_vp_set with hv_vpset Remove a duplicate definition of VP set (hv_vp_set) and use the common definition (hv_vpset) that is used in other places. Change the order of the members in struct hv_pcibus_device so that the declaration of retarget_msi_interrupt_params is the last member. Struct hv_vpset, which contains a flexible array, is nested two levels deep in struct hv_pcibus_device via retarget_msi_interrupt_params. Add a comment that retarget_msi_interrupt_params should be the last member of struct hv_pcibus_device. Signed-off-by: Maya Nakamura <m.maya.nakamura@gmail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
|
#
6ae91579 |
|
28-Feb-2019 |
Maya Nakamura <m.maya.nakamura@gmail.com> |
PCI: hv: Add __aligned(8) to struct retarget_msi_interrupt Because Hyper-V requires that hypercall arguments be aligned on an 8 byte boundary, add __aligned(8) to struct retarget_msi_interrupt. Link: https://lore.kernel.org/lkml/87k1hlqlby.fsf@vitty.brq.redhat.com/ Signed-off-by: Maya Nakamura <m.maya.nakamura@gmail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
|
#
54be5b8c |
|
20-Sep-2018 |
Wei Yongjun <weiyongjun1@huawei.com> |
PCI: hv: Fix return value check in hv_pci_assign_slots() In case of error, the function pci_create_slot() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: a15f2c08c708 ("PCI: hv: support reporting serial number as slot information") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a15f2c08 |
|
14-Sep-2018 |
Stephen Hemminger <stephen@networkplumber.org> |
PCI: hv: support reporting serial number as slot information The Hyper-V host API for PCI provides a unique "serial number" which can be used as basis for sysfs PCI slot table. This can be useful for cases where userspace wants to find the PCI device based on serial number. When an SR-IOV NIC is added, the host sends an attach message with serial number. The kernel doesn't use the serial number, but it is useful when doing the same thing in a userspace driver such as the DPDK. By having /sys/bus/pci/slots/N it provides a direct way to find the matching PCI device. There maybe some cases where serial number is not unique such as when using GPU's. But the PCI slot infrastructure will handle that. This has a side effect which may also be useful. The common udev network device naming policy uses the slot information (rather than PCI address). Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
447ae316 |
|
28-Jul-2018 |
Nicolai Stange <nstange@suse.de> |
x86: Don't include linux/irq.h from asm/hardirq.h The next patch in this series will have to make the definition of irq_cpustat_t available to entering_irq(). Inclusion of asm/hardirq.h into asm/apic.h would cause circular header dependencies like asm/smp.h asm/apic.h asm/hardirq.h linux/irq.h linux/topology.h linux/smp.h asm/smp.h or linux/gfp.h linux/mmzone.h asm/mmzone.h asm/mmzone_64.h asm/smp.h asm/apic.h asm/hardirq.h linux/irq.h linux/irqdesc.h linux/kobject.h linux/sysfs.h linux/kernfs.h linux/idr.h linux/gfp.h and others. This causes compilation errors because of the header guards becoming effective in the second inclusion: symbols/macros that had been defined before wouldn't be available to intermediate headers in the #include chain anymore. A possible workaround would be to move the definition of irq_cpustat_t into its own header and include that from both, asm/hardirq.h and asm/apic.h. However, this wouldn't solve the real problem, namely asm/harirq.h unnecessarily pulling in all the linux/irq.h cruft: nothing in asm/hardirq.h itself requires it. Also, note that there are some other archs, like e.g. arm64, which don't have that #include in their asm/hardirq.h. Remove the linux/irq.h #include from x86' asm/hardirq.h. Fix resulting compilation errors by adding appropriate #includes to *.c files as needed. Note that some of these *.c files could be cleaned up a bit wrt. to their set of #includes, but that should better be done from separate patches, if at all. Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
#
35a88a18 |
|
09-Jul-2018 |
Dexuan Cui <decui@microsoft.com> |
PCI: hv: Disable/enable IRQs rather than BH in hv_compose_msi_msg() Commit de0aa7b2f97d ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()") uses local_bh_disable()/enable(), because hv_pci_onchannelcallback() can also run in tasklet context as the channel event callback, so bottom halves should be disabled to prevent a race condition. With CONFIG_PROVE_LOCKING=y in the recent mainline, or old kernels that don't have commit f71b74bca637 ("irq/softirqs: Use lockdep to assert IRQs are disabled/enabled"), when the upper layer IRQ code calls hv_compose_msi_msg() with local IRQs disabled, we'll see a warning at the beginning of __local_bh_enable_ip(): IRQs not enabled as expected WARNING: CPU: 0 PID: 408 at kernel/softirq.c:162 __local_bh_enable_ip The warning exposes an issue in de0aa7b2f97d: local_bh_enable() can potentially call do_softirq(), which is not supposed to run when local IRQs are disabled. Let's fix this by using local_irq_save()/restore() instead. Note: hv_pci_onchannelcallback() is not a hot path because it's only called when the PCI device is hot added and removed, which is infrequent. Fixes: de0aa7b2f97d ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()") Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Cc: stable@vger.kernel.org Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com>
|
#
7403bd14 |
|
18-Mar-2018 |
Jia-Ju Bai <baijiaju1990@gmail.com> |
PCI: hv: Replace GFP_ATOMIC with GFP_KERNEL in new_pcichild_device() new_pcichild_device() is not called in atomic context. The call chain ending up at new_pcichild_device() is: [1] new_pcichild_device() <- pci_devices_present_work() pci_devices_present_work() is only set in INIT_WORK(). Despite never getting called from atomic context, new_pcichild_device() calls kzalloc with GFP_ATOMIC, which waits busily for allocation. GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL to avoid busy waiting. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> [lorenzo.pieralisi@arm.com: reworked commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
|
#
6e0832fa |
|
30-May-2018 |
Shawn Lin <shawn.lin@rock-chips.com> |
PCI: Collect all native drivers under drivers/pci/controller/ Native PCI drivers for root complex devices were originally all in drivers/pci/host/. Some of these devices can also be operated in endpoint mode. Drivers for endpoint mode didn't seem to fit in the "host" directory, so we put both the root complex and endpoint drivers in per-device directories, e.g., drivers/pci/dwc/, drivers/pci/cadence/, etc. These per-device directories contain trivial Kconfig and Makefiles and clutter drivers/pci/. Make a new drivers/pci/controllers/ directory and collect all the device-specific drivers there. No functional change intended. Link: https://lkml.kernel.org/r/1520304202-232891-1-git-send-email-shawn.lin@rock-chips.com Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|