#
3ec648c6 |
|
23-Aug-2023 |
Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> |
IB: Use capital "OR" for multiple licenses in SPDX Documentation/process/license-rules.rst and checkpatch expect the SPDX identifier syntax for multiple licenses to use capital "OR". Correct it to keep consistent format and avoid copy-paste issues. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230823092912.122674-1-krzysztof.kozlowski@linaro.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
d2c02346 |
|
22-Aug-2023 |
Brendan Cunningham <bcunningham@cornelisnetworks.com> |
RDMA/hfi1: Move user SDMA system memory pinning code to its own file Move user SDMA system memory page-pinning code from user_sdma.c to pin_system.c. Put declarations for non-static functions in pinning.h. System memory pinning is necessary for processing user SDMA requests but actual steps are invisible to user SDMA request-processing code. Moving system memory pinning code for user SDMA to its own file makes this distinction apparent. These changes have no effect on userspace. Signed-off-by: Patrick Kelsey <pat.kelsey@cornelisnetworks.com> Signed-off-by: Brendan Cunningham <bcunningham@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Link: https://lore.kernel.org/r/169271327311.1855761.4736714053318724062.stgit@awfm-02.cornelisnetworks.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
145eba1a |
|
22-Aug-2021 |
Cai Huoqing <caihuoqing@baidu.com> |
RDMA/hfi1: Convert to SPDX identifier use SPDX-License-Identifier instead of a verbose license text Link: https://lore.kernel.org/r/20210823042622.109-1-caihuoqing@baidu.com Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
991c4274 |
|
29-Jul-2021 |
Cai Huoqing <caihuoqing@baidu.com> |
RDMA/hfi1: Fix typo in comments Remove the repeated word 'the' from comments Link: https://lore.kernel.org/r/20210729082346.1882-1-caihuoqing@baidu.com Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
a0293eb2 |
|
19-Jul-2021 |
Xiyu Yang <xiyuyang19@fudan.edu.cn> |
RDMA/hfi1: Convert from atomic_t to refcount_t on hfi1_devdata->user_refcount refcount_t type and corresponding API can protect refcounters from accidental underflow and overflow and further use-after-free situations. Link: https://lore.kernel.org/r/1626674454-56075-1-git-send-email-xiyuyang19@fudan.edu.cn Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Tested-by: Josh Fisher <josh.fisher@cornelisnetworks.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
bf194997 |
|
16-Jun-2021 |
Leon Romanovsky <leon@kernel.org> |
RDMA: Fix kernel-doc warnings about wrong comment Compilation with W=1 produces warnings similar to the below. drivers/infiniband/ulp/ipoib/ipoib_main.c:320: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst All such occurrences were found with the following one line git grep -A 1 "\/\*\*" drivers/infiniband/ Link: https://lore.kernel.org/r/e57d5f4ddd08b7a19934635b44d6d632841b9ba7.1623823612.git.leonro@nvidia.com Reviewed-by: Jack Wang <jinpu.wang@ionos.com> #rtrs Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
d7407d16 |
|
11-Jun-2021 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Change ops->init_port to ops->port_groups init_port was only being used to register sysfs attributes against the port kobject. Now that all users are creating static attribute_group's we can simply set the attribute_group list in the ops and the core code can just handle it directly. This makes all the sysfs management quite straightforward and prevents any driver from abusing the naked port kobject in future because no driver code can access it. Link: https://lore.kernel.org/r/114f68f3d921460eafe14cea5a80ca65d81729c3.1623427137.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
8f1708f1 |
|
11-Jun-2021 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/hfi1: Use attributes for the port sysfs hfi1 should not be creating a mess of kobjects to attach to the port kobject - this is all attributes. The proper API is to create an attribute_group list and create it against the port's kobject. Link: https://lore.kernel.org/r/cbe0ccb6175dd22274359b6ad803a37435a70e91.1623427137.git.leonro@nvidia.com Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
780278c2 |
|
29-Mar-2021 |
Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> |
IB/hfi1: Rework AIP and VNIC dummy netdev usage All other users of the dummy netdevice embed the netdev in other structures: init_dummy_netdev(&mal->dummy_dev); init_dummy_netdev(ð->dummy_dev); init_dummy_netdev(&ar->napi_dev); init_dummy_netdev(&irq_grp->napi_ndev); init_dummy_netdev(&wil->napi_ndev); init_dummy_netdev(&trans_pcie->napi_dev); init_dummy_netdev(&dev->napi_dev); init_dummy_netdev(&bus->mux_dev); The AIP and VNIC implementation turns that model inside out and used a kfree() to free what appears to be a netdev struct when in reality, it is a struct that enbodies the rx state as well as the dummy netdev used to support napi_poll across disparate receive contexts. The relationship is infered by the odd allocation: const int netdev_size = sizeof(*dd->dummy_netdev) + sizeof(struct hfi1_netdev_priv); <snip> dd->dummy_netdev = kcalloc_node(1, netdev_size, GFP_KERNEL, dd->node); Correct the issue by: - Correctly naming the alloc and free functions - Renaming hfi1_netdev_priv to hfi1_netdev_rx - Replacing dd dummy_netdev with a netdev_rx pointer - Embedding the net_device in hfi1_netdev_rx - Moving the init_dummy_netdev to the alloc routine - Adjusting wrappers to fit the new model Fixes: 6991abcb993c ("IB/hfi1: Add functions to receive accelerated ipoib packets") Link: https://lore.kernel.org/r/1617026056-50483-11-git-send-email-dennis.dalessandro@cornelisnetworks.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
1fb7f897 |
|
01-Mar-2021 |
Mark Bloch <mbloch@nvidia.com> |
RDMA: Support more than 255 rdma ports Current code uses many different types when dealing with a port of a RDMA device: u8, unsigned int and u32. Switch to u32 to clean up the logic. This allows us to make (at least) the core view consistent and use the same type. Unfortunately not all places can be converted. Many uverbs functions expect port to be u8 so keep those places in order not to break UAPIs. HW/Spec defined values must also not be changed. With the switch to u32 we now can support devices with more than 255 ports. U32_MAX is reserved to make control logic a bit easier to deal with. As a device with U32_MAX ports probably isn't going to happen any time soon this seems like a non issue. When a device with more than 255 ports is created uverbs will report the RDMA device as having 255 ports as this is the max currently supported. The verbs interface is not changed yet because the IBTA spec limits the port size in too many places to be u8 and all applications that relies in verbs won't be able to cope with this change. At this stage, we are extending the interfaces that are using vendor channel solely Once the limitation is lifted mlx5 in switchdev mode will be able to have thousands of SFs created by the device. As the only instance of an RDMA device that reports more than 255 ports will be a representor device and it exposes itself as a RAW Ethernet only device CM/MAD/IPoIB and other ULPs aren't effected by this change and their sysfs/interfaces that are exposes to userspace can remain unchanged. While here cleanup some alignment issues and remove unneeded sanity checks (mainly in rdmavt), Link: https://lore.kernel.org/r/20210301070420.439400-1-leon@kernel.org Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
fdb68dd3 |
|
14-Mar-2021 |
Leon Romanovsky <leon@kernel.org> |
RDMA: Delete not-used static inline functions Perform mass deletion of static inline functions that are not used. Link: https://lore.kernel.org/r/20210314133908.291945-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
5de61a47 |
|
29-Mar-2021 |
Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> |
IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS A panic can result when AIP is enabled: BUG: unable to handle kernel NULL pointer dereference at 000000000000000 PGD 0 P4D 0 Oops: 0000 1 SMP PTI CPU: 70 PID: 981 Comm: systemd-udevd Tainted: G OE --------- - - 4.18.0-240.el8.x86_64 #1 Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.01.01.0005.101720141054 10/17/2014 RIP: 0010:__bitmap_and+0x1b/0x70 RSP: 0018:ffff99aa0845f9f0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8d5a6fc18000 RCX: 0000000000000048 RDX: 0000000000000000 RSI: ffffffffc06336f0 RDI: ffff8d5a8fa67750 RBP: 0000000000000079 R08: 0000000fffffffff R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: ffffffffc06336f0 R13: 00000000000000a0 R14: ffff8d5a6fc18000 R15: 0000000000000003 FS: 00007fec137a5980(0000) GS:ffff8d5a9fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000a04b48002 CR4: 00000000001606e0 Call Trace: hfi1_num_netdev_contexts+0x7c/0x110 [hfi1] hfi1_init_dd+0xd7f/0x1a90 [hfi1] ? pci_bus_read_config_dword+0x49/0x70 ? pci_mmcfg_read+0x3e/0xe0 do_init_one.isra.18+0x336/0x640 [hfi1] local_pci_probe+0x41/0x90 pci_device_probe+0x105/0x1c0 really_probe+0x212/0x440 driver_probe_device+0x49/0xc0 device_driver_attach+0x50/0x60 __driver_attach+0x61/0x130 ? device_driver_attach+0x60/0x60 bus_for_each_dev+0x77/0xc0 ? klist_add_tail+0x3b/0x70 bus_add_driver+0x14d/0x1e0 ? dev_init+0x10b/0x10b [hfi1] driver_register+0x6b/0xb0 ? dev_init+0x10b/0x10b [hfi1] hfi1_mod_init+0x1e6/0x20a [hfi1] do_one_initcall+0x46/0x1c3 ? free_unref_page_commit+0x91/0x100 ? _cond_resched+0x15/0x30 ? kmem_cache_alloc_trace+0x140/0x1c0 do_init_module+0x5a/0x220 load_module+0x14b4/0x17e0 ? __do_sys_finit_module+0xa8/0x110 __do_sys_finit_module+0xa8/0x110 do_syscall_64+0x5b/0x1a0 The issue happens when pcibus_to_node() returns NO_NUMA_NODE. Fix this issue by moving the initialization of dd->node to hfi1_devdata allocation and remove the other pcibus_to_node() calls in the probe path and use dd->node instead. Affinity logic is adjusted to use a new field dd->affinity_entry as a guard instead of dd->node. Fixes: 4730f4a6c6b2 ("IB/hfi1: Activate the dummy netdev") Link: https://lore.kernel.org/r/1617025700-31865-4-git-send-email-dennis.dalessandro@cornelisnetworks.com Cc: stable@vger.kernel.org Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
3d2a9d64 |
|
25-Nov-2020 |
Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> |
IB/hfi1: Ensure correct mm is used at all times Two earlier bug fixes have created a security problem in the hfi1 driver. One fix aimed to solve an issue where current->mm was not valid when closing the hfi1 cdev. It attempted to do this by saving a cached value of the current->mm pointer at file open time. This is a problem if another process with access to the FD calls in via write() or ioctl() to pin pages via the hfi driver. The other fix tried to solve a use after free by taking a reference on the mm. To fix this correctly we use the existing cached value of the mm in the mmu notifier. Now we can check in the insert, evict, etc. routines that current->mm matched what the notifier was registered for. If not, then don't allow access. The register of the mmu notifier will save the mm pointer. Since in do_exit() the exit_mm() is called before exit_files(), which would call our close routine a reference is needed on the mm. We rely on the mmgrab done by the registration of the notifier, whereas before it was explicit. The mmu notifier deregistration happens when the user context is torn down, the creation of which triggered the registration. Also of note is we do not do any explicit work to protect the interval tree notifier. It doesn't seem that this is going to be needed since we aren't actually doing anything with current->mm. The interval tree notifier stuff still has a FIXME noted from a previous commit that will be addressed in a follow on patch. Cc: <stable@vger.kernel.org> Fixes: e0cf75deab81 ("IB/hfi1: Fix mm_struct use after free") Fixes: 3faa3d9a308e ("IB/hfi1: Make use of mm consistent") Link: https://lore.kernel.org/r/20201125210112.104301.51331.stgit@awfm-01.aw.intel.com Suggested-by: Jann Horn <jannh@google.com> Reported-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
4730f4a6 |
|
10-May-2020 |
Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> |
IB/hfi1: Activate the dummy netdev As described in earlier patches, ipoib netdev will share receive contexts with existing VNIC netdev through a dummy netdev. The following changes are made to achieve that: - Set up netdev receive contexts after user contexts. A function is added to count the available netdev receive contexts. - Add functions to set/get receive map table free index. - Rename NUM_VNIC_MAP_ENTRIES as NUM_NETDEV_MAP_ENTRIES. - Let the dummy netdev own the receive contexts instead of VNIC. - Allocate the dummy netdev when the hfi1 device is added and free it when the device is removed. - Initialize AIP RSM rules when the IpoIb rxq is initialized and remove the rules when it is de-initialized. - Convert VNIC to use the dummy netdev. Link: https://lore.kernel.org/r/20200511160649.173205.4626.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
0bae02d5 |
|
10-May-2020 |
Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> |
IB/hfi1: Add interrupt handler functions for accelerated ipoib This patch adds the interrupt handler function, the NAPI poll function, and its associated helper functions for receiving accelerated ipoib packets. While we are here, fix the formats of two error printouts. Link: https://lore.kernel.org/r/20200511160637.173205.64890.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
6991abcb |
|
10-May-2020 |
Kaike Wan <kaike.wan@intel.com> |
IB/hfi1: Add functions to receive accelerated ipoib packets Ipoib netdev will share receive contexts with existing VNIC netdev. To achieve that, a dummy netdev is allocated with hfi1_devdata to own the receive contexts, and ipoib and VNIC netdevs will be put on top of it. Each receive context is associated with a single NAPI object. This patch adds the functions to receive incoming packets for accelerated ipoib. Link: https://lore.kernel.org/r/20200511160631.173205.54184.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
89dcaa36 |
|
10-May-2020 |
Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> |
IB/hfi1: Rename num_vnic_contexts as num_netdev_contexts Rename num_vnic_contexts as num_ndetdev_contexts since VNIC and ipoib will share the same set of receive contexts. Link: https://lore.kernel.org/r/20200511160625.173205.53306.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
19d8b90a |
|
10-May-2020 |
Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> |
IB/hfi1: RSM rules for AIP This is implementation of RSM rule for AIP packets. AIP rule will use rule RSM2 and will match standard Infiniband packet containg BTH (LNH==BTH) and having Dest QPN prefixed with value 0x81. Spread between receive contexts will be done using source QPN bits. VNIC and AIP will share receive contexts, so their rules will point to the same RMT entries and their shared code is moved to separate functions. If any of the rules is active RMT mapping will be skipped for latter. Changed function hfi1_vnic_is_rsm_full to be more general and moved it from main header to chip.c. Changed the order of RSM rules because AIP rule as more specific one is needed to be placed before more general QOS rule. Rules are occupying two last RSM registers. Link: https://lore.kernel.org/r/20200511160612.173205.73002.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
84e3b19a |
|
10-May-2020 |
Gary Leshner <Gary.S.Leshner@intel.com> |
IB/hfi1: Remove module parameter for KDETH qpns The module parameter for KDETH qpns is being removed in favor of always using the default value of 0x80 as the qpn prefix. Defines have been added for various KDETH values including the prefix of 0x80. The reserved range now starts at the base value for KDETH qpns (0x80) and extends up to and including the last qpn for other reserved QP prefixed types. Adjust other QP prefixed define names to match KDETH defined names. Link: https://lore.kernel.org/r/20200511160600.173205.27508.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
5ab17a24 |
|
16-Mar-2020 |
Kaike Wan <kaike.wan@intel.com> |
IB/hfi1: Remove kobj from hfi1_devdata The field kobj was added to hfi1_devdata structure to manage the life time of the hfi1_devdata structure for PSM accesses: commit e11ffbd57520 ("IB/hfi1: Do not free hfi1 cdev parent structure early") Later another mechanism user_refcount/user_comp was introduced to provide the same functionality: commit acd7c8fe1493 ("IB/hfi1: Fix an Oops on pci device force remove") This patch will remove this kobj field, as it is no longer needed. Link: https://lore.kernel.org/r/20200316210500.7753.4145.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
be863834 |
|
10-Feb-2020 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Close window for pq and request coliding Cleaning up a pq can result in the following warning and panic: WARNING: CPU: 52 PID: 77418 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0 list_del corruption, ffff88cb2c6ac068->next is LIST_POISON1 (dead000000000100) Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel ast aesni_intel ttm lrw gf128mul glue_helper ablk_helper drm_kms_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev lpc_ich mei_me drm_panel_orientation_quirks i2c_i801 mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci i2c_algo_bit libahci dca ptp libata pps_core crc32c_intel [last unloaded: i2c_algo_bit] CPU: 52 PID: 77418 Comm: pvbatch Kdump: loaded Tainted: G OE ------------ 3.10.0-957.38.3.el7.x86_64 #1 Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019 Call Trace: [<ffffffff90365ac0>] dump_stack+0x19/0x1b [<ffffffff8fc98b78>] __warn+0xd8/0x100 [<ffffffff8fc98bff>] warn_slowpath_fmt+0x5f/0x80 [<ffffffff8ff970c3>] __list_del_entry+0x63/0xd0 [<ffffffff8ff9713d>] list_del+0xd/0x30 [<ffffffff8fddda70>] kmem_cache_destroy+0x50/0x110 [<ffffffffc0328130>] hfi1_user_sdma_free_queues+0xf0/0x200 [hfi1] [<ffffffffc02e2350>] hfi1_file_close+0x70/0x1e0 [hfi1] [<ffffffff8fe4519c>] __fput+0xec/0x260 [<ffffffff8fe453fe>] ____fput+0xe/0x10 [<ffffffff8fcbfd1b>] task_work_run+0xbb/0xe0 [<ffffffff8fc2bc65>] do_notify_resume+0xa5/0xc0 [<ffffffff90379134>] int_signal+0x12/0x17 BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300 PGD 2cdab19067 PUD 2f7bfdb067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel ast aesni_intel ttm lrw gf128mul glue_helper ablk_helper drm_kms_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev lpc_ich mei_me drm_panel_orientation_quirks i2c_i801 mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci i2c_algo_bit libahci dca ptp libata pps_core crc32c_intel [last unloaded: i2c_algo_bit] CPU: 52 PID: 77418 Comm: pvbatch Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.38.3.el7.x86_64 #1 Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019 task: ffff88cc26db9040 ti: ffff88b5393a8000 task.ti: ffff88b5393a8000 RIP: 0010:[<ffffffff8fe1f93e>] [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300 RSP: 0018:ffff88b5393abd60 EFLAGS: 00010287 RAX: 0000000000000000 RBX: ffff88cb2c6ac000 RCX: 0000000000000003 RDX: 0000000000000400 RSI: 0000000000000400 RDI: ffffffff9095b800 RBP: ffff88b5393abdb0 R08: ffffffff9095b808 R09: ffffffff8ff77c19 R10: ffff88b73ce1f160 R11: ffffddecddde9800 R12: ffff88cb2c6ac000 R13: 000000000000000c R14: ffff88cf3fdca780 R15: 0000000000000000 FS: 00002aaaaab52500(0000) GS:ffff88b73ce00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 0000002d27664000 CR4: 00000000007607e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: [<ffffffff8fe20d44>] __kmem_cache_shutdown+0x14/0x80 [<ffffffff8fddda78>] kmem_cache_destroy+0x58/0x110 [<ffffffffc0328130>] hfi1_user_sdma_free_queues+0xf0/0x200 [hfi1] [<ffffffffc02e2350>] hfi1_file_close+0x70/0x1e0 [hfi1] [<ffffffff8fe4519c>] __fput+0xec/0x260 [<ffffffff8fe453fe>] ____fput+0xe/0x10 [<ffffffff8fcbfd1b>] task_work_run+0xbb/0xe0 [<ffffffff8fc2bc65>] do_notify_resume+0xa5/0xc0 [<ffffffff90379134>] int_signal+0x12/0x17 Code: 00 00 ba 00 04 00 00 0f 4f c2 3d 00 04 00 00 89 45 bc 0f 84 e7 01 00 00 48 63 45 bc 49 8d 04 c4 48 89 45 b0 48 8b 80 c8 00 00 00 <48> 8b 78 10 48 89 45 c0 48 83 c0 10 48 89 45 d0 48 8b 17 48 39 RIP [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300 RSP <ffff88b5393abd60> CR2: 0000000000000010 The panic is the result of slab entries being freed during the destruction of the pq slab. The code attempts to quiesce the pq, but looking for n_req == 0 doesn't account for new requests. Fix the issue by using SRCU to get a pq pointer and adjust the pq free logic to NULL the fd pq pointer prior to the quiesce. Fixes: e87473bc1b6c ("IB/hfi1: Only set fd pointer when base context is completely initialized") Link: https://lore.kernel.org/r/20200210131033.87408.81174.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
5ffd0486 |
|
06-Jan-2020 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Add software counter for ctxt0 seq drop All other code paths increment some form of drop counter. This was missed in the original implementation. Fixes: 82c2611daaf0 ("staging/rdma/hfi1: Handle packets with invalid RHF on context 0") Link: https://lore.kernel.org/r/20200106134228.119356.96828.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
d791d294 |
|
06-Jan-2020 |
Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> |
IB/hfi1: Return void in packet receiving functions Packet receiving functions returns int value, and yet the return values are not used at all. This patch converts the functions to return void. Link: https://lore.kernel.org/r/20200106134222.119356.84098.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
cd47b594 |
|
06-Jan-2020 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: IB/hfi1: Add an API to handle special case drop This patch pushes special case drop logic into an API to be shared by all interrupt handlers. Additionally, convert do_drop to a bool. Link: https://lore.kernel.org/r/20200106134203.119356.36962.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
01c7fc50 |
|
06-Jan-2020 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Add fast and slow handlers for receive context This patch eliminate special cases by adding a fast_handler member to the receive context and changes to the fast handler as specified in the new variable. Initialize the variable as soon as the setting for dma tail is known when the context is created. Setting fast path is called every time when any context has entered slow path. Add function to check if contexts is using fast path and do not set fast path when it is already done to improve RCD fastpath setting. Link: https://lore.kernel.org/r/20200106134150.119356.87558.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
2fb3b5ae |
|
19-Dec-2019 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Add accessor API routines to access context members This patch adds a set of accessor routines to access context members. Link: https://lore.kernel.org/r/20191219211922.58387.26548.stgit@awfm-01.aw.intel.com Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
3889551d |
|
12-Nov-2019 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv This converts one of the two users of mmu_notifiers to use the new API. The conversion is fairly straightforward, however the existing use of notifiers here seems to be racey. Link: https://lore.kernel.org/r/20191112202231.3856-7-jgg@ziepe.ca Tested-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
9755f724 |
|
13-Jun-2019 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Create inline to get extended headers This paves the way for another patch that reacts to a flush sdma completion for RC. Fixes: 81cd3891f021 ("IB/hfi1: Add support for 16B Management Packets") Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
03b92789 |
|
08-Feb-2019 |
Matthew Wilcox <willy@infradead.org> |
hfi1: Convert hfi1_unit_table to XArray Also remove hfi1_devs_list. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
0ee3b915 |
|
20-Feb-2019 |
Matthew Wilcox <willy@infradead.org> |
hfi1: Convert vesw_idr to XArray Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
bc5add09 |
|
26-Feb-2019 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Close race condition on user context disable and close When disabling and removing a receive context, it is possible for an asynchronous event (i.e IRQ) to occur. Because of this, there is a race between cleaning up the context, and the context being used by the asynchronous event. cpu 0 (context cleanup) rc->ref_count-- (ref_count == 0) hfi1_rcd_free() cpu 1 (IRQ (with rcd index)) rcd_get_by_index() lock ref_count+++ <-- reference count race (WARNING) return rcd unlock cpu 0 hfi1_free_ctxtdata() <-- incorrect free location lock remove rcd from array unlock free rcd This race will cause the following WARNING trace: WARNING: CPU: 0 PID: 175027 at include/linux/kref.h:52 hfi1_rcd_get_by_index+0x84/0xa0 [hfi1] CPU: 0 PID: 175027 Comm: IMB-MPI1 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.el7.x86_64 #1 Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.11.01.0076.C4.111920150602 11/19/2015 Call Trace: dump_stack+0x19/0x1b __warn+0xd8/0x100 warn_slowpath_null+0x1d/0x20 hfi1_rcd_get_by_index+0x84/0xa0 [hfi1] is_rcv_urgent_int+0x24/0x90 [hfi1] general_interrupt+0x1b6/0x210 [hfi1] __handle_irq_event_percpu+0x44/0x1c0 handle_irq_event_percpu+0x32/0x80 handle_irq_event+0x3c/0x60 handle_edge_irq+0x7f/0x150 handle_irq+0xe4/0x1a0 do_IRQ+0x4d/0xf0 common_interrupt+0x162/0x162 The race can also lead to a use after free which could be similar to: general protection fault: 0000 1 SMP CPU: 71 PID: 177147 Comm: IMB-MPI1 Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.el7.x86_64 #1 Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.11.01.0076.C4.111920150602 11/19/2015 task: ffff9962a8098000 ti: ffff99717a508000 task.ti: ffff99717a508000 __kmalloc+0x94/0x230 Call Trace: ? hfi1_user_sdma_process_request+0x9c8/0x1250 [hfi1] hfi1_user_sdma_process_request+0x9c8/0x1250 [hfi1] hfi1_aio_write+0xba/0x110 [hfi1] do_sync_readv_writev+0x7b/0xd0 do_readv_writev+0xce/0x260 ? handle_mm_fault+0x39d/0x9b0 ? pick_next_task_fair+0x5f/0x1b0 ? sched_clock_cpu+0x85/0xc0 ? __schedule+0x13a/0x890 vfs_writev+0x35/0x60 SyS_writev+0x7f/0x110 system_call_fastpath+0x22/0x27 Use the appropriate kref API to verify access. Reorder context cleanup to ensure context removal before cleanup occurs correctly. Cc: stable@vger.kernel.org # v4.14.0+ Fixes: f683c80ca68e ("IB/hfi1: Resolve kernel panics by reference counting receive contexts") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
22d136d7 |
|
24-Jan-2019 |
Kaike Wan <kaike.wan@intel.com> |
IB/hfi1: Add TID RDMA handlers This commit adds the TID RDMA READ pointers to the receiving opcode handlers. It also adds TID RDMA READ header sizes to header size table. A function to print the RHF EFLAGS errors is created so that it can be shared by both IB and TID RDMA receiving functions. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
838b6fd2 |
|
23-Jan-2019 |
Kaike Wan <kaike.wan@intel.com> |
IB/hfi1: TID RDMA RcvArray programming and TID allocation TID entries are used by hfi1 hardware to receive data payload from incoming packets directly into a user buffer and thus avoid data copying by software. This patch implements the functions for TID allocation, freeing, and programming TID RcvArray entries in hardware for kernel clients. TID entries are managed via lists of TID groups similar to PSM. Furthermore, to track TID resource allocation for each request, software flows are also allocated and freed as needed. Since software flows consume large amount of memory for tracking TID allocation and freeing, it is generally desirable to allocate them dynamically in the send queue and only for TID RDMA requests, but pre-allocate them for receive queue because the send queue could have thousands of entries while the receive queue has only a limited number of entries. Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
37356e78 |
|
05-Feb-2019 |
Kaike Wan <kaike.wan@intel.com> |
IB/hfi1: TID RDMA flow allocation The hfi1 hardware flow is a hardware flow-control mechanism for a KDETH data packet that is received on a hfi1 port. It validates the packet by checking both the generation and sequence. Each QP that uses the TID RDMA mechanism will allocate a hardware flow from its receiving context for any incoming KDETH data packets. This patch implements: (1) a function to allocate hardware flow (2) a function to free hardware flow (3) a function to initialize hardware flow generation for a receiving context (4) a wait mechanism if the hardware flow is not available (4) a function to remove the qp from the wait queue for hardware flow when the qp is reset or destroyed. Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
f01b4d5a |
|
23-Jan-2019 |
Kaike Wan <kaike.wan@intel.com> |
IB/hfi1: OPFN interface OPFN allows a pair of connected RC QPs to exchange a set of parameters in succession. The parameter exchange itself is done using the IB compare and swap request with a special virtual address. The request is triggered using a reserved IB work request opcode. This patch implements the OPFN interface to initialize, start, process, and reset the OPFN request. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
44e43d91 |
|
24-Jan-2019 |
Mitko Haralanov <mitko.haralanov@intel.com> |
IB/hfi1: OPFN support discovery OPFN (Omni Path Feature Negotiation) support discovery allows a RC QP to announce that it supports OPFN and also discover if OPFN is supported by the peer QP. OPFN parameter negotiation is skipped unless OPFN support is first discovered. OPFN support is announced by claiming what was the reserved bit in dword 1 of OmniPath modified base transport header in requests and responses. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
fe4dd423 |
|
28-Nov-2018 |
Mitko Haralanov <mitko.haralanov@intel.com> |
IB/hfi1: Correctly process FECN and BECN in packets A CA is supposed to ignore FECN bits in multicast, ACK, and CNP packets. This patch corrects the behavior of the HFI1 driver in this regard by ignoring FECNs in those packet types. While fixing the above behavior, fix the extraction of the FECN and BECN bits from the packet headers for both 9B and 16B packets. Furthermore, this patch corrects the driver's response to a FECN in RDMA READ RESPONSE packets. Instead of sending an "empty" ACK, the driver now sends a CNP packet. While editing that code path, add the missing trace for CNP packets. Fixes: 88733e3b8450 ("IB/hfi1: Add 16B UD support") Fixes: f59fb9e05109 ("IB/hfi1: Fix handling of FECN marked multicast packet") Reviewed-by: Kaike Wan <kaike.wan@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
36d84219 |
|
28-Nov-2018 |
Piotr Stankiewicz <piotr.stankiewicz@intel.com> |
IB/hfi1: Fix an out-of-bounds access in get_hw_stats When running with KASAN, the following trace is produced: [ 62.535888] ================================================================== [ 62.544930] BUG: KASAN: slab-out-of-bounds in gut_hw_stats+0x122/0x230 [hfi1] [ 62.553856] Write of size 8 at addr ffff88080e8d6330 by task kworker/0:1/14 [ 62.565333] CPU: 0 PID: 14 Comm: kworker/0:1 Not tainted 4.19.0-test-build-kasan+ #8 [ 62.575087] Hardware name: Intel Corporation S2600KPR/S2600KPR, BIOS SE5C610.86B.01.01.0019.101220160604 10/12/2016 [ 62.587951] Workqueue: events work_for_cpu_fn [ 62.594050] Call Trace: [ 62.598023] dump_stack+0xc6/0x14c [ 62.603089] ? dump_stack_print_info.cold.1+0x2f/0x2f [ 62.610041] ? kmsg_dump_rewind_nolock+0x59/0x59 [ 62.616615] ? get_hw_stats+0x122/0x230 [hfi1] [ 62.622985] print_address_description+0x6c/0x23c [ 62.629744] ? get_hw_stats+0x122/0x230 [hfi1] [ 62.636108] kasan_report.cold.6+0x241/0x308 [ 62.642365] get_hw_stats+0x122/0x230 [hfi1] [ 62.648703] ? hfi1_alloc_rn+0x40/0x40 [hfi1] [ 62.655088] ? __kmalloc+0x110/0x240 [ 62.660695] ? hfi1_alloc_rn+0x40/0x40 [hfi1] [ 62.667142] setup_hw_stats+0xd8/0x430 [ib_core] [ 62.673972] ? show_hfi+0x50/0x50 [hfi1] [ 62.680026] ib_device_register_sysfs+0x165/0x180 [ib_core] [ 62.687995] ib_register_device+0x5a2/0xa10 [ib_core] [ 62.695340] ? show_hfi+0x50/0x50 [hfi1] [ 62.701421] ? ib_unregister_device+0x2e0/0x2e0 [ib_core] [ 62.709222] ? __vmalloc_node_range+0x2d0/0x380 [ 62.716131] ? rvt_driver_mr_init+0x11f/0x2d0 [rdmavt] [ 62.723735] ? vmalloc_node+0x5c/0x70 [ 62.729697] ? rvt_driver_mr_init+0x11f/0x2d0 [rdmavt] [ 62.737347] ? rvt_driver_mr_init+0x1f5/0x2d0 [rdmavt] [ 62.744998] ? __rvt_alloc_mr+0x110/0x110 [rdmavt] [ 62.752315] ? rvt_rc_error+0x140/0x140 [rdmavt] [ 62.759434] ? rvt_vma_open+0x30/0x30 [rdmavt] [ 62.766364] ? mutex_unlock+0x1d/0x40 [ 62.772445] ? kmem_cache_create_usercopy+0x15d/0x230 [ 62.780115] rvt_register_device+0x1f6/0x360 [rdmavt] [ 62.787823] ? rvt_get_port_immutable+0x180/0x180 [rdmavt] [ 62.796058] ? __get_txreq+0x400/0x400 [hfi1] [ 62.802969] ? memcpy+0x34/0x50 [ 62.808611] hfi1_register_ib_device+0xde6/0xeb0 [hfi1] [ 62.816601] ? hfi1_get_npkeys+0x10/0x10 [hfi1] [ 62.823760] ? hfi1_init+0x89f/0x9a0 [hfi1] [ 62.830469] ? hfi1_setup_eagerbufs+0xad0/0xad0 [hfi1] [ 62.838204] ? pcie_capability_clear_and_set_word+0xcd/0xe0 [ 62.846429] ? pcie_capability_read_word+0xd0/0xd0 [ 62.853791] ? hfi1_pcie_init+0x187/0x4b0 [hfi1] [ 62.860958] init_one+0x67f/0xae0 [hfi1] [ 62.867301] ? hfi1_init+0x9a0/0x9a0 [hfi1] [ 62.873876] ? wait_woken+0x130/0x130 [ 62.879860] ? read_word_at_a_time+0xe/0x20 [ 62.886329] ? strscpy+0x14b/0x280 [ 62.891998] ? hfi1_init+0x9a0/0x9a0 [hfi1] [ 62.898405] local_pci_probe+0x70/0xd0 [ 62.904295] ? pci_device_shutdown+0x90/0x90 [ 62.910833] work_for_cpu_fn+0x29/0x40 [ 62.916750] process_one_work+0x584/0x960 [ 62.922974] ? rcu_work_rcufn+0x40/0x40 [ 62.928991] ? __schedule+0x396/0xdc0 [ 62.934806] ? __sched_text_start+0x8/0x8 [ 62.941020] ? pick_next_task_fair+0x68b/0xc60 [ 62.947674] ? run_rebalance_domains+0x260/0x260 [ 62.954471] ? __list_add_valid+0x29/0xa0 [ 62.960607] ? move_linked_works+0x1c7/0x230 [ 62.967077] ? trace_event_raw_event_workqueue_execute_start+0x140/0x140 [ 62.976248] ? mutex_lock+0xa6/0x100 [ 62.982029] ? __mutex_lock_slowpath+0x10/0x10 [ 62.988795] ? __switch_to+0x37a/0x710 [ 62.994731] worker_thread+0x62e/0x9d0 [ 63.000602] ? max_active_store+0xf0/0xf0 [ 63.006828] ? __switch_to_asm+0x40/0x70 [ 63.012932] ? __switch_to_asm+0x34/0x70 [ 63.019013] ? __switch_to_asm+0x40/0x70 [ 63.025042] ? __switch_to_asm+0x34/0x70 [ 63.031030] ? __switch_to_asm+0x40/0x70 [ 63.037006] ? __schedule+0x396/0xdc0 [ 63.042660] ? kmem_cache_alloc_trace+0xf3/0x1f0 [ 63.049323] ? kthread+0x59/0x1d0 [ 63.054594] ? ret_from_fork+0x35/0x40 [ 63.060257] ? __sched_text_start+0x8/0x8 [ 63.066212] ? schedule+0xcf/0x250 [ 63.071529] ? __wake_up_common+0x110/0x350 [ 63.077794] ? __schedule+0xdc0/0xdc0 [ 63.083348] ? wait_woken+0x130/0x130 [ 63.088963] ? finish_task_switch+0x1f1/0x520 [ 63.095258] ? kasan_unpoison_shadow+0x30/0x40 [ 63.101792] ? __init_waitqueue_head+0xa0/0xd0 [ 63.108183] ? replenish_dl_entity.cold.60+0x18/0x18 [ 63.115151] ? _raw_spin_lock_irqsave+0x25/0x50 [ 63.121754] ? max_active_store+0xf0/0xf0 [ 63.127753] kthread+0x1ae/0x1d0 [ 63.132894] ? kthread_bind+0x30/0x30 [ 63.138422] ret_from_fork+0x35/0x40 [ 63.146973] Allocated by task 14: [ 63.152077] kasan_kmalloc+0xbf/0xe0 [ 63.157471] __kmalloc+0x110/0x240 [ 63.162804] init_cntrs+0x34d/0xdf0 [hfi1] [ 63.168883] hfi1_init_dd+0x29a3/0x2f90 [hfi1] [ 63.175244] init_one+0x551/0xae0 [hfi1] [ 63.181065] local_pci_probe+0x70/0xd0 [ 63.186759] work_for_cpu_fn+0x29/0x40 [ 63.192310] process_one_work+0x584/0x960 [ 63.198163] worker_thread+0x62e/0x9d0 [ 63.203843] kthread+0x1ae/0x1d0 [ 63.208874] ret_from_fork+0x35/0x40 [ 63.217203] Freed by task 1: [ 63.221844] __kasan_slab_free+0x12e/0x180 [ 63.227844] kfree+0x92/0x1a0 [ 63.232570] single_release+0x3a/0x60 [ 63.238024] __fput+0x1d9/0x480 [ 63.242911] task_work_run+0x139/0x190 [ 63.248440] exit_to_usermode_loop+0x191/0x1a0 [ 63.254814] do_syscall_64+0x301/0x330 [ 63.260283] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 63.270199] The buggy address belongs to the object at ffff88080e8d5500 which belongs to the cache kmalloc-4096 of size 4096 [ 63.287247] The buggy address is located 3632 bytes inside of 4096-byte region [ffff88080e8d5500, ffff88080e8d6500) [ 63.303564] The buggy address belongs to the page: [ 63.310447] page:ffffea00203a3400 count:1 mapcount:0 mapping:ffff88081380e840 index:0x0 compound_mapcount: 0 [ 63.323102] flags: 0x2fffff80008100(slab|head) [ 63.329775] raw: 002fffff80008100 0000000000000000 0000000100000001 ffff88081380e840 [ 63.340175] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000 [ 63.350564] page dumped because: kasan: bad access detected [ 63.361974] Memory state around the buggy address: [ 63.369137] ffff88080e8d6200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 63.379082] ffff88080e8d6280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 63.389032] >ffff88080e8d6300: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc [ 63.398944] ^ [ 63.406141] ffff88080e8d6380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 63.416109] ffff88080e8d6400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 63.426099] ================================================================== The trace happens because get_hw_stats() assumes there is room in the memory allocated in init_cntrs() to accommodate the driver counters. Unfortunately, that routine only allocated space for the device counters. Fix by insuring the allocation has room for the additional driver counters. Cc: <Stable@vger.kernel.org> # v4.14+ Fixes: b7481944b06e9 ("IB/hfi1: Show statistics counters under IB stats interface") Reviewed-by: Mike Marciniczyn <mike.marciniszyn@intel.com> Reviewed-by: Mike Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
508a523f |
|
11-Oct-2018 |
Parav Pandit <parav@mellanox.com> |
RDMA/drivers: Use core provided API for registering device attributes Use rdma_set_device_sysfs_group() to register device attributes and simplify the driver. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
dc9f5d0f |
|
16-Aug-2018 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Move URGENT IRQ enable to hfi1_rcvctrl() User contexts use the receive URGENT interrupt. However, enabling the IRQ SRC in the file_ops module is not as clean as it could be. Augment the _rcvctl() function to be able to enable/disable the IRQ source. Use the new interface from file_ops to enable/disable the IRQ. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
a2f7bbdc |
|
16-Aug-2018 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Rework the IRQ API to be more flexible The current IRQ API is an all or nothing interface. This has two problems: 1. All IRQs are enabled regardless of use 2. Moving from general interrupt to MSIx handling is difficult Introduce a new API to enable/disable specific IRQs or a range of IRQs. Do not enable and disable all IRQs in one step. Rework various modules to enable/disable IRQs when needed. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
6eb4eb10 |
|
16-Aug-2018 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Make the MSIx resource allocation a bit more flexible The current method of allocating MSIx resources is a bit cumbersome, and not very easily added to. Refactor and re-order the code paths into a more consistent interface. Update the interface so that allocations are not order dependent. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
09e71899 |
|
16-Aug-2018 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Prepare for new HFI1 MSIx API The current HFI1 MSIx API is difficult to follow, change, or add to. In anticipation of moving to an more flexible API, move the current MSIx functionality to the new msix.c module. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
57f97e96 |
|
16-Aug-2018 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Get the hfi1_devdata structure as early as possible Currently several things occur before the hfi1_devdata structure is allocated. This leads to an inconsistent logging ability and makes it more difficult to restructure some code paths. Allocate (and do a minimal init) the structure as soon as possible. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
6a516bc9 |
|
15-Aug-2018 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: tune_pcie_caps is arbitrarily placed, poorly The tune_pcie_caps needs to occur sometime after PCI is enabled, but before the HFI is enabled. Currently it is placed in the MSIx allocation code which doesn't really fit. Moving it to just after the gen3 bump. Clean up the associated code (modules, etc.). Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
70324739 |
|
20-Jun-2018 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Remove INTx support and simplify MSIx usage The INTx IRQ support does not work for all HF1 IRQ handlers (specifically the receive data IRQs). Remove all supporting code for the INTx IRQ. If the requested MSIx vector request is unsuccessful, do not allow the driver to continue. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Kamenee Arumugam <kamenee.arumugam@intel.com> Reviewed-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
071e4fec |
|
20-Jun-2018 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Reorg ctxtdata and rightsize fields Many fields in ctxtdata are incorrectly sized and the organization of the fields within the structure is a jumble. Fix by: - Correcting oversize fields. - Putting fields common to all contexts at the top with hot fields at the top. - Moving PSM fields to the bottom of the structure. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
06e81e3e |
|
20-Jun-2018 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Remove caches of chip CSRs Remove the sizeable cache of the chip sizing CSRs and replace with CSR reads as needed. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
15d063d5 |
|
20-Jun-2018 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Remove unused/writeonly devdata fields Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
4b0b76bd |
|
20-Jun-2018 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Rightsize ctxt_eager_bufs fields Fields in this structure are sized excessively based on hardware limitations and input values. Fix by reducing fields as appropriate and repositioning to close holes in the structure. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
b67bbc59 |
|
20-Jun-2018 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Remove rcvctrl from ctxtdata It is only ever written. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
b2578431 |
|
20-Jun-2018 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Remove rcvhdrq_size The usage of this ctxt data field is not hot path and the value can be computed on demand to cut down the ctxtdata bloat. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
32e3d970 |
|
04-Jun-2018 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Remove rcvhdrsize The field is based on a constant that can never change. Use the define to assign the register instead. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
40442b30 |
|
04-Jun-2018 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Move rhf_offset from devdata to ctxtdata This field should be in ctxtdata to allow for better locality of access by eliminating a dd dereference. The new field is now side-by-side with rcvhdrqentsize since the rhf_offset is a function of the rcvhdrqentsize. Both fields are now correctly sized as u8. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
b0ba3c18 |
|
04-Jun-2018 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Move normal functions from hfi1_devdata to const array The current implementation precludes having receive context specific packet type receive handlers. Fix this by adding adding c99 const array for the existing handlers and remove the current 72 bytes of pointers from devdata. A new pointer in hfi1_ctxtdata will point to the const array. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
ed71e86a |
|
04-Jun-2018 |
Kaike Wan <kaike.wan@intel.com> |
IB/hfi1: Rename exp_lock to exp_mutex The mutex exp_lock in struct hfi1_ctxtdata is used to protect all Expected TID data of a user context. This patch renames it to exp_mutex to better reflect its identity and prepare for upcoming patches. Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
c8314811 |
|
15-May-2018 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Cleanup of exp_rcv The knowledge of the internal workings of the expect receive is too distributed. Fix by: - right size several rcd fields associated with expect receive - making an init entrance to init all the lists - consolidate all the allocations into an array anchored in the rcd Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
81cd3891 |
|
15-May-2018 |
Don Hiatt <don.hiatt@intel.com> |
IB/hfi1: Add support for 16B Management Packets 16B Management Packets (L4=0x08) replace the BTH and DETH of normal MAD packet packets with a header containing the the source and destination queue pair numbers; fields that were originally retrieved from the BTH/DETH are now populated from this header as well as from the 16B LRH (e.g. pkey). 16B Management Packets are used as an optimized management format on 16B fabrics. These management packets have an opcode of IB_OPCODE_UD_SEND_ONLY, a fixed 3Byte pad, and a header length of 24Bytes. The decision as to when we send a management packet is based upon either the source or destination queue pair number being 0 or 1. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
4171a693 |
|
15-May-2018 |
Don Hiatt <don.hiatt@intel.com> |
IB/hfi1: Define 16B Management Packets Add 16B Management Packet definition. This optimized packet format replaces the ib_other_headers and BTH with a source and destination QP number. To support these packets we introduce struct opa_16b_mgmt into the struct hfi1_16b_header. This packet format is only used for MAD packets using the IB_OPCODE_UD_SEND_ONLY opcode on QP0/1. The original 16B implementation failed to use 16B management packets so now we add their definition. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
5d18ee67 |
|
02-May-2018 |
Sebastian Sanchez <sebastian.sanchez@intel.com> |
IB/{hfi1, rdmavt, qib}: Implement CQ completion vector support Currently the driver doesn't support completion vectors. These are used to indicate which sets of CQs should be grouped together into the same vector. A vector is a CQ processing thread that runs on a specific CPU. If an application has several CQs bound to different completion vectors, and each completion vector runs on different CPUs, then the completion queue workload is balanced. This helps scale as more nodes are used. Implement CQ completion vector support using a global workqueue where a CQ entry is queued to the CPU corresponding to the CQ's completion vector. Since the workqueue is global, it's guaranteed to always be there when queueing CQ entries; Therefore, the RCU locking for cq->rdi->worker in the hot path is superfluous. Each completion vector is assigned to a different CPU. The number of completion vectors available is computed by taking the number of online, physical CPUs from the local NUMA node and subtracting the CPUs used for kernel receive queues and the general interrupt. Special use cases: * If there are no CPUs left for completion vectors, the same CPU for the general interrupt is used; Therefore, there would only be one completion vector available. * For multi-HFI systems, the number of completion vectors available for each device is the total number of completion vectors in the local NUMA node divided by the number of devices in the same NUMA node. If there's a division remainder, the first device to get initialized gets an extra completion vector. Upon a CQ creation, an invalid completion vector could be specified. Handle it as follows: * If the completion vector is less than 0, set it to 0. * Set the completion vector to the result of the passed completion vector moded with the number of device completion vectors available. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
a74d5307 |
|
02-May-2018 |
Mitko Haralanov <mitko.haralanov@intel.com> |
IB/hfi1: Rework fault injection machinery The packet fault injection code present in the HFI1 driver had some issues which not only fragment the code but also created user confusion. Furthermore, it suffered from the following issues: 1. The fault_packet method only worked for received packets. This meant that the only fault injection mode available for sent packets is fault_opcode, which did not allow for random packet drops on all egressing packets. 2. The mask available for the fault_opcode mode did not really work due to the fact that the opcode values are not bits in a bitmask but rather sequential integer values. Creating a opcode/mask pair that would successfully capture a set of packets was nearly impossible. 3. The code was fragmented and used too many debugfs entries to operate and control. This was confusing to users. 4. It did not allow filtering fault injection on a per direction basis - egress vs. ingress. In order to improve or fix the above issues, the following changes have been made: 1. The fault injection methods have been combined into a single fault injection facility. As such, the fault injection has been plugged into both the send and receive code paths. Regardless of method used the fault injection will operate on both egress and ingress packets. 2. The type of fault injection - by packet or by opcode - is now controlled by changing the boolean value of the file "opcode_mode". When the value is set to True, fault injection is done by opcode. Otherwise, by packet. 2. The masking ability has been removed in favor of a bitmap that holds opcodes of interest (one bit per opcode, a total of 256 bits). This works in tandem with the "opcode_mode" value. When the value of "opcode_mode" is False, this bitmap is ignored. When the value is True, the bitmap lists all opcodes to be considered for fault injection. By default, the bitmap is empty. When the user wants to filter by opcode, the user sets the corresponding bit in the bitmap by echo'ing the bit position into the 'opcodes' file. This gets around the issue that the set of opcodes does not lend itself to effective masks and allow for extremely fine-grained filtering by opcode. 4. fault_packet and fault_opcode methods have been combined. Hence, there is only one debugfs directory controlling the entire operation of the fault injection machinery. This reduces the number of debugfs entries and provides a more unified user experience. 5. A new control files - "direction" - is provided to allow the user to control the direction of packets, which are subject to fault injection. 6. A new control file - "skip_usec" - is added that would allow the user to specify a "timeout" during which no fault injection will occur. In addition, the following bug fixes have been applied: 1. The fault injection code has been split into its own header and source files. This was done to better organize the code and support conditional compilation without littering the code with #ifdef's. 2. The method by which the TX PIO packets were being marked for drop conflicted with the way send contexts were being setup. As a result, the send context was repeatedly being reset. 3. The fault injection only makes sense when the user can control it through the debugfs entries. However, a kernel configuration can enable fault injection but keep fault injection debugfs entries disabled. Therefore, it makes sense that the HFI fault injection code depends on both. 4. Error suppression did not take into account the method by which PIO packets were being dropped. Therefore, even with error suppression turned on, errors would still be displayed to the screen. A larger enough packet drop percentage would case the kernel to crash because the driver would be stuck printing errors. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
8d3e7113 |
|
02-May-2018 |
Alex Estrin <alex.estrin@intel.com> |
IB/{hfi1, qib}: Add handling of kernel restart A warm restart will fail to unload the driver, leaving link state potentially flapping up to the point the BIOS resets the adapter. Correct the issue by hooking the shutdown pci method, which will bring port down. Cc: <stable@vger.kernel.org> # 4.9.x Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
f59fb9e0 |
|
01-May-2018 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Fix handling of FECN marked multicast packet The code for handling a marked UD packet unconditionally returns the dlid in the header of the FECN marked packet. This is not correct for multicast packets where the DLID is in the multicast range. The subsequent attempt to send the CNP with the multicast lid will cause the chip to halt the ack send context because the source lid doesn't match the chip programming. The send context will be halted and flush any other pending packets in the pio ring causing the CNP to not be sent. A part of investigating the fix, it was determined that the 16B work broke the FECN routine badly with inconsistent use of 16 bit and 32 bits types for lids and pkeys. Since the port's source lid was correctly 32 bits the type mixmatches need to be dealt with at the same time as fixing the CNP header issue. Fix these issues by: - Using the ports lid for as the SLID for responding to FECN marked UD packets - Insure pkey is always 16 bit in this and subordinate routines - Insure lids are 32 bits in this and subordinate routines Cc: <stable@vger.kernel.org> # 4.14.x Fixes: 88733e3b8450 ("IB/hfi1: Add 16B UD support") Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
8a18e911 |
|
11-Mar-2018 |
Zhu Yanjun <yanjun.zhu@oracle.com> |
IB: remove duplicate header files In hfi.h, the header file opa_addr.h is included twice. In vt.h, the header file mmap.h is included twice. Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
07190076 |
|
01-Feb-2018 |
Kamenee Arumugam <kamenee.arumugam@intel.com> |
IB/hfi1: Convert PortXmitWait/PortVLXmitWait counters to flit times HFI's counters SendWaitCnt and SendWaitVlCnt are in units of TXE cycle time (at 805MHz). OPA counters PortXmitWait and PortVLXmtWait are in units of flit times. Convert the counter values to flit units using following conversion formula: PortXmitWait = SendWaitCnt * 2 * (4 /link_width) * (25 Gbps /link_speed) PortVLXmitWait = SendWaitVLCnt * 2 * (4 /link_width) * (25 Gbps /link_speed) At link up or downgrade events, the link width can change. To ensure accurate counter calculations, sample the counters after the events, during counter requests, and then aggregate the OPA counters. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Kamenee Arumugam <kamenee.arumugam@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
ca85bb1c |
|
01-Feb-2018 |
Sebastian Sanchez <sebastian.sanchez@intel.com> |
IB/hfi1: Remove unnecessary fecn and becn fields packet->fecn and packet->becn are calculated in the hot path and are never used. Remove these fields as they show to be costly in a profile. Also, remove initialization for becn and fecn in process_ecn() as they're unconditionally assigned in the function and ensure fecn and becn variables use a boolean type. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
6d6b8848 |
|
01-Feb-2018 |
Sebastian Sanchez <sebastian.sanchez@intel.com> |
IB/hfi1: Optimize packet type comparison using 9B and bypass code paths The packet type comparison used to find out if a packet is a bypass packet in the hot path is an expensive operation as seen in a profile. Determine packet's pkey and migration bit through the bypass and 9B code paths instead. Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
82a97926 |
|
01-Feb-2018 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Re-order IRQ cleanup to address driver cleanup race The pci_request_irq() interfaces always adds the IRQF_SHARED bit to all IRQ requests. When the kernel is built with CONFIG_DEBUG_SHIRQ config flag, if the IRQF_SHARED bit is set, a call to the IRQ handler is made from the __free_irq() function. This is testing a race condition between the IRQ cleanup and an IRQ racing the cleanup. The HFI driver should be able to handle this race, but does not. This race can cause traces that start with this footprint: BUG: unable to handle kernel NULL pointer dereference at (null) Call Trace: <hfi1 irq handler> ... __free_irq+0x1b3/0x2d0 free_irq+0x35/0x70 pci_free_irq+0x1c/0x30 clean_up_interrupts+0x53/0xf0 [hfi1] hfi1_start_cleanup+0x122/0x190 [hfi1] postinit_cleanup+0x1d/0x280 [hfi1] remove_one+0x233/0x250 [hfi1] pci_device_remove+0x39/0xc0 Export IRQ cleanup function so it can be called from other modules. Using the exported cleanup function: Re-order the driver cleanup code to clean up IRQ resources before other resources, eliminating the race. Re-order error path for init so that the race does not occur. Reduce severity on spurious error message for SDMA IRQs to info. Reviewed-by: Alex Estrin <alex.estrin@intel.com> Reviewed-by: Patel Jay P <jay.p.patel@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
11f0e897 |
|
18-Dec-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/{hfi1, qib}: Fix a concurrency issue with device name in logging The get_unit_name() function crafts a string based on the device name and the device unit number. It then stores this in a static variable. This has concurrency issues as can be seen with this log: hfi1 0000:02:00.0: hfi1_1: read_idle_message: read idle message 0x203 hfi1 0000:01:00.0: hfi1_1: read_idle_message: read idle message 0x203 The PCI device ID (0000:02:00.0 vs. 0000:01:00.0) is correct for the message, but the device string hfi1_1 is incorrect (it should be hfi1_0 for the second log message). Remove get_unit_name() function. Instead, use the rvt accessor rvt_get_ibdev_name() to get the IB name string. Clean up any hfi1_early_xx calls that can now use the new path. QIB has the same (qib_get_unit_name()) issue. Updating as necessary. Remove qib_get_unit_name() function. Update log message that has redundant device name. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
06f2597f |
|
18-Dec-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/{rdmavt, hfi1, qib}: Remove get_card_name() downcall rdmavt has a down call to client drivers to retrieve a crafted card name. This name should be the IB defined name. Rather than craft the name each time it is needed, simply retrieve the IB allocated name from the IB device. Update the function name to reflect its application. Clean up driver code to match this change. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
2e903b61 |
|
22-Dec-2017 |
Don Hiatt <don.hiatt@intel.com> |
IB/hfi1: Change slid arg in ingress_pkey_table_fail to 32bit Change the slid arg to ingress_pkey_table_fail() to a full 32Bits and do not convert to 16Bits in caller. This is so we can keep everything 32bit in the kernel and only change to 16bit at the uapi boundary. Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
4c009af4 |
|
22-Dec-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi: Only read capability registers if the capability exists During driver init, various registers are saved to allow restoration after an FLR or gen3 bump. Some of these registers are not available in some circumstances (i.e. Virtual machines). This bug makes the driver unusable when the PCI device is passed into a VM, it fails during probe. Delete unnecessary register read/write, and only access register if the capability exists. Cc: <stable@vger.kernel.org> # 4.14.x Fixes: a618b7e40af2 ("IB/hfi1: Move saving PCI values to a separate function") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
641f348b |
|
06-Nov-2017 |
Jan Sokolowski <jan.sokolowski@intel.com> |
IB/hfi1: Allow MgmtAllowed on B2B setups HFI's are hard-wired to send Device Info frames with MgmtAllowed bit set to 0. This means in B2B setups, MgmtAllowed would never be allowed, which prevents remote opa management tools from working properly. Assume MgmtAllowed if a neighbor is also an HFI. Fixes: 98b9ee2002a8 ("IB/hfi1: Cache neighbor secure data after link up") Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
1b311f89 |
|
23-Oct-2017 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Add tx_opcode_stats like the opcode_stats This patch adds tx_opcode_stats to parallel the (rx)opcode_stats in the debugfs. Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
e2fdbc23 |
|
11-Oct-2017 |
Bart Van Assche <bvanassche@acm.org> |
IB/hfi1: Define hfi1_handle_cnp_tbl[] once Move the hfi1_handle_cnp_tbl[] from a header file to a .c file such that only one copy ends up in the hfi1 kernel module. This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d7d62617 |
|
02-Oct-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Fix incorrect available receive user context count The addition of the VNIC contexts to num_rcv_contexts changes the meaning of the sysfs value nctxts from available user contexts, to user contexts + reserved VNIC contexts. User applications that use nctxts are now broken. Update the calculation so that VNIC contexts are used only if there are hardware contexts available, and do not silently affect nctxts. Update code to use the calculated VNIC context number. Update the sysfs value nctxts to be available user contexts only. Fixes: 2280740f01ae ("IB/hfi1: Virtual Network Interface Controller (VNIC) HW support") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Niranjana Vishwanathapura <Niranjana.Vishwanathapura@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: <Stable@vger.kernel.org> #v4.12+ Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
e08aa594 |
|
02-Oct-2017 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Fix output trace issues from 16B change The 16B changes to the output side of the header trace introduced two issues: 1. An uninitialized field "l4" for 9B packets This field needs to be given a value of 0 for 9B packets to insure a correct 9B trace. The fix adds a new define to insure that there is a dummy default for 9B packets to insure the correct string is decoded. 2. Use of entry vs. __entry in field references Fixes: Commit 863cf89d472f ("IB/hfi1: Add 16B trace support") Reported-by: Kaike Wan <kaike.wan@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d59075ad |
|
26-Sep-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Add a safe wrapper for _rcd_get_by_index hfi1_rcd_get_by_index assumes that the given index is in the correct range. In most cases this is correct because the index is bounded by a loop. For these cases, adding a range check to the function is redundant. For the use case that is not bounded by the loop range, a _safe wrapper function is needed to validate the index before accessing the rcd array. Add a _safe wrapper to _get_by_index to validate the index range. Update appropriate call sites with the new _safe function. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
6fab2a88 |
|
26-Sep-2017 |
Jan Sokolowski <jan.sokolowski@intel.com> |
IB/hfi1: Remove unused hfi1_cpulist variables Following variables: hfi1_cpulist and hfi1_cpulist_count are unused. Remove them. Reviewed-by: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
21e5acc0 |
|
26-Sep-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Inline common calculation Calculating the offset to a context is done several times throughout the code. Create a common inlined function for doing this calculation. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
156d24d7 |
|
26-Sep-2017 |
Ira Weiny <ira.weiny@intel.com> |
IB/hfi1: Remove unused link_default variable devdata->link_default is no longer variable Maintain number of holes by moving dc_shutdown Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
05cb18fd |
|
26-Sep-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Update HFI to use the latest PCI API The HFI PCI IRQ code uses an obsolete PCI API. Update the code to use the new PCI IRQ API and any necessary changes because of the new API. Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
de42de80 |
|
21-Aug-2017 |
Grzegorz Morys <grzegorz.morys@intel.com> |
IB/hfi1: Ratelimit prints from sdma_interrupt Ratelimit error prints from sdma_interrupt function that could swarm dmesg otherwise. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Grzegorz Morys <grzegorz.morys@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
bf808b50 |
|
13-Aug-2017 |
Kaike Wan <kaike.wan@intel.com> |
IB/hfi1: Add kernel receive context info to debugfs Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d392a673 |
|
13-Aug-2017 |
Jakub Byczkowski <jakub.byczkowski@intel.com> |
IB/hfi1: Remove pstate from hfi1_pportdata Do not track physical state separately from host_link_state. Deduce physical state from host_link_state when required. Change cache_physical_state to log_physical_state to make sure host_link_state reflects hardwares physical state properly. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
91618604 |
|
13-Aug-2017 |
Jakub Byczkowski <jakub.byczkowski@intel.com> |
IB/hfi1: Add flag for platform config scratch register read Add flag in pport data structure to determine when platform config was read from scratch registers. Change conditions in parse_platform_config and get_platform_config_field to use the new flag. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
566d53a8 |
|
04-Aug-2017 |
Don Hiatt <don.hiatt@intel.com> |
IB/hfi1: Enhance PIO/SDMA send for 16B PIO/SDMA send logic now uses the hdr_type field to determine the type of packet that has been constructed. Based on the hdr_type, certain things such as PBC flags, padding count and the LT extra trailing bytes are determined. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
5b6cabb0 |
|
04-Aug-2017 |
Don Hiatt <don.hiatt@intel.com> |
IB/hfi1: Add 16B RC/UC support Add 16B bypass packet support for RC/UC traffic types. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
51e658f5 |
|
04-Aug-2017 |
Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> |
IB/rdmavt, hfi1, qib: Enhance rdmavt and hfi1 to use 32 bit lids Increase lid used in hfi1 driver to 32 bits. qib continues to use 16 bit lids. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
863cf89d |
|
04-Aug-2017 |
Don Hiatt <don.hiatt@intel.com> |
IB/hfi1: Add 16B trace support Add trace support to 16B bypass packets during send and receive. Sample input header trace: <idle>-0 [000] d.h. 271742.509477: input_ibhdr: [0000:05:00.0] (16B) len:24 sc:0 dlid:0xf0000b slid:0x10002 age:0 becn:0 fecn:0 l4:10 rc:0 sc:0 pkey:0x8001 entropy:0x0000 op:0x65,UD_SEND_ONLY_WITH_IMMEDIATE se:0 m:1 pad:3 tver:0 qpn:0xffffff a:0 psn:0x00000001 hlen:248 deth qkey 0x01234567 sqpn 0x000004 Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
88733e3b |
|
04-Aug-2017 |
Don Hiatt <don.hiatt@intel.com> |
IB/hfi1: Add 16B UD support Add 16B bypass packet support for UD traffic types. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d98bb7f7 |
|
04-Aug-2017 |
Don Hiatt <don.hiatt@intel.com> |
IB/hfi1: Determine 9B/16B L2 header type based on Address handle When address handle attributes are initialized, the LIDs are transformed to be in the 32 bit LID space. When constructing the header, hfi1 driver will look at the LID to determine the packet header to be created. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
5786adf3 |
|
04-Aug-2017 |
Don Hiatt <don.hiatt@intel.com> |
IB/hfi1: Add support to process 16B header errors Enhance hdr_rcverr() to also handle errors during 16B bypass packet receive. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
72c07e2b |
|
04-Aug-2017 |
Don Hiatt <don.hiatt@intel.com> |
IB/hfi1: Add support to receive 16B bypass packets We introduce a struct hfi1_16b_header to support 16B headers. 16B bypass packets are received by the driver and processed similar to 9B packets. Add basic support to handle 16B packets. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d295dbeb |
|
04-Aug-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hf1: User context locking is inconsistent There is a mixture of mutex and spinlocks to protect receive context (rcd/uctxt) information. This is not used consistently. Use the mutex to protect device receive context information only. Use the spinlock to protect sub context information only. Protect access to items in the rcd array with a spinlock and reference count. Remove spinlock around dd->rcd array cleanup. Since interrupts are disabled and cleaned up before this point, this lock is not useful. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
f2a3bc00 |
|
04-Aug-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Protect context array set/clear with spinlock The rcd array can be accessed from user context or during interrupts. Protecting this with a mutex isn't a good idea because the mutex should not be used from an IRQ. Protect the allocation and freeing of rcd array elements with a spinlock. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
64a296f5 |
|
04-Aug-2017 |
Bartlomiej Dudek <bartlomiej.dudek@intel.com> |
IB/hfi1: Use host_link_state to read state when DC is shut down When DC is shut down (by e.g. disconnecting the cable), the driver should use host_link_state to get port's current physical state. This is due to the fact that physical state is read from DC's CSRs and when DC is shut down and state is changed, its registers are not impacted. Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Bartlomiej Dudek <bartlomiej.dudek@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
02a222c7 |
|
04-Aug-2017 |
Byczkowski, Jakub <jakub.byczkowski@intel.com> |
IB/hfi1: Remove lstate from hfi1_pportdata Do not track logical state separately from host_link_state. Deduce logical state from host_link_state when required. Transitions in set_link_state and goto_offline already make sure host_link_state reflects hardware's logical state properly. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
626c077c |
|
29-Jul-2017 |
Sebastian Sanchez <sebastian.sanchez@intel.com> |
IB/hfi1: Prevent link down request double queuing When link interrupts occur, multiple link down requests could be queued up when only one is needed. This could get the hfi1 out of sync with its link partner during LNI. Only allow one link down request to be queued at any one time. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
71d47008 |
|
29-Jul-2017 |
Sebastian Sanchez <sebastian.sanchez@intel.com> |
IB/hfi1: Create workqueue for link events Currently, link down interrupts queue link entries on a workqueue intended for sending events only. Create a workqueue for queuing link events. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
a618b7e4 |
|
24-Jul-2017 |
Bartlomiej Dudek <bartlomiej.dudek@intel.com> |
IB/hfi1: Move saving PCI values to a separate function During PCIe initialization some registers' values from PCI config space are saved in order to restore them later (i.e. after reset). Restoring those value is done by a function called restore_pci_variables, while saving them is put directly into function hfi1_pcie_ddinit. Move saving values to a separate function in the image of restoring functionality. Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Bartlomiej Dudek <bartlomiej.dudek@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
e6f7622d |
|
24-Jul-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Size rcd array index correctly and consistently The array index for the rcd array is sized several different ways throughout the code. Use the user interface size (u16) as the standard size and update the necessary code to reflect this. u16 is large enough for the largest amount of supported contexts. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
91d970ab |
|
24-Jul-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Remove unused user context data members Several data members of the user context have become unused over time. Cleaning them up. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
cb51c5d2 |
|
24-Jul-2017 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/hfi1: Fix bar0 mapping to use write combining When the debugpat kernel boot flag is turned on the following traces are printed: [ 1884.793168] x86/PAT: Overlap at 0x90000000-0x92000000 [ 1884.803510] x86/PAT: reserve_memtype added [mem 0x91200000-0x9127ffff], track uncached-minus, req write-combining, ret uncached-minus [ 1884.818167] hfi1 0000:05:00.0: hfi1_0: WC Remapped RcvArray: ffffc9000a980000 The ioremap_wc() clearly is not returning a write combining mapping due to an overlap where the RcvArray is mapped in a uncached mapping prior to creating the proposed write combining mapping. The patch replaces the single base register for uncached CSRs that used to overlap the RcvArray with two mappings. One, kregbase1, from the bar0 up to the RcvArray and another, kregbase2, from the end of the RcvArray to the pio send buffer space. A new dd field, base2_start, is used to convert the zero-based offset in the CSR routines to the correct kregbase1/kregbase2 mapping. A single direct write of the RcvArray CSRs is replaced with hfi1_put_tid() to insure correct access using the new disjoint mapping. Additionally, the kregend field is deleted since it is only ever written. patdebug now shows the RcvArray as write combining: [ 35.688990] x86/PAT: reserve_memtype added [mem 0x91200000-0x9127ffff], track write-combining, req write-combining, ret write-combining To insulate from any potential issues with write combining, all writeq are now flushed in hfi1_put_tid() and rcv_array_wc_fill(). Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
c53df62c |
|
30-Jun-2017 |
Bartlomiej Dudek <bartlomiej.dudek@intel.com> |
IB/hfi1: Check return values from PCI config API calls Ensure that return values from kernel PCI config access API calls in HFI driver are checked and react properly if they are not expected (i.e. not successful). Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Bartlomiej Dudek <bartlomiej.dudek@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
f683c80c |
|
09-Jun-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Resolve kernel panics by reference counting receive contexts Base receive contexts can be used by sub contexts. Because of this, resources for the context cannot be completely freed until all sub contexts are done using the base context. Introduce a reference count so that the base receive context can be freed only when all sub contexts are done with it. Use the provided function call for setting default send context integrity rather than the manual method. The cleanup path does not set all variables back to NULL after freeing resources. Since the clean up code can get called more than once, (e.g. during context close and on the error path), it is necessary to make sure that all the variables are NULLed. Possible crash are: BUG: unable to handle kernel paging request at 0000000001908900 IP: read_csr+0x24/0x30 [hfi1] RIP: 0010:read_csr+0x24/0x30 [hfi1] Call Trace: sc_disable+0x40/0x110 [hfi1] hfi1_file_close+0x16f/0x360 [hfi1] __fput+0xe7/0x210 ____fput+0xe/0x10 or kernel BUG at mm/slub.c:3877! RIP: 0010:kfree+0x14f/0x170 Call Trace: hfi1_free_ctxtdata+0x19a/0x2b0 [hfi1] ? hfi1_user_exp_rcv_grp_free+0x73/0x80 [hfi1] hfi1_file_close+0x20f/0x360 [hfi1] __fput+0xe7/0x210 ____fput+0xe/0x10 Fixes: Commit 62239fc6e554 ("IB/hfi1: Clean up on context initialization failure") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
bec7c79c |
|
29-May-2017 |
Byczkowski, Jakub <jakub.byczkowski@intel.com> |
IB/hfi1: Modify handling of physical link state by Host Driver Ensure states returned to the Fabric Manager are consistent with the OPA specification by caching the physical state along with the logical state. Reviewed-by: Stuart Summers <john.s.summers@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Andrzej Kotlowski <andrzej.kotlowski@intel.com> Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
bb7dde87 |
|
26-May-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Replace deprecated pci functions with new API pci_enable_msix_range() and pci_disable_msix() have been deprecated. Updating to the new pci_alloc_irq_vectors() interface. Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
9039746c |
|
12-May-2017 |
Don Hiatt <don.hiatt@intel.com> |
IB/hfi1: Setup common IB fields in hfi1_packet struct We move many common IB fields into the hfi1_packet structure and set them up in a single function. This allows us to set the fields in a single place and not deal with them throughout the driver. Reviewed-by: Brian Welty <brian.welty@intel.com> Reviewed-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
228d2af1 |
|
12-May-2017 |
Don Hiatt <don.hiatt@intel.com> |
IB/hfi1: Separate input/output header tracing Calls to trace incoming packets will now receive the packet context as parameter. This enables trace support for future packet types. Header trace output is in the format <field>:<value> which makes parsing easier. input_ibhdr trace before change: <idle>-0 [001] d.h. 5904.250925: input_ibhdr: [0000:05:00.0] vl 0 lver 0 sl 0 lnh 2,LRH_BTH dlid 0002 len 18 slid 0001 op 0x64,UD_SEND_ONLY se 0 m 0 pad 0 tver 0 pkey 0xffff f 0 b 0 qpn 0x000001 a 0 psn 0x000001b2 deth qkey 0x80010000 sqpn 0x000001 input_ibhdr trace after change: <idle>-0 [001] d.h. 6655.714488: input_ibhdr: [0000:05:00.0] (IB) len:124 sc:0 dlid:0x0001 slid:0x0002 lnh:2,LRH_BTH lver:0 sl:0 age:0 becn:0 fecn:0 l4:0 rc:0 entropy:0 op:0x64,UD_SEND_ONLY se:0 m:0 pad:0 tver:0 pkey:0x7fff f:0 b:0 qpn:0x000001 a:0 psn:0x00000036 hlen:8 deth qkey:0x80010000 sqpn:0x000001 Reviewed-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
b3e6b4bd |
|
12-May-2017 |
Byczkowski, Jakub <jakub.byczkowski@intel.com> |
RDMA/hfi1: Defer setting VL15 credits to link-up interrupt Keep VL15 credits at 0 during LNI, before link-up. Store VL15 credits value during verify cap interrupt and set in after link-up. This addresses an issue where VL15 MAD packets could be sent by one side of the link before the other side is ready to receive them. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
62239fc6 |
|
04-May-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Clean up on context initialization failure The error path for context initialization is not consistent. Cleanup all resources on failure. Removed unused variable user_event_mask. Add the _BASE_FAILED bit to the event flags so that a base context can notify waiting sub contexts that they cannot continue. Running out of sub contexts is an EBUSY result, not EINVAL. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
8737ce95 |
|
04-May-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Fix an assign/ordering issue with shared context IDs The current algorithm for generating sub-context IDs is FILO. If the contexts are not closed in that order, the uniqueness of the ID will be compromised. I.e. logging the creation/deletion of context IDs with an application that assigns and closes in a FIFO order reveals: cache_id: assign: uctxt: 3 sub_ctxt: 0 cache_id: assign: uctxt: 3 sub_ctxt: 1 cache_id: assign: uctxt: 3 sub_ctxt: 2 cache_id: close: uctxt: 3 sub_ctxt: 0 cache_id: assign: uctxt: 3 sub_ctxt: 2 <<< The sub_ctxt ID 2 is reused incorrectly. Update the sub-context ID assign algorithm to use a bitmask of in_use contexts. The new algorithm will allow the contexts to be closed in any order, and will only re-use unused contexts. Size subctxt and subctxt_cnt to match the user API size. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
9b60d2cb |
|
04-May-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Clean up context initialization Context initialization mixes base context init with sub context init. This is bad because contexts can be reused, and on reuse, reinit things that should not re-initialized. Normalize comments and function names to refer to base context and sub context (not main, shared or slaves). Separate the base context initialization from sub context initialization. hfi1_init_ctxt() cannot return an error so changed to a void and remove error message. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
5fbded48 |
|
04-May-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Search shared contexts on the opened device, not all devices The search for available shared contexts walks each registered hfi1 device. This search is too broad because other devices may not be on the same fabric, and using its contexts could cause unexpected behavior. Removed walking the list of devices, limiting the search to the opened device. With the device walk removed, the hfi1_devdata (dd) is not available. Added it to the hfi1_filedata for reference. With this change, hfi1_count_units() was rendered obsolete and was removed. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
f4cd8765 |
|
04-May-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Name function prototype parameters To improve the readability of function prototypes, give the parameters names. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
4608e4c8 |
|
09-Apr-2017 |
Dennis Dalessandro <dennis.dalessandro@intel.com> |
IB/hfi1: Use bool in process_ecn The process_ecn intends to return a bool value. However it is doing so incorrectly by ANDing the fecn mask. The fecn bit is bit 31. Bool is not a native data type and is up to the compiler to implement how it sees fit. It is conceivable that this upper bit gets washed out. Fix by converting to a bool properly. Cc: stable@vger.kernel.org Fixes: Commit fd2b562edca6 ("IB/hfi1: Pull FECN/BECN processing to a common place") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
98b9ee20 |
|
09-Apr-2017 |
Stuart Summers <john.s.summers@intel.com> |
IB/hfi1: Cache neighbor secure data after link up Secure data is transferred across the link during verify cap. This includes Neighbor Guid, Type, and Port Number. This transfer is not guaranteed to complete until the 8051 firmware has completed processing of the state_complete frame. Move the consumption of this data from verify cap handling to link up handling to ensure the data is finalized. Additionally, do not notify the SM that the link is up until after this data is actually available. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Stuart Summers <john.s.summers@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
22546b74 |
|
28-Apr-2017 |
Tadeusz Struk <tadeusz.struk@intel.com> |
IB/hfi1: Fix softlockup issue Soft lockups can occur because the mad processing on different CPUs acquire the spin lock dc8051_lock: [534552.835870] [<ffffffffa026f993>] ? read_dev_port_cntr.isra.37+0x23/0x160 [hfi1] [534552.835880] [<ffffffffa02775af>] read_dev_cntr+0x4f/0x60 [hfi1] [534552.835893] [<ffffffffa028d7cd>] pma_get_opa_portstatus+0x64d/0x8c0 [hfi1] [534552.835904] [<ffffffffa0290e7d>] hfi1_process_mad+0x48d/0x18c0 [hfi1] [534552.835908] [<ffffffff811dc1f1>] ? __slab_free+0x81/0x2f0 [534552.835936] [<ffffffffa024c34e>] ? ib_mad_recv_done+0x21e/0xa30 [ib_core] [534552.835939] [<ffffffff811dd153>] ? __kmalloc+0x1f3/0x240 [534552.835947] [<ffffffffa024c3fb>] ib_mad_recv_done+0x2cb/0xa30 [ib_core] [534552.835955] [<ffffffffa0237c85>] __ib_process_cq+0x55/0xd0 [ib_core] [534552.835962] [<ffffffffa0237d70>] ib_cq_poll_work+0x20/0x60 [ib_core] [534552.835964] [<ffffffff810a7f3b>] process_one_work+0x17b/0x470 [534552.835966] [<ffffffff810a8d76>] worker_thread+0x126/0x410 [534552.835969] [<ffffffff810a8c50>] ? rescuer_thread+0x460/0x460 [534552.835971] [<ffffffff810b052f>] kthread+0xcf/0xe0 [534552.835974] [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 [534552.835977] [<ffffffff81696418>] ret_from_fork+0x58/0x90 [534552.835980] [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 This issue is made worse when the 8051 is busy and the reads take longer. Fix by using a non-spinning lock procure. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: Mike Marciszyn <mike.marciniszyn@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
3d591099 |
|
09-Apr-2017 |
Don Hiatt <don.hiatt@intel.com> |
IB/hfi1: Use defines from common headers Move FECN and BECN related defines to common header files Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
cb427057 |
|
09-Apr-2017 |
Don Hiatt <don.hiatt@intel.com> |
IB/hfi1: Add functions to parse 9B headers These inline functions improve code readability by enabling callers to read specific fields from the header without knowledge of byte offsets. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
aad559c2 |
|
09-Apr-2017 |
Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> |
IB/hfi1: Rename hdr2sc to hfi1_9B_get_sc5 The function really returned the 5-bit sc value from the header and rhf. hdr2sc didn't quite describe what it did. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
21c433a7 |
|
25-Apr-2017 |
Christoph Hellwig <hch@lst.de> |
IB/hfi1: Use pcie_flr() instead of duplicating it Tested-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Doug Ledford <dledford@redhat.com>
|
#
64551ede |
|
12-Apr-2017 |
Vishwanathapura, Niranjana <niranjana.vishwanathapura@intel.com> |
IB/hfi1: VNIC SDMA support HFI1 VNIC SDMA support enables transmission of VNIC packets over SDMA. Map VNIC queues to SDMA engines and support halting and wakeup of the VNIC queues. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
2280740f |
|
12-Apr-2017 |
Vishwanathapura, Niranjana <niranjana.vishwanathapura@intel.com> |
IB/hfi1: Virtual Network Interface Controller (VNIC) HW support HFI1 HW specific support for VNIC functionality. Dynamically allocate a set of contexts for VNIC when the first vnic port is instantiated. Allocate VNIC contexts from user contexts pool and return them back to the same pool while freeing up. Set aside enough MSI-X interrupts for VNIC contexts and assign them when the contexts are allocated. On the receive side, use an RSM rule to spread TCP/UDP streams among VNIC contexts. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d4829ea6 |
|
12-Apr-2017 |
Vishwanathapura, Niranjana <niranjana.vishwanathapura@intel.com> |
IB/hfi1: OPA_VNIC RDMA netdev support Add support to create and free OPA_VNIC rdma netdev devices. Implement netstack interface functionality including xmit_skb, receive side NAPI etc. Also implement rdma netdev control functions. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
5e6e9424 |
|
20-Mar-2017 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/hfi1: Add a patch value to the firmware version string The HFI firmware now includes a patch level in its version. Updating the necessary code to include the patch version in the firmware string. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
5a52a7ac |
|
20-Mar-2017 |
Sebastian Sanchez <sebastian.sanchez@intel.com> |
IB/hfi1: NULL pointer dereference when freeing rhashtable A NULL pointer dereference occurs when the driver is unloaded, and the SDMA rhashtable is freed if the rhashtable_init() function has not been called. Prevent this by changing sdma_rht to be a pointer to a dynamically allocated hash table. The NULL-ness of the pointer serves as an indication that the hash table was initialized and that it needs to be destroyed. Fixes: 0cb2aa690c7e ("IB/hfi1: Add sysfs interface for affinity setup") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
c27aad00 |
|
08-Feb-2017 |
Jakub Byczkowski <jakub.byczkowski@intel.com> |
IB/hfi1: Modify logging frequency of DCC errors Use rate-limit state to limit number of messages logged to kernel message buffer for DCC errors. Add new macro dd_dev_info_ratelimited for that propose. Replace all dd_dev_info calls in handle_dcc_err function with rate-limited version. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
1198fcea |
|
08-Feb-2017 |
Brian Welty <brian.welty@intel.com> |
IB/hfi1, rdmavt: Move SGE state helper routines into rdmavt To improve code reuse, add small SGE state helper routines to rdmavt_mr.h. Leverage these in hfi1, including refactoring of hfi1_copy_sge. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
76327627 |
|
08-Feb-2017 |
Sebastian Sanchez <sebastian.sanchez@intel.com> |
IB/hfi1: Reduce oversized fields in struct hfi1_packet Some fields in struct hfi1_packet are oversized. Reduce them. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
f3e862cb |
|
08-Feb-2017 |
Sebastian Sanchez <sebastian.sanchez@intel.com> |
IB/hfi1: Access hfi1_ibport through rcd pointer Receive code paths use the QP's device and port number to access the struct hfi1_ibport. When an instance of struct hfi1_ctxtdata is present, it can be used to access struct hfi1_ibport through a pointer. This makes struct hfi1_ibport lookup time faster as an array doesn't have to be indexed and access fields in other cache-lines. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
6e40b59c |
|
07-Dec-2016 |
Tadeusz Struk <tadeusz.struk@intel.com> |
IB/hfi1: Remove definition of unused hfi1_affinity struct The struct hfi1_affinity is not used anymore. We use the struct hfi1_affinity_node and hfi1_affinity_node_list instead. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
fe4d9243 |
|
17-Oct-2016 |
Easwar Hariharan <easwar.hariharan@intel.com> |
IB/hfi1: Add active channel and backplane support for integrated devices Use scratch registers within the HFI1 device to recover signal integrity information that is then used to tune the channel. While there, update error messages to better convey the result of falling back to a backup file. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
6e768f06 |
|
17-Oct-2016 |
Sebastian Sanchez <sebastian.sanchez@intel.com> |
IB/hfi1: Optimize devdata cachelines Profiling shows hot path struct members that need to be in a minimum set of cachelines. Group these struct member in the same cacheline: sc2vl_lock sc2vl rhf_rcv_function_map rcv_limit rhf_offset Group these struct member in the same cacheline: process_pio_send process_dma_send pport rcd int_counter flags num_pports first_user_ctxt Fill holes in struct hfi1_devdata revealed by pahole. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
a6cd5f08 |
|
17-Oct-2016 |
Jakub Pawlak <jakub.pawlak@intel.com> |
IB/hfi1: Unify access to GUID entries This patch consolidates the node GUIDs and the port GUID handling and unifies access to these items. The knowledge of hfi1 GUIDs' design and their location are kept in accessors to centralize access. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
f0f98f74 |
|
17-Oct-2016 |
Easwar Hariharan <easwar.hariharan@intel.com> |
IB/hfi1: Delete unused lock The lock is an unused vestige from qib. Remove it. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
26ea2544 |
|
17-Oct-2016 |
Easwar Hariharan <easwar.hariharan@intel.com> |
IB/hfi1: Clean up unused argument hfi1_pcie_ddinit takes the PCI device id as an argument but never uses it. Clean it up. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
eacc830f |
|
17-Oct-2016 |
Dennis Dalessandro <dennis.dalessandro@intel.com> |
IB/hfi1: Remove leftover snoop references A few snoop related variables were missed in the snoop/capture removal to get out of staging. Go back and clean those up too. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
acd7c8fe |
|
25-Oct-2016 |
Tadeusz Struk <tadeusz.struk@intel.com> |
IB/hfi1: Fix an Oops on pci device force remove This patch fixes an Oops on device unbind, when the device is used by a PSM user process. PSM processes access device resources which are freed on device removal. Similar protection exists in uverbs in ib_core for Verbs clients, but PSM doesn't use ib_uverbs hence a separate protection is required for PSM clients. Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d9ac4555 |
|
10-Oct-2016 |
Jakub Pawlak <jakub.pawlak@intel.com> |
IB/hfi1: Fix integrity check flags default values Prevent setting up integrity check flags when module is loaded with NO_INTEGRITY capability. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
2d01c37d |
|
25-Sep-2016 |
Tadeusz Struk <tadeusz.struk@intel.com> |
IB/hfi1: Add irq affinity notification handler This patch adds an irq affinity notification handler. When a user changes interrupt affinity settings for an sdma engine, the driver needs to make changes to its internal sde structures and also update the affinity_hint. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Reviewed-by: Jianxin Xiong <jianxin.xiong@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
0cb2aa69 |
|
25-Sep-2016 |
Tadeusz Struk <tadeusz.struk@intel.com> |
IB/hfi1: Add sysfs interface for affinity setup Some users want more control over which cpu cores are being used by the driver. For example, users might want to restrict the driver to some specified subset of the cores so that they can appropriately partition processes, irq handlers, and work threads. To allow the user to fine tune system affinity settings new sysfs attributes are introduced per sdma engine. This patch adds a new attribute type for sdma engine and a new cpu_list attribute. When the user writes a cpu range to the cpu_list attribute the driver will create an internal cpu->sdma map, which will be used later as a look-up table to choose an optimal engine for a user requests. Reviewed-by: Dean Luick <dean.luick@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Reviewed-by: Jianxin Xiong <jianxin.xiong@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
242833fb |
|
25-Sep-2016 |
Dennis Dalessandro <dennis.dalessandro@intel.com> |
IB/hfi1: Remove unused variable from devdata We no longer use an error tasklet. Remove it from the hfi1_devdata structure. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
60368186 |
|
06-Sep-2016 |
Tymoteusz Kielan <tymoteusz.kielan@intel.com> |
IB/hfi1: Fix user-space buffers mapping with IOMMU enabled The dma_XXX API functions return bus addresses which are physical addresses when IOMMU is disabled. Buffer mapping to user-space is done via remap_pfn_range() with PFN based on bus address instead of physical. This results in wrong pages being mapped to user-space when IOMMU is enabled. Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Tymoteusz Kielan <tymoteusz.kielan@intel.com> Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
261a4351 |
|
06-Sep-2016 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/qib,IB/hfi: Use core common header file Use common header file structs, defines, and accessors in the drivers. The old declarations are removed. The repositioning of the includes allows for the removal of hfi1_message_header and replaces its use with ib_header. Also corrected are two issues with set_armed_to_active(): - The "packet" parameter is now a pointer as it should have been - The etype is validated to insure that the header is correct Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
429b6a72 |
|
31-Aug-2016 |
Harish Chegondi <harish.chegondi@intel.com> |
IB/hfi1: Make n_krcvqs be an unsigned long integer The global variable n_krcvqs stores the sum of the number of kernel receive queues of VLs 0-7 which the user can pass to the driver through the module parameter array krcvqs which is of type unsigned integer. If the user passes large value(s) into krcvqs parameter array, it can cause an arithmetic overflow while calculating n_krcvqs which is also of type unsigned int. The overflow results in an incorrect value of n_krcvqs which can lead to kernel crash while loading the driver. Fix by changing the data type of n_krcvqs to unsigned long. This patch also changes the data type of other variables that get their values from n_krcvqs. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
673b975f |
|
31-Aug-2016 |
Dean Luick <dean.luick@intel.com> |
IB/hfi1: Add QSFP sanity pre-check Sometimes a QSFP device does not respond in the expected time after a power-on. Add a read pre-check/retry when starting the link on driver load. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
08fe16f6 |
|
16-Aug-2016 |
Mitko Haralanov <mitko.haralanov@intel.com> |
IB/hfi1: Improve J_KEY generation Previously, J_KEY generation was based on the lower 16 bits of the user's UID. While this works, it was not good enough as a non-root user could collide with a root user given a sufficiently large UID. This patch attempt to improve the J_KEY generation by using the following algorithm: The 16 bit J_KEY space is partitioned into 3 separate spaces reserved for different user classes: * all users with administtor privileges (including 'root') will use J_KEYs in the range of 0 to 31, * all kernel protocols, which use KDETH packets will use J_KEYs in the range of 32 to 63, and * all other users will use J_KEYs in the range of 64 to 65535. The above separation is aimed at preventing different user levels from sending packets to each other and, additionally, separate kernel protocols from all other types of users. The later is meant to prevent the potential corruption of kernel memory by any other type of user. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
476d95bd |
|
09-Aug-2016 |
Wei Yongjun <weiyj.lk@gmail.com> |
IB/hfi1: Using kfree_rcu() to simplify the code The callback function of call_rcu() just calls a kfree(), so we can use kfree_rcu() instead of call_rcu() + callback function. Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com> Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
e0b09ac5 |
|
28-Jul-2016 |
Dean Luick <dean.luick@intel.com> |
IB/hfi1: Make the cache handler own its rb tree root The objects which use cache handling should reference their own handler object not the internal data structure it uses to track the nodes. Have the "users" of the mmu notifier code pass opaque objects which can then be properly used in the mmu callbacks depending on the owners needs. This patch has the additional benefit that operations no longer require a look up in a list to find the handlers. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
3faa3d9a |
|
28-Jul-2016 |
Ira Weiny <ira.weiny@intel.com> |
IB/hfi1: Make use of mm consistent The hfi1 driver registers a mmu_notifier callback when /dev/hfi1_* is opened, and unregisters it when the device is closed. The driver incorrectly assumes that the close will always happen from the same context as the open. In particular, closes due to SIGKILL or OOM killer activity may happen from a different context. In these cases, the wrong mm is passed to mmu_notifier_unregister(), which causes improper reference counting for the victim mm, and eventual memory corruption. Preserve the mm for all open file descriptors and use this mm rather than current->mm for memory operations for the lifetime of that fd. Note: this patch leaves 1 use of current->mm in place. This use is removed in a follow on patch because other functional changes were required prior to that use being removed. If registration fails, there is no reason to keep the handler object around. Free the handler object rather than add it to the list to prevent any mmu_notifier operations, including unregister, when registration fails. Suggested-by: Jim Foraker <foraker1@llnl.gov> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
bdf7752e |
|
28-Jul-2016 |
Dean Luick <dean.luick@intel.com> |
IB/hfi1: Use the same capability state for all shared contexts Save the current capability state at user context creation time. Report this saved value for all shared contexts. Also get rid of unnecessary hfi1_get_base_kinfo function. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
ac335e7e |
|
27-Jul-2016 |
Ira Weiny <ira.weiny@intel.com> |
IB/hfi1: Add parameter names to function declarations Parameter names to function declarations make it more clear what those parameters do. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
8e1f52df |
|
27-Jul-2016 |
Dean Luick <dean.luick@intel.com> |
IB/hfi1: Remove unused uctxt->subpid and uctxt->pid These are no longer needed. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
b736a469 |
|
25-Jul-2016 |
Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> |
IB/hfi1: Use hdr2sc function to calculate 5-bit SC The interface is used to compute the 5-bit SC field from the LRH and the RHF bits. Modify code to use the interface instead. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
8adf71fa |
|
25-Jul-2016 |
Jianxin Xiong <jianxin.xiong@intel.com> |
IB/hfi1: Fix "suspicious rcu_dereference_check() usage" warnings This fixes the following warnings with PROVE_LOCKING and PROVE_RCU enabled in the kernel: case (1): [ INFO: suspicious RCU usage. ] drivers/infiniband/hw/hfi1/init.c:532 suspicious rcu_dereference_check() usage! case (2): [ INFO: suspicious RCU usage. ] drivers/infiniband/hw/hfi1/hfi.h:1624 suspicious rcu_dereference_check() usage! Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
5fd2b562 |
|
25-Jul-2016 |
Mitko Haralanov <mitko.haralanov@intel.com> |
IB/hfi1: Pull FECN/BECN processing to a common place There were multiple places where FECN/BECN processing was being done for the different types of QPs. All of that code was very similar, which meant that it could be pulled into a single function used by the different QP types. To retain the performance in the fastpath, the common code starts with an inline function, which only calls the slow path if the packet has any of the [FB]ECN bits set. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
dba715f0 |
|
06-Jul-2016 |
Dean Luick <dean.luick@intel.com> |
IB/hfi1: Use built-in i2c bit-shift bus adapter Use built-in i2c bit-shift bus adapter to control the i2c busses on the chip. Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d6373019 |
|
25-Jul-2016 |
Sebastian Sanchez <sebastian.sanchez@intel.com> |
IB/hfi1: Reserve and collapse CPU cores for contexts Kernel receive queues oversubscribe CPU cores on multi-HFI systems. To prevent this, the kernel receive queues are separated onto different cores, and the SDMA engine interrupts are constrained to a lesser number of cores. hfi1s_on_numa_node*krcvqs is the number of CPU cores that are reserved for kernel receive queues for all HFIs. Each HFI initializes its kernel receive queues to one of the reserved CPU cores. If there ends up being 0 CPU cores leftover for SDMA engines, use the same CPU cores as receive contexts. In addition, general and control contexts are assigned to their own CPU core, however, both types of contexts tend to have low traffic. To save CPU cores, collapse general and control contexts to one CPU core for all HFI units. This change prevents SDMA engine interrupts from wrapping around general contexts. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
2b719046 |
|
01-Jul-2016 |
Jakub Pawlak <jakub.pawlak@intel.com> |
IB/hfi1: Add counter to track unsupported packets drop Add sw counter to track dropped unsupported packets. Report unsupported packets drop as the RcvError. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
462b6b21 |
|
01-Jul-2016 |
Sebastian Sanchez <sebastian.sanchez@intel.com> |
IB/hfi1: Separate tracepoints into specific headers The ftrace infrastructure used to evaluate the TRACE_SYSTEM macro on every DEFINE_EVENT() macro. Now the TRACE_SYSTEM macro only gets evaluated when trace/define_trace.h is included, so the group event information is lost. This was introduced in commit acd388fd3af3 ("tracing: Give system name a pointer") Therefore, each system tracepoint must be on its own file. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
939b6ca8 |
|
15-Jun-2016 |
Ira Weiny <ira.weiny@intel.com> |
IB/hfi1: Add device FW version string Export the firmware version through the core. Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
f48ad614 |
|
19-May-2016 |
Dennis Dalessandro <dennis.dalessandro@intel.com> |
IB/hfi1: Move driver out of staging The TODO list for the hfi1 driver was completed during 4.6. In addition other objections raised (which are far beyond what was in the TODO list) have been addressed as well. It is now time to remove the driver from staging and into the drivers/infiniband sub-tree. Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|