History log of /linux-master/drivers/infiniband/hw/hfi1/verbs.c
Revision Date Author Comments
# 3ec648c6 23-Aug-2023 Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>

IB: Use capital "OR" for multiple licenses in SPDX

Documentation/process/license-rules.rst and checkpatch expect the SPDX
identifier syntax for multiple licenses to use capital "OR". Correct it
to keep consistent format and avoid copy-paste issues.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230823092912.122674-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>


# 00cbce5c 06-Apr-2023 Patrick Kelsey <pat.kelsey@cornelisnetworks.com>

IB/hfi1: Fix bugs with non-PAGE_SIZE-end multi-iovec user SDMA requests

hfi1 user SDMA request processing has two bugs that can cause data
corruption for user SDMA requests that have multiple payload iovecs
where an iovec other than the tail iovec does not run up to the page
boundary for the buffer pointed to by that iovec.a

Here are the specific bugs:
1. user_sdma_txadd() does not use struct user_sdma_iovec->iov.iov_len.
Rather, user_sdma_txadd() will add up to PAGE_SIZE bytes from iovec
to the packet, even if some of those bytes are past
iovec->iov.iov_len and are thus not intended to be in the packet.
2. user_sdma_txadd() and user_sdma_send_pkts() fail to advance to the
next iovec in user_sdma_request->iovs when the current iovec
is not PAGE_SIZE and does not contain enough data to complete the
packet. The transmitted packet will contain the wrong data from the
iovec pages.

This has not been an issue with SDMA packets from hfi1 Verbs or PSM2
because they only produce iovecs that end short of PAGE_SIZE as the tail
iovec of an SDMA request.

Fixing these bugs exposes other bugs with the SDMA pin cache
(struct mmu_rb_handler) that get in way of supporting user SDMA requests
with multiple payload iovecs whose buffers do not end at PAGE_SIZE. So
this commit fixes those issues as well.

Here are the mmu_rb_handler bugs that non-PAGE_SIZE-end multi-iovec
payload user SDMA requests can hit:
1. Overlapping memory ranges in mmu_rb_handler will result in duplicate
pinnings.
2. When extending an existing mmu_rb_handler entry (struct mmu_rb_node),
the mmu_rb code (1) removes the existing entry under a lock, (2)
releases that lock, pins the new pages, (3) then reacquires the lock
to insert the extended mmu_rb_node.

If someone else comes in and inserts an overlapping entry between (2)
and (3), insert in (3) will fail.

The failure path code in this case unpins _all_ pages in either the
original mmu_rb_node or the new mmu_rb_node that was inserted between
(2) and (3).
3. In hfi1_mmu_rb_remove_unless_exact(), mmu_rb_node->refcount is
incremented outside of mmu_rb_handler->lock. As a result, mmu_rb_node
could be evicted by another thread that gets mmu_rb_handler->lock and
checks mmu_rb_node->refcount before mmu_rb_node->refcount is
incremented.
4. Related to #2 above, SDMA request submission failure path does not
check mmu_rb_node->refcount before freeing mmu_rb_node object.

If there are other SDMA requests in progress whose iovecs have
pointers to the now-freed mmu_rb_node(s), those pointers to the
now-freed mmu_rb nodes will be dereferenced when those SDMA requests
complete.

Fixes: 7be85676f1d1 ("IB/hfi1: Don't remove RB entry when not needed.")
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Signed-off-by: Brendan Cunningham <bcunningham@cornelisnetworks.com>
Signed-off-by: Patrick Kelsey <pat.kelsey@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Link: https://lore.kernel.org/r/168088636445.3027109.10054635277810177889.stgit@252.162.96.66.static.eigbox.net
Signed-off-by: Leon Romanovsky <leon@kernel.org>


# ef90f0a1 09-Jan-2023 Dean Luick <dean.luick@cornelisnetworks.com>

IB/hfi1: Split IB counter allocation

Split the IB device and port counter allocation. Remove
the need for a lock. Clean up pointer usage.

Signed-off-by: Dean Luick <dean.luick@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Link: https://lore.kernel.org/r/167329106431.1472990.12587703493884915680.stgit@awfm-02.cornelisnetworks.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>


# e58f889e 01-Sep-2022 ye xingchen <ye.xingchen@zte.com.cn>

RDMA/hfi1: Remove the unneeded result variable

Return the value set_link_state() directly instead of storing it in
another redundant variable.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Link: https://lore.kernel.org/r/20220901074209.313004-1-ye.xingchen@zte.com.cn
Signed-off-by: Leon Romanovsky <leon@kernel.org>


# 2c34bb6d 18-Aug-2022 Wolfram Sang <wsa+renesas@sang-engineering.com>

IB: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Link: https://lore.kernel.org/r/20220818210018.6841-1-wsa+renesas@sang-engineering.com
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>


# e945c653 03-Apr-2022 Jason Gunthorpe <jgg@ziepe.ca>

RDMA: Split kernel-only global device caps from uverbs device caps

Split out flags from ib_device::device_cap_flags that are only used
internally to the kernel into kernel_cap_flags that is not part of the
uapi. This limits the device_cap_flags to being the same bitmap that will
be copied to userspace.

This cleanly splits out the uverbs flags from the kernel flags to avoid
confusion in the flags bitmap.

Add some short comments describing which each of the kernel flags is
connected to. Remove unused kernel flags.

Link: https://lore.kernel.org/r/0-v2-22c19e565eef+139a-kern_caps_jgg@nvidia.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# b135e324 08-Feb-2022 Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

IB/hfi1: Allow larger MTU without AIP

The AIP code signals the phys_mtu in the following query_port()
fragment:

props->phys_mtu = HFI1_CAP_IS_KSET(AIP) ? hfi1_max_mtu :
ib_mtu_enum_to_int(props->max_mtu);

Using the largest MTU possible should not depend on AIP.

Fix by unconditionally using the hfi1_max_mtu value.

Fixes: 6d72344cf6c4 ("IB/ipoib: Increase ipoib Datagram mode MTU's upper limit")
Link: https://lore.kernel.org/r/1644348309-174874-1-git-send-email-mike.marciniszyn@cornelisnetworks.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# da86dc17 15-Nov-2021 Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>

IB/hfi1: Properly allocate rdma counter desc memory

When optional counter support was added the allocation of the memory
holding the counter descriptors was not cleared properly. This caused
WARN_ON()s in the IB/sysfs code to be hit.

This is because the uninitialized memory made some of the counters wrongly
look like optional counters. Use kzalloc.

While here change the sizeof() calls to use the pointer rather than the
name of the type.

WARNING: CPU: 0 PID: 32644 at drivers/infiniband/core/sysfs.c:1064 ib_setup_port_attrs+0x7e1/0x890 [ib_core]
CPU: 0 PID: 32644 Comm: kworker/0:2 Tainted: G S W 5.15.0+ #36
Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.0018.C4.072020161249 07/20/2016
Workqueue: events work_for_cpu_fn
RIP: 0010:ib_setup_port_attrs+0x7e1/0x890 [ib_core]
RSP: 0018:ffffc90006ea3c40 EFLAGS: 00010202
RAX: 0000000000000068 RBX: ffff888106ad8000 RCX: 0000000000000138
RDX: ffff888126c84c00 RSI: ffff888103c41000 RDI: 0000000000000124
RBP: ffff88810f63a801 R08: ffff888126c8a000 R09: 0000000000000001
R10: ffffffffa09acf20 R11: 0000000000000065 R12: ffff88810f63a800
R13: ffff88810f63a800 R14: ffff88810f63a8e0 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff888667a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005590102cb078 CR3: 000000000240a003 CR4: 00000000001706f0
Call Trace:
ib_register_device.cold.44+0x23e/0x2d0 [ib_core]
rvt_register_device+0xfa/0x230 [rdmavt]
hfi1_register_ib_device+0x623/0x690 [hfi1]
init_one.cold.36+0x2d1/0x49b [hfi1]
local_pci_probe+0x45/0x80
work_for_cpu_fn+0x16/0x20
process_one_work+0x1b1/0x360
worker_thread+0x1d4/0x3a0
kthread+0x11a/0x140
ret_from_fork+0x22/0x30

Fixes: 5e2ddd1e5982 ("RDMA/counter: Add optional counter support")
Link: https://lore.kernel.org/r/20211115200913.124104.47770.stgit@awfm-01.cornelisnetworks.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 13f30b0f 08-Oct-2021 Aharon Landau <aharonl@nvidia.com>

RDMA/counter: Add a descriptor in struct rdma_hw_stats

Add a counter statistic descriptor structure in rdma_hw_stats. In addition
to the counter name, more meta-information will be added. This code
extension is needed for optional-counter support in the following patches.

Link: https://lore.kernel.org/r/20211008122439.166063-4-markzhang@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 145eba1a 22-Aug-2021 Cai Huoqing <caihuoqing@baidu.com>

RDMA/hfi1: Convert to SPDX identifier

use SPDX-License-Identifier instead of a verbose license text

Link: https://lore.kernel.org/r/20210823042622.109-1-caihuoqing@baidu.com
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 915e4af5 11-Jun-2021 Jason Gunthorpe <jgg@ziepe.ca>

RDMA: Remove rdma_set_device_sysfs_group()

The driver's device group can be specified as part of the ops structure
like the device's port group. No need for the complicated API.

Link: https://lore.kernel.org/r/8964785a34fd3a29ff5b6693493f575b717e594d.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# d7407d16 11-Jun-2021 Jason Gunthorpe <jgg@ziepe.ca>

RDMA: Change ops->init_port to ops->port_groups

init_port was only being used to register sysfs attributes against the
port kobject. Now that all users are creating static attribute_group's we
can simply set the attribute_group list in the ops and the core code can
just handle it directly.

This makes all the sysfs management quite straightforward and prevents any
driver from abusing the naked port kobject in future because no driver
code can access it.

Link: https://lore.kernel.org/r/114f68f3d921460eafe14cea5a80ca65d81729c3.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 4b5f4d3f 11-Jun-2021 Jason Gunthorpe <jgg@ziepe.ca>

RDMA: Split the alloc_hw_stats() ops to port and device variants

This is being used to implement both the port and device global stats,
which is causing some confusion in the drivers. For instance EFA and i40iw
both seem to be misusing the device stats.

Split it into two ops so drivers that don't support one or the other can
leave the op NULL'd, making the calling code a little simpler to
understand.

Link: https://lore.kernel.org/r/1955c154197b2a159adc2dc97266ddc74afe420c.1623427137.git.leonro@nvidia.com
Tested-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 1fb7f897 01-Mar-2021 Mark Bloch <mbloch@nvidia.com>

RDMA: Support more than 255 rdma ports

Current code uses many different types when dealing with a port of a RDMA
device: u8, unsigned int and u32. Switch to u32 to clean up the logic.

This allows us to make (at least) the core view consistent and use the
same type. Unfortunately not all places can be converted. Many uverbs
functions expect port to be u8 so keep those places in order not to break
UAPIs. HW/Spec defined values must also not be changed.

With the switch to u32 we now can support devices with more than 255
ports. U32_MAX is reserved to make control logic a bit easier to deal
with. As a device with U32_MAX ports probably isn't going to happen any
time soon this seems like a non issue.

When a device with more than 255 ports is created uverbs will report the
RDMA device as having 255 ports as this is the max currently supported.

The verbs interface is not changed yet because the IBTA spec limits the
port size in too many places to be u8 and all applications that relies in
verbs won't be able to cope with this change. At this stage, we are
extending the interfaces that are using vendor channel solely

Once the limitation is lifted mlx5 in switchdev mode will be able to have
thousands of SFs created by the device. As the only instance of an RDMA
device that reports more than 255 ports will be a representor device and
it exposes itself as a RAW Ethernet only device CM/MAD/IPoIB and other
ULPs aren't effected by this change and their sysfs/interfaces that are
exposes to userspace can remain unchanged.

While here cleanup some alignment issues and remove unneeded sanity
checks (mainly in rdmavt),

Link: https://lore.kernel.org/r/20210301070420.439400-1-leon@kernel.org
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# cd5962d4 25-Jan-2021 Lee Jones <lee.jones@linaro.org>

RDMA/hw/hfi1/verbs: Demote non-conforming doc header and fix a misspelling

Fixes the following W=1 kernel build warning(s):

drivers/infiniband/hw/hfi1/verbs.c:741: warning: Function parameter or member 'qp' not described in 'update_tx_opstats'
drivers/infiniband/hw/hfi1/verbs.c:1160: warning: Function parameter or member 'pkey' not described in 'egress_pkey_check'
drivers/infiniband/hw/hfi1/verbs.c:1160: warning: Excess function parameter 'bkey' description in 'egress_pkey_check'
drivers/infiniband/hw/hfi1/verbs.c:1217: warning: Function parameter or member 'qp' not described in 'get_send_routine'
drivers/infiniband/hw/hfi1/verbs.c:1217: warning: Function parameter or member 'ps' not described in 'get_send_routine'

Link: https://lore.kernel.org/r/20210126124732.3320971-20-lee.jones@linaro.org
Cc: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 376ceb31 16-Sep-2020 Aharon Landau <aharonl@mellanox.com>

RDMA: Fix link active_speed size

According to the IB spec active_speed size should be u16 and not u8 as
before. Changing it to allow further extensions in offered speeds.

Link: https://lore.kernel.org/r/20200917090223.1018224-4-leon@kernel.org
Signed-off-by: Aharon Landau <aharonl@mellanox.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 4d12c04c 28-May-2020 Jason Gunthorpe <jgg@ziepe.ca>

RDMA: Remove 'max_map_per_fmr'

Now that FMR support is gone, this attribute can be deleted from all
places.

Link: https://lore.kernel.org/r/13-v3-f58e6669d5d3+2cf-fmr_removal_jgg@mellanox.com
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 0ad45e5f 10-May-2020 Piotr Stankiewicz <piotr.stankiewicz@intel.com>

IB/hfi1: Enable the transmit side of the datagram ipoib netdev

This patch hooks the transmit side of the datagram netdev with
ipoib by setting the rdma_netdev_get_params function for the
hfi1 ib_device_ops structue. It also enables the receiving side
by adding the AIP capability into the default capabilities.

Link: https://lore.kernel.org/r/20200511160712.173205.65700.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 6d72344c 10-May-2020 Kaike Wan <kaike.wan@intel.com>

IB/ipoib: Increase ipoib Datagram mode MTU's upper limit

Currently the ipoib UD mtu is restricted to 4K bytes. Remove this
limitation so that the IPOIB module can potentially use an MTU (in UD
mode) that is bounded by the MTU of the underlying device. A field is
added to the ib_port_attr structure to indicate the maximum physical
MTU the underlying device supports.

Link: https://lore.kernel.org/r/20200511160618.173205.23053.stgit@awfm-01.aw.intel.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 7f90a5a0 10-May-2020 Gary Leshner <Gary.S.Leshner@intel.com>

IB/{rdmavt, hfi1}: Implement creation of accelerated UD QPs

Adds capability to create a qpn to be recognized as an accelerated
UD QP for ipoib.

This is accomplished by reserving 0x81 in byte[0] of the qpn as the
prefix for these qp types and reserving qpns between 0x810000 and
0x81ffff.

The hfi1 capability mask already contained a flag for the VNIC netdev.
This has been renamed and extended to include both VNIC and ipoib.

The rvt code to allocate qps now recognizes this flag and sets 0x81
into byte[0] of the qpn.

The code to allocate qpns is modified to reset the qpn numbering when it
is detected that a value is located in byte[0] for a UD QP and it is a
qpn being requested for net dev use. If it is a regular UD QP then it is
allowable to have bits set in byte[0] of the qpn and provide the
previously normal behavior.

The code to free the qpn now checks for the AIP prefix value of 0x81 and
removes it from the qpn before being freed so that the lower 16 bit
number can be reused.

This patch requires minor changes in the IB core and ipoib to facilitate
the creation of accelerated UP QPs.

Link: https://lore.kernel.org/r/20200511160607.173205.11757.stgit@awfm-01.aw.intel.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 84e3b19a 10-May-2020 Gary Leshner <Gary.S.Leshner@intel.com>

IB/hfi1: Remove module parameter for KDETH qpns

The module parameter for KDETH qpns is being removed in favor
of always using the default value of 0x80 as the qpn prefix.
Defines have been added for various KDETH values including
the prefix of 0x80.
The reserved range now starts at the base value for KDETH
qpns (0x80) and extends up to and including the last qpn for
other reserved QP prefixed types.
Adjust other QP prefixed define names to match KDETH defined
names.

Link: https://lore.kernel.org/r/20200511160600.173205.27508.stgit@awfm-01.aw.intel.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 817a68a6 25-Feb-2020 Dennis Dalessandro <dennis.dalessandro@intel.com>

IB/hfi1, qib: Ensure RCU is locked when accessing list

The packet handling function, specifically the iteration of the qp list
for mad packet processing misses locking RCU before running through the
list. Not only is this incorrect, but the list_for_each_entry_rcu() call
can not be called with a conditional check for lock dependency. Remedy
this by invoking the rcu lock and unlock around the critical section.

This brings MAD packet processing in line with what is done for non-MAD
packets.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20200225195445.140896.41873.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 22bb1365 04-Oct-2019 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Use a common pad buffer for 9B and 16B packets

There is no reason for a different pad buffer for the two
packet types.

Expand the current buffer allocation to allow for both
packet types.

Fixes: f8195f3b14a0 ("IB/hfi1: Eliminate allocation while atomic")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20191004204934.26838.13099.stgit@awfm-01.aw.intel.com
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 7b0b6925 25-Sep-2019 Denis Efremov <efremov@linux.com>

IB/hfi1: remove unlikely() from IS_ERR*() condition

"unlikely(IS_ERR_OR_NULL(x))" is excessive. IS_ERR_OR_NULL() already uses
unlikely() internally.

Link: http://lkml.kernel.org/r/20190829165025.15750-8-efremov@linux.com
Signed-off-by: Denis Efremov <efremov@linux.com>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Joe Perches <joe@perches.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 6497d0a9 30-Jul-2019 Gustavo A. R. Silva <gustavo@embeddedor.com>

IB/hfi1: Fix Spectre v1 vulnerability

sl is controlled by user-space, hence leading to a potential
exploitation of the Spectre variant 1 vulnerability.

Fix this by sanitizing sl before using it to index ibp->sl_to_sc.

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://lore.kernel.org/lkml/20180423164740.GY17484@dhcp22.suse.cz/

Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Link: https://lore.kernel.org/r/20190731175428.GA16736@embeddedor
Signed-off-by: Doug Ledford <dledford@redhat.com>


# b2590bdd 14-Jul-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Do not update hcrc for a KDETH packet during fault injection

When a KDETH packet is subject to fault injection during transmission,
HCRC is supposed to be omitted from the packet so that the hardware on the
receiver side would drop the packet. When creating pbc, the PbcInsertHcrc
field is set to be PBC_IHCRC_NONE if the KDETH packet is subject to fault
injection, but overwritten with PBC_IHCRC_LKDETH when update_hcrc() is
called later.

This problem is fixed by not calling update_hcrc() when the packet is
subject to fault injection.

Fixes: 6b6cf9357f78 ("IB/hfi1: Set PbcInsertHcrc for TID RDMA packets")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20190715164546.74174.99296.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 942a8993 13-Jun-2019 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Handle port down properly in pio

The call to sc_buffer_alloc currently returns NULL (no buffer) or
a buffer descriptor.

There is a third case when the port is down. Currently that
returns NULL and this prevents the caller from properly handling the
sc_buffer_alloc() failure. A verbs code link test after the call is
racy so the indication needs to come from the state check inside the allocation
routine to be valid.

Fix by encoding the ECOMM failure like SDMA. IS_ERR_OR_NULL() tests
are added at all call sites. For verbs send, this needs to treat any
error by returning a completion without any MMIO copy.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 4bb02e95 13-Jun-2019 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Use aborts to trigger RC throttling

SDMA and pio flushes will cause a lot of packets to be transmitted
after a link has gone down, using a lot of CPU to retransmit
packets.

Fix for RC QPs by recognizing the flush status and:
- Forcing a timer start
- Putting the QP into a "send one" mode

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 7a154142 05-Jun-2019 Jason Gunthorpe <jgg@ziepe.ca>

RDMA: Move owner into struct ib_device_ops

This more closely follows how other subsytems work, with owner being a
member of the structure containing the function pointers.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# b9560a41 05-Jun-2019 Jason Gunthorpe <jgg@ziepe.ca>

RDMA: Move driver_id into struct ib_device_ops

No reason for every driver to emit code to set this, just make it part of
the driver's existing static const ops structure.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 35164f52 24-May-2019 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/{qib, hfi1, rdmavt}: Correct ibv_devinfo max_mr value

The command 'ibv_devinfo -v' reports 0 for max_mr.

Fix by assigning the query values after the mr lkey_table has been built
rather than early on in the driver.

Fixes: 7b1e2099adc8 ("IB/rdmavt: Move memory registration into rdmavt")
Reviewed-by: Josh Collier <josh.d.collier@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 03b92789 08-Feb-2019 Matthew Wilcox <willy@infradead.org>

hfi1: Convert hfi1_unit_table to XArray

Also remove hfi1_devs_list.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 270a9833 26-Feb-2019 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Add running average for adaptive pio

The adaptive PIO implementation only considers the current packet size
when deciding between SDMA and pio for a packet.

This causes credit return forces if small and large packets are
interleaved.

Add a running average to avoid costly credit forces so that a large
sequence of small packets is required to go below the threshold that
chooses pio.

Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 34025fb0 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Prioritize the sending of ACK packets

ACK packets are generally associated with request completion and resource
release and therefore should be sent first. This patch optimizes the
send engine by using the following policies:
(1) QPs with RVT_S_ACK_PENDING bit set in qp->s_flags or qpriv->s_flags
should have their priority incremented;
(2) QPs with ACK or TID-ACK packet queued should have their priority
incremented;
(3) When a QP is queued to the wait list due to resource constraints, it
will be queued to the head if it has ACK packet to send;
(4) When selecting qps to run from the wait list, the one with the highest
priority and starve_cnt will be selected; each priority will be equivalent
to a fixed number of starve_cnt (16).

Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 3c6cb20a 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add TID RDMA WRITE functionality into RDMA verbs

This patch integrates TID RDMA WRITE protocol into normal RDMA verbs
framework. The TID RDMA WRITE protocol is an end-to-end protocol
between the hfi1 drivers on two OPA nodes that converts a qualified
RDMA WRITE request into a TID RDMA WRITE request to avoid data copying
on the responder side.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# f5a4a95f 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Allow for extra entries in QP's s_ack_queue

The TID RDMA WRITE protocol differs from normal IB RDMA WRITE
in that TID RDMA WRITE requests do require responses, not just
ACKs.

Therefore, TID RDMA WRITE requests need to be treated as RDMA
READ requests from the point of view of the QPs' s_ack_queue.
In other words, the QPs' need to allow for TID RDMA WRITE
requests to be stored in their s_ack_queue.

However, because the user does not know anything about the TID
RDMA capability and/or protocols, these extra entries in the
queue cannot be advertized to the user.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 24b11923 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Integrate TID RDMA READ protocol into RC protocol

This patch integrates the TID RDMA READ protocol into the IB RC protocol.
This protocol is an end-to-end protocol between the hfi1 drivers on two
OPA nodes that converts a qualified RDMA READ request into a TID RDMA
READ request to avoid data copying on the requester side. The following
codes are added in this patch:
- Send the TID RDMA READ request;
- Complete the TID RDMA READ send request;
- Send the TID RDMA READ response;
- Complete the TID RDMA READ request;

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 22d136d7 24-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add TID RDMA handlers

This commit adds the TID RDMA READ pointers to the receiving opcode
handlers. It also adds TID RDMA READ header sizes to header size table.
A function to print the RHF EFLAGS errors is created so that it can be
shared by both IB and TID RDMA receiving functions.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 6b6cf935 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Set PbcInsertHcrc for TID RDMA packets

All TID RDMA packets are in KDETH packet format and therefore the
PbcInsertHcrc must be set properly before sending the packet to
hardware. Otherwise, the packets will be dropped by the receiver.
By default, HCRC is not inserted for 9B packets without KDETH, and
this patch adds that back for TID RDMA packets.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 838b6fd2 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: TID RDMA RcvArray programming and TID allocation

TID entries are used by hfi1 hardware to receive data payload from
incoming packets directly into a user buffer and thus avoid data copying
by software. This patch implements the functions for TID allocation,
freeing, and programming TID RcvArray entries in hardware for kernel
clients. TID entries are managed via lists of TID groups similar to PSM.
Furthermore, to track TID resource allocation for each request, software
flows are also allocated and freed as needed. Since software flows
consume large amount of memory for tracking TID allocation and freeing,
it is generally desirable to allocate them dynamically in the send queue
and only for TID RDMA requests, but pre-allocate them for receive queue
because the send queue could have thousands of entries while the receive
queue has only a limited number of entries.

Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 48a615dc 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Integrate OPFN into RC transactions

OPFN parameter negotiation allows a pair of connected RC QPs to exchange
a set of parameters in succession. This negotiation does not commence
till the first ULP request. Because OPFN operations are operations
private to the driver, they do not generate user completions or put the
QP into error when they run out of retries. This patch integrates the
OPFN protocol into the transactions of an RC QP.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# ddf922c3 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1, IB/rdmavt: Allow for extending of QP's s_ack_queue

The OPFN protocol uses the COMPARE_SWAP request to exchange data
between the requester and the responder and therefore needs to
be stored in the QP's s_ack_queue when the request is received
on the responder side. However, because the user does not know
anything about the OPFN protocol, this extra entry in the
queue cannot be advertised to the user. This patch adds an extra
entry in a QP's s_ack_queue.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 87fc34b5 23-Jan-2019 Michael J. Ruhl <michael.j.ruhl@intel.com>

IB/{hfi1,qib}: Cleanup open coded sge sizing

Sge sizing is done in several places using an open coded method.

This can cause maintenance issues. The open coded method is
encapsulated in a helper routine. The helper was introduced with
commit:

1198fcea8a78 ("IB/hfi1, rdmavt: Move SGE state helper routines into
rdmavt")

Update all call sites that have the open coded path with the helper
routine.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# ea4baf7f 18-Dec-2018 Parav Pandit <parav@mellanox.com>

RDMA: Rename port_callback to init_port

Most provider routines are callback routines which ib core invokes.
_callback suffix doesn't convey information about when such callback is
invoked. Therefore, rename port_callback to init_port.

Additionally, store the init_port function pointer in ib_device_ops, so
that it can be accessed in subsequent patches when binding rdma device to
net namespace.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# e3c320ca 10-Dec-2018 Kamal Heib <kamalheib1@gmail.com>

RDMA/hfi1: Initialize ib_device_ops struct

Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 9aefcabe 28-Nov-2018 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Reduce lock contention on iowait_lock for sdma and pio

Commit 4e045572e2c2 ("IB/hfi1: Add unique txwait_lock for txreq events")
laid the ground work to support per resource waiting locking.

This patch adds that with a lock unique to each sdma engine and pio
sendcontext and makes necessary changes for verbs, PSM, and vnic to use
the new locks.

This is particularly beneficial for smaller messages that will exhaust
resources at a faster rate.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Reviewed-by: Gary Leshner <Gary.S.Leshner@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 5190f052 28-Nov-2018 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Allow the driver to initialize QP priv struct

This patch adds an interface to allow the driver to initialize the QP priv
struct when the QP is created and after the qpn has been assigned. A
field is added to the QP priv struct to reference the rcd and two new
files are added to contain the function to initialize the rcd field so
that more TID RDMA related code can be added here later.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# dbc2970c 28-Nov-2018 Michael J. Ruhl <michael.j.ruhl@intel.com>

IB/hfi1: Incorrect sizing of sge for PIO will OOPs

An incorrect sge sizing in the HFI PIO path will cause an OOPs similar to
this:

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [] hfi1_verbs_send_pio+0x3d8/0x530 [hfi1]
PGD 0
Oops: 0000 1 SMP
Call Trace:
? hfi1_verbs_send_dma+0xad0/0xad0 [hfi1]
hfi1_verbs_send+0xdf/0x250 [hfi1]
? make_rc_ack+0xa80/0xa80 [hfi1]
hfi1_do_send+0x192/0x430 [hfi1]
hfi1_do_send_from_rvt+0x10/0x20 [hfi1]
rvt_post_send+0x369/0x820 [rdmavt]
ib_uverbs_post_send+0x317/0x570 [ib_uverbs]
ib_uverbs_write+0x26f/0x420 [ib_uverbs]
? security_file_permission+0x21/0xa0
vfs_write+0xbd/0x1e0
? mntput+0x24/0x40
SyS_write+0x7f/0xe0
system_call_fastpath+0x16/0x1b

Fix by adding the missing sizing check to correctly determine the sge
length.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 36d84219 28-Nov-2018 Piotr Stankiewicz <piotr.stankiewicz@intel.com>

IB/hfi1: Fix an out-of-bounds access in get_hw_stats

When running with KASAN, the following trace is produced:

[ 62.535888]

==================================================================
[ 62.544930] BUG: KASAN: slab-out-of-bounds in
gut_hw_stats+0x122/0x230 [hfi1]
[ 62.553856] Write of size 8 at addr ffff88080e8d6330 by task
kworker/0:1/14

[ 62.565333] CPU: 0 PID: 14 Comm: kworker/0:1 Not tainted
4.19.0-test-build-kasan+ #8
[ 62.575087] Hardware name: Intel Corporation S2600KPR/S2600KPR, BIOS
SE5C610.86B.01.01.0019.101220160604 10/12/2016
[ 62.587951] Workqueue: events work_for_cpu_fn
[ 62.594050] Call Trace:
[ 62.598023] dump_stack+0xc6/0x14c
[ 62.603089] ? dump_stack_print_info.cold.1+0x2f/0x2f
[ 62.610041] ? kmsg_dump_rewind_nolock+0x59/0x59
[ 62.616615] ? get_hw_stats+0x122/0x230 [hfi1]
[ 62.622985] print_address_description+0x6c/0x23c
[ 62.629744] ? get_hw_stats+0x122/0x230 [hfi1]
[ 62.636108] kasan_report.cold.6+0x241/0x308
[ 62.642365] get_hw_stats+0x122/0x230 [hfi1]
[ 62.648703] ? hfi1_alloc_rn+0x40/0x40 [hfi1]
[ 62.655088] ? __kmalloc+0x110/0x240
[ 62.660695] ? hfi1_alloc_rn+0x40/0x40 [hfi1]
[ 62.667142] setup_hw_stats+0xd8/0x430 [ib_core]
[ 62.673972] ? show_hfi+0x50/0x50 [hfi1]
[ 62.680026] ib_device_register_sysfs+0x165/0x180 [ib_core]
[ 62.687995] ib_register_device+0x5a2/0xa10 [ib_core]
[ 62.695340] ? show_hfi+0x50/0x50 [hfi1]
[ 62.701421] ? ib_unregister_device+0x2e0/0x2e0 [ib_core]
[ 62.709222] ? __vmalloc_node_range+0x2d0/0x380
[ 62.716131] ? rvt_driver_mr_init+0x11f/0x2d0 [rdmavt]
[ 62.723735] ? vmalloc_node+0x5c/0x70
[ 62.729697] ? rvt_driver_mr_init+0x11f/0x2d0 [rdmavt]
[ 62.737347] ? rvt_driver_mr_init+0x1f5/0x2d0 [rdmavt]
[ 62.744998] ? __rvt_alloc_mr+0x110/0x110 [rdmavt]
[ 62.752315] ? rvt_rc_error+0x140/0x140 [rdmavt]
[ 62.759434] ? rvt_vma_open+0x30/0x30 [rdmavt]
[ 62.766364] ? mutex_unlock+0x1d/0x40
[ 62.772445] ? kmem_cache_create_usercopy+0x15d/0x230
[ 62.780115] rvt_register_device+0x1f6/0x360 [rdmavt]
[ 62.787823] ? rvt_get_port_immutable+0x180/0x180 [rdmavt]
[ 62.796058] ? __get_txreq+0x400/0x400 [hfi1]
[ 62.802969] ? memcpy+0x34/0x50
[ 62.808611] hfi1_register_ib_device+0xde6/0xeb0 [hfi1]
[ 62.816601] ? hfi1_get_npkeys+0x10/0x10 [hfi1]
[ 62.823760] ? hfi1_init+0x89f/0x9a0 [hfi1]
[ 62.830469] ? hfi1_setup_eagerbufs+0xad0/0xad0 [hfi1]
[ 62.838204] ? pcie_capability_clear_and_set_word+0xcd/0xe0
[ 62.846429] ? pcie_capability_read_word+0xd0/0xd0
[ 62.853791] ? hfi1_pcie_init+0x187/0x4b0 [hfi1]
[ 62.860958] init_one+0x67f/0xae0 [hfi1]
[ 62.867301] ? hfi1_init+0x9a0/0x9a0 [hfi1]
[ 62.873876] ? wait_woken+0x130/0x130
[ 62.879860] ? read_word_at_a_time+0xe/0x20
[ 62.886329] ? strscpy+0x14b/0x280
[ 62.891998] ? hfi1_init+0x9a0/0x9a0 [hfi1]
[ 62.898405] local_pci_probe+0x70/0xd0
[ 62.904295] ? pci_device_shutdown+0x90/0x90
[ 62.910833] work_for_cpu_fn+0x29/0x40
[ 62.916750] process_one_work+0x584/0x960
[ 62.922974] ? rcu_work_rcufn+0x40/0x40
[ 62.928991] ? __schedule+0x396/0xdc0
[ 62.934806] ? __sched_text_start+0x8/0x8
[ 62.941020] ? pick_next_task_fair+0x68b/0xc60
[ 62.947674] ? run_rebalance_domains+0x260/0x260
[ 62.954471] ? __list_add_valid+0x29/0xa0
[ 62.960607] ? move_linked_works+0x1c7/0x230
[ 62.967077] ?
trace_event_raw_event_workqueue_execute_start+0x140/0x140
[ 62.976248] ? mutex_lock+0xa6/0x100
[ 62.982029] ? __mutex_lock_slowpath+0x10/0x10
[ 62.988795] ? __switch_to+0x37a/0x710
[ 62.994731] worker_thread+0x62e/0x9d0
[ 63.000602] ? max_active_store+0xf0/0xf0
[ 63.006828] ? __switch_to_asm+0x40/0x70
[ 63.012932] ? __switch_to_asm+0x34/0x70
[ 63.019013] ? __switch_to_asm+0x40/0x70
[ 63.025042] ? __switch_to_asm+0x34/0x70
[ 63.031030] ? __switch_to_asm+0x40/0x70
[ 63.037006] ? __schedule+0x396/0xdc0
[ 63.042660] ? kmem_cache_alloc_trace+0xf3/0x1f0
[ 63.049323] ? kthread+0x59/0x1d0
[ 63.054594] ? ret_from_fork+0x35/0x40
[ 63.060257] ? __sched_text_start+0x8/0x8
[ 63.066212] ? schedule+0xcf/0x250
[ 63.071529] ? __wake_up_common+0x110/0x350
[ 63.077794] ? __schedule+0xdc0/0xdc0
[ 63.083348] ? wait_woken+0x130/0x130
[ 63.088963] ? finish_task_switch+0x1f1/0x520
[ 63.095258] ? kasan_unpoison_shadow+0x30/0x40
[ 63.101792] ? __init_waitqueue_head+0xa0/0xd0
[ 63.108183] ? replenish_dl_entity.cold.60+0x18/0x18
[ 63.115151] ? _raw_spin_lock_irqsave+0x25/0x50
[ 63.121754] ? max_active_store+0xf0/0xf0
[ 63.127753] kthread+0x1ae/0x1d0
[ 63.132894] ? kthread_bind+0x30/0x30
[ 63.138422] ret_from_fork+0x35/0x40

[ 63.146973] Allocated by task 14:
[ 63.152077] kasan_kmalloc+0xbf/0xe0
[ 63.157471] __kmalloc+0x110/0x240
[ 63.162804] init_cntrs+0x34d/0xdf0 [hfi1]
[ 63.168883] hfi1_init_dd+0x29a3/0x2f90 [hfi1]
[ 63.175244] init_one+0x551/0xae0 [hfi1]
[ 63.181065] local_pci_probe+0x70/0xd0
[ 63.186759] work_for_cpu_fn+0x29/0x40
[ 63.192310] process_one_work+0x584/0x960
[ 63.198163] worker_thread+0x62e/0x9d0
[ 63.203843] kthread+0x1ae/0x1d0
[ 63.208874] ret_from_fork+0x35/0x40

[ 63.217203] Freed by task 1:
[ 63.221844] __kasan_slab_free+0x12e/0x180
[ 63.227844] kfree+0x92/0x1a0
[ 63.232570] single_release+0x3a/0x60
[ 63.238024] __fput+0x1d9/0x480
[ 63.242911] task_work_run+0x139/0x190
[ 63.248440] exit_to_usermode_loop+0x191/0x1a0
[ 63.254814] do_syscall_64+0x301/0x330
[ 63.260283] entry_SYSCALL_64_after_hwframe+0x44/0xa9

[ 63.270199] The buggy address belongs to the object at
ffff88080e8d5500
which belongs to the cache kmalloc-4096 of size 4096
[ 63.287247] The buggy address is located 3632 bytes inside of
4096-byte region [ffff88080e8d5500, ffff88080e8d6500)
[ 63.303564] The buggy address belongs to the page:
[ 63.310447] page:ffffea00203a3400 count:1 mapcount:0
mapping:ffff88081380e840 index:0x0 compound_mapcount: 0
[ 63.323102] flags: 0x2fffff80008100(slab|head)
[ 63.329775] raw: 002fffff80008100 0000000000000000 0000000100000001
ffff88081380e840
[ 63.340175] raw: 0000000000000000 0000000000070007 00000001ffffffff
0000000000000000
[ 63.350564] page dumped because: kasan: bad access detected

[ 63.361974] Memory state around the buggy address:
[ 63.369137] ffff88080e8d6200: 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00
[ 63.379082] ffff88080e8d6280: 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00
[ 63.389032] >ffff88080e8d6300: 00 00 00 00 00 00 fc fc fc fc fc fc fc
fc fc fc
[ 63.398944] ^
[ 63.406141] ffff88080e8d6380: fc fc fc fc fc fc fc fc fc fc fc fc fc
fc fc fc
[ 63.416109] ffff88080e8d6400: fc fc fc fc fc fc fc fc fc fc fc fc fc
fc fc fc
[ 63.426099]
==================================================================

The trace happens because get_hw_stats() assumes there is room in the
memory allocated in init_cntrs() to accommodate the driver counters.
Unfortunately, that routine only allocated space for the device
counters.

Fix by insuring the allocation has room for the additional driver
counters.

Cc: <Stable@vger.kernel.org> # v4.14+
Fixes: b7481944b06e9 ("IB/hfi1: Show statistics counters under IB stats interface")
Reviewed-by: Mike Marciniczyn <mike.marciniszyn@intel.com>
Reviewed-by: Mike Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 508a523f 11-Oct-2018 Parav Pandit <parav@mellanox.com>

RDMA/drivers: Use core provided API for registering device attributes

Use rdma_set_device_sysfs_group() to register device attributes and
simplify the driver.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 116aa033 26-Sep-2018 Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

IB/{hfi1, qib, rdmavt}: Move send completion logic to rdmavt

Moving send completion code into rdmavt in order to have shared logic
between qib and hfi1 drivers.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 019f118b 26-Sep-2018 Brian Welty <brian.welty@intel.com>

IB/{hfi1, qib, rdmavt}: Move copy SGE logic into rdmavt

This patch moves hfi1_copy_sge() into rdmavt for sharing with qib.
This patch also moves all the wss_*() functions into rdmavt as
several wss_*() functions are called from hfi1_copy_sge()

When SGE copy mode is adaptive, cacheless copy may be done in some cases
for performance reasons. In those cases, X86 cacheless copy function
is called since the drivers that use rdmavt and may set SGE copy mode
to adaptive are X86 only. For this reason, this patch adds
"depends on X86_64" to rdmavt/Kconfig.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 5da0fc9d 28-Sep-2018 Dennis Dalessandro <dennis.dalessandro@intel.com>

IB/hfi1: Prepare resource waits for dual leg

Current implementation allows each qp to have only one send engine. As
such, each qp has only one list to queue prebuilt packets when send engine
resources are not available. To improve performance, it is desired to
support multiple send engines for each qp.

This patch creates the framework to support two send engines
(two legs) for each qp for the TID RDMA protocol, which can be easily
extended to support more send engines. It achieves the goal by creating a
leg specific struct, iowait_work in the iowait struct, to hold the
work_struct and the tx_list as well as a pointer to the parent iowait
struct.

The hfi1_pkt_state now has an additional field to record the current legs
work structure and that is now passed to all egress waiters to determine
the leg that needs to wait via a new iowait helper. The APIs are adjusted
to use the new leg specific struct as required.

Many new and modified helpers are added to support this change.

Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# d205a06a 26-Sep-2018 Kaike Wan <kaike.wan@intel.com>

IB/rdmavt: Rename check_send_wqe as setup_wqe

The driver-provided function check_send_wqe allows the hardware driver to
check and set up the incoming send wqe before it is inserted into the swqe
ring. This patch will rename it as setup_wqe to better reflect its
usage. In addition, this function is only called when all setup is
complete in rdmavt.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 0dbfaa9f 20-Sep-2018 Ira Weiny <ira.weiny@intel.com>

IB/hfi1: Fix SL array bounds check

The SL specified by a user needs to be a valid SL.

Add a range check to the user specified SL value which protects from
running off the end of the SL to SC table.

CC: stable@vger.kernel.org
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 522628ed 10-Jul-2018 Bart Van Assche <bvanassche@acm.org>

IB/hfi1: Suppress a compiler warning

Avoid that the following compiler warning is reported when building
with gcc 8:

drivers/infiniband/hw/hfi1/verbs.c:1896:2: warning: 'strncpy' output may be truncated copying 64 bytes from a string of length 64 [-Wstringop-truncation]

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 958200ad 04-Jul-2018 Jason Gunthorpe <jgg@ziepe.ca>

RDMA/hfi1: Move grh_required into update_sm_ah

grh_required is intended to be a global setting where all AV's will
require a GRH, not just the sm_lid. Move the special logic to the creation
of the SM AH.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>


# 2e2ba09e 04-Jun-2018 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/rdmavt, IB/hfi1: Create device dependent s_flags

Move some s_flags defines out of rdmavt and into hfi1 because they are
hfi1 specific and therefore should remain in the driver instead of
bubbling up to rdmavt.

Document device specific ranges in rdmavt and remap
those in hfi1.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 33023fb8 18-Jun-2018 Steve Wise <larrystevenwise@gmail.com>

IB/core: add max_send_sge and max_recv_sge attributes

This patch replaces the ib_device_attr.max_sge with max_send_sge and
max_recv_sge. It allows ulps to take advantage of devices that have very
different send and recv sge depths. For example cxgb4 has a max_recv_sge
of 4, yet a max_send_sge of 16. Splitting out these attributes allows
much more efficient use of the SQ for cxgb4 with ulps that use the RDMA_RW
API. Consider a large RDMA WRITE that has 16 scattergather entries.
With max_sge of 4, the ulp would send 4 WRITE WRs, but with max_sge of
16, it can be done with 1 WRITE WR.

Acked-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 81cd3891 15-May-2018 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Add support for 16B Management Packets

16B Management Packets (L4=0x08) replace the BTH and DETH
of normal MAD packet packets with a header containing the
the source and destination queue pair numbers; fields that
were originally retrieved from the BTH/DETH are now populated
from this header as well as from the 16B LRH (e.g. pkey).

16B Management Packets are used as an optimized management
format on 16B fabrics.

These management packets have an opcode of IB_OPCODE_UD_SEND_ONLY,
a fixed 3Byte pad, and a header length of 24Bytes.

The decision as to when we send a management packet is based
upon either the source or destination queue pair number being
0 or 1.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 5d18ee67 02-May-2018 Sebastian Sanchez <sebastian.sanchez@intel.com>

IB/{hfi1, rdmavt, qib}: Implement CQ completion vector support

Currently the driver doesn't support completion vectors. These
are used to indicate which sets of CQs should be grouped together
into the same vector. A vector is a CQ processing thread that
runs on a specific CPU.

If an application has several CQs bound to different completion
vectors, and each completion vector runs on different CPUs, then
the completion queue workload is balanced. This helps scale as more
nodes are used.

Implement CQ completion vector support using a global workqueue
where a CQ entry is queued to the CPU corresponding to the CQ's
completion vector. Since the workqueue is global, it's guaranteed
to always be there when queueing CQ entries; Therefore, the RCU
locking for cq->rdi->worker in the hot path is superfluous.

Each completion vector is assigned to a different CPU. The number of
completion vectors available is computed by taking the number of
online, physical CPUs from the local NUMA node and subtracting the
CPUs used for kernel receive queues and the general interrupt.
Special use cases:

* If there are no CPUs left for completion vectors, the same CPU
for the general interrupt is used; Therefore, there would only
be one completion vector available.

* For multi-HFI systems, the number of completion vectors available
for each device is the total number of completion vectors in
the local NUMA node divided by the number of devices in the same
NUMA node. If there's a division remainder, the first device to
get initialized gets an extra completion vector.

Upon a CQ creation, an invalid completion vector could be specified.
Handle it as follows:

* If the completion vector is less than 0, set it to 0.

* Set the completion vector to the result of the passed completion
vector moded with the number of device completion vectors
available.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# a74d5307 02-May-2018 Mitko Haralanov <mitko.haralanov@intel.com>

IB/hfi1: Rework fault injection machinery

The packet fault injection code present in the HFI1 driver had some
issues which not only fragment the code but also created user
confusion. Furthermore, it suffered from the following issues:

1. The fault_packet method only worked for received packets. This
meant that the only fault injection mode available for sent
packets is fault_opcode, which did not allow for random packet
drops on all egressing packets.
2. The mask available for the fault_opcode mode did not really work
due to the fact that the opcode values are not bits in a bitmask but
rather sequential integer values. Creating a opcode/mask pair that
would successfully capture a set of packets was nearly impossible.
3. The code was fragmented and used too many debugfs entries to
operate and control. This was confusing to users.
4. It did not allow filtering fault injection on a per direction basis -
egress vs. ingress.

In order to improve or fix the above issues, the following changes have
been made:

1. The fault injection methods have been combined into a single fault
injection facility. As such, the fault injection has been plugged
into both the send and receive code paths. Regardless of method used
the fault injection will operate on both egress and ingress packets.
2. The type of fault injection - by packet or by opcode - is now controlled
by changing the boolean value of the file "opcode_mode". When the value
is set to True, fault injection is done by opcode. Otherwise, by
packet.
2. The masking ability has been removed in favor of a bitmap that holds
opcodes of interest (one bit per opcode, a total of 256 bits). This
works in tandem with the "opcode_mode" value. When the value of
"opcode_mode" is False, this bitmap is ignored. When the value is
True, the bitmap lists all opcodes to be considered for fault injection.
By default, the bitmap is empty. When the user wants to filter by opcode,
the user sets the corresponding bit in the bitmap by echo'ing the bit
position into the 'opcodes' file. This gets around the issue that the set
of opcodes does not lend itself to effective masks and allow for extremely
fine-grained filtering by opcode.
4. fault_packet and fault_opcode methods have been combined. Hence, there
is only one debugfs directory controlling the entire operation of the
fault injection machinery. This reduces the number of debugfs entries
and provides a more unified user experience.
5. A new control files - "direction" - is provided to allow the user to
control the direction of packets, which are subject to fault injection.
6. A new control file - "skip_usec" - is added that would allow the user
to specify a "timeout" during which no fault injection will occur.

In addition, the following bug fixes have been applied:

1. The fault injection code has been split into its own header and source
files. This was done to better organize the code and support conditional
compilation without littering the code with #ifdef's.
2. The method by which the TX PIO packets were being marked for drop
conflicted with the way send contexts were being setup. As a result,
the send context was repeatedly being reset.
3. The fault injection only makes sense when the user can control it
through the debugfs entries. However, a kernel configuration can
enable fault injection but keep fault injection debugfs entries
disabled. Therefore, it makes sense that the HFI fault injection
code depends on both.
4. Error suppression did not take into account the method by which PIO
packets were being dropped. Therefore, even with error suppression
turned on, errors would still be displayed to the screen. A larger
enough packet drop percentage would case the kernel to crash because
the driver would be stuck printing errors.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 0ede73bc 19-Mar-2018 Matan Barak <matanb@mellanox.com>

IB/uverbs: Extend uverbs_ioctl header with driver_id

Extending uverbs_ioctl header with driver_id and another reserved
field. driver_id should be used in order to identify the driver.
Since every driver could have its own parsing tree, this is necessary
for strace support.
Downstream patches take off the EXPERIMENTAL flag from the ioctl() IB
support and thus we add some reserved fields for future usage.

Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 9636258f 01-Feb-2018 Mitko Haralanov <mitko.haralanov@intel.com>

IB/hfi1: Remove dependence on qp->s_hdrwords

The s_hdrwords variable was used to indicate whether a
packet was already built on a previous iteration of the
send engine. This variable assumed the protection of the
QP's RVT_S_BUSY flag, which was required since the the
QP's s_lock was dropped just prior to the packet being
queued on the one of the egress mechanisms.

Support for multiple send engine instantiations require
that the field not be used due to concurency issues.
The ps.txreq signals the "already built" without the
potential concurency issues.

Fix by getting rid of all s_hdrword usage. A wrapper
is added to test for the already built case that used to
use s_hdrwords.

What used to be stored in s_hdrwords is now in the txreq.
The PBC is not counted, but is added in the pio/sdma code
paths prior to posting the packet.

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 06f2597f 18-Dec-2017 Michael J. Ruhl <michael.j.ruhl@intel.com>

IB/{rdmavt, hfi1, qib}: Remove get_card_name() downcall

rdmavt has a down call to client drivers to retrieve a crafted card
name.

This name should be the IB defined name.

Rather than craft the name each time it is needed, simply retrieve
the IB allocated name from the IB device.

Update the function name to reflect its application.

Clean up driver code to match this change.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 5084c8ff 18-Dec-2017 Michael J. Ruhl <michael.j.ruhl@intel.com>

IB/{rdmavt, hfi1, qib}: Self determine driver name

Currently the HFI and QIB drivers allow the IB core to assign a unit
number to the driver name string.

If multiple devices exist in a system, there is a possibility that the
device unit number and the IB core number will be mismatched.

Fix by using the driver defined unit number to generate the device
name.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 69a3ffaa 14-Nov-2017 Jan Sokolowski <jan.sokolowski@intel.com>

IB/hfi1: Use 4096 for default active MTU in query_qp

Currently, if a port is queried that has an invalid
Maximum Transmission Unit, driver reports default MTU of 2048.
This in incorrect.

Use default value of 4096 if invalid.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 1b311f89 23-Oct-2017 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Add tx_opcode_stats like the opcode_stats

This patch adds tx_opcode_stats to parallel the
(rx)opcode_stats in the debugfs.

Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 8064135e 16-Oct-2017 Kees Cook <keescook@chromium.org>

IB/hfi1: Convert timers to use timer_setup()

In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly. Switches test of .data field to
.function, since .data will be going away.

Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# f8195f3b 09-Oct-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Eliminate allocation while atomic

The PIO trailing buffer was being dynamically allocated
but the kcalloc return value was not being checked. Further,
the GFP_KERNEL was being used even though the send engine
might be called with interrupts disabled.

Since the maximum size of the trailing buffer is only 12
bytes (CRC = 4, LT = 1, Pad = 0 to 7 bytes) just statically
allocate the buffer, remove the alloc entirely and share it
with the SDMA engine by making it global.

Reported-by: Leon Romanovsky <leon@kernel.org>
Fixes: 566d53a82644 ("IB/hfi1: Enhance PIO/SDMA send for 16B")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 6d945a84 11-Oct-2017 Bart Van Assche <bvanassche@acm.org>

IB/hfi1: Remove set-but-not-used variables

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 7221403d 04-Aug-2017 Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

IB/hfi1: Enable RDMA_CAP_OPA_AH in hfi driver to support extended LIDs

Enabling this bit helps core components query for extended address
support using the rdma_cap_opa_ah interface.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 566d53a8 04-Aug-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Enhance PIO/SDMA send for 16B

PIO/SDMA send logic now uses the hdr_type field to determine
the type of packet that has been constructed. Based on the hdr_type,
certain things such as PBC flags, padding count and the LT extra
trailing bytes are determined.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 51e658f5 04-Aug-2017 Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

IB/rdmavt, hfi1, qib: Enhance rdmavt and hfi1 to use 32 bit lids

Increase lid used in hfi1 driver to 32 bits. qib continues
to use 16 bit lids.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# d98bb7f7 04-Aug-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Determine 9B/16B L2 header type based on Address handle

When address handle attributes are initialized, the LIDs are
transformed to be in the 32 bit LID space.
When constructing the header, hfi1 driver will look at the LID
to determine the packet header to be created.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 5786adf3 04-Aug-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Add support to process 16B header errors

Enhance hdr_rcverr() to also handle errors during
16B bypass packet receive.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 30e07416 04-Aug-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Add support to send 16B bypass packets

We introduce struct hfi1_opa_header as a union
of ib (9B) and 16B headers.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 72c07e2b 04-Aug-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Add support to receive 16B bypass packets

We introduce a struct hfi1_16b_header to support 16B headers.
16B bypass packets are received by the driver and processed
similar to 9B packets. Add basic support to handle 16B packets.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 13c19222 04-Aug-2017 Don Hiatt <don.hiatt@intel.com>

IB/rdmavt, hfi1, qib: Modify check_ah() to account for extended LIDs

rvt_check_ah() delegates lid verification to underlying
driver. Underlying driver uses different conditions to
check for dlid depending on whether the device supports
extended LIDs

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 9abb0d1b 27-Jun-2017 Leon Romanovsky <leon@kernel.org>

RDMA: Simplify get firmware interface

There is a need to forward FW version to user space
application through RDMA netlink. In order to make it safe, there
is need to declare nla_policy and limit the size of FW string.

The new define IB_FW_VERSION_NAME_MAX will limit the size of
FW version string. That define was chosen to be equal to
ETHTOOL_FWVERS_LEN, because many drivers anyway are limited
by that value indirectly.

The introduction of this define allows us to remove the string size
from get_fw_str function signature.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>


# bf90aadd 24-Jul-2017 Michael J. Ruhl <michael.j.ruhl@intel.com>

IB/hfi1: Send MAD traps until repressed

A trap should be sent to the FM until the FM sends a repress message.
This is in line with the IBTA 13.4.9.

Add the ability to resend traps until a repress message is received.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Michael N. Henry <michael.n.henry@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# bcad2913 24-Jul-2017 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Serve the most starved iowait entry first

When an egress resource(SDMA descriptors, pio credits) is not available,
a sending thread will be put on the resource's wait queue. When the
resource becomes available again, up to a fixed number of sending threads
can be awakened sequentially and removed from the wait queue, depending
on the number of waiting threads and the number of free resources. Since
each awakened sending thread will send as many packets as possible, it
is highly likely that the first sending thread will consume all the
egress resources. Subsequently, it will be put back to the end of the wait
queue. Depending on the timing when the later sending threads wake up,
they may not be able to send any packet and be again put back to the end
of the wait queue sequentially, right behind the first sending thread.
This starvation cycle continues until some sending threads exceed their
retry limit and consequently fail.

This patch fixes the issue by two simple approaches:
(1) Any starved sending thread will be put to the head of the wait queue
while a served sending thread will be put to the tail;
(2) The most starved sending thread will be served first.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 8e959601 30-Jun-2017 Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>

IB/core, opa_vnic, hfi1, mlx5: Properly free rdma_netdev

IPOIB is calling free_rdma_netdev even though alloc_rdma_netdev has
returned -EOPNOTSUPP.
Move free_rdma_netdev from ib_device structure to rdma_netdev structure
thus ensuring proper cleanup function is called for the rdma net device.

Fix the following trace:

ib0: Failed to modify QP to ERROR state
BUG: unable to handle kernel paging request at 0000000000001d20
IP: hfi1_vnic_free_rn+0x26/0xb0 [hfi1]
Call Trace:
ipoib_remove_one+0xbe/0x160 [ib_ipoib]
ib_unregister_device+0xd0/0x170 [ib_core]
rvt_unregister_device+0x29/0x90 [rdmavt]
hfi1_unregister_ib_device+0x1a/0x100 [hfi1]
remove_one+0x4b/0x220 [hfi1]
pci_device_remove+0x39/0xc0
device_release_driver_internal+0x141/0x200
driver_detach+0x3f/0x80
bus_remove_driver+0x55/0xd0
driver_unregister+0x2c/0x50
pci_unregister_driver+0x2a/0xa0
hfi1_mod_cleanup+0x10/0xf65 [hfi1]
SyS_delete_module+0x171/0x250
do_syscall_64+0x67/0x150
entry_SYSCALL64_slow_path+0x25/0x25

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# bec7c79c 29-May-2017 Byczkowski, Jakub <jakub.byczkowski@intel.com>

IB/hfi1: Modify handling of physical link state by Host Driver

Ensure states returned to the Fabric Manager are consistent with
the OPA specification by caching the physical state along with the
logical state.

Reviewed-by: Stuart Summers <john.s.summers@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Andrzej Kotlowski <andrzej.kotlowski@intel.com>
Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# cb49366f 01-Jun-2017 Vishwanathapura, Niranjana <niranjana.vishwanathapura@intel.com>

IB/core,rdmavt,hfi1,opa-vnic: Send OPA cap_mask3 in trap

Provide the ability for IB clients to modify the OPA specific
capability mask and include this mask in the subsequent trap data.

Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Michael N. Henry <michael.n.henry@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 9039746c 12-May-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Setup common IB fields in hfi1_packet struct

We move many common IB fields into the hfi1_packet structure and
set them up in a single function. This allows us to set the fields
in a single place and not deal with them throughout the driver.

Reviewed-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 228d2af1 12-May-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Separate input/output header tracing

Calls to trace incoming packets will now receive the packet
context as parameter. This enables trace support for future
packet types.

Header trace output is in the format <field>:<value>
which makes parsing easier.

input_ibhdr trace before change:
<idle>-0 [001] d.h. 5904.250925: input_ibhdr: [0000:05:00.0] vl 0
lver 0 sl 0 lnh 2,LRH_BTH dlid 0002 len 18 slid 0001 op
0x64,UD_SEND_ONLY se 0 m 0 pad 0 tver 0 pkey 0xffff f 0 b 0 qpn 0x000001
a 0 psn 0x000001b2 deth qkey 0x80010000 sqpn 0x000001

input_ibhdr trace after change:
<idle>-0 [001] d.h. 6655.714488: input_ibhdr: [0000:05:00.0] (IB)
len:124 sc:0 dlid:0x0001 slid:0x0002 lnh:2,LRH_BTH lver:0 sl:0 age:0
becn:0 fecn:0 l4:0 rc:0 entropy:0 op:0x64,UD_SEND_ONLY se:0 m:0 pad:0
tver:0 pkey:0x7fff f:0 b:0 qpn:0x000001 a:0 psn:0x00000036 hlen:8 deth
qkey:0x80010000 sqpn:0x000001

Reviewed-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 7dafbab3 12-May-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Add functions to parse BTH/IB headers

Improve code readablity by adding inline functions
to read specific BTH/IB fields without knowledge of
byte offsets.

Reviewed-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 44c58487 29-Apr-2017 Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

IB/core: Define 'ib' and 'roce' rdma_ah_attr types

rdma_ah_attr can now be either ib or roce allowing
core components to use one type or the other and also
to define attributes unique to a specific type. struct
ib_ah is also initialized with the type when its first
created. This ensures that calls such as modify_ah
dont modify the type of the address handle attribute.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# d8966fcd 29-Apr-2017 Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

IB/core: Use rdma_ah_attr accessor functions

Modify core and driver components to use accessor functions
introduced to access individual fields of rdma_ah_attr

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 0a18cfe4 29-Apr-2017 Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

IB/core: Rename ib_create_ah to rdma_create_ah

Rename ib_create_ah to rdma_create_ah so its in sync with the
rename of the ib address handle attribute

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 90898850 29-Apr-2017 Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

IB/core: Rename struct ib_ah_attr to rdma_ah_attr

This patch simply renames struct ib_ah_attr to
rdma_ah_attr as these fields specify attributes that are
not necessarily specific to IB.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# b6eac931 09-Apr-2017 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Prevent kernel QP post send hard lockups

The driver progress routines can call cond_resched() when
a timeslice is exhausted and irqs are enabled.

If the ULP had been holding a spin lock without disabling irqs and
the post send directly called the progress routine, the cond_resched()
could yield allowing another thread from the same ULP to deadlock
on that same lock.

Correct by replacing the current hfi1_do_send() calldown with a unique
one for post send and adding an argument to hfi1_do_send() to indicate
that the send engine is running in a thread. If the routine is not
running in a thread, avoid calling cond_resched().

CC: <stable@vger.kernel.org> # 4.7.x-
Fixes: Commit 831464ce4b74 ("IB/hfi1: Don't call cond_resched in atomic mode when sending packets")
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# cb427057 09-Apr-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Add functions to parse 9B headers

These inline functions improve code readability by
enabling callers to read specific fields from the
header without knowledge of byte offsets.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# aad559c2 09-Apr-2017 Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

IB/hfi1: Rename hdr2sc to hfi1_9B_get_sc5

The function really returned the 5-bit sc value from
the header and rhf. hdr2sc didn't quite describe what it did.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# aad9ff97 09-Apr-2017 Michael J. Ruhl <michael.j.ruhl@intel.com>

IB/rdmavt/hfi1/qib: Use the MGID and MLID for multicast addressing

The Infiniband spec defines "A multicast address is defined by a
MGID and a MLID" (section 10.5).

The current code only uses the MGID for identifying multicast groups.
Update the driver to be compliant with this definition.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 2280740f 12-Apr-2017 Vishwanathapura, Niranjana <niranjana.vishwanathapura@intel.com>

IB/hfi1: Virtual Network Interface Controller (VNIC) HW support

HFI1 HW specific support for VNIC functionality.
Dynamically allocate a set of contexts for VNIC when the first vnic
port is instantiated. Allocate VNIC contexts from user contexts pool
and return them back to the same pool while freeing up. Set aside
enough MSI-X interrupts for VNIC contexts and assign them when the
contexts are allocated. On the receive side, use an RSM rule to
spread TCP/UDP streams among VNIC contexts.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 243d9f43 20-Mar-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Add transmit fault injection feature

Add ability to fault packets on transmit by opcode.
Dropping by packet can be achieved by setting the mask to 0.

In order to drop non-verbs traffic we set PbcInsertHrc
to NONE (0x2). The packet will still be delivered to
the receiving node but a KHdrHCRCErr (KDETH packet
with a bad HCRC) will be triggered and the packet will
not be delivered to the correct context.

In order to drop regular verbs traffic we set the
PbcTestEbp flag. The packet will still be delivered
to the receiving node but a 'late ebp error' will
be triggered and will be dropped.

A global toggle (/sys/kernel/debug/hfi1/hfi1_X/fault_suppress_err)
has been added to suppress the error messages on the receive
node when a packet was faulted on the sending node.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 0181ce31 20-Mar-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Add receive fault injection feature

Add fault injection capability:
- Drop packets unconditionally (fault_by_packet)
- Drop packets based on opcode (fault_by_opcode)

This feature reacts to the global FAULT_INJECTION
config flag.

The faulting traces have been added:
- misc/fault_opcode
- misc/fault_packet

See 'Documentation/fault-injection/fault-injection.txt'
for details.

Examples:
- Dropping packets by opcode:
/sys/kernel/debug/hfi1/hfi1_X/fault_opcode
# Enable fault
echo Y > fault_by_opcode
# Setprobability of dropping (0-100%)
# echo 25 > probability
# Set opcode
echo 0x64 > opcode
# Number of times to fault
echo 3 > times
# An optional mask allows you to fault
# a range of opcodes
echo 0xf0 > mask
/sys/kernel/debug/hfi1/hfi1_X/fault_stats
contains a value in parentheses to indicate
number of each opcode dropped.

- Dropping packets unconditionally
/sys/kernel/debug/hfi1/hfi1_X/fault_packet
# Enable fault
echo Y > fault_by_packet
/sys/kernel/debug/hfi1/hfi1_X/fault_packet/fault_stats
contains the number of packets dropped.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 5e6e9424 20-Mar-2017 Michael J. Ruhl <michael.j.ruhl@intel.com>

IB/hfi1: Add a patch value to the firmware version string

The HFI firmware now includes a patch level in its version.
Updating the necessary code to include the patch version in the
firmware string.

Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 62eed66e 20-Mar-2017 Tadeusz Struk <tadeusz.struk@intel.com>

IB/hfi1: Protect the global dev_cntr_names and port_cntr_names

Protect the global dev_cntr_names and port_cntr_names with the global
mutex as they are allocated and freed in a function called per device.
Otherwise there is a danger of double free and memory leaks.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 43a474aa 20-Mar-2017 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/rdmavt, IB/hfi1, IB/qib: Make wc opcode translation driver dependent

The work to create a completion helper moved the translation of send
wqe operations to completion opcodes to rdmvat.

This precludes having driver dependent operations. Make the translation
driver dependent by doing the translation in the driver prior to the
rvt_qp_swqe_complete() call using restored translation tables.

Fixes: Commit f2dc9cdce83c ("IB/rdmavt: Add a send completion helper")
Fixes: Commit 0771da5a6e9d ("IB/hfi1,IB/qib: Use new send completion helper")
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 64b2ae74 14-Feb-2017 Arnd Bergmann <arnd@arndb.de>

IB/hfi1: use size_t for passing array length

gcc-7 produces a mysterious warning about the size argument being potentially out
of range:

drivers/infiniband/hw/hfi1/verbs.c: In function 'init_cntr_names':
drivers/infiniband/hw/hfi1/verbs.c:1644:2: error: 'memcpy': specified size between 18446744071562067968 and 18446744073709551615 exceeds maximum object size 9223372036854775807 [-Werror=stringop-overflow=]

This seems to refer to a the case where an 64-bit size_t gets truncated
into a negative 'int' and subsequently turned into a high 64-bit number
again.

The fix is clearly to use size_t here, which matches the type that gets
used for this value elsewhere.

Fixes: b7481944b06e ("IB/hfi1: Show statistics counters under IB stats interface")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 1198fcea 08-Feb-2017 Brian Welty <brian.welty@intel.com>

IB/hfi1, rdmavt: Move SGE state helper routines into rdmavt

To improve code reuse, add small SGE state helper routines to rdmavt_mr.h.
Leverage these in hfi1, including refactoring of hfi1_copy_sge.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 0128fcea 08-Feb-2017 Brian Welty <brian.welty@intel.com>

IB/hfi1, rdmavt: Update copy_sge to use boolean arguments

Convert copy_sge and related SGE state functions to use boolean.
For determining if QP is in user mode, add helper function in rdmavt_qp.h.
This is used to determine if QP needs the last byte ordering.
While here, change rvt_pd.user to a boolean.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 56acbbfb 08-Feb-2017 Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

IB/hfi1: Use new rdmavt timers

Reduce hfi1 code footprint by using the rdmavt timers.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# f3e862cb 08-Feb-2017 Sebastian Sanchez <sebastian.sanchez@intel.com>

IB/hfi1: Access hfi1_ibport through rcd pointer

Receive code paths use the QP's device and port
number to access the struct hfi1_ibport. When an
instance of struct hfi1_ctxtdata is present, it can
be used to access struct hfi1_ibport through a pointer.
This makes struct hfi1_ibport lookup time faster as an
array doesn't have to be indexed and access fields in
other cache-lines.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# c4550c63 24-Jan-2017 Or Gerlitz <ogerlitz@mellanox.com>

IB: Query ports via the core instead of direct into the driver

Change the drivers to call ib_query_port in their get port
immutable handler instead of their own query port handler.

Doing this required to set the core cap flags of this device
before the ib_query_port call is made, since the IB core might
need these caps to serve the port query.

Drivers are ensured by the IB core that the port attributes passed
to the port query verb implementation are zero, and hence we
removed the zeroing from the drivers.

This patch doesn't add any new functionality.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Acked-by: Adit Ranadive <aditr@vmware.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 3067771c 20-Jan-2017 Bart Van Assche <bvanassche@acm.org>

IB/hfi1: Switch from dma_device to dev.parent

Prepare for removal of ib_device.dma_device.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 0771da5a 07-Dec-2016 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1,IB/qib: Use new send completion helper

Convert cq completion returns in both rdmavt drivers
to use the new helper.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# b777f154 07-Dec-2016 Mitko Haralanov <mitko.haralanov@intel.com>

IB/hfi1: Remove usage of qp->s_cur_sge

The s_cur_sge field in the qp structure holds a pointer to the
SGE of the currently processed WQE. It assumes the protection
of the RVT_S_BUSY flag to prevent the changing of this field
while the send engine is using it. This scheme works as long
as there is only one instance of the send engine running at a
time.

Scaling of the send engine to multiple cores would break this
assumption as there could be multiple instances of the send engine
running on different CPUs. This opens a window where the QP's
RVT_S_BUSY flag is not set but the send engine is still running.

To prevent accidental changing of the s_cur_sge pointer, the QP's
dependence on it is removed. The SGE pointer is now stored in the
verbs_txreq, which is a per-packet data structure. This ensures
that each individual packet has it's own pointer, which is setup
while the RVT_S_BUSY flag is set.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# e922ae06 07-Dec-2016 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Remove dependence on qp->s_cur_size

The qp->s_cur_size field assumes that the S_BUSY bit protects
the field from modification after the slock is dropped. Scaling the
send engine to multiple cores would break that assumption.

Correct the issue by carrying the payload size in the txreq structure.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# b7481944 07-Dec-2016 Jianxin Xiong <jianxin.xiong@intel.com>

IB/hfi1: Show statistics counters under IB stats interface

Previously tools like hfi1stats had to access these counters through
debugfs, which often caused permission issue for non-root users. It is
not always acceptable to change the debugfs mounting permission due
to security concerns. When exposed under the IB stats interface, the
counters are universally readable by default.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# a6cd5f08 17-Oct-2016 Jakub Pawlak <jakub.pawlak@intel.com>

IB/hfi1: Unify access to GUID entries

This patch consolidates the node GUIDs and the port GUID handling
and unifies access to these items. The knowledge of hfi1 GUIDs'
design and their location are kept in accessors to centralize access.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 4e045572 10-Oct-2016 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Add unique txwait_lock for txreq events

Profiling suggests that the read_seqbegin() in
the txreq put logic is colliding with other uses
of the iowait lock.

The packet at a time use of this lock dictates a unique
lock to avoid reader/writer collisions when the number
of vTxWait events is low.

In order to support a unique lock the iowait struct embedded
in the QP is extended to remember the lock that protects the queue
head.

The QP destroy removes that QP from any wait list. It doesn't
need to know the head because of the linked list API, but it does
need to know the lock required to protect the head.

This also opens up the wait logic to have unique per resources locks
which needs to be in future refinement.

Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# bd99fdea 25-Aug-2016 Yuval Shaia <yuval.shaia@oracle.com>

IB/{core,hw}: Add constant for node_desc

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 37aab620 30-Sep-2016 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Fix trace of atomic ack

The length is incorrect, causing the trace data to
be truncated.

Add the additional 8 bytes that should have been there.
Also trace out the atomic ack in hex to aid debugging.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# f6aa7835 25-Sep-2016 Jianxin Xiong <jianxin.xiong@intel.com>

IB/hfi1: Increase default settings of max_cqes and max_qps

The ib_write_bw test allows using up to 16384 QPs. When a relatively
large number of QPs (within that range) is used, the test can fail
because the number of CQ entries needed exceeds the limit set by the
driver.

This patch increases the default setting of max_cqes from 0x2FFFF
(196607) to 0x2FFFFF(3145727), which is sufficient to cover the
maximum number needed by the ib_write_bw test (2097152). The default
setting of max_qps is also increased from 16384 to 32768 to allow
the test to run successfully with 16383 or 16384 QPs.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# b374e060 25-Sep-2016 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Consolidate pio control masks into single definition

This allows for adding additional pages of adaptive pio
opcode control including manufacturer specific ones.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 261a4351 06-Sep-2016 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/qib,IB/hfi: Use core common header file

Use common header file structs, defines, and accessors
in the drivers. The old declarations are removed.

The repositioning of the includes allows for the removal
of hfi1_message_header and replaces its use with ib_header.

Also corrected are two issues with set_armed_to_active():
- The "packet" parameter is now a pointer as it should have been
- The etype is validated to insure that the header is correct

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 4d6f85c3 06-Sep-2016 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/rdmavt, IB/qib, IB/hfi1: Use new QP put get routines

This improves readability and hides the reference count
mechanism from the client drivers.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# b736a469 25-Jul-2016 Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

IB/hfi1: Use hdr2sc function to calculate 5-bit SC

The interface is used to compute the 5-bit SC field from the
LRH and the RHF bits. Modify code to use the interface instead.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# d4d602e9 25-Jul-2016 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Rename hfi1_pio_header to hfi1_sdma_header.

hfi1_pio_header should really be called hfi1_sdma_header
as it is only used for sdma transmits.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# a9b6b3bc 25-Jul-2016 Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

IB/hfi1: Rename struct ahg_ib_header to struct hfi1_ahg_info

struct ahg_ib_header has no header specific information.
Rename it to struct hfi1_ahg_info

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# c72cfe3e 25-Jul-2016 Jianxin Xiong <jianxin.xiong@intel.com>

IB/hfi1: Add support for extended memory management

Advertise and add the capability of handing all aspects of IBTA extended
memory management support in post send.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 0db3dfa0 25-Jul-2016 Jianxin Xiong <jianxin.xiong@intel.com>

IB/hfi1: Work request processing for fast register mr and invalidate

In order to support extended memory management support, add send side
processing of work requests of type IB_WR_REG_MR, IB_WR_LOCAL_INV, and
IB_WR_SEND_WITH_INV. The first two are local operations and are supported
for both RC and UC. Send with invalidate is only supported for RC because
the corresponding IB opcodes are not defined for UC.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# a2df0c83 25-Jul-2016 Jianxin Xiong <jianxin.xiong@intel.com>

IB/hfi1: Handle send with invalidate opcode in the RC recv path

As part of enabling extended memory management support, add the processing
of the RC send with invalidate.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 1ac57c50 01-Jul-2016 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Add hfi1 post send tables

Add initial table for table driven post_send support.

Reviewed-by: Jianxin Xiong <jianxin.xiong@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 71e68e3d 01-Jul-2016 Jakub Pawlak <jakub.pawlak@intel.com>

IB/hfi1: Correct receive packet handler assignment

Prevent processing receive packet in case when opcode is
accepted by QP but handler for this type of packet is not
defined.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 939b6ca8 15-Jun-2016 Ira Weiny <ira.weiny@intel.com>

IB/hfi1: Add device FW version string

Export the firmware version through the core.

Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# bdd8a98c 24-May-2016 Jianxin Xiong <jianxin.xiong@intel.com>

IB/hfi1: Add tracing support for send with invalidate opcode

Enable trace generation for packets with the "Send Last with
Invalidate" and "Send Only with Invalidate" opcodes.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# f48ad614 19-May-2016 Dennis Dalessandro <dennis.dalessandro@intel.com>

IB/hfi1: Move driver out of staging

The TODO list for the hfi1 driver was completed during 4.6. In addition
other objections raised (which are far beyond what was in the TODO list)
have been addressed as well. It is now time to remove the driver from
staging and into the drivers/infiniband sub-tree.

Reviewed-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>