History log of /linux-master/drivers/infiniband/hw/hfi1/verbs.h
Revision Date Author Comments
# 3ec648c6 23-Aug-2023 Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>

IB: Use capital "OR" for multiple licenses in SPDX

Documentation/process/license-rules.rst and checkpatch expect the SPDX
identifier syntax for multiple licenses to use capital "OR". Correct it
to keep consistent format and avoid copy-paste issues.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230823092912.122674-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>


# 0227f4d0 11-Sep-2022 Gaosheng Cui <cuigaosheng1@huawei.com>

IB/hfi1: remove rc_only_opcode and uc_only_opcode declarations

rc_only_opcode and uc_only_opcode have been removed since
commit b374e060cc2a ("IB/hfi1: Consolidate pio control masks
into single definition"), so remove them.

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Link: https://lore.kernel.org/r/20220911092325.3216513-1-cuigaosheng1@huawei.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>


# 145eba1a 22-Aug-2021 Cai Huoqing <caihuoqing@baidu.com>

RDMA/hfi1: Convert to SPDX identifier

use SPDX-License-Identifier instead of a verbose license text

Link: https://lore.kernel.org/r/20210823042622.109-1-caihuoqing@baidu.com
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 1fb7f897 01-Mar-2021 Mark Bloch <mbloch@nvidia.com>

RDMA: Support more than 255 rdma ports

Current code uses many different types when dealing with a port of a RDMA
device: u8, unsigned int and u32. Switch to u32 to clean up the logic.

This allows us to make (at least) the core view consistent and use the
same type. Unfortunately not all places can be converted. Many uverbs
functions expect port to be u8 so keep those places in order not to break
UAPIs. HW/Spec defined values must also not be changed.

With the switch to u32 we now can support devices with more than 255
ports. U32_MAX is reserved to make control logic a bit easier to deal
with. As a device with U32_MAX ports probably isn't going to happen any
time soon this seems like a non issue.

When a device with more than 255 ports is created uverbs will report the
RDMA device as having 255 ports as this is the max currently supported.

The verbs interface is not changed yet because the IBTA spec limits the
port size in too many places to be u8 and all applications that relies in
verbs won't be able to cope with this change. At this stage, we are
extending the interfaces that are using vendor channel solely

Once the limitation is lifted mlx5 in switchdev mode will be able to have
thousands of SFs created by the device. As the only instance of an RDMA
device that reports more than 255 ports will be a representor device and
it exposes itself as a RAW Ethernet only device CM/MAD/IPoIB and other
ULPs aren't effected by this change and their sysfs/interfaces that are
exposes to userspace can remain unchanged.

While here cleanup some alignment issues and remove unneeded sanity
checks (mainly in rdmavt),

Link: https://lore.kernel.org/r/20210301070420.439400-1-leon@kernel.org
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# c593642c 09-Dec-2019 Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>

treewide: Use sizeof_field() macro

Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except
at places where these are defined. Later patches will remove the unused
definition of FIELD_SIZEOF().

This patch is generated using following script:

EXCLUDE_FILES="include/linux/stddef.h|include/linux/kernel.h"

git grep -l -e "\bFIELD_SIZEOF\b" | while read file;
do

if [[ "$file" =~ $EXCLUDE_FILES ]]; then
continue
fi
sed -i -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file;
done

Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: David Miller <davem@davemloft.net> # for net


# e26e7b88 29-Oct-2019 Leon Romanovsky <leon@kernel.org>

RDMA: Change MAD processing function to remove extra casting and parameter

All users of process_mad() converts input pointers from ib_mad_hdr to be
ib_mad, update the function declaration to use ib_mad directly.

Also remove not used input MAD size parameter.

Link: https://lore.kernel.org/r/20191029062745.7932-17-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Tested-By: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 4bb02e95 13-Jun-2019 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Use aborts to trigger RC throttling

SDMA and pio flushes will cause a lot of packets to be transmitted
after a link has gone down, using a lot of CPU to retransmit
packets.

Fix for RC QPs by recognizing the flush status and:
- Forcing a timer start
- Putting the QP into a "send one" mode

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 270a9833 26-Feb-2019 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Add running average for adaptive pio

The adaptive PIO implementation only considers the current packet size
when deciding between SDMA and pio for a packet.

This causes credit return forces if small and large packets are
interleaved.

Add a running average to avoid costly credit forces so that a large
sequence of small packets is required to go below the threshold that
chooses pio.

Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 3c6cb20a 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add TID RDMA WRITE functionality into RDMA verbs

This patch integrates TID RDMA WRITE protocol into normal RDMA verbs
framework. The TID RDMA WRITE protocol is an end-to-end protocol
between the hfi1 drivers on two OPA nodes that converts a qualified
RDMA WRITE request into a TID RDMA WRITE request to avoid data copying
on the responder side.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 572f0c33 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add the dual leg code

The "Second Leg" of the TID RDMA WRITE protocol deals with
the transfer of data and ack packets, which are in the KDETH
PSN space, as opposed to the IB PSN space.

Therefore, the Second Leg could be considered as a separate
state machine. As such, it is handled by a different work
queue item which is scheduled along with the normal IB state
machine work item.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 70dcb2e3 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add the TID second leg send packet builder

To improve performance, the TID RDMA WRITE protocol is designed to
own a second leg to send data and ack packets in the KDETH PSN space.
This patch adds the packet builder for the requester side, which
contains the state machine to build TID RDMA WRITE DATA and TID
RDMA RESYNC packet.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 6e38fca6 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Resend the TID RDMA WRITE DATA packets

This patch adds the logic to resend TID RDMA WRITE DATA packets.
The tracking indices will be reset properly so that the correct
TID entries will be used.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 829eaee5 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add TID RDMA retry timer

This patch adds the TID RDMA retry timer to make sure that TID RDMA
WRITE DATA packets for a segment are received successfully by the
responder. This timer is generally armed when the last TID RDMA
WRITE DATA packet for a segment is sent out and stopped when all
TID RDMA DATA packets are acknowledged.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 9e93e967 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add a function to receive TID RDMA ACK packet

This patch adds a function to receive TID RDMA ACK packet, which could
be an acknowledge to either a TID RDMA WRITE DATA packet or an TID
RDMA RESYNC packet. For an ACK to TID RDMA WRITE DATA packet, the
request segments are completed appropriately. For an ACK to a TID
RDMA RESYNC packet, any pending segment flow information is updated
accordingly.

Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 0f75e325 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add a function to build TID RDMA ACK packet

This patch adds a function to build TID RDMA ACJ packet, which is also
in the KDETH PSN space for packet ordering. This packet is used to
acknowledge the receiving of all the TID RDMA WRITE DATA packets
before the given KDETH PSN. Similar to RC ACK packets, TID RDMA ACK
packets could also be coalesced.

Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# d72fe7d5 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add a function to receive TID RDMA WRITE DATA packet

This patch adds a function to receive TID RDMA WRITE DATA packet,
which is in the KDETH PSN space in packet ordering. Due to the use
of header suppression, software is generally only notified when
the last data packet for a segment is received. This patch also
adds code to handle KDETH EFLAGS errors for ingress TID RDMA WRITE
DATA packets.

Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 72a0ea99 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add a function to receive TID RDMA WRITE response

This patch adds a function to receive TID RDMA WRITE response.
The TID entries will be stored for encoding TID RDMA WRITE DATA
packet later.

Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 3c759e00 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add TID resource timer

This patch adds the TID resource timer, which is used by the responder
to free any TID resources that are allocated for TID RDMA WRITE request
and not returned by the requester after a reasonable time.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 38d46d36 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add a function to build TID RDMA WRITE response

This patch adds the function to build TID RDMA WRITE response. The
main role of the TID RDMA WRITE RESP packet is to send TID entries
to the requester so that they can be used to encode TID RDMA WRITE
DATA packet.

Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 07b92370 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add functions to receive TID RDMA WRITE request

This patch adds the functions to receive TID RDMA WRITE request. The
request will be stored in the QP's s_ack_queue. This patch also adds
code to handle duplicate TID RDMA WRITE request and a function to
allocate TID resources for data receiving on the responder side.

Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# a0b34f75 24-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add interlock between a TID RDMA request and other requests

This locking mechanism is designed to provent vavious memory corruption
scenarios from occurring when requests are pipelined, especially when
RDMA READ/WRITE requests are interleaved with TID RDMA READ/WRITE
requests:
1. READ-AFTER-READ;
2. READ-AFTER-WRITE;
3. WRITE-AFTER-READ;
When memory corruption is likely, a request will be held back until
previous requests have been completed.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 22d136d7 24-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add TID RDMA handlers

This commit adds the TID RDMA READ pointers to the receiving opcode
handlers. It also adds TID RDMA READ header sizes to header size table.
A function to print the RHF EFLAGS errors is created so that it can be
shared by both IB and TID RDMA receiving functions.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 9905bf06 05-Feb-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add functions to receive TID RDMA READ response

This patch adds the functions to receive TID RDMA READ response. The TID
resource information in the KDETH packet header will direct the hardware
to deliver the packet payload to the user buffer automatically and the
software will handle the packet header for the last packet of a segment
as all other packet headers are suppressed by default. The TID entries
will be freed when all packets for a segment have been received. This
patch also adds the functions to handle KDETH eflag errors, including
flow sequence and generation errors, when a TID RDMA READ response
packet is received . The flow sequence error can be recovered by software
checking of the flow sequence and will disappear when the hardware flow
is programmed with a new generation number.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# d0d564a1 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add functions to receive TID RDMA READ request

This patch adds the functions to receive TID RDMA READ request. The TID
resource information will be stored and tracked. Duplicate request
will also be handled properly.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 742a3826 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add functions to build TID RDMA READ request

This patch adds the helper functions to build the TID RDMA READ request
on the requester side. The key is to allocate TID resources (TID flow
and TID entries) and send the resource information to the responder side
along with the read request. Since the TID resources are limited, each
TID RDMA READ request has to be split into segments with a default
segment size of 256K. A software flow is allocated to track the data
transaction for each segment. The work request opcode, packet opcode, and
packet formats for TID RDMA READ protocol are also defined in this patch.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 2f16a696 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add the counter n_tidwait

This patch adds the counter n_tidwait to count the number of times the
TID resource allocator has to wait for TID resources.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 838b6fd2 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: TID RDMA RcvArray programming and TID allocation

TID entries are used by hfi1 hardware to receive data payload from
incoming packets directly into a user buffer and thus avoid data copying
by software. This patch implements the functions for TID allocation,
freeing, and programming TID RcvArray entries in hardware for kernel
clients. TID entries are managed via lists of TID groups similar to PSM.
Furthermore, to track TID resource allocation for each request, software
flows are also allocated and freed as needed. Since software flows
consume large amount of memory for tracking TID allocation and freeing,
it is generally desirable to allocate them dynamically in the send queue
and only for TID RDMA requests, but pre-allocate them for receive queue
because the send queue could have thousands of entries while the receive
queue has only a limited number of entries.

Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 37356e78 05-Feb-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: TID RDMA flow allocation

The hfi1 hardware flow is a hardware flow-control mechanism for a KDETH
data packet that is received on a hfi1 port. It validates the packet by
checking both the generation and sequence. Each QP that uses the TID RDMA
mechanism will allocate a hardware flow from its receiving context for
any incoming KDETH data packets.

This patch implements:
(1) a function to allocate hardware flow
(2) a function to free hardware flow
(3) a function to initialize hardware flow generation for a receiving
context
(4) a wait mechanism if the hardware flow is not available
(4) a function to remove the qp from the wait queue for hardware flow
when the qp is reset or destroyed.

Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# d22a207d 23-Jan-2019 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Add OPFN helper functions for TID RDMA feature

This patch adds the OPFN helper functions to initialize, encode, decode,
and reset OPFN parameters for the TID RDMA feature.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 44e43d91 24-Jan-2019 Mitko Haralanov <mitko.haralanov@intel.com>

IB/hfi1: OPFN support discovery

OPFN (Omni Path Feature Negotiation) support discovery allows a RC QP to
announce that it supports OPFN and also discover if OPFN is supported by
the peer QP. OPFN parameter negotiation is skipped unless OPFN support is
first discovered. OPFN support is announced by claiming what was
the reserved bit in dword 1 of OmniPath modified base transport header
in requests and responses.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 5190f052 28-Nov-2018 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Allow the driver to initialize QP priv struct

This patch adds an interface to allow the driver to initialize the QP priv
struct when the QP is created and after the qpn has been assigned. A
field is added to the QP priv struct to reference the rcd and two new
files are added to contain the function to initialize the rcd field so
that more TID RDMA related code can be added here later.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 116aa033 26-Sep-2018 Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

IB/{hfi1, qib, rdmavt}: Move send completion logic to rdmavt

Moving send completion code into rdmavt in order to have shared logic
between qib and hfi1 drivers.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 019f118b 26-Sep-2018 Brian Welty <brian.welty@intel.com>

IB/{hfi1, qib, rdmavt}: Move copy SGE logic into rdmavt

This patch moves hfi1_copy_sge() into rdmavt for sharing with qib.
This patch also moves all the wss_*() functions into rdmavt as
several wss_*() functions are called from hfi1_copy_sge()

When SGE copy mode is adaptive, cacheless copy may be done in some cases
for performance reasons. In those cases, X86 cacheless copy function
is called since the drivers that use rdmavt and may set SGE copy mode
to adaptive are X86 only. For this reason, this patch adds
"depends on X86_64" to rdmavt/Kconfig.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 5da0fc9d 28-Sep-2018 Dennis Dalessandro <dennis.dalessandro@intel.com>

IB/hfi1: Prepare resource waits for dual leg

Current implementation allows each qp to have only one send engine. As
such, each qp has only one list to queue prebuilt packets when send engine
resources are not available. To improve performance, it is desired to
support multiple send engines for each qp.

This patch creates the framework to support two send engines
(two legs) for each qp for the TID RDMA protocol, which can be easily
extended to support more send engines. It achieves the goal by creating a
leg specific struct, iowait_work in the iowait struct, to hold the
work_struct and the tx_list as well as a pointer to the parent iowait
struct.

The hfi1_pkt_state now has an additional field to record the current legs
work structure and that is now passed to all egress waiters to determine
the leg that needs to wait via a new iowait helper. The APIs are adjusted
to use the new leg specific struct as required.

Many new and modified helpers are added to support this change.

Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# d205a06a 26-Sep-2018 Kaike Wan <kaike.wan@intel.com>

IB/rdmavt: Rename check_send_wqe as setup_wqe

The driver-provided function check_send_wqe allows the hardware driver to
check and set up the incoming send wqe before it is inserted into the swqe
ring. This patch will rename it as setup_wqe to better reflect its
usage. In addition, this function is only called when all setup is
complete in rdmavt.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 0b79b277 10-Sep-2018 Michael J. Ruhl <michael.j.ruhl@intel.com>

IB/{hfi1, qib, rdmavt}: Schedule multi RC/UC packets instead of posting

The post_send() path determines if it should post directly or, schedule
the post for later. The current logic is:

if the swqe ring is empty or (for hfi1) wqe->length <= piothreshold
post the send
else
schedule

This can allow large requests to call the send engine directly. Large
requests can potentially produce a large number of packets prior to
returning to the caller, blocking the caller from posting more requests,
and allowing better parallel processing.

Allow the driver(s) more say in this logic (pass call_send to the driver,
rather than examining a return value).

Update hfi1/qib logic to schedule the send engine if an RC or UC message
is larger than the QP MTU size.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 4171a693 15-May-2018 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Define 16B Management Packets

Add 16B Management Packet definition. This optimized packet
format replaces the ib_other_headers and BTH with a source
and destination QP number.

To support these packets we introduce struct opa_16b_mgmt
into the struct hfi1_16b_header.

This packet format is only used for MAD packets using the
IB_OPCODE_UD_SEND_ONLY opcode on QP0/1.

The original 16B implementation failed to use 16B management
packets so now we add their definition.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 832369fa 02-May-2018 Brian Welty <brian.welty@intel.com>

IB/{hfi1, qib, rdmavt}: Move logic to allocate receive WQE into rdmavt

Moving receive-side WQE allocation logic into rdmavt will allow
further code reuse between qib and hfi1 drivers.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# a74d5307 02-May-2018 Mitko Haralanov <mitko.haralanov@intel.com>

IB/hfi1: Rework fault injection machinery

The packet fault injection code present in the HFI1 driver had some
issues which not only fragment the code but also created user
confusion. Furthermore, it suffered from the following issues:

1. The fault_packet method only worked for received packets. This
meant that the only fault injection mode available for sent
packets is fault_opcode, which did not allow for random packet
drops on all egressing packets.
2. The mask available for the fault_opcode mode did not really work
due to the fact that the opcode values are not bits in a bitmask but
rather sequential integer values. Creating a opcode/mask pair that
would successfully capture a set of packets was nearly impossible.
3. The code was fragmented and used too many debugfs entries to
operate and control. This was confusing to users.
4. It did not allow filtering fault injection on a per direction basis -
egress vs. ingress.

In order to improve or fix the above issues, the following changes have
been made:

1. The fault injection methods have been combined into a single fault
injection facility. As such, the fault injection has been plugged
into both the send and receive code paths. Regardless of method used
the fault injection will operate on both egress and ingress packets.
2. The type of fault injection - by packet or by opcode - is now controlled
by changing the boolean value of the file "opcode_mode". When the value
is set to True, fault injection is done by opcode. Otherwise, by
packet.
2. The masking ability has been removed in favor of a bitmap that holds
opcodes of interest (one bit per opcode, a total of 256 bits). This
works in tandem with the "opcode_mode" value. When the value of
"opcode_mode" is False, this bitmap is ignored. When the value is
True, the bitmap lists all opcodes to be considered for fault injection.
By default, the bitmap is empty. When the user wants to filter by opcode,
the user sets the corresponding bit in the bitmap by echo'ing the bit
position into the 'opcodes' file. This gets around the issue that the set
of opcodes does not lend itself to effective masks and allow for extremely
fine-grained filtering by opcode.
4. fault_packet and fault_opcode methods have been combined. Hence, there
is only one debugfs directory controlling the entire operation of the
fault injection machinery. This reduces the number of debugfs entries
and provides a more unified user experience.
5. A new control files - "direction" - is provided to allow the user to
control the direction of packets, which are subject to fault injection.
6. A new control file - "skip_usec" - is added that would allow the user
to specify a "timeout" during which no fault injection will occur.

In addition, the following bug fixes have been applied:

1. The fault injection code has been split into its own header and source
files. This was done to better organize the code and support conditional
compilation without littering the code with #ifdef's.
2. The method by which the TX PIO packets were being marked for drop
conflicted with the way send contexts were being setup. As a result,
the send context was repeatedly being reset.
3. The fault injection only makes sense when the user can control it
through the debugfs entries. However, a kernel configuration can
enable fault injection but keep fault injection debugfs entries
disabled. Therefore, it makes sense that the HFI fault injection
code depends on both.
4. Error suppression did not take into account the method by which PIO
packets were being dropped. Therefore, even with error suppression
turned on, errors would still be displayed to the screen. A larger
enough packet drop percentage would case the kernel to crash because
the driver would be stuck printing errors.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 78d3633b 01-Feb-2018 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Remove blind constants from 16B update

These values were introduced as part of the 16B code to
account for the varying size of the LRH between the differing
packet formats.

Replace the blind constants with defines based on FIELD_SIZEOF()
calls.

Fixes: 5b6cabb0db77 ("IB/hfi1: Add 16B RC/UC support")
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# bdaf96f6 01-Feb-2018 Sebastian Sanchez <sebastian.sanchez@intel.com>

IB/hfi1: Look up ibport using a pointer in receive path

In the receive path, hfi1_ibport is looked up by indexing into an
array. A profile shows this to be expensive. The receive context
data has a pointer to the ibport data, use that pointer instead.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 6d6b8848 01-Feb-2018 Sebastian Sanchez <sebastian.sanchez@intel.com>

IB/hfi1: Optimize packet type comparison using 9B and bypass code paths

The packet type comparison used to find out if a packet is a bypass
packet in the hot path is an expensive operation as seen in a profile.

Determine packet's pkey and migration bit through the bypass and 9B
code paths instead.

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 9636258f 01-Feb-2018 Mitko Haralanov <mitko.haralanov@intel.com>

IB/hfi1: Remove dependence on qp->s_hdrwords

The s_hdrwords variable was used to indicate whether a
packet was already built on a previous iteration of the
send engine. This variable assumed the protection of the
QP's RVT_S_BUSY flag, which was required since the the
QP's s_lock was dropped just prior to the packet being
queued on the one of the egress mechanisms.

Support for multiple send engine instantiations require
that the field not be used due to concurency issues.
The ps.txreq signals the "already built" without the
potential concurency issues.

Fix by getting rid of all s_hdrword usage. A wrapper
is added to test for the already built case that used to
use s_hdrwords.

What used to be stored in s_hdrwords is now in the txreq.
The PBC is not counted, but is added in the pio/sdma code
paths prior to posting the packet.

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 61654679 13-Aug-2017 Grzegorz Morys <grzegorz.morys@intel.com>

IB/hfi1: Remove HFI1_VERBS_31BIT_PSN option

Remove HFI1_VERBS_31BIT_PSN Kconfig option leaving only 31-bit PSNs
available. The option was implemented in the early days of the driver
and is no longer needed.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Grzegorz Morys <grzegorz.morys@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 566d53a8 04-Aug-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Enhance PIO/SDMA send for 16B

PIO/SDMA send logic now uses the hdr_type field to determine
the type of packet that has been constructed. Based on the hdr_type,
certain things such as PBC flags, padding count and the LT extra
trailing bytes are determined.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 5b6cabb0 04-Aug-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Add 16B RC/UC support

Add 16B bypass packet support for RC/UC traffic types.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 51e658f5 04-Aug-2017 Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

IB/rdmavt, hfi1, qib: Enhance rdmavt and hfi1 to use 32 bit lids

Increase lid used in hfi1 driver to 32 bits. qib continues
to use 16 bit lids.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 88733e3b 04-Aug-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Add 16B UD support

Add 16B bypass packet support for UD traffic types.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# d98bb7f7 04-Aug-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Determine 9B/16B L2 header type based on Address handle

When address handle attributes are initialized, the LIDs are
transformed to be in the 32 bit LID space.
When constructing the header, hfi1 driver will look at the LID
to determine the packet header to be created.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 5786adf3 04-Aug-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Add support to process 16B header errors

Enhance hdr_rcverr() to also handle errors during
16B bypass packet receive.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 30e07416 04-Aug-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Add support to send 16B bypass packets

We introduce struct hfi1_opa_header as a union
of ib (9B) and 16B headers.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 72c07e2b 04-Aug-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Add support to receive 16B bypass packets

We introduce a struct hfi1_16b_header to support 16B headers.
16B bypass packets are received by the driver and processed
similar to 9B packets. Add basic support to handle 16B packets.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# bcad2913 24-Jul-2017 Kaike Wan <kaike.wan@intel.com>

IB/hfi1: Serve the most starved iowait entry first

When an egress resource(SDMA descriptors, pio credits) is not available,
a sending thread will be put on the resource's wait queue. When the
resource becomes available again, up to a fixed number of sending threads
can be awakened sequentially and removed from the wait queue, depending
on the number of waiting threads and the number of free resources. Since
each awakened sending thread will send as many packets as possible, it
is highly likely that the first sending thread will consume all the
egress resources. Subsequently, it will be put back to the end of the wait
queue. Depending on the timing when the later sending threads wake up,
they may not be able to send any packet and be again put back to the end
of the wait queue sequentially, right behind the first sending thread.
This starvation cycle continues until some sending threads exceed their
retry limit and consequently fail.

This patch fixes the issue by two simple approaches:
(1) Any starved sending thread will be put to the head of the wait queue
while a served sending thread will be put to the tail;
(2) The most starved sending thread will be served first.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 13d84914 29-May-2017 Dennis Dalessandro <dennis.dalessandro@intel.com>

IB/hfi1,qib: Do not send QKey trap for UD qps

According to IBTA spec a QKey violation should not result in a bad qkey
trap being triggered for UD queue pairs. Also since it is a silent error
we do not increment the q_key violation or the dropped packet counters.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 9039746c 12-May-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Setup common IB fields in hfi1_packet struct

We move many common IB fields into the hfi1_packet structure and
set them up in a single function. This allows us to set the fields
in a single place and not deal with them throughout the driver.

Reviewed-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# dd1ed108 04-May-2017 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Fix yield logic in send engine

When there are many RC QPs and an RDMA READ request
is sent, timeouts occur on the requester side because
of fairness among RC QPs on their relative SDMA engine
on the responder side. This also hits write and send, but
to a lesser extent.

Complicating the issue is that the current code checks if workqueue
is congested before scheduling other QPs, however, this
check is based on the number of active entries in the
workqueue, which was found to be too big to for
workqueue_congested() to be effective.

Fix by reducing the number of active entries as revealed by
experimentation from the default of num_sdma to
HFI1_MAX_ACTIVE_WORKQUEUE_ENTRIES. Retry counts were monitored
to determine the correct value.

Tracing to investigate any future issues is also added.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 688f21c0 04-May-2017 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1, IB/rdmavt: Move r_adefered to r_lock cache line

This field is causing excessive cache line bouncing.

There are spare bytes in the r_lock cache line so the best approach
is to make an rvt QP field and remove from the hfi1 priv field.

Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# d8966fcd 29-Apr-2017 Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

IB/core: Use rdma_ah_attr accessor functions

Modify core and driver components to use accessor functions
introduced to access individual fields of rdma_ah_attr

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 90898850 29-Apr-2017 Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

IB/core: Rename struct ib_ah_attr to rdma_ah_attr

This patch simply renames struct ib_ah_attr to
rdma_ah_attr as these fields specify attributes that are
not necessarily specific to IB.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# b6eac931 09-Apr-2017 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Prevent kernel QP post send hard lockups

The driver progress routines can call cond_resched() when
a timeslice is exhausted and irqs are enabled.

If the ULP had been holding a spin lock without disabling irqs and
the post send directly called the progress routine, the cond_resched()
could yield allowing another thread from the same ULP to deadlock
on that same lock.

Correct by replacing the current hfi1_do_send() calldown with a unique
one for post send and adding an argument to hfi1_do_send() to indicate
that the send engine is running in a thread. If the routine is not
running in a thread, avoid calling cond_resched().

CC: <stable@vger.kernel.org> # 4.7.x-
Fixes: Commit 831464ce4b74 ("IB/hfi1: Don't call cond_resched in atomic mode when sending packets")
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 243d9f43 20-Mar-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Add transmit fault injection feature

Add ability to fault packets on transmit by opcode.
Dropping by packet can be achieved by setting the mask to 0.

In order to drop non-verbs traffic we set PbcInsertHrc
to NONE (0x2). The packet will still be delivered to
the receiving node but a KHdrHCRCErr (KDETH packet
with a bad HCRC) will be triggered and the packet will
not be delivered to the correct context.

In order to drop regular verbs traffic we set the
PbcTestEbp flag. The packet will still be delivered
to the receiving node but a 'late ebp error' will
be triggered and will be dropped.

A global toggle (/sys/kernel/debug/hfi1/hfi1_X/fault_suppress_err)
has been added to suppress the error messages on the receive
node when a packet was faulted on the sending node.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 0181ce31 20-Mar-2017 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Add receive fault injection feature

Add fault injection capability:
- Drop packets unconditionally (fault_by_packet)
- Drop packets based on opcode (fault_by_opcode)

This feature reacts to the global FAULT_INJECTION
config flag.

The faulting traces have been added:
- misc/fault_opcode
- misc/fault_packet

See 'Documentation/fault-injection/fault-injection.txt'
for details.

Examples:
- Dropping packets by opcode:
/sys/kernel/debug/hfi1/hfi1_X/fault_opcode
# Enable fault
echo Y > fault_by_opcode
# Setprobability of dropping (0-100%)
# echo 25 > probability
# Set opcode
echo 0x64 > opcode
# Number of times to fault
echo 3 > times
# An optional mask allows you to fault
# a range of opcodes
echo 0xf0 > mask
/sys/kernel/debug/hfi1/hfi1_X/fault_stats
contains a value in parentheses to indicate
number of each opcode dropped.

- Dropping packets unconditionally
/sys/kernel/debug/hfi1/hfi1_X/fault_packet
# Enable fault
echo Y > fault_by_packet
/sys/kernel/debug/hfi1/hfi1_X/fault_packet/fault_stats
contains the number of packets dropped.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 1198fcea 08-Feb-2017 Brian Welty <brian.welty@intel.com>

IB/hfi1, rdmavt: Move SGE state helper routines into rdmavt

To improve code reuse, add small SGE state helper routines to rdmavt_mr.h.
Leverage these in hfi1, including refactoring of hfi1_copy_sge.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 0128fcea 08-Feb-2017 Brian Welty <brian.welty@intel.com>

IB/hfi1, rdmavt: Update copy_sge to use boolean arguments

Convert copy_sge and related SGE state functions to use boolean.
For determining if QP is in user mode, add helper function in rdmavt_qp.h.
This is used to determine if QP needs the last byte ordering.
While here, change rvt_pd.user to a boolean.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 56acbbfb 08-Feb-2017 Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

IB/hfi1: Use new rdmavt timers

Reduce hfi1 code footprint by using the rdmavt timers.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 696513e8 08-Feb-2017 Brian Welty <brian.welty@intel.com>

IB/hfi1, qib, rdmavt: Move AETH credit functions into rdmavt

Add rvt_compute_aeth() and rvt_get_credit() as shared functions in
rdmavt, moved from hfi1/qib logic.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# beb5a042 08-Feb-2017 Brian Welty <brian.welty@intel.com>

IB/hfi1, qib, rdmavt: Move two IB event functions into rdmavt

Add rvt_rc_error() and rvt_comm_est() as shared functions in
rdmavt, moved from hfi1/qib logic.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# a6cd5f08 17-Oct-2016 Jakub Pawlak <jakub.pawlak@intel.com>

IB/hfi1: Unify access to GUID entries

This patch consolidates the node GUIDs and the port GUID handling
and unifies access to these items. The knowledge of hfi1 GUIDs'
design and their location are kept in accessors to centralize access.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 4e045572 10-Oct-2016 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Add unique txwait_lock for txreq events

Profiling suggests that the read_seqbegin() in
the txreq put logic is colliding with other uses
of the iowait lock.

The packet at a time use of this lock dictates a unique
lock to avoid reader/writer collisions when the number
of vTxWait events is low.

In order to support a unique lock the iowait struct embedded
in the QP is extended to remember the lock that protects the queue
head.

The QP destroy removes that QP from any wait list. It doesn't
need to know the head because of the linked list API, but it does
need to know the lock required to protect the head.

This also opens up the wait logic to have unique per resources locks
which needs to be in future refinement.

Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# b374e060 25-Sep-2016 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/hfi1: Consolidate pio control masks into single definition

This allows for adding additional pages of adaptive pio
opcode control including manufacturer specific ones.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 261a4351 06-Sep-2016 Mike Marciniszyn <mike.marciniszyn@intel.com>

IB/qib,IB/hfi: Use core common header file

Use common header file structs, defines, and accessors
in the drivers. The old declarations are removed.

The repositioning of the includes allows for the removal
of hfi1_message_header and replaces its use with ib_header.

Also corrected are two issues with set_armed_to_active():
- The "packet" parameter is now a pointer as it should have been
- The etype is validated to insure that the header is correct

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# d4d602e9 25-Jul-2016 Don Hiatt <don.hiatt@intel.com>

IB/hfi1: Rename hfi1_pio_header to hfi1_sdma_header.

hfi1_pio_header should really be called hfi1_sdma_header
as it is only used for sdma transmits.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# a9b6b3bc 25-Jul-2016 Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

IB/hfi1: Rename struct ahg_ib_header to struct hfi1_ahg_info

struct ahg_ib_header has no header specific information.
Rename it to struct hfi1_ahg_info

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# bd24ef5e 25-Jul-2016 Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

IB/hfi1: Remove unused elements from struct ahg_ib_header

sde and hfi1_ib_header are not used anymore.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 23f7d0d2 24-May-2016 Jianxin Xiong <jianxin.xiong@intel.com>

IB/hfi1, qib: Add ieth to the packet header definitions

A new union member "ieth" (Invalidate Extended Transport Header) is
added to the packet header definition in preparation of supporting
the send with invalidate opcode.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# f48ad614 19-May-2016 Dennis Dalessandro <dennis.dalessandro@intel.com>

IB/hfi1: Move driver out of staging

The TODO list for the hfi1 driver was completed during 4.6. In addition
other objections raised (which are far beyond what was in the TODO list)
have been addressed as well. It is now time to remove the driver from
staging and into the drivers/infiniband sub-tree.

Reviewed-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>