History log of /linux-master/drivers/infiniband/hw/mthca/mthca_qp.c
Revision Date Author Comments
# dc52aadb 13-Jul-2023 Thomas Bogendoerfer <tbogendoerfer@suse.de>

RDMA/mthca: Fix crash when polling CQ for shared QPs

Commit 21c2fe94abb2 ("RDMA/mthca: Combine special QP struct with mthca QP")
introduced a new struct mthca_sqp which doesn't contain struct mthca_qp
any longer. Placing a pointer of this new struct into qptable leads
to crashes, because mthca_poll_one() expects a qp pointer. Fix this
by putting the correct pointer into qptable.

Fixes: 21c2fe94abb2 ("RDMA/mthca: Combine special QP struct with mthca QP")
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Link: https://lore.kernel.org/r/20230713141658.9426-1-tbogendoerfer@suse.de
Signed-off-by: Leon Romanovsky <leon@kernel.org>


# 1fb7f897 01-Mar-2021 Mark Bloch <mbloch@nvidia.com>

RDMA: Support more than 255 rdma ports

Current code uses many different types when dealing with a port of a RDMA
device: u8, unsigned int and u32. Switch to u32 to clean up the logic.

This allows us to make (at least) the core view consistent and use the
same type. Unfortunately not all places can be converted. Many uverbs
functions expect port to be u8 so keep those places in order not to break
UAPIs. HW/Spec defined values must also not be changed.

With the switch to u32 we now can support devices with more than 255
ports. U32_MAX is reserved to make control logic a bit easier to deal
with. As a device with U32_MAX ports probably isn't going to happen any
time soon this seems like a non issue.

When a device with more than 255 ports is created uverbs will report the
RDMA device as having 255 ports as this is the max currently supported.

The verbs interface is not changed yet because the IBTA spec limits the
port size in too many places to be u8 and all applications that relies in
verbs won't be able to cope with this change. At this stage, we are
extending the interfaces that are using vendor channel solely

Once the limitation is lifted mlx5 in switchdev mode will be able to have
thousands of SFs created by the device. As the only instance of an RDMA
device that reports more than 255 ports will be a representor device and
it exposes itself as a RAW Ethernet only device CM/MAD/IPoIB and other
ULPs aren't effected by this change and their sysfs/interfaces that are
exposes to userspace can remain unchanged.

While here cleanup some alignment issues and remove unneeded sanity
checks (mainly in rdmavt),

Link: https://lore.kernel.org/r/20210301070420.439400-1-leon@kernel.org
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 26e990ba 03-Oct-2020 Jason Gunthorpe <jgg@ziepe.ca>

RDMA: Check attr_mask during modify_qp

Each driver should check that it can support the provided attr_mask during
modify_qp. IB_USER_VERBS_EX_CMD_MODIFY_QP was being used to block
modify_qp_ex because the driver didn't check RATE_LIMIT.

Link: https://lore.kernel.org/r/6-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 21c2fe94 26-Sep-2020 Leon Romanovsky <leon@kernel.org>

RDMA/mthca: Combine special QP struct with mthca QP

As preparation for the removal of QP allocation logic, we need to ensure
that ib_core allocates the right amount of memory before a call to the
driver create_qp(). It requires from driver to have the same structs for
all types of QPs.

Link: https://lore.kernel.org/r/20200926102450.2966017-10-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 3f649ab7 03-Jun-2020 Kees Cook <keescook@chromium.org>

treewide: Remove uninitialized_var() usage

Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings
(e.g. "unused variable"). If the compiler thinks it is uninitialized,
either simply initialize the variable or make compiler changes.

In preparation for removing[2] the[3] macro[4], remove all remaining
needless uses with the following script:

git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
xargs perl -pi -e \
's/\buninitialized_var\(([^\)]+)\)/\1/g;
s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'

drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
pathological white-space.

No outstanding warnings were found building allmodconfig with GCC 9.3.0
for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
alpha, and m68k.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/
[2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/
[3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/
[4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/

Reviewed-by: Leon Romanovsky <leonro@mellanox.com> # drivers/infiniband and mlx4/mlx5
Acked-by: Jason Gunthorpe <jgg@mellanox.com> # IB
Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless drivers
Reviewed-by: Chao Yu <yuchao0@huawei.com> # erofs
Signed-off-by: Kees Cook <keescook@chromium.org>


# fb24ea52 22-Feb-2019 Will Deacon <will@kernel.org>

drivers: Remove explicit invocations of mmiowb()

mmiowb() is now implied by spin_unlock() on architectures that require
it, so there is no reason to call it from driver code. This patch was
generated using coccinelle:

@mmiowb@
@@
- mmiowb();

and invoked as:

$ for d in drivers include/linux/qed sound; do \
spatch --include-headers --sp-file mmiowb.cocci --dir $d --in-place; done

NOTE: mmiowb() has only ever guaranteed ordering in conjunction with
spin_unlock(). However, pairing each mmiowb() removal in this patch with
the corresponding call to spin_unlock() is not at all trivial, so there
is a small chance that this change may regress any drivers incorrectly
relying on mmiowb() to order MMIO writes between CPUs using lock-free
synchronisation. If you've ended up bisecting to this commit, you can
reintroduce the mmiowb() calls using wmb() instead, which should restore
the old behaviour on all architectures other than some esoteric ia64
systems.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 19b1a294 24-Feb-2019 Erez Alfasi <ereza@mellanox.com>

RDMA: Use __packed annotation instead of __attribute__ ((packed))

"__attribute__" set of macros has been standardized, have became more
potentially portable and consistent code back in v2.6.21 by commit
82ddcb040 ("[PATCH] extend the set of "__attribute__" shortcut macros").
Moreover, nowadays checkpatch.pl warns about using __attribute__((packed))
instead of __packed.

This patch converts all the "__attribute__ ((packed))" annotations to
"__packed" within the RDMA subsystem.

Signed-off-by: Erez Alfasi <ereza@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 89944450 07-Feb-2019 Shamir Rabinovitch <shamir.rabinovitch@oracle.com>

IB/{hw,sw}: Remove 'uobject->context' dependency in object creation APIs

Now when we have the udata passed to all the ib_xxx object creation APIs
and the additional macro 'rdma_udata_to_drv_context' to get the
ib_ucontext from ib_udata stored in uverbs_attr_bundle, we can finally
start to remove the dependency of the drivers in the
ib_xxx->uobject->context.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# e00b64f7 17-Dec-2018 Shamir Rabinovitch <shamir.rabinovitch@oracle.com>

RDMA: Cleanup undesired pd->uobject usage

Drivers should be using udata to determine if a method is invoked from
user space or kernel space. A pd does not necessarily say a different
objects is kernel or user.

Transforming the tests to use udata eliminates a large number of uobject
references from the drivers.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# d31131bb 02-Oct-2018 Kamal Heib <kamalheib1@gmail.com>

RDMA: Remove unused parameter from ib_modify_qp_is_ok()

The ll parameter is not used in ib_modify_qp_is_ok(), so remove it.

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# d34ac5cd 18-Jul-2018 Bart Van Assche <bvanassche@acm.org>

RDMA, core and ULPs: Declare ib_post_send() and ib_post_recv() arguments const

Since neither ib_post_send() nor ib_post_recv() modify the data structure
their second argument points at, declare that argument const. This change
makes it necessary to declare the 'bad_wr' argument const too and also to
modify all ULPs that call ib_post_send(), ib_post_recv() or
ib_post_srq_recv(). This patch does not change any functionality but makes
it possible for the compiler to verify whether the
ib_post_(send|recv|srq_recv) really do not modify the posted work request.

To make this possible, only one cast had to be introduce that casts away
constness, namely in rpcrdma_post_recvs(). The only way I can think of to
avoid that cast is to introduce an additional loop in that function or to
change the data type of bad_wr from struct ib_recv_wr ** into int
(an index that refers to an element in the work request list). However,
both approaches would require even more extensive changes than this
patch.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# f696bf6d 18-Jul-2018 Bart Van Assche <bvanassche@acm.org>

RDMA: Constify the argument of the work request conversion functions

When posting a send work request, the work request that is posted is not
modified by any of the RDMA drivers. Make this explicit by constifying
most ib_send_wr pointers in RDMA transport drivers.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 6da2ec56 12-Jun-2018 Kees Cook <keescook@chromium.org>

treewide: kmalloc() -> kmalloc_array()

The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
patch replaces cases of:

kmalloc(a * b, gfp)

with:
kmalloc_array(a * b, gfp)

as well as handling cases of:

kmalloc(a * b * c, gfp)

with:

kmalloc(array3_size(a, b, c), gfp)

as it's slightly less ugly than:

kmalloc_array(array_size(a, b), c, gfp)

This does, however, attempt to ignore constant size factors like:

kmalloc(4 * 1024, gfp)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The tools/ directory was manually excluded, since it has its own
implementation of kmalloc().

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
kmalloc(
- (sizeof(TYPE)) * E
+ sizeof(TYPE) * E
, ...)
|
kmalloc(
- (sizeof(THING)) * E
+ sizeof(THING) * E
, ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
kmalloc(
- sizeof(u8) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(__u8) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(char) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(unsigned char) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(u8) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(__u8) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(char) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(unsigned char) * COUNT
+ COUNT
, ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (COUNT_ID)
+ COUNT_ID, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * COUNT_ID
+ COUNT_ID, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (COUNT_CONST)
+ COUNT_CONST, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * COUNT_CONST
+ COUNT_CONST, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (COUNT_ID)
+ COUNT_ID, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * COUNT_ID
+ COUNT_ID, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (COUNT_CONST)
+ COUNT_CONST, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * COUNT_CONST
+ COUNT_CONST, sizeof(THING)
, ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

- kmalloc
+ kmalloc_array
(
- SIZE * COUNT
+ COUNT, SIZE
, ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
kmalloc(
- sizeof(TYPE) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(THING) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
kmalloc(
- sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
kmalloc(
- sizeof(THING1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(THING1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
kmalloc(
- (COUNT) * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
)

// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
kmalloc(C1 * C2 * C3, ...)
|
kmalloc(
- (E1) * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- (E1) * (E2) * E3
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- (E1) * (E2) * (E3)
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- E1 * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
)

// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@

(
kmalloc(sizeof(THING) * C2, ...)
|
kmalloc(sizeof(TYPE) * C2, ...)
|
kmalloc(C1 * C2 * C3, ...)
|
kmalloc(C1 * C2, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (E2)
+ E2, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * E2
+ E2, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (E2)
+ E2, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * E2
+ E2, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- (E1) * E2
+ E1, E2
, ...)
|
- kmalloc
+ kmalloc_array
(
- (E1) * (E2)
+ E1, E2
, ...)
|
- kmalloc
+ kmalloc_array
(
- E1 * E2
+ E1, E2
, ...)
)

Signed-off-by: Kees Cook <keescook@chromium.org>


# 44c58487 29-Apr-2017 Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

IB/core: Define 'ib' and 'roce' rdma_ah_attr types

rdma_ah_attr can now be either ib or roce allowing
core components to use one type or the other and also
to define attributes unique to a specific type. struct
ib_ah is also initialized with the type when its first
created. This ensures that calls such as modify_ah
dont modify the type of the address handle attribute.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# d8966fcd 29-Apr-2017 Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

IB/core: Use rdma_ah_attr accessor functions

Modify core and driver components to use accessor functions
introduced to access individual fields of rdma_ah_attr

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 766b7f6c 29-Apr-2017 Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

IB/mthca: Rename to_ib_ah_attr to to_rdma_ah_attr

local function to_ib_ah_attr is renamed so it in
sync with the rename of the ib_ah_attr structure

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 90898850 29-Apr-2017 Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

IB/core: Rename struct ib_ah_attr to rdma_ah_attr

This patch simply renames struct ib_ah_attr to
rdma_ah_attr as these fields specify attributes that are
not necessarily specific to IB.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 25f40220 23-Dec-2015 Moni Shoua <monis@mellanox.com>

IB/core: Initialize UD header structure with IP and UDP headers

ib_ud_header_init() is used to format InfiniBand headers
in a buffer up to (but not with) BTH. For RoCE UDP ENCAP it is
required that this function would be able to build also IP and UDP
headers.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# e622f2f4 08-Oct-2015 Christoph Hellwig <hch@lst.de>

IB: split struct ib_send_wr

This patch split up struct ib_send_wr so that all non-trivial verbs
use their own structure which embedds struct ib_send_wr. This dramaticly
shrinks the size of a WR for most common operations:

sizeof(struct ib_send_wr) (old): 96

sizeof(struct ib_send_wr): 48
sizeof(struct ib_rdma_wr): 64
sizeof(struct ib_atomic_wr): 96
sizeof(struct ib_ud_wr): 88
sizeof(struct ib_fast_reg_wr): 88
sizeof(struct ib_bind_mw_wr): 96
sizeof(struct ib_sig_handover_wr): 80

And with Sagi's pending MR rework the fast registration WR will also be
down to a reasonable size:

sizeof(struct ib_fastreg_wr): 64

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [srp, srpt]
Reviewed-by: Chuck Lever <chuck.lever@oracle.com> [sunrpc]
Tested-by: Haggai Eran <haggaie@mellanox.com>
Tested-by: Sagi Grimberg <sagig@mellanox.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>


# dd5f03be 12-Dec-2013 Matan Barak <matanb@mellanox.com>

IB/core: Ethernet L2 attributes in verbs/cm structures

This patch add the support for Ethernet L2 attributes in the
verbs/cm/cma structures.

When dealing with L2 Ethernet, we should use smac, dmac, vlan ID and priority
in a similar manner that the IB L2 (and the L4 PKEY) attributes are used.

Thus, those attributes were added to the following structures:

* ib_ah_attr - added dmac
* ib_qp_attr - added smac and vlan_id, (sl remains vlan priority)
* ib_wc - added smac, vlan_id
* ib_sa_path_rec - added smac, dmac, vlan_id
* cm_av - added smac and vlan_id

For the path record structure, extra care was taken to avoid the new
fields when packing it into wire format, so we don't break the IB CM
and SA wire protocol.

On the active side, the CM fills. its internal structures from the
path provided by the ULP. We add there taking the ETH L2 attributes
and placing them into the CM Address Handle (struct cm_av).

On the passive side, the CM fills its internal structures from the WC
associated with the REQ message. We add there taking the ETH L2
attributes from the WC.

When the HW driver provides the required ETH L2 attributes in the WC,
they set the IB_WC_WITH_SMAC and IB_WC_WITH_VLAN flags. The IB core
code checks for the presence of these flags, and in their absence does
address resolution from the ib_init_ah_from_wc() helper function.

ib_modify_qp_is_ok is also updated to consider the link layer. Some
parameters are mandatory for Ethernet link layer, while they are
irrelevant for IB. Vendor drivers are modified to support the new
function signature.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>


# 16551d45 11-Jul-2012 Dotan Barak <dotanb@dev.mellanox.co.il>

IB/mthca: Fill in sq_sig_type in query QP

The query QP code was didn't fill that attribute, do that.

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>


# 9bbeb666 11-Jul-2012 Dotan Barak <dotanb@dev.mellanox.co.il>

IB/mthca: Warning about event for non-existent QPs should show event type

Events received for non-existent QPs should generate a warning that includes
the event type that was received.

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>


# cdb73db0 07-Jul-2011 Goldwyn Rodrigues <rgoldwyn@suse.de>

IB/mthca: Stop returning separate error and status from FW commands

Instead of having firmware command functions return an error and also
a status, leading to code like:

err = mthca_FW_COMMAND(..., &status);
if (err)
goto out;
if (status) {
err = -E...;
goto out;
}

all over the place, just handle the FW status inside the FW command
handling code (the way mlx4 does it), so we can simply write:

err = mthca_FW_COMMAND(...);
if (err)
goto out;

In addition to simplifying the source code, this also saves a healthy
chunk of text:

add/remove: 0/0 grow/shrink: 10/88 up/down: 510/-3357 (-2847)
function old new delta
static.trans_table 324 584 +260
mthca_cmd_poll 352 477 +125
mthca_cmd_wait 511 567 +56
mthca_table_put 213 240 +27
mthca_cleanup_db_tab 372 387 +15
__mthca_remove_one 314 323 +9
mthca_cleanup_user_db_tab 275 283 +8
__mthca_init_one 1738 1746 +8
mthca_cleanup 20 21 +1
mthca_MAD_IFC 1081 1082 +1
mthca_MGID_HASH 43 40 -3
mthca_MAP_ICM_AUX 23 20 -3
mthca_MAP_ICM 19 16 -3
mthca_MAP_FA 23 20 -3
mthca_READ_MGM 43 38 -5
mthca_QUERY_SRQ 43 38 -5
mthca_QUERY_QP 59 54 -5
mthca_HW2SW_SRQ 43 38 -5
mthca_HW2SW_MPT 60 55 -5
mthca_HW2SW_EQ 43 38 -5
mthca_HW2SW_CQ 43 38 -5
mthca_free_icm_table 120 114 -6
mthca_query_srq 214 206 -8
mthca_free_qp 662 654 -8
mthca_cmd 38 28 -10
mthca_alloc_db 1321 1311 -10
mthca_setup_hca 1067 1055 -12
mthca_WRITE_MTT 35 22 -13
mthca_WRITE_MGM 40 27 -13
mthca_UNMAP_ICM_AUX 36 23 -13
mthca_UNMAP_FA 36 23 -13
mthca_SYS_DIS 36 23 -13
mthca_SYNC_TPT 36 23 -13
mthca_SW2HW_SRQ 35 22 -13
mthca_SW2HW_MPT 35 22 -13
mthca_SW2HW_EQ 35 22 -13
mthca_SW2HW_CQ 35 22 -13
mthca_RUN_FW 36 23 -13
mthca_DISABLE_LAM 36 23 -13
mthca_CLOSE_IB 36 23 -13
mthca_CLOSE_HCA 38 25 -13
mthca_ARM_SRQ 39 26 -13
mthca_free_icms 178 164 -14
mthca_QUERY_DDR 389 375 -14
mthca_resize_cq 1063 1048 -15
mthca_unmap_eq_icm 123 107 -16
mthca_map_eq_icm 396 380 -16
mthca_cmd_box 90 74 -16
mthca_SET_IB 433 417 -16
mthca_RESIZE_CQ 369 353 -16
mthca_MAP_ICM_page 240 224 -16
mthca_MAP_EQ 183 167 -16
mthca_INIT_IB 473 457 -16
mthca_INIT_HCA 745 729 -16
mthca_map_user_db 816 798 -18
mthca_SYS_EN 157 139 -18
mthca_cleanup_qp_table 78 59 -19
mthca_cleanup_eq_table 168 149 -19
mthca_UNMAP_ICM 143 121 -22
mthca_modify_srq 172 149 -23
mthca_unmap_fmr 198 174 -24
mthca_query_qp 814 790 -24
mthca_query_pkey 343 319 -24
mthca_SET_ICM_SIZE 34 10 -24
mthca_QUERY_DEV_LIM 1870 1846 -24
mthca_map_cmd 1130 1105 -25
mthca_ENABLE_LAM 401 375 -26
mthca_modify_port 247 220 -27
mthca_query_device 884 850 -34
mthca_NOP 75 41 -34
mthca_table_get 287 249 -38
mthca_init_qp_table 333 293 -40
mthca_MODIFY_QP 348 308 -40
mthca_close_hca 131 89 -42
mthca_free_eq 435 390 -45
mthca_query_port 755 705 -50
mthca_free_cq 581 528 -53
mthca_alloc_icm_table 578 524 -54
mthca_multicast_attach 1041 986 -55
mthca_init_hca 326 271 -55
mthca_query_gid 487 431 -56
mthca_free_srq 524 468 -56
mthca_free_mr 168 111 -57
mthca_create_eq 1560 1501 -59
mthca_multicast_detach 790 728 -62
mthca_write_mtt 918 854 -64
mthca_register_device 1406 1342 -64
mthca_fmr_alloc 947 883 -64
mthca_mr_alloc 652 582 -70
mthca_process_mad 1242 1164 -78
mthca_dev_lim 910 830 -80
find_mgm 482 400 -82
mthca_modify_qp 3852 3753 -99
mthca_init_cq 1281 1181 -100
mthca_alloc_srq 1719 1610 -109
mthca_init_eq_table 1807 1679 -128
mthca_init_tavor 761 491 -270
mthca_init_arbel 2617 2098 -519

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.de>


# af7bd463 26-Aug-2010 Eli Cohen <eli@dev.mellanox.co.il>

IB/core: Add VLAN support for IBoE

Add 802.1q VLAN support to IBoE. The VLAN tag is encoded within the
GID derived from a link local address in the following way:

GID[11] GID[12] contain the VLAN ID when the GID contains a VLAN.

The 3 bits user priority field of the packets are identical to the 3
bits of the SL.

In case of rdma_cm apps, the TOS field is used to generate the SL
field by doing a shift right of 5 bits effectively taking to 3 MS bits
of the TOS field.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# ff7f5aab 26-Aug-2010 Eli Cohen <eli@dev.mellanox.co.il>

IB/pack: IBoE UD packet packing support

Add support for packing IBoE packet headers.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>

[ Clean up and fix ib_ud_header_init() a bit. - Roland ]

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 920d706c 08-Feb-2010 Eli Cohen <eli@mellanox.co.il>

IB/core: Fix and clean up ib_ud_header_init()

ib_ud_header_init() first clears header and then fills up the various
fields. Later on, it tests header->immediate_present, which it has
already cleared, so the condition is always false. Fix this by adding
an immediate_present parameter and setting header->immediate_present
as is done with grh_present. Also remove unused calculation of
header_len.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# ffe063f3 05-Sep-2009 Roland Dreier <rolandd@cisco.com>

IB/mthca: Annotate CQ locking

mthca_ib_lock_cqs()/mthca_ib_unlock_cqs() are helper functions that
lock/unlock both CQs attached to a QP in the proper order to avoid
AB-BA deadlocks. Annotate this so sparse can understand what's going
on (and warn us if we misuse these functions).

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# aed01227 15-Jul-2008 Roland Dreier <rolandd@cisco.com>

IB/mthca: Fix check of max_send_sge for special QPs

The MLX transport requires two extra gather entries for sends (one for
the header and one for the checksum at the end, as the comment says).
However the code checked that max_recv_sge was not too big, instead of
checking max_send_sge as it should have. Fix the code to check the
correct condition.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# d3809ad0 15-Jul-2008 Roland Dreier <rolandd@cisco.com>

IB/mthca: Remove extra code for RESET->ERR QP state transition

Commit b18aad71 ("IB/mthca: Fix RESET to ERROR transition") added some
extra code to handle a QP state transition from RESET to ERROR.
However, the latest 1.2.1 version of the IB spec has clarified that
this transition is actually not allowed, so we can remove this extra
code again.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# f3781d2e 15-Jul-2008 Roland Dreier <rolandd@cisco.com>

RDMA: Remove subversion $Id tags

They don't get updated by git and so they're worse than useless.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 5121df3a 16-Apr-2008 Dotan Barak <dotanb@dev.mellanox.co.il>

IB/mthca: Update QP state if query QP succeeds

If the QP was moved to another state (such as SQE) by the hardware,
then after this change the user won't have to set the IBV_QP_CUR_STATE
mask in order to execute modify QP in order to recover from this state.

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 0f39cf3d 16-Apr-2008 Roland Dreier <rolandd@cisco.com>

IB/core: Add support for "send with invalidate" work requests

Add a new IB_WR_SEND_WITH_INV send opcode that can be used to mark a
"send with invalidate" work request as defined in the iWARP verbs and
the InfiniBand base memory management extensions. Also put "imm_data"
and a new "invalidate_rkey" member in a new "ex" union in struct
ib_send_wr. The invalidate_rkey member can be used to pass in an
R_Key/STag to be invalidated. Add this new union to struct
ib_uverbs_send_wr. Add code to copy the invalidate_rkey field in
ib_uverbs_post_send().

Fix up low-level drivers to deal with the change to struct ib_send_wr,
and just remove the imm_data initialization from net/sunrpc/xprtrdma/,
since that code never does any send with immediate operations.

Also, move the existing IB_DEVICE_SEND_W_INV flag to a new bit, since
the iWARP drivers currently in the tree set the bit. The amso1100
driver at least will silently fail to honor the IB_SEND_INVALIDATE bit
if passed in as part of userspace send requests (since it does not
implement kernel bypass work request queueing). Remove the flag from
all existing drivers that set it until we know which ones are OK.

The values chosen for the new flag is not consecutive to avoid clashing
with flags defined in the XRC patches, which are not merged yet but
which are already in use and are likely to be merged soon.

This resurrects a patch sent long ago by Mikkel Hagen <mhagen@iol.unh.edu>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 680b575f 16-Apr-2008 Eli Cohen <eli@dev.mellanox.co.il>

IB/mthca: Add IPoIB checksum offload support

Arbel and Sinai devices support checksum generation and verification
of TCP and UDP packets for UD IPoIB messages. This patch checks if
the HCA supports this and sets the IB_DEVICE_UD_IP_CSUM capability
flag if it does. It implements support for handling the IB_SEND_IP_CSUM
send flag and setting the csum_ok field in receive work completions.

Signed-off-by: Eli Cohen <eli@mellnaox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 1d368c54 24-Jan-2008 Eli Cohen <eli at mellanox.co.il>

IB/ib_mthca: Pre-link receive WQEs in Tavor mode

We have recently discovered that Tavor mode requires each WQE in a
posted list of receive WQEs to have a valid NDA field at all times.
This requirement holds true for regular QPs as well as for SRQs. This
patch prelinks the receive queue in a regular QP and keeps the free
list in SRQ always properly linked.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Reviewed-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# ab8403c4 14-Oct-2007 Roland Dreier <rolandd@cisco.com>

IB/mthca: Avoid alignment traps when writing doorbells

Architectures such as ia64 see alignment traps when doing a 64-bit
read from __be32 doorbell[2] arrays to do doorbell writes in
mthca_write64(). Fix this by just passing the two halves of the
doorbell value into mthca_write64(). This actually improves the
generated code by allowing the compiler to see what's going on better.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 43509d1f 18-Jul-2007 Roland Dreier <rolandd@cisco.com>

IB/mthca: Simplify use of size0 in work request posting

Current code sets size0 to 0 at the start of work request posting
functions and then handles size0 == 0 specially within the loop over
work requests. Change this so size0 is set along with f0 the first
time through the loop (when nreq == 0). This makes the code easier to
understand by making it clearer that f0 and size0 are always
initialized if nreq != 0 without having to know that size0 == 0
implies nreq == 0.

Also annotate size0 with uninitialized_var() so that this doesn't
introduce a new compiler warning.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# e535c699 18-Jul-2007 Roland Dreier <rolandd@cisco.com>

IB/mthca: Factor out setting WQE UD segment entries

Factor code to set UD entries out of the work request posting
functions into inline functions set_tavor_ud_seg() and
set_arbel_ud_seg(). This doesn't change the generated code in any
significant way, and makes the source easier on the eyes.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 400ddc11 18-Jul-2007 Roland Dreier <rolandd@cisco.com>

IB/mthca: Factor out setting WQE remote address and atomic segment entries

Factor code to set remote address and atomic segment entries out of the
work request posting functions into inline functions set_raddr_seg()
and set_atomic_seg(). This doesn't change the generated code in any
significant way, and makes the source easier on the eyes.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 80885456 18-Jul-2007 Roland Dreier <rolandd@cisco.com>

IB/mthca: Factor out setting WQE data segment entries

Factor code to set data segment entries out of the work request
posting functions into inline functions mthca_set_data_seg() and
mthca_set_data_seg_inval(). This makes the code more readable and
also allows the compiler to do a better job -- on x86_64:

add/remove: 0/0 grow/shrink: 0/6 up/down: 0/-69 (-69)
function old new delta
mthca_arbel_post_srq_recv 373 369 -4
mthca_arbel_post_receive 570 562 -8
mthca_tavor_post_srq_recv 520 508 -12
mthca_tavor_post_send 1344 1330 -14
mthca_arbel_post_send 1481 1467 -14
mthca_tavor_post_receive 792 775 -17

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 6d7d080e 17-Jul-2007 Roland Dreier <rolandd@cisco.com>

IB/mthca: Use uninitialized_var() for f0

Commit 9db48926 ("drivers/infiniband/hw/mthca/mthca_qp: kill uninit'd
var warning") added "= 0" to the declarations of f0 to shut up gcc
warnings. However, there's no point in making the code bigger by
initializing f0 to a random value just to get rid of a warning;
setting f0 to 0 is no safer than just using uninitialized_var(), which
documents the situation better and gives smaller code too. For example,
on x86_64:

add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-16 (-16)
function old new delta
mthca_tavor_post_send 1352 1344 -8
mthca_arbel_post_send 1489 1481 -8

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 9db48926 17-Jul-2007 Jeff Garzik <jeff@garzik.org>

drivers/infiniband/hw/mthca/mthca_qp: kill uninit'd var warning

drivers/infiniband/hw/mthca/mthca_qp.c: In function
‘mthca_tavor_post_send’:
drivers/infiniband/hw/mthca/mthca_qp.c:1594: warning: ‘f0’ may be used
uninitialized in this function
drivers/infiniband/hw/mthca/mthca_qp.c: In function
‘mthca_arbel_post_send’:
drivers/infiniband/hw/mthca/mthca_qp.c:1949: warning: ‘f0’ may be used
uninitialized in this function

Initializing 'f0' is not strictly necessary in either case, AFAICS.

I was considering use of uninitialized_var(), but looking at the
complex flow of control in each function, I feel it is wiser and
safer to simply zero the var and be certain of ourselves.

Signed-off-by: Jeff Garzik <jeff@garzik.org>


# 8b7e1577 27-May-2007 Michael S. Tsirkin <mst@dev.mellanox.co.il>

IB/mthca: Fix handling of send CQE with error for QPs connected to SRQ

mthca_free_err_wqe() currently treats both send and receive CQEs
identically if a QP is using an SRQ. But for Tavor hardware, send
CQEs with error can be chained together even if the RQ is part of SRQ,
so we may miss some CQEs.

Fix by following the WQE chain for all send CQEs even for non-SRQ QPs.

This fixes crashes in IPoIB CM:
<https://bugs.openfabrics.org//show_bug.cgi?id=604>

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# e8edc6e0 20-May-2007 Alexey Dobriyan <adobriyan@gmail.com>

Detach sched.h from mm.h

First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.

This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
getting them indirectly

Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
they don't need sched.h
b) sched.h stops being dependency for significant number of files:
on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
after patch it's only 3744 (-8.3%).

Cross-compile tested on

all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
alpha alpha-up
arm
i386 i386-up i386-defconfig i386-allnoconfig
ia64 ia64-up
m68k
mips
parisc parisc-up
powerpc powerpc-up
s390 s390-up
sparc sparc-up
sparc64 sparc64-up
um-x86_64
x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig

as well as my two usual configs.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# b18aad71 13-May-2007 Michael S. Tsirkin <mst@dev.mellanox.co.il>

IB/mthca: Fix RESET to ERROR transition

According to the IB spec, a QP can be moved from RESET to the ERROR
state, but mthca firmware does not support this and returns an error if
we try. Work around this FW limitation by moving the QP from RESET to
INIT with dummy parameters and then transitioning from INIT to ERROR.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 3e28c56b 13-May-2007 Michael S. Tsirkin <mst@dev.mellanox.co.il>

IB/mthca: Fix posting >255 recv WRs for Tavor

Fix posting lists of > 255 receive WRs for Tavor: rq.next_ind must
be updated each doorbell, otherwise the next doorbell will use an
incorrect index.

Found by Ronni Zimmermann at Mellanox.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 9ba6d552 12-Apr-2007 Michael S. Tsirkin <mst@mellanox.co.il>

IB/mthca: Work around kernel QP starvation

With mthca, RC QPs can starve each other and even UD QPs on the same
hardware schedule queue. As a result, userspace MPI can starve
e.g. IPoIB traffic, with netdev watchdog warnings getting printed out,
and TCP connections getting stuck or failing.

Reduce the chance of this happening by using three separate hardware
schedule queues: one for userspace RC QPs, one for kernel RC QPs, and
one for all other QPs.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 30c00986 24-Apr-2007 Roland Dreier <rolandd@cisco.com>

IB/mthca: Simplify CQ cleaning in mthca_free_qp()

mthca_free_qp() already has local variables to hold the QP's send_cq
and recv_cq, so we can slightly clean up the calls to mthca_cq_clean()
by using those local variables instead of expressions like
to_mcq(qp->ibqp.send_cq).

Also, by cleaning the recv_cq first, we can avoid worrying about
whether the QP is attached to an SRQ for the second call, because we
would only clean send_cq if send_cq is not equal to recv_cq, and that
means send_cq cannot have any receive completions from the QP being
destroyed.

All this work even improves the generated code a bit:

add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-5 (-5)
function old new delta
mthca_free_qp 510 505 -5

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 88171cfe 01-Mar-2007 Roland Dreier <rolandd@cisco.com>

IB/mthca: Fix error path in mthca_alloc_memfree()

The garbled logic in mthca_alloc_memfree() causes it to return 0, even
if it fails to allocate all doorbell records. Fix it to return -ENOMEM
when it fails.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# fc89afce 09-Jan-2007 Dotan Barak <dotanb@dev.mellanox.co.il>

IB/mthca: Allow the QP state transition RESET->RESET

RESET->RESET is an allowed QP state transition, so mthca should handle
it correctly, by just returning success without involving the firmware.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 99d4f22e 10-Feb-2007 Roland Dreier <rolandd@cisco.com>

IB/mthca: Use correct structure size in call to memset()

When clearing the ib_ah_attr parameter in to_ib_ah_attr(), use sizeof
*ib_ah_attr instead of sizeof *path.

Pointed out by Jack Morgenstein <jackm@mellanox.co.il>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# f5e10529 27-Dec-2006 Dotan Barak <dotanb@dev.mellanox.co.il>

IB/mthca: Don't execute QUERY_QP firmware command for QP in RESET state

If a QP being queried is in the RESET state, don't execute the
QUERY_QP firmware command (because it will fail).

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# f0d1b0b3 08-Dec-2006 David Howells <dhowells@redhat.com>

[PATCH] LOG2: Implement a general integer log2 facility in the kernel

This facility provides three entry points:

ilog2() Log base 2 of unsigned long
ilog2_u32() Log base 2 of u32
ilog2_u64() Log base 2 of u64

These facilities can either be used inside functions on dynamic data:

int do_something(long q)
{
...;
y = ilog2(x)
...;
}

Or can be used to statically initialise global variables with constant values:

unsigned n = ilog2(27);

When performing static initialisation, the compiler will report "error:
initializer element is not constant" if asked to take a log of zero or of
something not reducible to a constant. They treat negative numbers as
unsigned.

When not dealing with a constant, they fall back to using fls() which permits
them to use arch-specific log calculation instructions - such as BSR on
x86/x86_64 or SCAN on FRV - if available.

[akpm@osdl.org: MMC fix]
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David Howells <dhowells@redhat.com>
Cc: Wojtek Kaniewski <wojtekka@toxygen.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# f4f3d0f0 29-Nov-2006 Roland Dreier <rolandd@cisco.com>

IB/mthca: Fix section mismatches

Commit b3b30f5e ("IB/mthca: Recover from catastrophic errors")
introduced some section mismatch breakage, because the error recovery
code tears down and reinitializes the device, which calls into lots of
code originally marked __devinit and __devexit from regular .text.

Fix this by getting rid of these now-incorrect section markers.

Reported by Randy Dunlap <randy.dunlap@oracle.com>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 1f5c23e2 16-Oct-2006 Arthur Kepner <akepner@sgi.com>

IB/mthca: Use mmiowb after doorbell ring

We discovered a problem when running IPoIB applications on multiple
CPUs on an Altix system. Many messages such as:

ib_mthca 0002:01:00.0: SQ 000014 full (19941644 head, 19941707 tail, 64 max, 0 nreq)

appear in syslog, and the driver wedges up.

Apparently this is because writes to the doorbells from different CPUs
reach the device out of order. The following patch adds mmiowb() calls
after doorbell rings to ensure the doorbell writes are ordered.

Signed-off-by: Arthur Kepner <akepner@sgi.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# d35cc330 22-Sep-2006 Roland Dreier <rolandd@cisco.com>

IB/mthca: Simplify calls to mthca_cq_clean()

If a QP has separate send and receive CQs, then the send CQ will never
have receive completions from that QP in it. So when cleaning the
send CQ, there's no need to pass in an SRQ pointer, even if the QP is
attached to an SRQ.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 9e583b85 28-Aug-2006 Jack Morgenstein <jackm@mellanox.co.il>

IB/mthca: Return correct number of bits for static rate in query_qp

Incorrect number of bits was taken for static_rate field.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# f6f76725 28-Aug-2006 Jack Morgenstein <jackm@mellanox.co.il>

IB/mthca: Return port number for unconnected QPs in query_qp

port_num was not being returned for unconnected QPs.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 9bc57e2d 11-Aug-2006 Ralph Campbell <ralphc@pathscale.com>

IB/uverbs: Pass userspace data to modify_srq and modify_qp methods

Pass a struct ib_udata to the low-level driver's ->modify_srq() and
->modify_qp() methods, so that it can get to the device-specific data
passed in by the userspace driver.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# a19aa5c5 11-Aug-2006 Roland Dreier <rolandd@cisco.com>

IB/mthca: Fix potential AB-BA deadlock with CQ locks

When destroying a QP, mthca locks both the QP's send CQ and receive
CQ. However, the following scenario is perfectly valid:

QP_a: send_cq == CQ_x, recv_cq == CQ_y
QP_b: send_cq == CQ_y, recv_cq == CQ_x

The old mthca code simply locked send_cq and then recv_cq, which in
this case could lead to an AB-BA deadlock if QP_a and QP_b were
destroyed simultaneously.

We can fix this by changing the locking code to lock the CQ with the
lower CQ number first, which will create a consistent lock ordering.
Also, the second CQ is locked with spin_lock_nested() to tell lockdep
that we know what we're doing with the lock nesting.

This bug was found by lockdep.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# e54b82d7 10-Aug-2006 Michael S. Tsirkin <mst@mellanox.co.il>

IB/mthca: Make fence flag work for send work requests

The fence bit needs to be set in the doorbell too, not just the WQE.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 0964d916 14-Jul-2006 Michael S. Tsirkin <mst@mellanox.co.il>

[PATCH] IB/mthca: comment fix

After recent changes, mthca_wq_init does not actually initialize the WQ as it
used to - it simply resets all index fields to their initial values. So,
let's rename it to mthca_wq_reset.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Cc: Roland Dreier <rolandd@cisco.com>
Acked-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# a46f9484 04-Jul-2006 Zach Brown <zach.brown@oracle.com>

[PATCH] mthca: initialize send and receive queue locks separately

mthca: initialize send and receive queue locks separately

lockdep identifies a lock by the call site of its initialization. By
initializing the send and receive queue locks in mthca_wq_init() we confuse
lockdep. It warns that that the ordered acquiry of both locks in
mthca_modify_qp() is recursive acquiry of one lock:

=============================================
[ INFO: possible recursive locking detected ]
---------------------------------------------
modprobe/1192 is trying to acquire lock:
(&wq->lock){....}, at: [<f892b4db>] mthca_modify_qp+0x60/0xa7b [ib_mthca]
but task is already holding lock:
(&wq->lock){....}, at: [<f892b4ce>] mthca_modify_qp+0x53/0xa7b [ib_mthca]

Initializing the locks separately in mthca_alloc_qp_common() stops the
warning and will let lockdep enforce proper ordering on paths that acquire
both locks.

Signed-off-by: Zach Brown <zach.brown@oracle.com>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# c93b6fba 17-Jun-2006 Roland Dreier <rolandd@cisco.com>

IB/mthca: Make all device methods truly reentrant

Documentation/infiniband/core_locking.txt says:

All of the methods in struct ib_device exported by a low-level
driver must be fully reentrant. The low-level driver is required to
perform all synchronization necessary to maintain consistency, even
if multiple function calls using the same object are run
simultaneously.

However, mthca's modify_qp, modify_srq and resize_cq methods are
currently not reentrant. Add a mutex to the QP, SRQ and CQ structures
so that these calls can be properly serialized.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# c9c5d9fe 17-Jun-2006 Roland Dreier <rolandd@cisco.com>

IB/mthca: Fix memory leak on modify_qp error paths

Some error paths after the mthca_alloc_mailbox() call in mthca_modify_qp()
just do a "return -EINVAL" without freeing the mailbox. Convert these
returns to "goto out" to avoid leaking the mailbox storage.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 23f3bc0f 18-May-2006 Michael S. Tsirkin <mst@mellanox.co.il>

IB/mthca: Fix posting lists of 256 receive requests for Tavor

If we post a list of length 256 exactly, nreq in doorbell gets set to
256 which is wrong: it should be encoded by 0. This is because we
only zero it out on the next WR, which may not be there. The solution
is to ring the doorbell after posting a WQE, not before posting the
next one.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# a3285aa4 09-May-2006 Roland Dreier <rolandd@cisco.com>

IB/mthca: Fix race in reference counting

Fix races in in destroying various objects. If a destroy routine
waits for an object to become free by doing

wait_event(&obj->wait, !atomic_read(&obj->refcount));
/* now clean up and destroy the object */

and another place drops a reference to the object by doing

if (atomic_dec_and_test(&obj->refcount))
wake_up(&obj->wait);

then this is susceptible to a race where the wait_event() and final
freeing of the object occur between the atomic_dec_and_test() and the
wake_up(). And this is a use-after-free, since wake_up() will be
called on part of the already-freed object.

Fix this in mthca by replacing the atomic_t refcounts with plain old
integers protected by a spinlock. This makes it possible to do the
decrement of the reference count and the wake_up() so that it appears
as a single atomic operation to the code waiting on the wait queue.

While touching this code, also simplify mthca_cq_clean(): the CQ being
cleaned cannot go away, because it still has a QP attached to it. So
there's no reason to be paranoid and look up the CQ by number; it's
perfectly safe to use the pointer that the callers already have.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# bf6a9e31 10-Apr-2006 Jack Morgenstein <jackm@mellanox.co.il>

IB: simplify static rate encoding

Push translation of static rate to HCA format into low-level drivers,
where it belongs. For static rate encoding, use encoding of rate
field from IB standard PathRecord, with addition of value 0, for
backwards compatibility with current usage. The changes are:

- Add enum ib_rate to midlayer includes.
- Get rid of static rate translation in IPoIB; just use static rate
directly from Path and MulticastGroup records.
- Update mthca driver to translate absolute static rate into the
format used by hardware. This also fixes mthca's static rate
handling for HCAs that are capable of 4X DDR.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# e1f7868c 29-Mar-2006 Roland Dreier <rolandd@cisco.com>

IB/mthca: Fix section mismatch problems

Quite a few cleanup functions in mthca were marked as __devexit.
However, they could also be called from error paths during
initialization, so they cannot be marked that way. Just delete all of
the incorrect annotations.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 192daa18 24-Mar-2006 Roland Dreier <rolandd@cisco.com>

IB/mthca: Fix modify QP error path

If the call to mthca_MODIFY_QP() failed, then mthca_modify_qp() would
still do some things it shouldn't, such as store away attributes for
special QPs. Fix this, and simplify the code, by simply jumping to
the exit path if mthca_MODIFY_QP() fails.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# b0b3a8e1 24-Mar-2006 Roland Dreier <rolandd@cisco.com>

IB/mthca: Fix indentation

Fix some whitespace damage (indenting with spaces) that snuck in.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# b3f64967 22-Mar-2006 Jack Morgenstein <jackm@mellanox.co.il>

IB/mthca: Fix uninitialized variable in mthca_alloc_qp()

mthca_alloc_sqp() by mthca_set_qp_size() need to set qp->transport
before calling mthca_set_qp_size(), since the value is used there.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 0ef61db8 19-Mar-2006 Dotan Barak <dotanb@mellanox.co.il>

IB/mthca: Check that sgid_index and path_mtu are valid in modify_qp

Add a check that the modify QP parameters sgid_index and path_mtu are
valid, since they might come from userspace.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 67e73776 01-Mar-2006 Dotan Barak <dotanb@mellanox.co.il>

IB/mthca: Check alternate P_Key index when setting alternate path

Check that the alternate P_Key index is in range when setting the
alternate path for a QP. Also make a cosmetic touch up to the debug
message printed when the main P_Key index is out of range.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 7667abd1 27-Feb-2006 Dotan Barak <dotanb@mellanox.co.il>

IB/mthca: Add support for send work request fence flag

Add support for IB_SEND_FENCE flag in post_send methods.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 8ebe5077 13-Feb-2006 Eli Cohen <eli@mellanox.co.il>

IB/mthca: Support for query QP and SRQ

Implement the query_qp and query_srq methods in mthca.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# d844183d 13-Feb-2006 Roland Dreier <rolandd@cisco.com>

IB/mthca: Convert to use ib_modify_qp_is_ok()

Use ib_modify_qp_is_ok() in mthca, and delete the big table of
attributes for queue pair state transitions.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 3fa1fa3e 03-Feb-2006 Roland Dreier <rolandd@cisco.com>

IB/mthca: Generate SQ drained events when requested

Add low-level driver support to ib_mthca so that consumers can request
a "send queue drained" event be generated when a transiton to the SQD
state completes.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 2fa5e2eb 01-Feb-2006 Roland Dreier <rolandd@cisco.com>

IB/mthca: Whitespace cleanups

Remove trailing whitespace and fix indentation that with spaces
instead of tabs.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# d9b98b0f 31-Jan-2006 Roland Dreier <rolandd@cisco.com>

IB/mthca: Make functions that never fail return void

The function mthca_free_err_wqe() can never fail, so get rid of its
return value. That means handle_error_cqe() doesn't have to check
what mthca_free_err_wqe() returns, which means it can't fail either
and doesn't have to return anything either. All this results in
simpler source code and a slight object code improvement:

add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-10 (-10)
function old new delta
mthca_free_err_wqe 83 81 -2
mthca_poll_cq 1758 1750 -8

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 9eacee2a 12-Jan-2006 Michael S. Tsirkin <mst@mellanox.co.il>

IB/mthca: Initialize grh_present before using it

build_mlx_header() was using sqp->ud_header.grh_present before it was
initialized by mthca_read_ah(). Furthermore, header->grh_present is
set by ib_ud_header_init, so there's no need to set it again in
mthca_read_ah().

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 4de144bf 06-Jan-2006 Dotan Barak <dotanb@mellanox.co.il>

IB/mthca: Add support for automatic path migration (APM)

Add code to modify QP operation to handle setting alternate paths for
connected QPs.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 0d3b525f 06-Jan-2006 Jack Morgenstein <jackm@mellanox.co.il>

IB/mthca: fix for RTR-to-RTS transition in modify QP

PKEY_INDEX is not a legal parameter in the RTR->RTS transition.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 0364ffc3 06-Jan-2006 Jack Morgenstein <jackm@mellanox.co.il>

IB/mthca: fix for SQEr-to-RTS transition in modify QP

Fixes to SQEr->RTS transition in modify_qp:
1. The flag IB_QP_ACCESS_FLAGS is optional for UC qps
2. The SQEr state is not supported for RC qps

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 5b3bc7a6 06-Jan-2006 Jack Morgenstein <jackm@mellanox.co.il>

IB/mthca: max_inline_data handling tweaks

Fix a case where copying max_inline_data from a successful create_qp
capabilities output to create_qp input could cause EINVAL error:

mthca_set_qp_size must check max_inline_data directly against
max_desc_sz; checking qp->sq.max_gs is wrong since max_inline_data
depends on the qp type and does not involve max_sg.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 38d1e793 05-Jan-2006 Jack Morgenstein <jackm@mellanox.co.il>

IB/mthca: check port validity in modify_qp

Modify_qp should check that the physical port number provided
is a legal value.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# c4342d8a 15-Dec-2005 Jack Morgenstein <jackm@mellanox.co.il>

IB/mthca: Fix corner cases in max_rd_atomic value handling in modify QP

sae and sre bits should only be set when setting sra_max. Further, in
the old code, if the caller specifies max_rd_atomic = 0, the sre and
sae bits are still set, with the result that the QP ends up with
max_rd_atomic = 1 in effect.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# d1646f86 15-Dec-2005 Jack Morgenstein <jackm@mellanox.co.il>

IB/mthca: Fix IB_QP_ACCESS_FLAGS handling.

This patch corrects some corner cases in managing the RAE/RRE bits in
the mthca qp context. These bits need to be zero if the user requests
max_dest_rd_atomic of zero. The bits need to be restored to the value
implied by the qp access flags attribute in a previous (or the
current) modify-qp command if the dest_rd_atomic variable is changed
to non-zero.

In the current implementation, the following scenario will not work:
RESET-to-INIT set QP access flags to all disabled (zeroes)
INIT-to-RTR set max_dest_rd_atomic=10, AND
set qp_access_flags = IB_ACCESS_REMOTE_READ | IB_ACCESS_REMOTE_ATOMIC

The current code will incorrectly take the access-flags value set in
the RESET-to-INIT transition.

We can simplify, and correct, this IB_QP_ACCESS_FLAGS handling: it is
always safe to set qp access flags in the firmware command if either
of IB_QP_MAX_DEST_RD_ATOMIC or IB_QP_ACCESS_FLAGS is set, so let's
just set it to the correct value, always.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 44b5b030 09-Dec-2005 Jack Morgenstein <jackm@mellanox.co.il>

IB/mthca: don't change driver's copy of attributes if modify QP fails

Only change the driver's copy of the QP attributes in modify QP after
checking the modify QP command completed successfully.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 6aa2e4e8 09-Dec-2005 Jack Morgenstein <jackm@mellanox.co.il>

IB/mthca: correct log2 calculation

Fix thinko in rd_atomic calculation: ffs(x) - 1 does not find the next
power of 2 -- it should be fls(x - 1).

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 94361cf7 09-Dec-2005 Jack Morgenstein <jackm@mellanox.co.il>

IB/mthca: check RDMA limits

Add limit checking on rd_atomic and dest_rd_atomic attributes:
especially for max_dest_rd_atomic, a value that is larger than HCA
capability can cause RDB overflow and corruption of another QP.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# e0ae9ecf 29-Nov-2005 Michael S. Tsirkin <mst@mellanox.co.il>

IB/mthca: fix posting of send lists of length >= 255 on mem-free HCAs

On mem-free HCAs, when posting a long list of send requests, a
doorbell must be rung every 255 requests. Add code to handle this.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 187a2586 28-Nov-2005 Michael S. Tsirkin <mst@mellanox.co.il>

IB/mthca: reset QP's last pointers when transitioning to reset state

last pointer is not updated when QP is modified to reset state. This
causes data corruption if WQEs are already posted on the queue.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 48fd0d1f 18-Nov-2005 Michael S. Tsirkin <mst@mellanox.co.il>

IB/mthca: Safer max_send_sge/max_recv_sge calculation

Calculation of QP capabilities still isn't exactly right in mthca:
max_send_sge/max_recv_sge fields returned in create_qp can exceed the
handware supported limits.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# cbc5b2bb 15-Nov-2005 Roland Dreier <rolandd@cisco.com>

[IB] mthca: don't disable RDMA writes if no responder resources

Responder resources are only required to handle RDMA reads and atomic
operations, not RDMA writes. So the driver should allow RDMA writes
even if responder resources are set to 0. This is especially
important for the UC transport -- with the old code, it was impossible
to enable RDMA writes for UC QPs.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# ae57e24a 09-Nov-2005 Michael S. Tsirkin <mst@mellanox.co.il>

[IB] mthca: fix posting long lists of receive work requests

In Tavor mode, when posting a long list of receive work requests, a
doorbell must be rung every 256 requests. Add code to do this when
required.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 62abb841 09-Nov-2005 Michael S. Tsirkin <mst@mellanox.co.il>

[IB] mthca: fix posting of atomic operations

The size of work requests for atomic operations was computed
incorrectly in mthca: all sizeofs need to be divided by 16.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 77369ed3 09-Nov-2005 Jack Morgenstein <jackm@mellanox.co.il>

[IB] uverbs: have kernel return QP capabilities

Move the computation of QP capabilities (max scatter/gather entries,
max inline data, etc) into the kernel, and have the uverbs module
return the values as part of the create QP response. This keeps
precise knowledge of device limits in the low-level kernel driver.

This requires an ABI bump, so while we're making changes, get rid of
the max_sge parameter for the modify SRQ command -- it's not used and
shouldn't be there.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# d09e3276 03-Nov-2005 Jack Morgenstein <jackm@mellanox.co.il>

[IB] mthca: check P_Key index in modify QP

Make sure that the P_Key index passed into mthca_modify_qp() is
within the device's P_Key table.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 4e57b681 30-Oct-2005 Tim Schmielau <tim@physik3.uni-rostock.de>

[PATCH] fix missing includes

I recently picked up my older work to remove unnecessary #includes of
sched.h, starting from a patch by Dave Jones to not include sched.h
from module.h. This reduces the number of indirect includes of sched.h
by ~300. Another ~400 pointless direct includes can be removed after
this disentangling (patch to follow later).
However, quite a few indirect includes need to be fixed up for this.

In order to feed the patches through -mm with as little disturbance as
possible, I've split out the fixes I accumulated up to now (complete for
i386 and x86_64, more archs to follow later) and post them before the real
patch. This way this large part of the patch is kept simple with only
adding #includes, and all hunks are independent of each other. So if any
hunk rejects or gets in the way of other patches, just drop it. My scripts
will pick it up again in the next round.

Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 547e3090 25-Oct-2005 Roland Dreier <rolandd@cisco.com>

[IB] mthca: correct modify QP attribute masks for UC

The UC transport does not support RDMA reads or atomic operations, so
we shouldn't require or even allow the consumer to set attributes
relating to these operations for UC QPs.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# efaae8f7 10-Oct-2005 Jack Morgenstein <jackm@mellanox.co.il>

[IB] mthca: Better limit checking and reporting

Check the sizes of CQs, QPs and SRQs when creating objects, and fail
instead of creating too-big queues. Also return real limits instead
of just plausible-sounding values from mthca_query_device().

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 71eea47d 20-Sep-2005 Michael S. Tsirkin <mst@mellanox.co.il>

[PATCH] IB/mthca: Fix device removal memory leak

Clean up QP table array on device removal.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# d6cff021 13-Sep-2005 Roland Dreier <rolandd@cisco.com>

[PATCH] IB/mthca: fix posting of first work request

Fix posting first WQE for mem-free HCAs: we need to link to previous
WQE even in that case. While we're at it, simplify code for
Tavor-mode HCAs. We don't really need the conditional test there
either; we can similarly always link to the previous WQE.

Based on Michael S. Tsirkin's analogous fix for userspace libmthca.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# bb4a7f0d 12-Sep-2005 Roland Dreier <rolandd@cisco.com>

[PATCH] IB/mthca: assign ACK timeout field correctly

The hardware reads the ACK timeout field from the most significant 5
bits of struct mthca_qp_path's ackto field, not the least significant
bits. This fix has the driver put the timeout in the right place.
Without this, we get a timeout that is 2^8 times too small.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 30a7e8ef 07-Sep-2005 Michael S. Tsirkin <mst@mellanox.co.il>

[PATCH] IB: Initialize qp->wait

Add missing call to init_waitqueue_head().

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# c9fe2b32 07-Sep-2005 Roland Dreier <roland@eddore.topspincom.com>

[PATCH] IB: really reset QPs

When we modify a QP to the RESET state, completely clean up the QP
so that it is really and truly reset.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# a4d61e84 25-Aug-2005 Roland Dreier <roland@eddore.topspincom.com>

[PATCH] IB: move include files to include/rdma

Move the InfiniBand headers from drivers/infiniband/include to include/rdma.
This allows InfiniBand-using code to live elsewhere, and lets us remove the
ugly EXTRA_CFLAGS include path from the InfiniBand Makefiles.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# ec34a922 19-Aug-2005 Roland Dreier <roland@eddore.topspincom.com>

[PATCH] IB/mthca: Add SRQ implementation

Add mthca support for shared receive queues (SRQs),
including userspace SRQs.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# c04bc3d1 19-Aug-2005 Roland Dreier <roland@eddore.topspincom.com>

[PATCH] IB/mthca: Move WQE structures into their own header

Move the definitions of the WQE structures from mthca_qp.c into
mthca_wqe.h, so that we'll be able to share them when we add the
SRQ code in mthca_srq.c.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 288bdeb4 19-Aug-2005 Roland Dreier <roland@eddore.topspincom.com>

[PATCH] IB/mthca: Simplify handling of completions with error

Mem-free HCAs never generate error CQEs that complete multiple WQEs,
so just skip the call to mthca_free_err_wqe() for them rather than
having logic to handle the mem-free case in mthca_free_err_wqe().

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 87b81670 18-Aug-2005 Roland Dreier <roland@eddore.topspincom.com>

[PATCH] IB/mthca: Factor out common queue alloc code

Clean up the allocation of memory for queues by factoring out the
common code into mthca_buf_alloc() and mthca_buf_free(). Now CQs and
QPs share the same queue allocation code, which we'll also use for SRQs.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# da6561c2 17-Aug-2005 Roland Dreier <roland@eddore.topspincom.com>

[PATCH] IB/mthca: Use correct port width capability value

When we call the INIT_IB firmware command to bring up a port, use
the actual port width capability returned by the QUERY_DEV_LIM
command instead of always trying to enable both 1X and 4X. This
fixes breakage seen when the firmware is build to allow 4X only.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 97f52eb4 13-Aug-2005 Sean Hefty <sean.hefty@intel.com>

[PATCH] IB: sparse endianness cleanup

Fix sparse warnings. Use __be* where appropriate.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 2a1d9b7f 11-Aug-2005 Roland Dreier <roland@eddore.topspincom.com>

[PATCH] IB: Add copyright notices

Make some lawyers happy and add copyright notices for people who
forgot to include them when they actually touched the code.

Signed-off-by: Roland Dreier <rolandd@cisco.com>


# 80c8ec2c 07-Jul-2005 Roland Dreier <rolandd@cisco.com>

[PATCH] IB uverbs: add mthca user QP support

Add support for userspace queue pairs (QPs) to mthca.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# ed878458 27-Jun-2005 Roland Dreier <roland@topspin.com>

[PATCH] IB/mthca: Align FW command mailboxes to 4K

Future versions of Mellanox HCA firmware will require command mailboxes to be
aligned to 4K. Support this by using a pci_pool to allocate all mailboxes.
This has the added benefit of shrinking the source and text of mthca.

Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# a03a5a67 27-Jun-2005 Roland Dreier <roland@topspin.com>

[PATCH] IB/mthca: Move mthca_is_memfree checks

Make mthca_table_put() and mthca_table_put_range() NOPs if the device is not
mem-free, so that we don't have to have "if (mthca_is_memfree())" tests in the
callers of these functions. This makes our code more readable and
maintainable, and saves a couple dozen bytes of text in ib_mthca.ko as well.

Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 9e6970b5 27-Jun-2005 Roland Dreier <roland@topspin.com>

[PATCH] IB/mthca: Enable unreliable connected transport

Add support for unreliable connected (UC) transport to mthca driver:
- Add attributes for UC to modify QP table.
- Add support for posting UC work requests.

Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 34a4a753 27-Jun-2005 Roland Dreier <roland@topspin.com>

[PATCH] IB/mthca: Set RDMA/atomic capabilities correctly

mthca apparently had the meanings of the max_rd_atomic and max_dest_rd_atomic
QP attributes backwards. max_rd_atomic limits the maximum number of
outstanding RDMA/atomic requests as an initiator (on a send queue), and
max_dest_rd_atomic specifies the resources allocated to handle RMDA/atomic
requests from the remote end of the connection. We were programming our QP
context with these values swapped.

Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# cd123d7f 27-Jun-2005 Roland Dreier <roland@topspin.com>

[PATCH] IB/mthca: Set QP static rate correctly

Fix offset of static_rate in QP context. Pointed out by Dror Goldenberg.

Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 64dc81fc 27-Jun-2005 Roland Dreier <roland@topspin.com>

[PATCH] IB/mthca: Use dma_alloc_coherent instead of pci_alloc_consistent

Switch all allocations of coherent memory from pci_alloc_consistent() to
dma_alloc_coherent(), so that we can pass GFP_KERNEL. This should help when
the system is low on memory.

Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 08aeb14e 16-Apr-2005 Roland Dreier <roland@topspin.com>

[PATCH] IB/mthca: map context for RDMA responder in mem-free mode

Fix RDMA in mem-free mode: we need to make sure that the RDMA context memory
is mapped for the HCA.

Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# ddf841f0 16-Apr-2005 Roland Dreier <roland@topspin.com>

[PATCH] IB/mthca: update receive queue initialization for new HCAs

Update initialization of receive queue to match new documentation. This
change is required to support new MT25204 HCA.

Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# d10ddbf6 16-Apr-2005 Roland Dreier <roland@topspin.com>

[PATCH] IB/mthca: encapsulate mem-free check into mthca_is_memfree()

Clean up mem-free mode support by introducing mthca_is_memfree() function,
which encapsulates the logic of deciding if a device is mem-free.

Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 2a4443a6 16-Apr-2005 Michael S. Tsirkin <mst@mellanox.co.il>

[PATCH] IB/mthca: fill in opcode field for send completions

Fill in missing fields in send completions.

Signed-off-by: Itamar Rabenstein <itamar@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# ddb934e0 16-Apr-2005 Roland Dreier <roland@topspin.com>

[PATCH] IB/mthca: implement RDMA/atomic operations for mem-free mode

Add code to support RDMA and atomic send work requests in mem-free mode.

Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 3fba2317 16-Apr-2005 Roland Dreier <roland@topspin.com>

[PATCH] IB/mthca: fix posting sends with immediate data

When posting a work request with immediate data, put the immediate data in the
immediate data field of the hardware's work request (rather than overwriting
the flags field).

Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 1da177e4 16-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org>

Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!