History log of /linux-master/drivers/infiniband/sw/rxe/rxe_mcast.c
Revision Date Author Comments
# 7f60951f 22-May-2022 Christophe JAILLET <christophe.jaillet@wanadoo.fr>

RDMA/rxe: Fix an error handling path in rxe_get_mcg()

The commit in the Fixes tag has shuffled some code.
Now 'mcg_num' is incremented before the kzalloc(). So if the memory
allocation fails, this increment must be undone.

Fixes: a926a903b7dc ("RDMA/rxe: Do not call dev_mc_add/del() under a spinlock")
Link: https://lore.kernel.org/r/fe137cd8b1f17593243aa73d59c18ea71ab9ee36.1653225896.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# bfdc0edd 04-May-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Change mcg_lock to a _bh lock

rxe_mcast.c currently uses _irqsave spinlocks for rxe->mcg_lock while
rxe_recv.c uses _bh spinlocks for the same lock.

As there is no case where the mcg_lock can be taken from an IRQ, change
these all to bh locks so we don't have confusing mismatched lock types on
the same spinlock.

Fixes: 6090a0c4c7c6 ("RDMA/rxe: Cleanup rxe_mcast.c")
Link: https://lore.kernel.org/r/20220504202817.98247-1-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# a926a903 04-May-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Do not call dev_mc_add/del() under a spinlock

These routines were not intended to be called under a spinlock and will
throw debugging warnings:

raw_local_irq_restore() called with IRQs enabled
WARNING: CPU: 13 PID: 3107 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x2f/0x50
CPU: 13 PID: 3107 Comm: python3 Tainted: G E 5.18.0-rc1+ #7
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
RIP: 0010:warn_bogus_irq_restore+0x2f/0x50
Call Trace:
<TASK>
_raw_spin_unlock_irqrestore+0x75/0x80
rxe_attach_mcast+0x304/0x480 [rdma_rxe]
ib_attach_mcast+0x88/0xa0 [ib_core]
ib_uverbs_attach_mcast+0x186/0x1e0 [ib_uverbs]
ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xcd/0x140 [ib_uverbs]
ib_uverbs_cmd_verbs+0xdb0/0xea0 [ib_uverbs]
ib_uverbs_ioctl+0xd2/0x160 [ib_uverbs]
do_syscall_64+0x5c/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae

Move them out of the spinlock, it is OK if there is some races setting up
the MC reception at the ethernet layer with rbtree lookups.

Fixes: 6090a0c4c7c6 ("RDMA/rxe: Cleanup rxe_mcast.c")
Link: https://lore.kernel.org/r/20220504202817.98247-1-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 3197706a 03-Mar-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Use standard names for ref counting

Rename rxe_add_ref() to rxe_get() and rxe_drop_ref() to rxe_put().
Significantly improves readability for new readers.

Link: https://lore.kernel.org/r/20220304000808.225811-10-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 6090a0c4 23-Feb-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Cleanup rxe_mcast.c

Finish adding subroutine comment headers to subroutines in
rxe_mcast.c. Make minor api change cleanups.

Link: https://lore.kernel.org/r/20220223230706.50332-5-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# a181c4c8 23-Feb-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Collect cleanup mca code in a subroutine

Collect cleanup code for struct rxe_mca into a subroutine,
__rxe_cleanup_mca() called in rxe_detach_mcg() in rxe_mcast.c.

Link: https://lore.kernel.org/r/20220223230706.50332-4-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 4a4f1073 23-Feb-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Collect mca init code in a subroutine

Collect initialization code for struct rxe_mca into a subroutine,
__rxe_init_mca(), to cleanup rxe_attach_mcg() in rxe_mcast.c. Check
limit on total number of attached qp's.

Link: https://lore.kernel.org/r/20220223230706.50332-3-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 3810c1a1 08-Feb-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Remove mcg from rxe pools

Finish removing mcg from rxe pools. Replace rxe pools ref counting by
kref's. Replace rxe_alloc by kzalloc.

Link: https://lore.kernel.org/r/20220208211644.123457-8-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 8a0a5fe0 08-Feb-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Replace pool key by rxe->mcg_tree

Continuing to decouple mcg from rxe pools. Create red-black tree code in
rxe_mcast.c to hold mcg index. Replace pool key calls by calls to local
red-black routines.

Link: https://lore.kernel.org/r/20220208211644.123457-6-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 8a99c81f 08-Feb-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Replace int num_qp by atomic_t qp_num

Replace int num_qp in struct rxe_mcg by atomic_t qp_num.

Link: https://lore.kernel.org/r/20220208211644.123457-5-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 5bc15d1f 08-Feb-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Replace grp by mcg, mce by mca

Replace 'grp' by 'mcg', 'mce' by 'mca'. Shorten subroutine names in
rxe_mcast.c. These name uses are more in line with other object names
used.

Link: https://lore.kernel.org/r/20220208211644.123457-4-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# d5724055 08-Feb-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Use kzmalloc/kfree for mca

Remove rxe_mca (was rxe_mc_elem) from rxe pools and use kzmalloc and kfree
to allocate and free in rxe_mcast.c. Call kzalloc outside of spinlocks to
avoid having to use GFP_ATOMIC.

Link: https://lore.kernel.org/r/20220208211644.123457-3-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 9fd0eb7c 08-Feb-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Move mcg_lock to rxe

Replace mcg->mcg_lock and mc_grp_pool->pool_lock by rxe->mcg_lock. This
is the first step of several intended to decouple the mc_grp and mc_elem
objects from the rxe pool code.

Link: https://lore.kernel.org/r/20220208211644.123457-2-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# a099b085 15-Feb-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Revert changes from irqsave to bh locks

A previous patch replaced all irqsave locks in rxe with bh locks. This
ran into problems because rdmacm has a bad habit of calling rdma verbs
APIs while disabling irqs. This is not allowed during spin_unlock_bh()
causing programs that use rdmacm to fail. This patch reverts the changes
to locks that had this problem or got dragged into the same mess. After
this patch blktests/check -q srp now runs correctly.

Link: https://lore.kernel.org/r/20220215194448.44369-1-rpearsonhpe@gmail.com
Fixes: 21adfa7a3c4e ("RDMA/rxe: Replace irqsave locks with bh locks")
Reported-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Reported-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Tested-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# d3f6899b 27-Jan-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Remove qp->grp_lock and qp->grp_list

Since it is no longer required to cleanup attachments to multicast
groups when a QP is destroyed qp->grp_lock and qp->grp_list are
no longer needed and are removed.

Link: https://lore.kernel.org/r/20220127213755.31697-7-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 8a7fa872 27-Jan-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Remove rxe_drop_all_macst_groups

With o10-2.2.3 enforced rxe_drop_all_mcast_groups is completely
unnecessary. Remove it and references to it.

Link: https://lore.kernel.org/r/20220127213755.31697-6-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# f9f48460 27-Jan-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Enforce IBA o10-2.2.3

Add code to check if a QP is attached to one or more multicast groups
when destroy_qp is called and return an error if so.

Link: https://lore.kernel.org/r/20220127213755.31697-5-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 02e35244 27-Jan-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Rename rxe_mc_grp and rxe_mc_elem

Rename rxe_mc_grp to rxe_mcg. Rename rxe_mc_elem to rxe_mca.
These can be read 'multicast group' and 'multicast attachment'.
'elem' collided with the use of elem in rxe pools and was a little
confusing.

Link: https://lore.kernel.org/r/20220127213755.31697-4-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 758c7f1e 27-Jan-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Move rxe_mcast_attach/detach to rxe_mcast.c

Move rxe_mcast_attach and rxe_mcast_detach from rxe_verbs.c to rxe_mcast.c,
Make non-static and add declarations to rxe_loc.h. Make the subroutines
in rxe_mcast.c referenced by these routines static and remove their
declarations from rxe_loc.h.

Link: https://lore.kernel.org/r/20220127213755.31697-3-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 7df10239 27-Jan-2022 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Move rxe_mcast_add/delete to rxe_mcast.c

Move rxe_mcast_add and rxe_mcast_delete from rxe_net.c to rxe_mcast.c,
make static and remove declarations from rxe_loc.h.

Link: https://lore.kernel.org/r/20220127213755.31697-2-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 02827b67 02-Nov-2021 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Cleanup rxe_pool_entry

Currently three different names are used to describe rxe pool elements.
They are referred to as entries, elems or pelems. This patch chooses one
'elem' and changes the other ones.

Link: https://lore.kernel.org/r/20211103050241.61293-3-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 21adfa7a 02-Nov-2021 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Replace irqsave locks with bh locks

Most of the locks in the rxe driver are _irqsave/restore locks but in fact
there are no interrupt threads that run rxe code or share data with
rxe. There are softirq threads and data sharing so the appropriate lock
type is _bh. This patch replaces all irqsave type locks with bh type
locks.

Link: https://lore.kernel.org/r/20211103050241.61293-2-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 65a81b61 13-Aug-2021 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Fix memory allocation while in a spin lock

rxe_mcast_add_grp_elem() in rxe_mcast.c calls rxe_alloc() while holding
spinlocks which in turn calls kzalloc(size, GFP_KERNEL) which is
incorrect. This patch replaces rxe_alloc() by rxe_alloc_locked() which
uses GFP_ATOMIC. This bug was caused by the below mentioned commit and
failing to handle the need for the atomic allocate.

Fixes: 4276fd0dddc9 ("RDMA/rxe: Remove RXE_POOL_ATOMIC")
Link: https://lore.kernel.org/r/20210813210625.4484-1-rpearsonhpe@gmail.com
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 88cc77eb 25-Jan-2021 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Fix misleading comments and names

The names and comments of the 'unlocked' pool APIs are very misleading and
not what was intended. This patch replaces 'rxe_xxx_nl' with
'rxe_xxx_locked' with comments indicating that the caller is expected to
hold the rxe pool lock.

Link: https://lore.kernel.org/r/20210125211641.2694-3-rpearson@hpe.com
Reported-by: Hillf Danton <hdanton@sina.com>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Bob Pearson <rpearson@hpe.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 8a48ac7f 16-Dec-2020 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Fix race in rxe_mcast.c

Fix a race in rxe_mcast.c that occurs when two QPs try at the same time to
attach a multicast address. Both QPs lookup the mgid address in a pool of
multicast groups and if they do not find it create a new group elem.

Fix this by locking the lookup/alloc/add key sequence and using the
unlocked APIs added in this patch set.

Link: https://lore.kernel.org/r/20201216231550.27224-8-rpearson@hpe.com
Signed-off-by: Bob Pearson <rpearson@hpe.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 63fa15db 27-Aug-2020 Bob Pearson <rpearsonhpe@gmail.com>

RDMA/rxe: Add SPDX hdrs to rxe source files

Add SPDX headers to all rxe .c and .h files.

Link: https://lore.kernel.org/r/20200827145439.2273-1-rpearson@hpe.com
Signed-off-by: Bob Pearson <rpearson@hpe.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 839f5ac0 10-Jan-2017 Bart Van Assche <bvanassche@acm.org>

IB/rxe: Remove a pointless indirection layer

Neither rxe->ifc_ops nor any of the function pointers in struct
struct rxe_ifc_ops ever change. Hence remove the rxe->ifc_ops
indirection mechanism.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Andrew Boyer <andrew.boyer@dell.com>
Cc: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 32404fb7 10-Jan-2017 Bart Van Assche <bvanassche@acm.org>

IB/rxe: Let the compiler check the type of the cleanup functions

Change the argument type of these functions from void * into
struct rxe_pool_entry *.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Andrew Boyer <andrew.boyer@dell.com>
Cc: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 8700e3e7 16-Jun-2016 Moni Shoua <monis@mellanox.com>

Soft RoCE driver

Soft RoCE (RXE) - The software RoCE driver

ib_rxe implements the RDMA transport and registers to the RDMA core
device as a kernel verbs provider. It also implements the packet IO
layer. On the other hand ib_rxe registers to the Linux netdev stack
as a udp encapsulating protocol, in that case RDMA, for sending and
receiving packets over any Ethernet device. This yields a RDMA
transport over the UDP/Ethernet network layer forming a RoCEv2
compatible device.

The configuration procedure of the Soft RoCE drivers requires
binding to any existing Ethernet network device. This is done with
/sys interface.

A userspace Soft RoCE library (librxe) provides user applications
the ability to run with Soft RoCE devices. The use of rxe verbs ins
user space requires the inclusion of librxe as a device specifics
plug-in to libibverbs. librxe is packaged separately.

Architecture:

+-----------------------------------------------------------+
| Application |
+-----------------------------------------------------------+
+-----------------------------------+
| libibverbs |
User +-----------------------------------+
+----------------+ +----------------+
| librxe | | HW RoCE lib |
+----------------+ +----------------+
+---------------------------------------------------------------+
+--------------+ +------------+
| Sockets | | RDMA ULP |
+--------------+ +------------+
+--------------+ +---------------------+
| TCP/IP | | ib_core |
+--------------+ +---------------------+
+------------+ +----------------+
Kernel | ib_rxe | | HW RoCE driver |
+------------+ +----------------+
+------------------------------------+
| NIC driver |
+------------------------------------+

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+-----------------------------------------------------------+
| Application |
+-----------------------------------------------------------+
+-----------------------------------+
| libibverbs |
User +-----------------------------------+
+----------------+ +----------------+
| librxe | | HW RoCE lib |
+----------------+ +----------------+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+--------------+ +------------+
| Sockets | | RDMA ULP |
+--------------+ +------------+
+--------------+ +---------------------+
| TCP/IP | | ib_core |
+--------------+ +---------------------+
+------------+ +----------------+
Kernel | ib_rxe | | HW RoCE driver |
+------------+ +----------------+
+------------------------------------+
| NIC driver |
+------------------------------------+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Soft RoCE resources:

[1[ https://github.com/SoftRoCE/librxe-dev librxe - source code in
Github
[2] https://github.com/SoftRoCE/rxe-dev/wiki/rxe-dev:-Home - Soft RoCE
Wiki page
[3] https://github.com/SoftRoCE/librxe-dev - Soft RoCE userspace library

Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>