History log of /freebsd-current/sys/ofed/include/rdma/ib_addr.h
Revision Date Author Comments
# 95ee2897 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: two-line .h pattern

Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/


# 12e99b63 19-Apr-2023 Justin Hibbits <jhibbits@FreeBSD.org>

ofed: Fix a logic inversion from IfAPI conversion

Reported by: bartosz.sobczak_intel.com
Fixes: 3e142e07675b ("ofed: Mechanically convert to IfAPI")
Sponsored by: Juniper Networks, Inc.


# 3e142e07 08-Feb-2023 Justin Hibbits <jhibbits@FreeBSD.org>

ofed: Mechanically convert to IfAPI

Summary:
Because of the intricacies of this code it wasn't purely scripted, but
instead hand-mechanical.

Reviewed by: hselasky
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38560


# 1411f52f 04-Jun-2021 Bjoern A. Zeeb <bz@FreeBSD.org>

mlx4/OFED: replace the struct net_device with struct ifnet

Given all the code does operate on struct ifnet, the last step in this
longer series of changes now is to rename struct net_device to
struct ifnet (that is what it was defined to in the LinuxKPi code).
While mlx4 and OFED are "shared" code the decision was made years ago
to not write it based on the netdevice KPI but the native ifnet KPI
for most of it. This commit simply spells this out and with that
frees "struct netdevice" to be re-done on LinuxKPI to become a more
native/mixed implementation over time as needed by, e.g., wireless
drivers.

Sponsored by: The FreeBSD Foundation
MFC after: 10 days
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D30515


# 825b7d4c 26-May-2021 Bjoern A. Zeeb <bz@FreeBSD.org>

OFED: migrate LinuxKPI net_device/ifnet macros into ofed

The LinuxKPI net_device actually is an ifnet; in order to further
clean that up so we can extend "net_device" migrate the few macros
left into ofed and make sure the header is included in all files
which need access to the macros.

Sponsored by: The FreeBSD Foundation
MFC after: 12 days
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D30477


# c35034b3 25-May-2021 Bjoern A. Zeeb <bz@FreeBSD.org>

LinuxKPI/OFED/mlx4: cleanup netdevice.h some more

This removes all unused bits from linux/netdevice.h and migrates two
inline functions into the mlx4 and ofed code respectively.

This gets the mlx4/ofed (struct ifnet) specific bits down to 7 lines
in netdevice.h.

Sponsored by: The FreeBSD Foundation
MFC after: 13 days
Reviewed by: hselasky, kib
Differential Revision: https://reviews.freebsd.org/D30461


# 7069b4c6 26-Mar-2021 Bjoern A. Zeeb <bz@FreeBSD.org>

LinuxKPI/OFED: (re)move inetdevice.h implementation

The two functions in linux/inetdevice.h are highly FreeBSD/ifnet
specific. This is a result of struct net_device being mapped to
struct ifnet.

The only known consumer of these functions are two files in the
ofed/infiniband code.

As a first step of cleaning up copy linux/inetdevice.h to
rdma/ib_addr_freebsd.h. (It stayed a separate file to preserve
copyright and license of the original file; otherwise it could be
merged into ib_addr.h where more EPOCH/vnet/.. are already used).

Slightly rename the function to not conflict with LinuxKPI
in the future.

Remove the three last, now unneeded includes of inetdevice.h and
zap linux/inetdevice.h to an empty header file with only the forward
include to netdevice.h remaining.

Sponsored-by: The FreeBSD Foundation
MFC-after: 2 weeks
Reviewed-by: hselasky, kib
X-D-R: D29366 (extracted as further cleanup)
Differential Revision: https://reviews.freebsd.org/D29434


# 536457e1 31-Aug-2020 Eric van Gyzen <vangyzen@FreeBSD.org>

infiniband: Appease Coverty

Coverity claims the call to rdma_gid2ip in cma_igmp_send overwrites addr.
Use a consistent definition of sockaddr to prevent detections and code
changes in the future.

Submitted by: bret_ketchum@dell.com
Reported by: Coverity
Reviewed by: hselasky, kib
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D26229


# 758a35d0 16-Oct-2019 Hans Petter Selasky <hselasky@FreeBSD.org>

VLAN_TRUNKDEV() requires epochification in ibcore after r353292.

Sponsored by: Mellanox Technologies


# 7877f593 09-Sep-2018 Hans Petter Selasky <hselasky@FreeBSD.org>

Introduce and use sgid_index in CM requests in ibcore.

For RoCE, when CM requests are received for RC and UD connections,
netdevice of the incoming request is unavailable. Because of that CM
requests are always forwarded to init_net namespace.

Now that we have the GID index available, introduce SGID index in
incoming CM requests and refer to the netdevice of it.

While at it fix some incorrect uses of init_net and make sure
the rdma_create_id() function stores the VNET it is passed.

Based on linux commit:
cee104334c98dd04e9dd4d9a4fa4784f7f6aada9

MFC after: 3 days
Approved by: re (gjb)
Sponsored by: Mellanox Technologies


# b4df6efb 05-Sep-2018 Slava Shwartsman <slavash@FreeBSD.org>

ibcore: Fix endless loop in searching for matching VLAN device

In r337943 ifnet's if_pcp was set to the PCP value in use
instead of IFNET_PCP_NONE.
Current ibcore code assumes that if_pcp is IFNET_PCP_NONE with
VLAN interfaces so it can identify prio-tagged traffic.
Fix that by explicitly verifying that that the if_type is IFT_ETHER
and not IFT_L2VLAN.

MFC after: 3 days
Approved by: re (Marius), hselasky (mentor), kib (mentor)
Sponsored by: Mellanox Technologies


# 8ac29525 17-Jul-2018 Hans Petter Selasky <hselasky@FreeBSD.org>

Remove blank line.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 855ad7cf 17-Jul-2018 Hans Petter Selasky <hselasky@FreeBSD.org>

Check AF family prior resolving address and introduce safer rdma_addr_size() variants in ibcore.

Garbage supplied by user will cause to UCMA module provide zero
memory size for memcpy(), because it wasn't checked, it will
produce unpredictable results in rdma_resolve_addr().

There are several places in the ucma ABI where userspace can pass in a
sockaddr but set the address family to AF_IB. When that happens,
rdma_addr_size() will return a size bigger than sizeof struct sockaddr_in6,
and the ucma kernel code might end up copying past the end of a buffer
not sized for a struct sockaddr_ib.

Fix this by introducing new variants
int rdma_addr_size_in6(struct sockaddr_in6 *addr);
int rdma_addr_size_kss(struct __kernel_sockaddr_storage *addr);

that are type-safe for the types used in the ucma ABI and return 0 if the
size computed is bigger than the size of the type passed in. We can use
these new variants to check what size userspace has passed in before
copying any addresses.

Linux commit:
2975d5de6428ff6d9317e9948f0968f7d42e5d74
09abfe7b5b2f442a85f4c4d59ecf582ad76088d7
84652aefb347297aa08e91e283adf7b18f77c2d5

MFC after: 1 week
Sponsored by: Mellanox Technologies


# f4546fa3 17-Jul-2018 Hans Petter Selasky <hselasky@FreeBSD.org>

Add support for prio-tagged traffic for RDMA in ibcore.

When receiving a PCP change all GID entries are reloaded.
This ensures the relevant GID entries use prio tagging,
by setting VLAN present and VLAN ID to zero.

The priority for prio tagged traffic is set using the regular
rdma_set_service_type() function.

Fake the real network device to have a VLAN ID of zero
when prio tagging is enabled. This is logic is hidden inside
the rdma_vlan_dev_vlan_id() function which must always be used
to retrieve the VLAN ID throughout all of ibcore and the
infiniband network drivers.

The VLAN presence information then propagates through all
of ibcore and so incoming connections will have the VLAN
bit set. The incoming VLAN ID is then checked against the
return value of rdma_vlan_dev_vlan_id().

MFC after: 1 week
Sponsored by: Mellanox Technologies


# b70db327 17-Jul-2018 Hans Petter Selasky <hselasky@FreeBSD.org>

Set RoCEv2 MGID according to spec in ibcore.

RoCEv2 Annex states that for RoCEv2 over IPv4, the corresponding
IPv4 address is encoded into the GID according to the following rule:
GID= :ffff:<IPv4 address>

Remove the 0xff0e prefix for RoCEv2 packets with IPv4 and leave it
zeroed and change rdma_is_multicast_addr() to consider the new logic.

Linux commit:
be1d325a335840a86c133a56c6a911c368bac0fd
1c3aea2bc8f0b2e5b57375ead40457ff75a3a2ec

MFC after: 1 week
Sponsored by: Mellanox Technologies


# d7c5a620 18-May-2018 Matt Macy <mmacy@FreeBSD.org>

ifnet: Replace if_addr_lock rwlock with epoch + mutex

Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree
4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32
4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32
4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32
4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32
4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32
4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32
4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32

After the patch

InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree
5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51
5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51
5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51
5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51
5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52
5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by: gallatin
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15366


# bf8641fe 05-Mar-2018 Hans Petter Selasky <hselasky@FreeBSD.org>

Get correct network device when accepting incoming RDMA connections in ibcore.

This patch ensures the GID index is always used as a basis of resolving
incoming RDMA connections, as compared to the GID value itself.

Background:
On a per infiniband port basis, the GID identifier is not a unique identifier!
This assumption falls apart when VLAN ID, IPv6 scope ID and RoCE type,
as supported by RoCE v2, is taken into account. This additional
information is stored in the so-called GID attributes and is needed to
correctly identify the destination network interface for an incoming
connection.

Different VLANs are allowed to define the same IPv4 addresses and especially
for the default IPv6 link-local addresses or when using so-called containers
or jails, this is true.

The VNET information for the destination network interface is needed in
order to perform the L2 address lookup in the right Virtual Network Stack
context.

Consequently old functions previously used by RoCE v1, like
rdma_addr_find_smac_by_sgid() are impossible to support, because
there can be multiple identical GIDs associated with the same
infiniband port, and the answer to such a request becomes undefined.
This function has been removed.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 09938b21 05-Mar-2018 Hans Petter Selasky <hselasky@FreeBSD.org>

Add missing FreeBSD tags and SVN properties to ibcore.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# fe267a55 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: general adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.


# 860bbba0 09-Nov-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Multiple fixes for using IPv6 link-local addresses with RDMA in ibcore.

1) Fail to resolve RDMA address if rtalloc1() returns the loopback
device, lo0, as the gateway interface. Currently RDMA loopback is
not supported.

2) Use ip_dev_find() and ip6_dev_find() to lookup network interfaces
with matching IPv4 and IPv6 addresses, respectivly.

3) In addr_resolve() make sure the "ifa" pointer is always set, also when
the "ifp" is NULL. Else a NULL pointer access might happen trying to
read from the "ifa" pointer later on.

4) In rdma_addr_find_dmac_by_grh() make sure the "bound_dev_if" field
gets set properly instead of passing the scope ID through the IPv6
socket address structure. This is more in line with upstream OFED
in Linux.

5) In rdma_addr_find_smac_by_sgid() there is no need to pass the
scope ID for IPv6. Either it is stored in the "bound_dev_if" field
or ip6_dev_find() will find the correct network device regardless
of the scope ID.

Sponsored by: Mellanox Technologies
MFC after: 1 week


# aacb0377 09-Oct-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Make sure the IPv6 scope ID gets zeroed inside the GID. Else searching for a
valid GID entry based on IPv6 addresses can fail.

Sponsored by: Mellanox Technologies
MFC after: 1 week


# 478d3005 14-Jun-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Initial RoCE/infiniband kernel update to Linux v4.9.

This patch currently supports:
- ibcore as a kernel module only
- krping as a kernel module only
- ipoib as a kernel module only

Sponsored by: Mellanox Technologies


# 0bab509b 22-Apr-2016 Hans Petter Selasky <hselasky@FreeBSD.org>

More fixes for using IPv6 addresses with RDMA:

- Added check that the SCOPE ID is only restored for IPv6 linklocal
addresses.

- Changes made by r237263 in the "cma_bind_addr()" function did not
check if the socket address was of type IPv6 and used the IPv4
socket address for IPv6 addresses. This caused the function to
fail. Fixed this.

- In the "rdma_gid2ip()" function and some other places the "sin6_len"
and "sin6_scope_id" fields were not set for IPv6 socket
addresses. Fixed this.

- The scope ID is not stored as part of the GID entries and must be
passed as an argument to "rdma_gid2ip()".

- Added new method to "struct ib_device" which returns a pointer to
the network interface which belongs to the given infiniband
device. This is needed to be able to get the scope ID for IPv6
addresses via the associated ethernet interface.

- Added convenience function, "rdma_get_ipv6_scope_id()", to get the
scope ID for IPv6 addresses.

- Implemented new "get_netdev" method for mlx4ib. Other IB controller
drivers which want to support IPv6 addresses needs to implement this
aswell.

- Bumped the FreeBSD version due to changing "struct ib_device".

Sponsored by: Mellanox Technologies
MFC after: 1 week


# b5c1e0cb 17-Feb-2015 Hans Petter Selasky <hselasky@FreeBSD.org>

Update the infiniband stack to Mellanox's OFED version 2.1.

Highlights:
- Multiple verbs API updates
- Support for RoCE, RDMA over ethernet

All hardware drivers depending on the common infiniband stack has been
updated aswell.

Discussed with: np @
Sponsored by: Mellanox Technologies
MFC after: 1 month


# 2c6eb461 15-Oct-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Update the OFED Linux compatibility layer and
Mellanox hardware driver(s):

- Properly name an inclusion guard
- Fix compile warnings regarding unsigned enums
- Add two new sysctl nodes
- Remove all empty linux header files
- Make an error printout more verbose
- Use "mod_delayed_work()" instead of
cancelling and starting a timeout.
- Implement more Linux scatterlist
functions.

MFC after: 3 days
Sponsored by: Mellanox Technologies


# b245f96c 12-Mar-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Since 32-bit if_baudrate isn't enough to describe a baud rate of a 10 Gbit
interface, in the r241616 a crutch was provided. It didn't work well, and
finally we decided that it is time to break ABI and simply make if_baudrate
a 64-bit value. Meanwhile, the entire struct if_data was reviewed.

o Remove the if_baudrate_pf crutch.

o Make all fields of struct if_data fixed machine independent size. The
notion of data (packet counters, etc) are by no means MD. And it is a
bug that on amd64 we've got a 64-bit counters, while on i386 32-bit,
which at modern speeds overflow within a second.

This also removes quite a lot of COMPAT_FREEBSD32 code.

o Give 16 bit for the ifi_datalen field. This field was provided to
make future changes to if_data less ABI breaking. Unfortunately the
8 bit size of it had effectively limited sizeof if_data to 256 bytes.

o Give 32 bits to ifi_mtu and ifi_metric.
o Give 64 bits to the rest of fields, since they are counters.

__FreeBSD_version bumped.

Discussed with: emax
Sponsored by: Netflix
Sponsored by: Nginx, Inc.


# 345a7955 18-Oct-2012 John Baldwin <jhb@FreeBSD.org>

Take advantage of if_baudrate_pf and calculate an effective baud rate on
all platforms (not just amd64) to compute an equivalent IB rate.


# f394ce6e 21-Mar-2011 Konstantin Belousov <kib@FreeBSD.org>

Allow the ofed modules to be compiled on i386.

Reviewed by: jeff


# aa0a1e58 21-Mar-2011 Jeff Roberson <jeff@FreeBSD.org>

- Merge in OFED 1.5.3 from projects/ofed/head