History log of /linux-master/drivers/infiniband/core/roce_gid_mgmt.c
Revision Date Author Comments
# ca325edb 21-Jul-2022 Slark Xiao <slark_xiao@163.com>

IB: Fix repeated words 'the the' comments

Replace 'the the' with 'the' in the comments.

Signed-off-by: Slark Xiao <slark_xiao@163.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# bf194997 16-Jun-2021 Leon Romanovsky <leon@kernel.org>

RDMA: Fix kernel-doc warnings about wrong comment

Compilation with W=1 produces warnings similar to the below.

drivers/infiniband/ulp/ipoib/ipoib_main.c:320: warning: This comment
starts with '/**', but isn't a kernel-doc comment. Refer
Documentation/doc-guide/kernel-doc.rst

All such occurrences were found with the following one line
git grep -A 1 "\/\*\*" drivers/infiniband/

Link: https://lore.kernel.org/r/e57d5f4ddd08b7a19934635b44d6d632841b9ba7.1623823612.git.leonro@nvidia.com
Reviewed-by: Jack Wang <jinpu.wang@ionos.com> #rtrs
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 1fb7f897 01-Mar-2021 Mark Bloch <mbloch@nvidia.com>

RDMA: Support more than 255 rdma ports

Current code uses many different types when dealing with a port of a RDMA
device: u8, unsigned int and u32. Switch to u32 to clean up the logic.

This allows us to make (at least) the core view consistent and use the
same type. Unfortunately not all places can be converted. Many uverbs
functions expect port to be u8 so keep those places in order not to break
UAPIs. HW/Spec defined values must also not be changed.

With the switch to u32 we now can support devices with more than 255
ports. U32_MAX is reserved to make control logic a bit easier to deal
with. As a device with U32_MAX ports probably isn't going to happen any
time soon this seems like a non issue.

When a device with more than 255 ports is created uverbs will report the
RDMA device as having 255 ports as this is the max currently supported.

The verbs interface is not changed yet because the IBTA spec limits the
port size in too many places to be u8 and all applications that relies in
verbs won't be able to cope with this change. At this stage, we are
extending the interfaces that are using vendor channel solely

Once the limitation is lifted mlx5 in switchdev mode will be able to have
thousands of SFs created by the device. As the only instance of an RDMA
device that reports more than 255 ports will be a representor device and
it exposes itself as a RAW Ethernet only device CM/MAD/IPoIB and other
ULPs aren't effected by this change and their sysfs/interfaces that are
exposes to userspace can remain unchanged.

While here cleanup some alignment issues and remove unneeded sanity
checks (mainly in rdmavt),

Link: https://lore.kernel.org/r/20210301070420.439400-1-leon@kernel.org
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# d71f5fa2 18-Jan-2021 Lee Jones <lee.jones@linaro.org>

RDMA/core/roce_gid_mgmt: Fix misnaming of 'rdma_roce_rescan_device()'s param 'ib_dev'

Fixes the following W=1 kernel build warning(s):

drivers/infiniband/core/roce_gid_mgmt.c:511: warning: Function parameter or member 'ib_dev' not described in 'rdma_roce_rescan_device'
drivers/infiniband/core/roce_gid_mgmt.c:511: warning: Excess function parameter 'device' description in 'rdma_roce_rescan_device'

Link: https://lore.kernel.org/r/20210118223929.512175-9-lee.jones@linaro.org
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Taehee Yoo <ap420073@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# eff74233 25-Sep-2020 Taehee Yoo <ap420073@gmail.com>

net: core: introduce struct netdev_nested_priv for nested interface infrastructure

Functions related to nested interface infrastructure such as
netdev_walk_all_{ upper | lower }_dev() pass both private functions
and "data" pointer to handle their own things.
At this point, the data pointer type is void *.
In order to make it easier to expand common variables and functions,
this new netdev_nested_priv structure is added.

In the following patch, a new member variable will be added into this
struct to fix the lockdep issue.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# cb8f1478 31-May-2019 Florian Westphal <fw@strlen.de>

drivers: use in_dev_for_each_ifa_rtnl/rcu

Like previous patches, use the new iterator macros to avoid sparse
warnings once proper __rcu annotations are added.

Compile tested only.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 37fbd834 05-Dec-2018 Mark Zhang <markz@mellanox.com>

IB/core: Fix oops in netdev_next_upper_dev_rcu()

When support for bonding of RoCE devices was added, there was
necessarily a link between the RoCE device and the paired netdevice that
was part of the bond. If you remove the mlx4_en module, that paired
association is broken (the RoCE device is still present but the paired
netdevice has been released). We need to account for this in
is_upper_ndev_bond_master_filter() and filter out those links with a
broken pairing or else we later oops in netdev_next_upper_dev_rcu().

Fixes: 408f1242d940 ("IB/core: Delete lower netdevice default GID entries in bonding scenario")
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# d52ef88a 19-Nov-2018 Parav Pandit <parav@mellanox.com>

RDMA/core: Add GIDs while changing MAC addr only for registered ndev

Currently when MAC address is changed, regardless of the netdev reg_state,
GID entries are removed and added to reflect the new MAC address and new
default GID entries.

When a bonding device is used and the underlying PCI device is removed
several netdevice events are generated. Two events of the interest are
CHANGEADDR and UNREGISTER event on lower(slave) netdevice of the bond
netdevice.

Sometimes CHANGEADDR event is generated when netdev state is
UNREGISTERING (after UNREGISTER event is generated). In this scenario, GID
entries for default GIDs are added and never deleted because GID entries
are deleted only when netdev state is < UNREGISTERED.

This leads to non zero reference count on the netdevice. Due to this, PCI
device unbind operation is getting stuck.

To avoid it, when changing mac address, add GID entries only if netdev is
in REGISTERED state.

Fixes: 03db3a2d81e6 ("IB/core: Add RoCE GID table management")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# dd81b2c8 14-Aug-2018 Parav Pandit <parav@mellanox.com>

IB/core: Change filter function return type from int to bool

Filter functions returns either 0 or 1, therefore better change their
return type from int to bool to reflect the same. Additionally some
filter functions have suffix of _filter some doesn't. Make all filter
function consistent to have __filter suffix to improve code readability.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# d12e2eed 14-Aug-2018 Parav Pandit <parav@mellanox.com>

IB/core: Update GID entries for netdevice whose mac address changes

Update all GID table entries of the netdevice whose MAC address changed.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 464b79b4 14-Aug-2018 Parav Pandit <parav@mellanox.com>

IB/core: Add default GIDs of the bond master netdev

Currently following issues exist:
1. Default GIDs of the lower (slave) netdevice if the bond netdevice is
added. Rather default GID should be of bond master netdevice.
2. Due to this, when failover event occurs FAILOVER event handler attempts
to delete the GID of the upper device and tries to add the default GID
of the lower device. This is incorrect behavior.

To have simple and correct code:
(a) Split default GIDs addition out of add_netdev_ips(). This allows
easier removal in future if RoCE default GIDs are removed.
(b) Add default GIDs of the bond master device by using right filter and
callback function.
(c) Remove unused function enum_netdev_default_gids().

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# a03d4d277 14-Aug-2018 Parav Pandit <parav@mellanox.com>

IB/core: Consider adding default GIDs of bond device

Now that we correctly delete the default GIDs of lower devices during
CHANGEUPPER event, add default GIDs of the bonding master device.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 408f1242 14-Aug-2018 Parav Pandit <parav@mellanox.com>

IB/core: Delete lower netdevice default GID entries in bonding scenario

When NETDEV_CHANGEUPPER event occurs, lower device is not yet established
as slave of the master, and when upper device is bond device, default GID
entries not deleted.

Due to this, when bond device is fully configured, default GID entries of
bond device cannot be added as default GID entries are occupied by the
lower netdevice. This is incorrect.

Default GID entries should really be of bond netdevice because in all RoCE
GIDs (default or IP), MAC address of the bond device will be used. It is
confusing to have default GID of netdevice which is not really used for
any purpose.

Therefore, as first step, implement
(a) filter function which filters if a CHANGEUPPER event netdevice and
associated upper device is master device or not.
(b) callback function which deletes the default GIDs of lower (event
netdevice).

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# b9f09866 14-Aug-2018 Parav Pandit <parav@mellanox.com>

IB/core: Avoid confusing del_netdev_default_ips

Currently bond_delete_netdev_default_gids() is called by two callers.
(a) del_netdev_default_ips_join()
(b) del_netdev_default_ips()

Both above functions changes the argument order while calling
bond_delete_netdev_default_gids(). This required silly
del_netdev_default_ips() wrapper.

Additionally, del_netdev_default_ips() deletes default GIDs not IP based
GIDs. del_netdev_default_ips() having _ips suffix is confusing.

Therefore, get rid of confusing del_netdev_default_ips() and simplify
bond_delete_netdev_default_gids() to follow same argument order as its
caller.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 666e7099 14-Aug-2018 Parav Pandit <parav@mellanox.com>

IB/core: Add comment for change upper netevent handling

Add comment for handling CHANGEUPPER netevent handling.
To improve code readability,
(a) move cmd definitions to its respective if-else branches,
(b) avoid single line structure definitions.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 9906224f 21-May-2018 Parav Pandit <parav@mellanox.com>

IB/core: Remove duplicate declaration of gid_cache_wq

Remove duplicate declaration of gid_cache_wq.

Fixes: d41861942 ("IB/core: Add generic function to extract IB speed from netdev")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# dc5640f2 23-Apr-2018 Parav Pandit <parav@mellanox.com>

IB/core: Fix deleting default GIDs when changing mac adddress

Before [1], When MAC address of the netdevice is changed, default GID is
supposed to get deleted and added back which affects the node and/or port
GUID in below sequence.

netdevice_event()
-> NETDEV_CHANGEADDR
default_del_cmd()
del_netdev_default_ips()
bond_delete_netdev_default_gids()
ib_cache_gid_set_default_gid()
ib_cache_gid_del()
add_cmd()
[..]

However, ib_cache_gid_del() was not getting invoked in non bonding
scenarios because event_ndev and rdma_ndev are same.
Therefore, fix such condition to ignore checking upper device when event
ndev and rdma_dev are same; similar to bond_set_netdev_default_gids().

Which this fix ib_cache_gid_del() is invoked correctly; however
ib_cache_gid_del() doesn't find the default GID for deletion because
find_gid() was given default_gid = false with
GID_ATTR_FIND_MASK_DEFAULT set.
But it was getting overwritten by ib_cache_gid_set_default_gid() later
on as part of add_cmd().
Therefore, mac address change used to work for default GID.

With refactor series [1], this incorrect behavior is detected.

Therefore,
when deleting default GID, set default_gid and set MASK flag.
when deleting IP based GID, clear default_gid and set MASK flag.

[1] https://patchwork.kernel.org/patch/10319151/

Fixes: 238fdf48f2b5 ("IB/core: Add RoCE table bonding support")
Fixes: 598ff6bae689 ("IB/core: Refactor GID modify code for RoCE")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# f0b07bb1 29-Mar-2018 Kirill Tkhai <ktkhai@virtuozzo.com>

net: Introduce net_rwsem to protect net_namespace_list

rtnl_lock() is used everywhere, and contention is very high.
When someone wants to iterate over alive net namespaces,
he/she has no a possibility to do that without exclusive lock.
But the exclusive rtnl_lock() in such places is overkill,
and it just increases the contention. Yes, there is already
for_each_net_rcu() in kernel, but it requires rcu_read_lock(),
and this can't be sleepable. Also, sometimes it may be need
really prevent net_namespace_list growth, so for_each_net_rcu()
is not fit there.

This patch introduces new rw_semaphore, which will be used
instead of rtnl_mutex to protect net_namespace_list. It is
sleepable and allows not-exclusive iterations over net
namespaces list. It allows to stop using rtnl_lock()
in several places (what is made in next patches) and makes
less the time, we keep rtnl_mutex. Here we just add new lock,
while the explanation of we can remove rtnl_lock() there are
in next patches.

Fine grained locks generally are better, then one big lock,
so let's do that with net_namespace_list, while the situation
allows that.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 32f69e4b 04-Jan-2018 Daniel Jurgens <danielj@mellanox.com>

{net, IB}/mlx5: Manage port association for multiport RoCE

When mlx5_ib_add is called determine if the mlx5 core device being
added is capable of dual port RoCE operation. If it is, determine
whether it is a master device or a slave device using the
num_vhca_ports and affiliate_nic_vport_criteria capabilities.

If the device is a slave, attempt to find a master device to affiliate it
with. Devices that can be affiliated will share a system image guid. If
none are found place it on a list of unaffiliated ports. If a master is
found bind the port to it by configuring the port affiliation in the NIC
vport context.

Similarly when mlx5_ib_remove is called determine the port type. If it's
a slave port, unaffiliate it from the master device, otherwise just
remove it from the unaffiliated port list.

The IB device is registered as a multiport device, even if a 2nd port is
not available for affiliation. When the 2nd port is affiliated later the
GID cache must be refreshed in order to get the default GIDs for the 2nd
port in the cache. Export roce_rescan_device to provide a mechanism to
refresh the cache after a new port is bound.

In a multiport configuration all IB object (QP, MR, PD, etc) related
commands should flow through the master mlx5_core_dev, other commands
must be sent to the slave port mlx5_core_mdev, an interface is provide
to get the correct mdev for non IB object commands.

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# 908d6460 04-Jan-2018 Daniel Jurgens <danielj@mellanox.com>

IB/core: Change roce_rescan_device to return void

It always returns 0. Change return type to void.

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# d4186194 14-Jun-2017 Yuval Shaia <yuval.shaia@oracle.com>

IB/core: Add generic function to extract IB speed from netdev

Logic of retrieving netdev speed from net_device and translating it to
IB speed is implemented in rxe, in usnic and in bnxt drivers.

Define new function which merges all.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Christian Benvenuti <benve@cisco.com>
Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 8fe8bacb 30-May-2017 Majd Dibbiny <majd@mellanox.com>

IB/core: Add ordered workqueue for RoCE GID management

Currently the RoCE GID management uses the ib_wq to do add and delete new GIDs
according to the netdev events.

The ib_wq isn't an ordered workqueue and thus two work elements can be executed
concurrently which will result in unexpected behavior and inconsistency of the
GIDs cache content.

Example:
ifconfig eth1 11.11.11.11/16 up

This command will invoke the following netdev events in the following order:
1. NETDEV_UP
2. NETDEV_DOWN
3. NETDEV_UP

If (2) and (3) will be executed concurrently or in reverse order, instead of
having a new GID with 11.11.11.11 IP, we will end up without any new GIDs.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 59039608 01-Feb-2017 Parav Pandit <parav@mellanox.com>

IB/core: Remove pointer casting from void to net_device

This patch avoids unnecessary type casting from void to net_device.

CC: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# a0b3455f 03-Nov-2016 Leon Romanovsky <leon@kernel.org>

IB/core: Remove debug prints after allocation failure

The prints after [k|v][m|z|c]alloc() functions are not needed,
because in case of failure, allocator will print their internal
error prints anyway.

Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 453d3932 17-Oct-2016 David Ahern <dsa@cumulusnetworks.com>

IB/core: Flip to the new dev walk API

Convert rdma_is_upper_dev_rcu, handle_netdev_upper and
ipoib_get_net_dev_match_addr to the new upper device walk API.
This is just a code conversion; no functional change is intended.

v2
- removed typecast of data

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 6020d7e5 23-Dec-2015 Matan Barak <matanb@mellanox.com>

IB/core: Move rdma_is_upper_dev_rcu to header file

In order to validate the route, we need an easy way to check if a
net-device belongs to our RDMA device. Move this helper function
to a header file in order to make this check easier.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 7766a99f 23-Dec-2015 Matan Barak <matanb@mellanox.com>

IB/core: Add ROCE_UDP_ENCAP (RoCE V2) type

Adding RoCE v2 GID type and port type. Vendors
which support this type will get their GID table
populated with RoCE v2 GIDs automatically.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# b39ffa1d 23-Dec-2015 Matan Barak <matanb@mellanox.com>

IB/core: Add gid_type to gid attribute

In order to support multiple GID types, we need to store the gid_type
with each GID. This is also aligned with the RoCE v2 annex "RoCEv2 PORT
GID table entries shall have a "GID type" attribute that denotes the L3
Address type". The currently supported GID is IB_GID_TYPE_IB which is
also RoCE v1 GID type.

This implies that gid_type should be added to roce_gid_table meta-data.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 39096420 15-Oct-2015 Matan Barak <matanb@mellanox.com>

IB/core: Fix use after free of ifa

When using ifup/ifdown while executing enum_netdev_ipv4_ips,
ifa could become invalid and cause use after free error.
Fixing it by protecting with RCU lock.

Fixes: 03db3a2d81e6 ('IB/core: Add RoCE GID table management')
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 238fdf48 30-Jul-2015 Matan Barak <matanb@mellanox.com>

IB/core: Add RoCE table bonding support

Handling bonding and other devices require us to all all GIDs of the
net-devices which are upper-devices of the RoCE port related
net-device.

Active-backup configurations imposes even more challenges as the
default GID should only be set on the active devices (this is
necessary as otherwise the same MAC could be used for several
slaves and thus several slaves will have identical GIDs).

Managing these configurations are done by listening to:
(a) NETDEV_CHANGEUPPER event
(1) if a related net-device is linked, delete all inactive
slaves default GIDs and add the upper device GIDs.
(2) if a related net-device is unlinked, delete all upper GIDs
and add the default GIDs.
(b) NETDEV_BONDING_FAILOVER:
(1) delete the bond GIDs from inactive slaves
(2) delete the inactive slave's default GIDs
(3) Add the bond GIDs to the active slave.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 03db3a2d 30-Jul-2015 Matan Barak <matanb@mellanox.com>

IB/core: Add RoCE GID table management

RoCE GIDs are based on IP addresses configured on Ethernet net-devices
which relate to the RDMA (RoCE) device port.

Currently, each of the low-level drivers that support RoCE (ocrdma,
mlx4) manages its own RoCE port GID table. As there's nothing which is
essentially vendor specific, we generalize that, and enhance the RDMA
core GID cache to do this job.

In order to populate the GID table, we listen for events:

(a) netdev up/down/change_addr events - if a netdev is built onto
our RoCE device, we need to add/delete its IPs. This involves
adding all GIDs related to this ndev, add default GIDs, etc.

(b) inet events - add new GIDs (according to the IP addresses)
to the table.

For programming the port RoCE GID table, providers must implement
the add_gid and del_gid callbacks.

RoCE GID management requires us to state the associated net_device
alongside the GID. This information is necessary in order to manage
the GID table. For example, when a net_device is removed, its
associated GIDs need to be removed as well.

RoCE mandates generating a default GID for each port, based on the
related net-device's IPv6 link local. In contrast to the GID based on
the regular IPv6 link-local (as we generate GID per IP address),
the default GID is also available when the net device is down (in
order to support loopback).

Locking is done as follows:
The patch modify the GID table code both for new RoCE drivers
implementing the add_gid/del_gid callbacks and for current RoCE and
IB drivers that do not. The flows for updating the table are
different, so the locking requirements are too.

While updating RoCE GID table, protection against multiple writers is
achieved via mutex_lock(&table->lock). Since writing to a table
requires us to find an entry (possible a free entry) in the table and
then modify it, this mutex protects both the find_gid and write_gid
ensuring the atomicity of the action.
Each entry in the GID cache is protected by rwlock. In RoCE, writing
(usually results from netdev notifier) involves invoking the vendor's
add_gid and del_gid callbacks, which could sleep.
Therefore, an invalid flag is added for each entry. Updates for RoCE are
done via a workqueue, thus sleeping is permitted.

In IB, updates are done in write_lock_irq(&device->cache.lock), thus
write_gid isn't allowed to sleep and add_gid/del_gid are not called.

When passing net-device into/out-of the GID cache, the device
is always passed held (dev_hold).

The code uses a single work item for updating all RDMA devices,
following a netdev or inet notifier.

The patch moves the cache from being a client (which was incorrect,
as the cache is part of the IB infrastructure) to being explicitly
initialized/freed when a device is registered/removed.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>