History log of /linux-master/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
Revision Date Author Comments
# b25bd37c 06-Aug-2023 Tariq Toukan <tariqt@nvidia.com>

net/mlx5: Move TISes from priv to mdev HW resources

The transport interface send (TIS) object is responsible for performing
all transport related operations of the transmit side. Messages from
Send Queues get segmented and transmitted by the TIS including all
transport required implications, e.g. in the case of large send offload,
the TIS is responsible for the segmentation.

These are stateless objects and can be used by multiple netdevs (e.g.
representors) who share the same core device.

Providing the TISes as a service from the core layer to the netdev layer
reduces the number of replecated TIS objects (in case of multiple
netdevs), and will ease the transition to netdev with multiple mdevs.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 31c70bfe 25-Nov-2022 Dragos Tatulea <dtatulea@nvidia.com>

net/mlx5e: IPoIB, Block PKEY interfaces with less rx queues than parent

A user is able to configure an arbitrary number of rx queues when
creating an interface via netlink. This doesn't work for child PKEY
interfaces because the child interface uses the parent receive channels.

Although the child shares the parent's receive channels, the number of
rx queues is important for the channel_stats array: the parent's rx
channel index is used to access the child's channel_stats. So the array
has to be at least as large as the parent's rx queue size for the
counting to work correctly and to prevent out of bound accesses.

This patch checks for the mentioned scenario and returns an error when
trying to create the interface. The error is propagated to the user.

Fixes: be98737a4faa ("net/mlx5e: Use dynamic per-channel allocations in stats")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 806a8df7 15-Dec-2022 Dragos Tatulea <dtatulea@nvidia.com>

net/mlx5e: IPoIB, Block queue count configuration when sub interfaces are present

PKEY sub interfaces share the receive queues with the parent interface.
While setting the sub interface queue count is not supported, it is
currently possible to change the number of queues of the parent interface.
Thus we can end up with inconsistent queue sizes between the parent and its
sub interfaces.

This change disallows setting the queue count on the parent interface when
sub interfaces are present.

This is achieved by introducing an explicit reference to the parent netdev
in the mlx5i_priv of the child interface. An additional counter is also
required on the parent side to detect when sub interfaces are attached and
for proper cleanup.

The rtnl lock is taken during the ethtool op and the sub interface
ndo_init/uninit ops. There is no race here around counting the sub
interfaces, reading the sub interfaces and setting the number of
channels. The ASSERT_RTNL was added to document that.

Fixes: be98737a4faa ("net/mlx5e: Use dynamic per-channel allocations in stats")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# e74ae1fa 29-Sep-2022 Guy Truzman <gtruzman@nvidia.com>

net/mlx5e: Add error flow when failing update_rx

Up until now, return value of update_rx was ignored. Therefore, flow
continues even if it fails. Add error flow in case of update_rx fails in
mlx5e_open_locked, mlx5i_open and mlx5i_pkey_open.

Signed-off-by: Guy Truzman <gtruzman@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 3db4c85c 30-Sep-2022 Maxim Mikityanskiy <maximmi@nvidia.com>

net/mlx5e: xsk: Use queue indices starting from 0 for XSK queues

In the initial implementation of XSK in mlx5e, XSK RQs coexisted with
regular RQs in the same channel. The main idea was to allow RSS work the
same for regular traffic, without need to reconfigure RSS to exclude XSK
queues.

However, this scheme didn't prove to be beneficial, mainly because of
incompatibility with other vendors. Some tools don't properly support
using higher indices for XSK queues, some tools get confused with the
double amount of RQs exposed in sysfs. Some use cases are purely XSK,
and allocating the same amount of unused regular RQs is a waste of
resources.

This commit changes the queuing scheme to the standard one, where XSK
RQs replace regular RQs on the channels where XSK sockets are open. Two
RQs still exist in the channel to allow failsafe disable of XSK, but
only one is exposed at a time. The next commit will achieve the desired
memory save by flushing the buffers when the regular RQ is unused.

As the result of this transition:

1. It's possible to use RSS contexts over XSK RQs.

2. It's possible to dedicate all queues to XSK.

3. When XSK RQs coexist with regular RQs, the admin should make sure no
unwanted traffic goes into XSK RQs by either excluding them from RSS or
settings up the XDP program to return XDP_PASS for non-XSK traffic.

4. When using a mixed fleet of mlx5e devices and other netdevs, the same
configuration can be applied. If the application supports the fallback
to copy mode on unsupported drivers, it will work too.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 6c72cb05 04-Dec-2021 Tariq Toukan <tariqt@nvidia.com>

net/mlx5e: Use bitmap field for profile features

Use a features bitmap field in mlx5e_profile to declare profile support
state of the different features. Let it replace the existing
rx_ptp_support boolean. It will be extended to cover more features in a
downstream patch.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# a7605370 27-Jul-2021 Arnd Bergmann <arnd@arndb.de>

dev_ioctl: split out ndo_eth_ioctl

Most users of ndo_do_ioctl are ethernet drivers that implement
the MII commands SIOCGMIIPHY/SIOCGMIIREG/SIOCSMIIREG, or hardware
timestamping with SIOCSHWTSTAMP/SIOCGHWTSTAMP.

Separate these from the few drivers that use ndo_do_ioctl to
implement SIOCBOND, SIOCBR and SIOCWANDEV commands.

This is a purely cosmetic change intended to help readers find
their way through the implementation.

Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Vladimir Oltean <olteanv@gmail.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 3adb60b6 25-Feb-2021 Aya Levin <ayal@nvidia.com>

net:mlx5e: Add PTP-TIR and PTP-RQT

Add PTP-TIR and initiate its RQT to allow PTP-RQ to integrate into the
safe-reopen flow on configuration change. Add rx_ptp_support flag on a
profile and turn it on for ETH driver. With this flag set, create a
redirect-RQT for PTP-RQ.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


# 3ef14e46 25-Feb-2020 Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Separate between netdev objects and mlx5e profiles initialization

1) Initialize netdevice features and structures on netdevice allocation
and outside of the mlx5e profile.

2) As now mlx5e netdevice private params will be setup on profile init only
after netdevice features are already set, we add a call to
netde_update_features() to resolve any conflict.
This is nice since we reuse the fix_features ndo code if a profile
wants different default features, instead of duplicating features
conflict resolution code on profile initialization.

3) With this we achieve total separation between mlx5e profiles and
netdevices, and will allow replacing mlx5e profiles on the fly to reuse
the same netdevice for multiple profiles.
e.g. for uplink representor profile as shown in the following patch

4) Profile callbacks are not allowed to touch netdev->features directly
anymore, since in downstream patch we will detach/attach netdev
dynamically to profile, hence we move the code dealing with
netdev->features from profile->init() to fix_features ndo, and we
will call netdev_update_features() on
mlx5e_attach_netdev(profile, netdev);

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>


# 5adf4c475 30-Apr-2020 Tariq Toukan <tariqt@mellanox.com>

net/mlx5e: RX, Re-work initializaiton of RX function pointers

Instead of exposing the RQ datapath handlers (from en_rx.c) so that
they are set in the control path (in en_main.c), wrap this logic
in a single function in en_rx.c and expose it alone.

Every profile will now have a pointer to the new mlx5e_rx_handlers
structure, instead of directly pointing to the previously-exposed
RQ handlers.

This significantly improves locality and modularity of the driver,
and allows many functions in en_rx.c to become static.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 80639b19 03-May-2020 Erez Shitrit <erezsh@mellanox.com>

net/mlx5e: IPoIB, Enable loopback packets for IPoIB interfaces

Enable loopback of unicast and multicast traffic for IPoIB enhanced
mode.
This will allow interfaces with the same pkey to communicate between
them e.g cloned interfaces that located in different namespaces.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# f93f4f4f 06-Apr-2020 Leon Romanovsky <leon@kernel.org>

net/mlx5: Remove extra indirection while storing QPN

The FPGA, SW steering and IPoIB need to have only QPN from the
mlx5_core_qp struct, so reduce memory footprint by storing QPN
directly.

Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>


# 45f171b1 07-Aug-2019 Maxim Mikityanskiy <maximmi@mellanox.com>

net/mlx5e: Support LAG TX port affinity distribution

When the VF LAG is in use, round-robin the TX affinity of channels among
the different ports, if supported by the firmware. Create a set of TISes
per port, while doing round-robin of the channels over the different
sets. Let all SQs of a channel share the same set of TISes.

If lag_tx_port_affinity HCA cap bit is supported, num_lag_ports > 1 and
we aren't the LACP owner (PF in the regular use), assign the affinities,
otherwise use tx_affinity == 0 in TIS context to let the FW assign the
affinities itself. The TISes of the LACP owner are mapped only to the
native physical port.

For VFs, the starting port for round-robin is determined by its vhca_id,
because a VF may have only one channel if attached to a single-core VM.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 694826e3 14-Jul-2019 Tariq Toukan <tariqt@mellanox.com>

net/mlx5e: Fix wrong max num channels indication

No XSK support in the enhanced IPoIB driver and representors.
Add a profile property to specify this, and enhance the logic
that calculates the max number of channels to take it into
account.

Fixes: db05815b36cb ("net/mlx5e: Add XSK zero-copy support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 2b257a6e 05-Jul-2019 Tariq Toukan <tariqt@mellanox.com>

net/mlx5e: Re-work TIS creation functions

Let the EN TIS creation function (mlx5e_create_tis) be responsible
for applying common mdev related fields.
Other specific fields must be set by the caller and passed within
the inbox.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a90f88fe 23-May-2019 Gavi Teitz <gavi@mellanox.com>

net/mlx5e: Don't refresh TIRs when updating representor SQs

Refreshing TIRs is done in order to update the TIRs with the current
state of SQs in the transport domain, so that the TIRs can filter out
undesired self-loopback packets based on the source SQ of the packet.

Representor TIRs will only receive packets that originate from their
associated vport, due to dedicated steering, and therefore will never
receive self-loopback packets, whose source vport will be the vport of
the E-Switch manager, and therefore not the vport associated with the
representor. As such, it is not necessary to refresh the representors'
TIRs, since self-loopback packets can't reach them.

Since representors only exist in switchdev mode, and there is no
scenario in which a representor will exist in the transport domain
alongside a non-representor, it is not necessary to refresh the
transport domain's TIRs upon changing the state of a representor's
queues. Therefore, do not refresh TIRs upon such a change. Achieve
this by adding an update_rx callback to the mlx5e_profile, which
refreshes TIRs for non-representors and does nothing for representors,
and replace instances of mlx5e_refresh_tirs() upon changing the state
of the queues with update_rx().

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 779d986d 06-Sep-2018 Feras Daoud <ferasda@mellanox.com>

net/mlx5e: Do not ignore netdevice TX/RX queues number

The current design of mlx5e driver ignores the netdevice TX/RX queues
number for netdevices that RDMA IPoIB ULP creates. Instead, the queue
number is initialized to the maximum number that mlx5 thinks best for
performance. As a result, ULP drivers that choose to create a netdevice
with queue number that is less than the maximum channels mlx5 creates,
will get a memory corruption.

This fix changes the mlx5e netdev logic to respect ULP netdevices TX/RX
queue number and use it when creating resources instead of the maximum
channel number.

Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 182570b2 02-Oct-2018 Feras Daoud <ferasda@mellanox.com>

net/mlx5e: Gather common netdev init/cleanup functionality in one place

Introduce a helper init/cleanup function that initializes mlx5e generic
netdev private structure, and use them from all profiles init/cleanup
callbacks.

This patch will also be helpful to initialize/cleanup netdevs that are
not created by mlx5 driver, e.g: accelerated ipoib child netdevs.

Fixes: 26e59d8077a3 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# b75ba382 02-Sep-2018 Feras Daoud <ferasda@mellanox.com>

net/mlx5e: IPoIB, Add ndo stats support for IPoIB child devices

Expose RX and TX counters by implementing ndo_get_stats64 operation
for child devices.

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 73281b78 11-Feb-2018 Tariq Toukan <tariqt@mellanox.com>

net/mlx5e: Derive Striding RQ size from MTU

In Striding RQ, each WQE serves multiple packets
(hence called Multi-Packet WQE, MPWQE).
The size of a MPWQE is constant (currently 256KB).

Upon a ringparam set operation, we calculate the number of
MPWQEs per RQ. For this, first it is needed to determine the
number of packets that can reside within a single MPWQE.
In this patch we use the actual MTU size instead of ETH_DATA_LEN
for this calculation.

This implies that a change in MTU might require a change
in Striding RQ ring size.

In addition, this obsoletes some WQEs-to-packets translation
functions and helps delete ~60 LOC.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 08437c57 31-Oct-2017 Feras Daoud <ferasda@mellanox.com>

net/mlx5e: IPoIB, Add PTP ioctl support for child interface

Add support to control precision time protocol on child interfaces
using ioctl.

This commit changes the following:
- Change parent ioctl function to be non static
- Reuse the parent ioctl function in child devices

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>


# 6a910233 14-Sep-2017 Alex Vesker <valex@mellanox.com>

net/mlx5e: IPoIB, Add PKEY child interface ethtool ops

Similar to VLAN interfaces child interfaces have limited ethtool
support. In current code the main limitation that does not
allow child interface ethtool configuration is due to shared
resources which are managed by the parent.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>


# af98cebc 14-Sep-2017 Alex Vesker <valex@mellanox.com>

net/mlx5e: IPoIB, Add PKEY child interface ndos

Child interface ndos will be called to support child interface
specific behaviour.

ndo_init flow:
-Acquire shared QPN to net-device HT from parent
-Continue with the same flow as parent interface

ndo_open flow:
-Initialize child underlay QP and connect to shared FT
-Create child send TIS
-Open child send channels

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>


# 4c6c615e 14-Sep-2017 Alex Vesker <valex@mellanox.com>

net/mlx5e: IPoIB, Add PKEY child interface nic profile

Child interface profile will be called to support child interface
specific behaviour. The child code is sparse compared to the parent
since the RX channels are shared between the interfaces.
Creating a septate profile for child and parent will make a smother
code with a better ability for future expansion.
The profile stuct is exposed to the parent using a getter function.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>


# 7e7f4780 14-Sep-2017 Alex Vesker <valex@mellanox.com>

net/mlx5e: IPoIB, Use hash-table to map between QPN to child netdev

This change is needed for PKEY support, since the RQs are shared
between the child interface and the parent. The parent is responsible
for NAPI and the precessing of RX completions. Using the dqpn in the
completion descriptor we set the corresponding child IPoIB netdevice
on the SKB.
The mapping between the dqpn and the netdevice is done using a HT,
each mlx5 IPoIB interface registers its mapping on creation.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>