History log of /linux-master/drivers/net/ethernet/microsoft/mana/mana_en.c
Revision Date Author Comments
# c0de6ab9 02-Apr-2024 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Fix Rx DMA datasize and skb_over_panic

mana_get_rxbuf_cfg() aligns the RX buffer's DMA datasize to be
multiple of 64. So a packet slightly bigger than mtu+14, say 1536,
can be received and cause skb_over_panic.

Sample dmesg:
[ 5325.237162] skbuff: skb_over_panic: text:ffffffffc043277a len:1536 put:1536 head:ff1100018b517000 data:ff1100018b517100 tail:0x700 end:0x6ea dev:<NULL>
[ 5325.243689] ------------[ cut here ]------------
[ 5325.245748] kernel BUG at net/core/skbuff.c:192!
[ 5325.247838] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 5325.258374] RIP: 0010:skb_panic+0x4f/0x60
[ 5325.302941] Call Trace:
[ 5325.304389] <IRQ>
[ 5325.315794] ? skb_panic+0x4f/0x60
[ 5325.317457] ? asm_exc_invalid_op+0x1f/0x30
[ 5325.319490] ? skb_panic+0x4f/0x60
[ 5325.321161] skb_put+0x4e/0x50
[ 5325.322670] mana_poll+0x6fa/0xb50 [mana]
[ 5325.324578] __napi_poll+0x33/0x1e0
[ 5325.326328] net_rx_action+0x12e/0x280

As discussed internally, this alignment is not necessary. To fix
this bug, remove it from the code. So oversized packets will be
marked as CQE_RX_TRUNCATED by NIC, and dropped.

Cc: stable@vger.kernel.org
Fixes: 2fbbd712baf1 ("net: mana: Enable RX path to handle various MTU sizes")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/1712087316-20886-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 02fed6d9 13-Dec-2023 Konstantin Taranov <kotaranov@microsoft.com>

net: mana: add msix index sharing between EQs

This patch allows to assign and poll more than one EQ on the same
msix index.
It is achieved by introducing a list of attached EQs in each IRQ context.
It also removes the existing msix_index map that tried to ensure that there
is only one EQ at each msix_index.
This patch exports symbols for creating EQs from other MANA kernel modules.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# f4225441 28-Nov-2023 Colin Ian King <colin.i.king@gmail.com>

net: mana: Fix spelling mistake "enforecement" -> "enforcement"

There is a spelling mistake in struct field hc_tx_err_sqpdid_enforecement.
Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Link: https://lore.kernel.org/r/20231128095304.515492-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 7cc9e6d7 26-Nov-2023 Jakub Kicinski <kuba@kernel.org>

eth: link netdev to page_pools in drivers

Link page pool instances to netdev for the drivers which
already link to NAPI. Unless the driver is doing something
very weird per-NAPI should imply per-netdev.

Add netsec as well, Ilias indicates that it fits the mold.

Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# e1df5202 24-Nov-2023 Shradha Gupta <shradhagupta@linux.microsoft.com>

net :mana :Add remaining GDMA stats for MANA to ethtool

Extend performance counter stats in 'ethtool -S <interface>'
for MANA VF to include all GDMA stat counter.

Tested-on: Ubuntu22
Testcases:
1. LISA testcase:
PERF-NETWORK-TCP-THROUGHPUT-MULTICONNECTION-NTTTCP-Synthetic
2. LISA testcase:
PERF-NETWORK-TCP-THROUGHPUT-MULTICONNECTION-NTTTCP-SRIOV

Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Link: https://lore.kernel.org/r/1700830950-803-1-git-send-email-shradhagupta@linux.microsoft.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# f5247a6e 27-Oct-2023 Konstantin Taranov <kotaranov@microsoft.com>

net: mana: Use xdp_set_features_flag instead of direct assignment

This patch uses a helper function for assignment of xdp_features.
This change simplifies backports.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://lore.kernel.org/r/1698430011-21562-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# a43e8e9f 29-Sep-2023 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Fix oversized sge0 for GSO packets

Handle the case when GSO SKB linear length is too large.

MANA NIC requires GSO packets to put only the header part to SGE0,
otherwise the TX queue may stop at the HW level.

So, use 2 SGEs for the skb linear part which contains more than the
packet header.

Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# 7a54de92 29-Sep-2023 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Fix the tso_bytes calculation

sizeof(struct hop_jumbo_hdr) is not part of tso_bytes, so remove
the subtraction from header size.

Cc: stable@vger.kernel.org
Fixes: bd7fc6e1957c ("net: mana: Add new MANA VF performance counters for easier troubleshooting")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# b2b00006 29-Sep-2023 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Fix TX CQE error handling

For an unknown TX CQE error type (probably from a newer hardware),
still free the SKB, update the queue tail, etc., otherwise the
accounting will be wrong.

Also, TX errors can be triggered by injecting corrupted packets, so
replace the WARN_ONCE to ratelimited error logging.

Cc: stable@vger.kernel.org
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# ac3899c6 09-Aug-2023 Shradha Gupta <shradhagupta@linux.microsoft.com>

net: mana: Add gdma stats to ethtool output for mana

Extended performance counter stats in 'ethtool -S <interface>'
for MANA VF to include GDMA tx LSO packets and bytes count.

Tested-on: Ubuntu22
Testcases:
1. LISA testcase:
PERF-NETWORK-TCP-THROUGHPUT-MULTICONNECTION-NTTTCP-Synthetic
2. LISA testcase:
PERF-NETWORK-TCP-THROUGHPUT-MULTICONNECTION-NTTTCP-SRIOV
3. Validated the GDMA stat packets and byte counters
Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a9ca9f9c 04-Aug-2023 Yunsheng Lin <linyunsheng@huawei.com>

page_pool: split types and declarations from page_pool.h

Split types and pure function declarations from page_pool.h
and add them in page_page/types.h, so that C sources can
include page_pool.h and headers should generally only include
page_pool/types.h as suggested by jakub.
Rename page_pool.h to page_pool/helpers.h to have both in
one place.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://lore.kernel.org/r/20230804180529.2483231-2-aleksander.lobakin@intel.com
[Jakub: change microsoft/mana, fix kdoc paths in Documentation]
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# b1d13f7a 04-Aug-2023 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Add page pool for RX buffers

Add page pool for RX buffers for faster buffer cycle and reduce CPU
usage.

The standard page pool API is used.

With iperf and 128 threads test, this patch improved the throughput
by 12-15%, and decreased the IRQ associated CPU's usage from 99-100% to
10-50%.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 92272ec4 02-Aug-2023 Jakub Kicinski <kuba@kernel.org>

eth: add missing xdp.h includes in drivers

Handful of drivers currently expect to get xdp.h by virtue
of including netdevice.h. This will soon no longer be the case
so add explicit includes.

Reviewed-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://lore.kernel.org/r/20230803010230.1755386-2-kuba@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>


# da4e8648 17-Jul-2023 Long Li <longli@microsoft.com>

net: mana: Batch ringing RX queue doorbell on receiving packets

It's inefficient to ring the doorbell page every time a WQE is posted to
the received queue. Excessive MMIO writes result in CPU spending more
time waiting on LOCK instructions (atomic operations), resulting in
poor scaling performance.

Move the code for ringing doorbell page to where after we have posted all
WQEs to the receive queue during a callback from napi_poll().

With this change, tests showed an improvement from 120G/s to 160G/s on a
200G physical link, with 16 or 32 hardware queues.

Tests showed no regression in network latency benchmarks on single
connection.

Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1689622539-5334-2-git-send-email-longli@linuxonhyperv.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# a7dfeda6 09-Aug-2023 Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>

net: mana: Fix MANA VF unload when hardware is unresponsive

When unloading the MANA driver, mana_dealloc_queues() waits for the MANA
hardware to complete any inflight packets and set the pending send count
to zero. But if the hardware has failed, mana_dealloc_queues()
could wait forever.

Fix this by adding a timeout to the wait. Set the timeout to 120 seconds,
which is a somewhat arbitrary value that is more than long enough for
functional hardware to complete any sends.

Cc: stable@vger.kernel.org
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>
Link: https://lore.kernel.org/r/1691576525-24271-1-git-send-email-schakrabarti@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# b803d1fd 09-Jun-2023 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Add support for vlan tagging

To support vlan, use MANA_LONG_PKT_FMT if vlan tag is present in TX
skb. Then extract the vlan tag from the skb struct, and save it to
tx_oob for the NIC to transmit. For vlan tags on the payload, they
are accepted by the NIC too.

For RX, extract the vlan tag from CQE and put it into skb.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 21453285 14-May-2023 Long Li <longli@microsoft.com>

RDMA/mana_ib: Use v2 version of cfg_rx_steer_req to enable RX coalescing

With RX coalescing, one CQE entry can be used to indicate multiple packets
on the receive queue. This saves processing time and PCI bandwidth over
the CQ.

The MANA Ethernet driver also uses the v2 version of the protocol. It
doesn't use RX coalescing and its behavior is not changed.

Link: https://lore.kernel.org/r/1684045095-31228-1-git-send-email-longli@linuxonhyperv.com
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


# 1919b39f 26-May-2023 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Fix perf regression: remove rx_cqes, tx_cqes counters

The apc->eth_stats.rx_cqes is one per NIC (vport), and it's on the
frequent and parallel code path of all queues. So, r/w into this
single shared variable by many threads on different CPUs creates a
lot caching and memory overhead, hence perf regression. And, it's
not accurate due to the high volume concurrent r/w.

For example, a workload is iperf with 128 threads, and with RPS
enabled. We saw perf regression of 25% with the previous patch
adding the counters. And this patch eliminates the regression.

Since the error path of mana_poll_rx_cq() already has warnings, so
keeping the counter and convert it to a per-queue variable is not
necessary. So, just remove this counter from this high frequency
code path.

Also, remove the tx_cqes counter for the same reason. We have
warnings & other counters for errors on that path, and don't need
to count every normal cqe processing.

Cc: stable@vger.kernel.org
Fixes: bd7fc6e1957c ("net: mana: Add new MANA VF performance counters for easier troubleshooting")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/1685115537-31675-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# df18f2da 21-Apr-2023 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Check if netdev/napi_alloc_frag returns single page

netdev/napi_alloc_frag() may fall back to single page which is smaller
than the requested size.
Add error checking to avoid memory overwritten.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 5c74064f 21-Apr-2023 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Rename mana_refill_rxoob and remove some empty lines

Rename mana_refill_rxoob for naming consistency.
And remove some empty lines between function call and error
checking.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 80f6215b 12-Apr-2023 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Add support for jumbo frame

During probe, get the hardware-allowed max MTU by querying the device
configuration. Users can select MTU up to the device limit.
When XDP is in use, limit MTU settings so the buffer size is within
one page. And, when MTU is set to a too large value, XDP is not allowed
to run.
Also, to prevent changing MTU fails, and leaves the NIC in a bad state,
pre-allocate all buffers before starting the change. So in low memory
condition, it will return error, without affecting the NIC.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 2fbbd712 12-Apr-2023 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Enable RX path to handle various MTU sizes

Update RX data path to allocate and use RX queue DMA buffers with
proper size based on potentially various MTU sizes.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a2917b23 12-Apr-2023 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Refactor RX buffer allocation code to prepare for various MTU

Move out common buffer allocation code from mana_process_rx_cqe() and
mana_alloc_rx_wqe() to helper functions.
Refactor related variables so they can be changed in one place, and buffer
sizes are in sync.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# ce518bc3 12-Apr-2023 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Use napi_build_skb in RX path

Use napi_build_skb() instead of build_skb() to take advantage of the
NAPI percpu caches to obtain skbuff_head.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# bd7fc6e1 15-Mar-2023 Shradha Gupta <shradhagupta@linux.microsoft.com>

net: mana: Add new MANA VF performance counters for easier troubleshooting

Extended performance counter stats in 'ethtool -S <interface>' output
for MANA VF to facilitate troubleshooting.

Tested-on: Ubuntu22
Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 66c0e13a 01-Feb-2023 Marek Majtyka <alardam@gmail.com>

drivers: net: turn on XDP features

A summary of the flags being set for various drivers is given below.
Note that XDP_F_REDIRECT_TARGET and XDP_F_FRAG_TARGET are features
that can be turned off and on at runtime. This means that these flags
may be set and unset under RTNL lock protection by the driver. Hence,
READ_ONCE must be used by code loading the flag value.

Also, these flags are not used for synchronization against the availability
of XDP resources on a device. It is merely a hint, and hence the read
may race with the actual teardown of XDP resources on the device. This
may change in the future, e.g. operations taking a reference on the XDP
resources of the driver, and in turn inhibiting turning off this flag.
However, for now, it can only be used as a hint to check whether device
supports becoming a redirection target.

Turn 'hw-offload' feature flag on for:
- netronome (nfp)
- netdevsim.

Turn 'native' and 'zerocopy' features flags on for:
- intel (i40e, ice, ixgbe, igc)
- mellanox (mlx5).
- stmmac
- netronome (nfp)

Turn 'native' features flags on for:
- amazon (ena)
- broadcom (bnxt)
- freescale (dpaa, dpaa2, enetc)
- funeth
- intel (igb)
- marvell (mvneta, mvpp2, octeontx2)
- mellanox (mlx4)
- mtk_eth_soc
- qlogic (qede)
- sfc
- socionext (netsec)
- ti (cpsw)
- tap
- tsnep
- veth
- xen
- virtio_net.

Turn 'basic' (tx, pass, aborted and drop) features flags on for:
- netronome (nfp)
- cavium (thunder)
- hyperv.

Turn 'redirect_target' feature flag on for:
- amanzon (ena)
- broadcom (bnxt)
- freescale (dpaa, dpaa2)
- intel (i40e, ice, igb, ixgbe)
- ti (cpsw)
- marvell (mvneta, mvpp2)
- sfc
- socionext (netsec)
- qlogic (qede)
- mellanox (mlx5)
- tap
- veth
- virtio_net
- xen

Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Marek Majtyka <alardam@gmail.com>
Link: https://lore.kernel.org/r/3eca9fafb308462f7edb1f58e451d59209aa07eb.1675245258.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 0c9ef08a 08-Nov-2022 Nathan Huckleberry <nhuck@google.com>

net: mana: Fix return type of mana_start_xmit()

The ndo_start_xmit field in net_device_ops is expected to be of type
netdev_tx_t (*ndo_start_xmit)(struct sk_buff *skb, struct net_device *dev).

The mismatched return type breaks forward edge kCFI since the underlying
function definition does not match the function hook definition. A new
warning in clang will catch this at compile time:

drivers/net/ethernet/microsoft/mana/mana_en.c:382:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
.ndo_start_xmit = mana_start_xmit,
^~~~~~~~~~~~~~~
1 error generated.

The return type of mana_start_xmit should be changed from int to
netdev_tx_t.

Reported-by: Dan Carpenter <error27@gmail.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1703
Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Signed-off-by: Nathan Huckleberry <nhuck@google.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
[nathan: Rebase on net-next and resolve conflicts
Add note about new clang warning]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20221109002629.1446680-1-nathan@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# 28c66cfa 03-Nov-2022 Ajay Sharma <sharmaajay@microsoft.com>

net: mana: Define data structures for protection domain and memory registration

The MANA hardware support protection domain and memory registration for use
in RDMA environment. Add those definitions and expose them for use by the
RDMA driver.

Signed-off-by: Ajay Sharma <sharmaajay@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1667502990-2559-12-git-send-email-longli@linuxonhyperv.com
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>


# aa565497 03-Nov-2022 Long Li <longli@microsoft.com>

net: mana: Define max values for SGL entries

The number of maximum SGl entries should be computed from the maximum
WQE size for the intended queue type and the corresponding OOB data
size. This guarantees the hardware queue can successfully queue requests
up to the queue depth exposed to the upper layer.

Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1667502990-2559-9-git-send-email-longli@linuxonhyperv.com
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>


# fd325cd6 03-Nov-2022 Long Li <longli@microsoft.com>

net: mana: Move header files to a common location

In preparation to add MANA RDMA driver, move all the required header files
to a common location for use by both Ethernet and RDMA drivers.

Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1667502990-2559-8-git-send-email-longli@linuxonhyperv.com
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>


# d44089e5 03-Nov-2022 Long Li <longli@microsoft.com>

net: mana: Record port number in netdev

The port number is useful for user-mode application to identify this
net device based on port index. Set to the correct value in ndev.

Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1667502990-2559-7-git-send-email-longli@linuxonhyperv.com
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>


# 4c0ff7a1 03-Nov-2022 Long Li <longli@microsoft.com>

net: mana: Export Work Queue functions for use by RDMA driver

RDMA device may need to create Ethernet device queues for use by Queue
Pair type RAW. This allows a user-mode context accesses Ethernet hardware
queues. Export the supporting functions for use by the RDMA driver.

Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1667502990-2559-6-git-send-email-longli@linuxonhyperv.com
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>


# b5c1c985 03-Nov-2022 Long Li <longli@microsoft.com>

net: mana: Handle vport sharing between devices

For outgoing packets, the PF requires the VF to configure the vport with
corresponding protection domain and doorbell ID for the kernel or user
context. The vport can't be shared between different contexts.

Implement the logic to exclusively take over the vport by either the
Ethernet device or RDMA device.

Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1667502990-2559-4-git-send-email-longli@linuxonhyperv.com
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>


# a69839d4 03-Nov-2022 Long Li <longli@microsoft.com>

net: mana: Add support for auxiliary device

In preparation for supporting MANA RDMA driver, add support for auxiliary
device in the Ethernet driver. The RDMA device is modeled as an auxiliary
device to the Ethernet device.

Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1667502990-2559-2-git-send-email-longli@linuxonhyperv.com
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>


# 068c38ad 26-Oct-2022 Thomas Gleixner <tglx@linutronix.de>

net: Remove the obsolte u64_stats_fetch_*_irq() users (drivers).

Now that the 32bit UP oddity is gone and 32bit uses always a sequence
count, there is no need for the fetch_irq() variants anymore.

Convert to the regular interface.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 18010ff7 02-Dec-2022 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Fix race on per-CQ variable napi work_done

After calling napi_complete_done(), the NAPIF_STATE_SCHED bit may be
cleared, and another CPU can start napi thread and access per-CQ variable,
cq->work_done. If the other thread (for example, from busy_poll) sets
it to a value >= budget, this thread will continue to run when it should
stop, and cause memory corruption and panic.

To fix this issue, save the per-CQ work_done variable in a local variable
before napi_complete_done(), so it won't be corrupted by a possible
concurrent thread after napi_complete_done().

Also, add a flag bit to advertise to the NIC firmware: the NAPI work_done
variable race is fixed, so the driver is able to reliably support features
like busy_poll.

Cc: stable@vger.kernel.org
Fixes: e1b5683ff62e ("net: mana: Move NAPI from EQ to CQ")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://lore.kernel.org/r/1670010190-28595-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# 7a8938cd 14-Jun-2022 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Add support of XDP_REDIRECT action

Add a handler of the XDP_REDIRECT return code from a XDP program. The
packets will be flushed at the end of each RX/CQ NAPI poll cycle.
ndo_xdp_xmit() is implemented by sharing the code in mana_xdp_tx().
Ethtool per queue counters are added for XDP redirect and xmit operations.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# 1566e7d6 14-Jun-2022 Dexuan Cui <decui@microsoft.com>

net: mana: Add the Linux MANA PF driver

This minimal PF driver runs on bare metal.
Currently Ethernet TX/RX works. SR-IOV management is not supported yet.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Co-developed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# b707b89f 06-May-2022 Jakub Kicinski <kuba@kernel.org>

eth: switch to netif_napi_add_weight()

Switch all Ethernet drivers which use custom napi weights
to the new API.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 16d083e2 04-May-2022 Jakub Kicinski <kuba@kernel.org>

net: switch to netif_napi_add_tx()

Switch net callers to the new API not requiring
the NAPI_POLL_WEIGHT argument.

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Acked-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Acked-by: Alexandra Winter <wintera@linux.ibm.com>
Link: https://lore.kernel.org/r/20220504163725.550782-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 68f83135 04-Feb-2022 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Remove unnecessary check of cqe_type in mana_process_rx_cqe()

The switch statement already ensures cqe_type == CQE_RX_OKAY at that
point.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# e4b76219 04-Feb-2022 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Add handling of CQE_RX_TRUNCATED

The proper way to drop this kind of CQE is advancing rxq tail
without indicating the packet to the upper network layer.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a6bf5703 28-Jan-2022 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Reuse XDP dropped page

Reuse the dropped page in RX path to save page allocation
overhead.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# d356abb9 28-Jan-2022 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Add counter for XDP_TX

This counter will show up in ethtool stat. It is the
number of packets received and forwarded by XDP program.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# f90f8420 28-Jan-2022 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Add counter for packet dropped by XDP

This counter will show up in ethtool stat data.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 3b80b73a 29-Dec-2021 Jakub Kicinski <kuba@kernel.org>

net: Add includes masked by netdevice.h including uapi/bpf.h

Add missing includes unmasked by the subsequent change.

Mostly network drivers missing an include for XDP_PACKET_HEADROOM.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211230012742.770642-2-kuba@kernel.org


# 6cc74443 15-Dec-2021 Dexuan Cui <decui@microsoft.com>

net: mana: Add RX fencing

RX fencing allows the driver to know that any prior change to the RQs has
finished, e.g. when the RQs are disabled/enabled or the hashkey/indirection
table are changed, RX fencing is required.

Remove the previous workaround "ssleep(1)" and add the real support for
RX fencing as the PF driver supports the MANA_FENCE_RQ request now (any
old PF driver not supporting the request won't be used in production).

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://lore.kernel.org/r/20211216001748.8751-1-decui@microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# ed5356b5 19-Nov-2021 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Add XDP support

Add support of XDP for the MANA driver.

Supported XDP actions:
XDP_PASS, XDP_TX, XDP_DROP, XDP_ABORTED

XDP actions not yet supported:
XDP_REDIRECT

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 635096a8 29-Oct-2021 Dexuan Cui <decui@microsoft.com>

net: mana: Support hibernation and kexec

Implement the suspend/resume/shutdown callbacks for hibernation/kexec.

Add mana_gd_setup() and mana_gd_cleanup() for some common code, and
use them in the mand_gd_* callbacks.

Reuse mana_probe/remove() for the hibernation path.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 6c7ea696 29-Oct-2021 Dexuan Cui <decui@microsoft.com>

net: mana: Fix the netdev_err()'s vPort argument in mana_init_port()

Use the correct port index rather than 0.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a137c069 25-Oct-2021 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Allow setting the number of queues while the NIC is down

The existing code doesn't allow setting the number of queues while the
NIC is down.

Update the ethtool handler functions to support setting the number of
queues while the NIC is at down state.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# f3956ebb 01-Oct-2021 Jakub Kicinski <kuba@kernel.org>

ethernet: use eth_hw_addr_set() instead of ether_addr_copy()

Convert Ethernet from ether_addr_copy() to eth_hw_addr_set():

@@
expression dev, np;
@@
- ether_addr_copy(dev->dev_addr, np)
+ eth_hw_addr_set(dev, np)

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# be049936 08-Oct-2021 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Fix error handling in mana_create_rxq()

Fix error handling in mana_create_rxq() when
cq->gdma_id >= gc->max_num_cqs.

Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://lore.kernel.org/r/1633698691-31721-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 1e2d0824 24-Aug-2021 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Add support for EQ sharing

The existing code uses (1 + #vPorts * #Queues) MSIXs, which may exceed
the device limit.

Support EQ sharing, so that multiple vPorts (NICs) can share the same
set of MSIXs.

And, report the EQ-sharing capability bit to the host, which means the
host can potentially offer more vPorts and queues to the VM.

Also update the resource limit checking and error handling for better
robustness.

Now, we support up to 256 virtual ports per VF (it was 16/VF), and
support up to 64 queues per vPort (it was 16).

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# e1b5683f 24-Aug-2021 Haiyang Zhang <haiyangz@microsoft.com>

net: mana: Move NAPI from EQ to CQ

The existing code has NAPI threads polling on EQ directly. To prepare
for EQ sharing among vPorts, move NAPI from EQ to CQ so that one EQ
can serve multiple CQs from different vPorts.

The "arm bit" is only set when CQ processing is completed to reduce
the number of EQ entries, which in turn reduce the number of interrupts
on EQ.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# b9078845 20-Jun-2021 Christophe JAILLET <christophe.jaillet@wanadoo.fr>

net: mana: Fix a memory leak in an error handling path in 'mana_create_txq()'

If this test fails we must free some resources as in all the other error
handling paths of this function.

Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# ea89c862 13-May-2021 Gustavo A. R. Silva <gustavoars@kernel.org>

net: mana: Use struct_size() in kzalloc()

Make use of the struct_size() helper instead of an open-coded version,
in order to avoid any potential type mistakes or integer overflows
that, in the worst scenario, could lead to heap overflows.

This code was detected with the help of Coccinelle and, audited and
fixed manually.

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# d90a9468 22-Apr-2021 Dexuan Cui <decui@microsoft.com>

net: mana: Use int to check the return value of mana_gd_poll_cq()

mana_gd_poll_cq() may return -1 if an overflow error is detected (this
should never happen unless there is a bug in the driver or the hardware).

Fix the type of the variable "comp_read" by using int rather than u32.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# ca9c54d2 16-Apr-2021 Dexuan Cui <decui@microsoft.com>

net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)

Add a VF driver for Microsoft Azure Network Adapter (MANA) that will be
available in the future.

Co-developed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Co-developed-by: Shachar Raindel <shacharr@microsoft.com>
Signed-off-by: Shachar Raindel <shacharr@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>