History log of /linux-master/drivers/net/ethernet/stmicro/stmmac/stmmac_xdp.c
Revision Date Author Comments
# ffb33221 24-May-2023 Wei Fang <wei.fang@nxp.com>

net: stmmac: fix call trace when stmmac_xdp_xmit() is invoked

We encountered a kernel call trace issue which was related to
ndo_xdp_xmit callback on our i.MX8MP platform. The reproduce
steps show as follows.
1. The FEC port (eth0) connects to a PC port, and the PC uses
pktgen_sample03_burst_single_flow.sh to generate packets and
send these packets to the FEC port. Notice that the script must
be executed before step 2.
2. Run the "./xdp_redirect eth0 eth1" command on i.MX8MP, the
eth1 interface is the dwmac. Then there will be a call trace
issue soon. Please see the log for more details.
The root cause is that the NETDEV_XDP_ACT_NDO_XMIT feature is
enabled by default, so when the step 2 command is exexcuted
and packets have already been sent to eth0, the stmmac_xdp_xmit()
starts running before the stmmac_xdp_set_prog() finishes. To
resolve this issue, we disable the NETDEV_XDP_ACT_NDO_XMIT
feature by default and turn on/off this feature when the bpf
program is installed/uninstalled which just like the other
ethernet drivers.

Call Trace log:
[ 306.311271] ------------[ cut here ]------------
[ 306.315910] WARNING: CPU: 0 PID: 15 at lib/timerqueue.c:55 timerqueue_del+0x68/0x70
[ 306.323590] Modules linked in:
[ 306.326654] CPU: 0 PID: 15 Comm: ksoftirqd/0 Not tainted 6.4.0-rc1+ #37
[ 306.333277] Hardware name: NXP i.MX8MPlus EVK board (DT)
[ 306.338591] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 306.345561] pc : timerqueue_del+0x68/0x70
[ 306.349577] lr : __remove_hrtimer+0x5c/0xa0
[ 306.353777] sp : ffff80000b7c3920
[ 306.357094] x29: ffff80000b7c3920 x28: 0000000000000000 x27: 0000000000000001
[ 306.364244] x26: ffff80000a763a40 x25: ffff0000d0285a00 x24: 0000000000000001
[ 306.371390] x23: 0000000000000001 x22: ffff000179389a40 x21: 0000000000000000
[ 306.378537] x20: ffff000179389aa0 x19: ffff0000d2951308 x18: 0000000000001000
[ 306.385686] x17: f1d3000000000000 x16: 00000000c39c1000 x15: 55e99bbe00001a00
[ 306.392835] x14: 09000900120aa8c0 x13: e49af1d300000000 x12: 000000000000c39c
[ 306.399987] x11: 100055e99bbe0000 x10: ffff8000090b1048 x9 : ffff8000081603fc
[ 306.407133] x8 : 000000000000003c x7 : 000000000000003c x6 : 0000000000000001
[ 306.414284] x5 : ffff0000d2950980 x4 : 0000000000000000 x3 : 0000000000000000
[ 306.421432] x2 : 0000000000000001 x1 : ffff0000d2951308 x0 : ffff0000d2951308
[ 306.428585] Call trace:
[ 306.431035] timerqueue_del+0x68/0x70
[ 306.434706] __remove_hrtimer+0x5c/0xa0
[ 306.438549] hrtimer_start_range_ns+0x2bc/0x370
[ 306.443089] stmmac_xdp_xmit+0x174/0x1b0
[ 306.447021] bq_xmit_all+0x194/0x4b0
[ 306.450612] __dev_flush+0x4c/0x98
[ 306.454024] xdp_do_flush+0x18/0x38
[ 306.457522] fec_enet_rx_napi+0x6c8/0xc68
[ 306.461539] __napi_poll+0x40/0x220
[ 306.465038] net_rx_action+0xf8/0x240
[ 306.468707] __do_softirq+0x128/0x3a8
[ 306.472378] run_ksoftirqd+0x40/0x58
[ 306.475961] smpboot_thread_fn+0x1c4/0x288
[ 306.480068] kthread+0x124/0x138
[ 306.483305] ret_from_fork+0x10/0x20
[ 306.486889] ---[ end trace 0000000000000000 ]---

Fixes: 66c0e13ad236 ("drivers: net: turn on XDP features")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230524125714.357337-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# ac746c85 11-Nov-2021 Ong Boon Leong <boon.leong.ong@intel.com>

net: stmmac: enhance XDP ZC driver level switching performance

The previous stmmac_xdp_set_prog() implementation uses stmmac_release()
and stmmac_open() which tear down the PHY device and causes undesirable
autonegotiation which causes a delay whenever AFXDP ZC is setup.

This patch introduces two new functions that just sufficiently tear
down DMA descriptors, buffer, NAPI process, and IRQs and reestablish
them accordingly in both stmmac_xdp_release() and stammac_xdp_open().

As the results of this enhancement, we get rid of transient state
introduced by the link auto-negotiation:

$ ./xdpsock -i eth0 -t -z

sock0@eth0:0 txonly xdp-drv
pps pkts 1.00
rx 0 0
tx 634444 634560

sock0@eth0:0 txonly xdp-drv
pps pkts 1.00
rx 0 0
tx 632330 1267072

sock0@eth0:0 txonly xdp-drv
pps pkts 1.00
rx 0 0
tx 632438 1899584

sock0@eth0:0 txonly xdp-drv
pps pkts 1.00
rx 0 0
tx 632502 2532160

Reported-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a6451192 24-Aug-2021 Song Yoong Siang <yoong.siang.song@intel.com>

net: stmmac: fix kernel panic due to NULL pointer dereference of xsk_pool

After free xsk_pool, there is possibility that napi polling is still
running in the middle, thus causes a kernel crash due to kernel NULL
pointer dereference of rx_q->xsk_pool and tx_q->xsk_pool.

Fix this by changing the XDP pool setup sequence to:
1. disable napi before free xsk_pool
2. enable napi after init xsk_pool

The following kernel panic is observed without this patch:

RIP: 0010:xsk_uses_need_wakeup+0x5/0x10
Call Trace:
stmmac_napi_poll_rxtx+0x3a9/0xae0 [stmmac]
__napi_poll+0x27/0x130
net_rx_action+0x233/0x280
__do_softirq+0xe2/0x2b6
run_ksoftirqd+0x1a/0x20
smpboot_thread_fn+0xac/0x140
? sort_range+0x20/0x20
kthread+0x124/0x150
? set_kthread_struct+0x40/0x40
ret_from_fork+0x1f/0x30
---[ end trace a77c8956b79ac107 ]---

Fixes: bba2556efad6 ("net: stmmac: Enable RX via AF_XDP zero-copy")
Cc: <stable@vger.kernel.org> # 5.13.x
Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 132c32ee 13-Apr-2021 Ong Boon Leong <boon.leong.ong@intel.com>

net: stmmac: Add TX via XDP zero-copy socket

We add the support of XDP ZC TX submission and cleaning into
stmmac_tx_clean(). The function is made to clean as many TX complete
frames as possible, i.e. limit by priv->dma_tx_size instead of NAPI
budget. For TX ring that is associated with XSK pool, the function
stmmac_xdp_xmit_zc() is introduced to TX frame buffers from XSK pool by
using xsk_tx_peek_desc(). To make stmmac_tx_clean() support the cleaning
of XSK TX frames, STMMAC_TXBUF_T_XSK_TX TX buffer type is introduced.

As stmmac_tx_clean() uses the return value to cue whether NAPI function
should continue to poll, we augment the caller of stmmac_tx_clean() to
pass NAPI budget instead of priv->dma_tx_size through 'budget' input and
made stmmac_tx_clean() to always clean up-to the TX ring size instead.
This allows us to use the return boolean status of stmmac_xdp_xmit_zc()
to decide if XSK TX work is done or not: If true, set 'xmits' to return
'budget - 1' so that NAPI poll may exit. Else, set 'xmits' to return
'budget' to make NAPI poll continue to poll since XSK TX work is not
done. Finally, at the end of stmmac_tx_clean(), the function now take
a maximum value between 'count' and 'xmits' so that status from both
TX cleaning and XSK TX (only for XDP ZC) is considered.

This patch adds a new NAPI poll called stmmac_napi_poll_rxtx() that is
meant to be enabled/disabled for RX and TX ring that are bound to XSK
pool. This NAPI poll function starts with cleaning TX ring, then submits
XSK TX frames to TX ring before proceed to perform RX operations, i.e.
, receiving RX frames and replenishing RX ring with RX free buffers
obtained from XSK pool. Therefore, during XSK RX and TX setup, the driver
enables stmmac_napi_poll_rxtx() for RX and TX operations, then during
XSK RX and TX pool tear-down, the driver reenables the exisiting
independent NAPI poll functions accordingly: stmmac_napi_poll_rx() and
stmmac_napi_poll_tx().

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# bba2556e 13-Apr-2021 Ong Boon Leong <boon.leong.ong@intel.com>

net: stmmac: Enable RX via AF_XDP zero-copy

This patch adds the support for receiving packet via AF_XDP zero-copy
mechanism.

XDP ZC uses 1:1 mapping of XDP buffer to receive packet, therefore the
use of split header is not used currently. The 'xdp_buff' is declared as
union together with a struct that contains 'page', 'addr' and
'page_offset' that are associated with primary buffer.

RX buffers are now allocated either via page_pool or xsk pool. For RX
buffers from xsk_pool they are allocated and deallocated using below
functions:

* stmmac_alloc_rx_buffers_zc(struct stmmac_priv *priv, u32 queue)
* dma_free_rx_xskbufs(struct stmmac_priv *priv, u32 queue)

With above functions now available, we then extend the following driver
functions to support XDP ZC:
* stmmac_reinit_rx_buffers()
* __init_dma_rx_desc_rings()
* init_dma_rx_desc_rings()
* __free_dma_rx_desc_resources()

Note: stmmac_alloc_rx_buffers_zc() may return -ENOMEM due to RX XDP
buffer pool is not allocated (e.g. samples/bpf/xdpsock TX-only). But,
it is still ok to let TX XDP ZC to continue, therefore, the -ENOMEM
is silently ignored to let the driver succcessfully transition to XDP
ZC mode for the said RX and TX queue.

As XDP ZC buffer size is different, the DMA buffer size is required
to be reprogrammed accordingly for RX DMA/Queue that is populated with
XDP buffer from XSK pool.

Next, to add or remove per-queue XSK pool, stmmac_xdp_setup_pool()
will call stmmac_xdp_enable_pool() or stmmac_xdp_disable_pool()
that in-turn coordinates the tearing down and setting up RX ring via
RX buffers and descriptors removal and reallocation through
stmmac_disable_rx_queue() and stmmac_enable_rx_queue(). In addition,
stmmac_xsk_wakeup() is added to initiate XDP RX buffer replenishing
by signalling user application to add available XDP frames back to
FILL queue.

For RX processing using XDP zero-copy buffer, stmmac_rx_zc() is
introduced which is implemented with the assumption that RX split
header is disabled. For XDP verdict is XDP_PASS, the XDP buffer is
copied into a sk_buff allocated through stmmac_construct_skb_zc()
and sent to Linux network GRO inside stmmac_dispatch_skb_zc(). Free RX
buffers are then replenished using stmmac_rx_refill_zc()

v2: introduce __stmmac_disable_all_queues() to contain the original code
that does napi_disable() and then make stmmac_setup_tc_block_cb()
to use it. Move synchronize_rcu() into stmmac_disable_all_queues()
that eventually calls __stmmac_disable_all_queues(). Then,
make both stmmac_release() and stmmac_suspend() to use
stmmac_disable_all_queues(). Thanks David Miller for spotting the
synchronize_rcu() issue in v1 patch.

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 5fabb012 31-Mar-2021 Ong Boon Leong <boon.leong.ong@intel.com>

net: stmmac: Add initial XDP support

This patch adds the initial XDP support to stmmac driver. It supports
XDP_PASS, XDP_DROP and XDP_ABORTED actions. Upcoming patches will add
support for XDP_TX and XDP_REDIRECT.

To support XDP headroom, this patch adds page_offset into RX buffer and
change the dma_sync_single_for_device|cpu(). The DMA address used for
RX operation are changed to take into page_offset too. As page_pool
can handle dma_sync_single_for_device() on behalf of driver with
PP_FLAG_DMA_SYNC_DEV flag, we skip doing that in stmmac driver.

Current stmmac driver supports split header support (SPH) in RX but
the flexibility of splitting header and payload at different position
makes it very complex to be supported for XDP processing. In addition,
jumbo frame is not supported in XDP to keep the initial codes simple.

This patch has been tested with the sample app "xdp1" located in
samples/bpf directory for both SKB and Native (XDP) mode. The burst
traffic generated using pktgen_sample03_burst_single_flow.sh in
samples/pktgen directory.

Changes in v3:
- factor in xdp header and tail adjustment done by XDP program.
Thanks to Jakub Kicinski for pointing out the gap in v2.

Changes in v2:
- fix for "warning: variable 'len' set but not used" reported by lkp.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>