History log of /linux-master/drivers/net/ethernet/google/gve/gve_tx_dqo.c
Revision Date Author Comments
# f13697cc 22-Jan-2024 Shailend Chand <shailend@google.com>

gve: Switch to config-aware queue allocation

The new config-aware functions will help achieve the goal of being able
to allocate resources for new queues while there already are active
queues serving traffic.

These new functions work off of arbitrary queue allocation configs
rather than just the currently active config in priv, and they return
the newly allocated resources instead of writing them into priv.

Signed-off-by: Shailend Chand <shailend@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240122182632.1102721-4-shailend@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 18de1e51 16-Nov-2023 Eric Dumazet <edumazet@google.com>

gve: add gve_features_check()

It is suboptimal to attempt skb linearization from ndo_start_xmit()
if a gso skb has pathological layout, or if host stack does not have
access to the payload (TCP direct). Linearization of large skbs
can also fail under memory pressure.

We should instead have an ndo_features_check() so that we can
fallback to GSO, which is supported even for TCP direct,
and generally much more efficient (no payload copy).

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Bailey Forrest <bcf@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jeroen de Borst <jeroendb@google.com>
Cc: Praveen Kaligineedi <pkaligineedi@google.com>
Cc: Shailend Chand <shailend@google.com>
Cc: Ziwei Xiao <ziweixiao@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a6fb8d5a 04-Aug-2023 Rushil Gupta <rushilg@google.com>

gve: Tx path for DQO-QPL

Each QPL page is divided into GVE_TX_BUFS_PER_PAGE_DQO buffers.
When a packet needs to be transmitted, we break the packet into max
GVE_TX_BUF_SIZE_DQO sized chunks and transmit each chunk using a TX
descriptor.
We allocate the TX buffers from the free list in dqo_tx.
We store these TX buffer indices in an array in the pending_packet
structure.

The TX buffers are returned to the free list in dqo_compl after
receiving packet completion or when removing packets from miss
completions list.

Signed-off-by: Rushil Gupta <rushilg@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a695641c 22-May-2023 Coco Li <lixiaoyan@google.com>

gve: Support IPv6 Big TCP on DQ

Add support for using IPv6 Big TCP on DQ which can handle large TSO/GRO
packets. See https://lwn.net/Articles/895398/. This can improve the
throughput and CPU usage.

Perf test result:
ip -d link show $DEV
gso_max_size 185000 gso_max_segs 65535 tso_max_size 262143 tso_max_segs 65535 gro_max_size 185000

For performance, tested with neper using 9k MTU on hardware that supports 200Gb/s line rate.

In single streams when line rate is not saturated, we expect throughput improvements.
When the networking is performing at line rate, we expect cpu usage improvements.

Tcp_stream (unidirectional stream test, T=thread, F=flow):
skb=180kb, T=1, F=1, no zerocopy: throughput average=64576.88 Mb/s, sender stime=8.3, receiver stime=10.68
skb=64kb, T=1, F=1, no zerocopy: throughput average=64862.54 Mb/s, sender stime=9.96, receiver stime=12.67
skb=180kb, T=1, F=1, yes zerocopy: throughput average=146604.97 Mb/s, sender stime=10.61, receiver stime=5.52
skb=64kb, T=1, F=1, yes zerocopy: throughput average=131357.78 Mb/s, sender stime=12.11, receiver stime=12.25

skb=180kb, T=20, F=100, no zerocopy: throughput average=182411.37 Mb/s, sender stime=41.62, receiver stime=79.4
skb=64kb, T=20, F=100, no zerocopy: throughput average=182892.02 Mb/s, sender stime=57.39, receiver stime=72.69
skb=180kb, T=20, F=100, yes zerocopy: throughput average=182337.65 Mb/s, sender stime=27.94, receiver stime=39.7
skb=64kb, T=20, F=100, yes zerocopy: throughput average=182144.20 Mb/s, sender stime=47.06, receiver stime=39.01

Signed-off-by: Ziwei Xiao <ziweixiao@google.com>
Signed-off-by: Coco Li <lixiaoyan@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230522201552.3585421-1-ziweixiao@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# a5affbd8 17-Nov-2022 Jeroen de Borst <jeroendb@google.com>

gve: Handle alternate miss completions

The virtual NIC has 2 ways of indicating a miss-path
completion. This handles the alternate.

Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 504148fe 30-Jun-2022 Eric Dumazet <edumazet@google.com>

net: add skb_[inner_]tcp_all_headers helpers

Most drivers use "skb_transport_offset(skb) + tcp_hdrlen(skb)"
to compute headers length for a TCP packet, but others
use more convoluted (but equivalent) ways.

Add skb_tcp_all_headers() and skb_inner_tcp_all_headers()
helpers to harmonize this a bit.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 577d7685 29-Jun-2022 Jilin Yuan <yuanjilin@cdjrlc.com>

google/gve:fix repeated words in comments

Delete the redundant word 'a'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Link: https://lore.kernel.org/r/20220629130013.3273-1-yuanjilin@cdjrlc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 1e0083bd 28-Sep-2021 Arnd Bergmann <arnd@arndb.de>

gve: DQO: avoid unused variable warnings

The use of dma_unmap_addr()/dma_unmap_len() in the driver causes
multiple warnings when these macros are defined as empty, e.g.
in an ARCH=i386 allmodconfig build:

drivers/net/ethernet/google/gve/gve_tx_dqo.c: In function 'gve_tx_add_skb_no_copy_dqo':
drivers/net/ethernet/google/gve/gve_tx_dqo.c:494:40: error: unused variable 'buf' [-Werror=unused-variable]
494 | struct gve_tx_dma_buf *buf =

This is not how the NEED_DMA_MAP_STATE macros are meant to work,
as they rely on never using local variables or a temporary structure
like gve_tx_dma_buf.

Remote the gve_tx_dma_buf definition and open-code the contents
in all places to avoid the warning. This causes some rather long
lines but otherwise ends up making the driver slightly smaller.

Fixes: a57e5de476be ("gve: DQO: Add TX path")
Link: https://lore.kernel.org/netdev/20210723231957.1113800-1-bcf@google.com/
Link: https://lore.kernel.org/netdev/20210721151100.2042139-1-arnd@kernel.org/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>


# e8192476 24-Jun-2021 Bailey Forrest <bcf@google.com>

gve: Fix warnings reported for DQO patchset

https://patchwork.kernel.org/project/netdevbpf/list/?series=506637&state=*

- Remove unused variable
- Use correct integer type for string formatting.
- Remove `inline` in C files

Fixes: 9c1a59a2f4bc ("gve: DQO: Add ring allocation and initialization")
Fixes: a57e5de476be ("gve: DQO: Add TX path")
Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a57e5de4 24-Jun-2021 Bailey Forrest <bcf@google.com>

gve: DQO: Add TX path

TX SKBs will have their buffers DMA mapped with the device. Each buffer
will have at least one TX descriptor associated. Each SKB will also have
a metadata descriptor.

Each TX queue maintains an array of `gve_tx_pending_packet_dqo` objects.
Every TX SKB will have an associated pending_packet object. A TX SKB's
descriptors will use its pending_packet's index as the completion tag,
which will be returned on the TX completion queue.

The device implements a "flow-miss model". Most packets will simply
receive a packet completion. The flow-miss system may choose to process
a packet based on its contents. A TX packet which experiences a flow
miss would receive a miss completion followed by a later reinjection
completion. The miss-completion is received when the packet starts to be
processed by the flow-miss system and the reinjection completion is
received when the flow-miss system completes processing the packet and
sends it on the wire.

Notable mentions:

- Buffers may be freed after receiving the miss-completion, but in order
to avoid packet reordering, we do not complete the SKB until receiving
the reinjection completion.

- The driver must robustly handle the unlikely scenario where a miss
completion does not have an associated reinjection completion. This is
accomplished by maintaining a list of packets which have a pending
reinjection completion. After a short timeout (5 seconds), the
SKB and buffers are released and the pending_packet is moved to a
second list which has a longer timeout (60 seconds), where the
pending_packet will not be reused. When the longer timeout elapses,
the driver may assume the reinjection completion would never be
received and the pending_packet may be reused.

- Completion handling is triggered by an interrupt and is done in the
NAPI poll function. Because the TX path and completion exist in
different threading contexts they maintain their own lists for free
pending_packet objects. The TX path uses a lock-free approach to steal
the list from the completion path.

- Both the TSO context and general context descriptors have metadata
bytes. The device requires that if multiple descriptors contain the
same field, each descriptor must have the same value set for that
field.

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 9c1a59a2 24-Jun-2021 Bailey Forrest <bcf@google.com>

gve: DQO: Add ring allocation and initialization

Allocate the buffer and completion ring structures. Do not populate the
rings yet. That will happen in the respective rx and tx datapath
follow-on patches

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 5e8c5adf 24-Jun-2021 Bailey Forrest <bcf@google.com>

gve: DQO: Add core netdev features

Add napi netdev device registration, interrupt handling and initial tx
and rx polling stubs. The stubs will be filled in follow-on patches.

Also:
- LRO feature advertisement and handling
- Also update ethtool logic

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>