History log of /linux-master/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.h
Revision Date Author Comments
# 88e69af0 07-Sep-2023 Ratheesh Kannoth <rkannoth@marvell.com>

octeontx2-pf: Fix page pool cache index corruption.

The access to page pool `cache' array and the `count' variable
is not locked. Page pool cache access is fine as long as there
is only one consumer per pool.

octeontx2 driver fills in rx buffers from page pool in NAPI context.
If system is stressed and could not allocate buffers, refiiling work
will be delegated to a delayed workqueue. This means that there are
two cosumers to the page pool cache.

Either workqueue or IRQ/NAPI can be run on other CPU. This will lead
to lock less access, hence corruption of cache pool indexes.

To fix this issue, NAPI is rescheduled from workqueue context to refill
rx buffers.

Fixes: b2e3406a38f0 ("octeontx2-pf: Add support for page pool")
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 49fa4b0d 23-Aug-2023 Ratheesh Kannoth <rkannoth@marvell.com>

octeontx2-pf: fix page_pool creation fail for rings > 32k

octeontx2 driver calls page_pool_create() during driver probe()
and fails if queue size > 32k. Page pool infra uses these buffers
as shock absorbers for burst traffic. These pages are pinned down
over time as working sets varies, due to the recycling nature
of page pool, given page pool (currently) don't have a shrinker
mechanism, the pages remain pinned down in ptr_ring.
Instead of clamping page_pool size to 32k at
most, limit it even more to 2k to avoid wasting memory.

This have been tested on octeontx2 CN10KA hardware.
TCP and UDP tests using iperf shows no performance regressions.

Fixes: b2e3406a38f0 ("octeontx2-pf: Add support for page pool")
Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# b2e3406a 21-May-2023 Ratheesh Kannoth <rkannoth@marvell.com>

octeontx2-pf: Add support for page pool

Page pool for each rx queue enhance rx side performance
by reclaiming buffers back to each queue specific pool. DMA
mapping is done only for first allocation of buffers.
As subsequent buffers allocation avoid DMA mapping,
it results in performance improvement.

Image | Performance
------------ | ------------
Vannila | 3Mpps
|
with this | 42Mpps
change |
---------------------------

Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Link: https://lore.kernel.org/r/20230522020404.152020-1-rkannoth@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# ab6dddd2 13-May-2023 Subbaraya Sundeep <sbhatta@marvell.com>

octeontx2-pf: qos send queues management

Current implementation is such that the number of Send queues (SQs)
are decided on the device probe which is equal to the number of online
cpus. These SQs are allocated and deallocated in interface open and c
lose calls respectively.

This patch defines new APIs for initializing and deinitializing Send
queues dynamically and allocates more number of transmit queues for
QOS feature.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# f0dfc4c8 06-Nov-2022 Ratheesh Kannoth <rkannoth@marvell.com>

octeontx2-pf: Fix SQE threshold checking

Current way of checking available SQE count which is based on
HW updated SQB count could result in driver submitting an SQE
even before CQE for the previously transmitted SQE at the same
index is processed in NAPI resulting losing SKB pointers,
hence a leak. Fix this by checking a consumer index which
is updated once CQE is processed.

Fixes: 3ca6c4c882a7 ("octeontx2-pf: Add packet transmission support")
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Reviewed-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Link: https://lore.kernel.org/r/20221107033505.2491464-1-rkannoth@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# 53e99496 29-Jul-2022 Subbaraya Sundeep <sbhatta@marvell.com>

octeontx2-pf: Reduce minimum mtu size to 60

PTP messages like SYNC, FOLLOW_UP, DELAY_REQ are of size 58 bytes.
Using a minimum packet length as 64 makes NIX to pad 6 bytes of
zeroes while transmission. This is causing latest ptp4l application to
emit errors since length in PTP header and received packet are not same.
Padding upto 3 bytes is fine but more than that makes ptp4l to assume
the pad bytes as a TLV. Hence reduce the size to 60 from 64.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Link: https://lore.kernel.org/r/20220729092457.3850-1-naveenm@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 6e144b47 16-May-2022 Suman Ghosh <sumang@marvell.com>

octeontx2-pf: Add support for adaptive interrupt coalescing

Added support for adaptive IRQ coalescing. It uses net_dim
algorithm to find the suitable delay/IRQ count based on the
current packet rate.

Signed-off-by: Suman Ghosh <sumang@marvell.com>
Link: https://lore.kernel.org/r/20220517044055.876158-1-sumang@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 0182d078 10-Oct-2021 Subbaraya Sundeep <sbhatta@marvell.com>

octeontx2-pf: Simplify the receive buffer size calculation

This patch separates the logic of configuring hardware
maximum transmit frame size and receive frame size.
This simplifies the logic to calculate receive buffer
size and using cqe descriptor of different size.
Also additional size of skb_shared_info structure is
allocated for each receive buffer pointer given to
hardware which is not necessary. Hence change the
size calculation to remove the size of
skb_shared_info. Add a check for array out of
bounds while adding fragments to the network stack.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 06059a1a 29-Sep-2021 Geetha sowjanya <gakula@marvell.com>

octeontx2-pf: Add XDP support to netdev PF

Adds XDP_PASS, XDP_TX, XDP_DROP and XDP_REDIRECT support
for netdev PF.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# af3826db 27-Sep-2021 Geetha sowjanya <gakula@marvell.com>

octeontx2-pf: Use hardware register for CQE count

Current driver uses software CQ head pointer to poll on CQE
header in memory to determine if CQE is valid. Software needs
to make sure, that the reads of the CQE do not get re-ordered
so much that it ends up with an inconsistent view of the CQE.
To ensure that DMB barrier after read to first CQE cacheline
and before reading of the rest of the CQE is needed.
But having barrier for every CQE read will impact the performance,
instead use hardware CQ head and tail pointers to find the
valid number of CQEs.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# ef6c8da7 01-Sep-2021 Geetha sowjanya <gakula@marvell.com>

octeontx2-pf: cn10K: Reserve LMTST lines per core

This patch reserves the LMTST lines per cpu instead
of separate LMTST lines for NPA(buffer free) and NIX(sqe flush).
LMTST line of the core on which SQ or RQ is processed is used
for LMTST operation.

This patch also replace STEOR with STEORL release semantics and
updates driver name in ethtool file.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# cb0e3ec4 27-Aug-2021 Sunil Goutham <sgoutham@marvell.com>

octeontx2-pf: Fix inconsistent license text

Fixed inconsistent license text across the netdev
drivers.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 5c051207 29-Jun-2021 Geetha sowjanya <gakula@marvell.com>

octeontx2-pf: cn10k: Use runtime allocated LMTLINE region

The current driver uses static LMTST region allocated by firmware.
This memory gets populated as PF/VF BAR2. RVU PF/VF driver ioremap
the memory as device memory for NIX/NPA operation. Since the memory
is mapped as device memory we see performance degration. To address
this issue this patch implements runtime memory allocation.
RVU PF/VF allocates memory during device probe and share the base
address with RVU AF. RVU AF then configure the LMT MAP table
accordingly.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# ab58a416 11-Feb-2021 Hariprasad Kelam <hkelam@marvell.com>

octeontx2-pf: cn10k: Get max mtu supported from admin function

CN10K supports max MTU of 16K on LMAC links and 64k on LBK
links and Octeontx2 silicon supports 9K mtu on both links.
Get the same from nix_get_hw_info mbox message in netdev probe.

This patch also calculates receive buffer size required based
on the MTU set.

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 4c236d5d 11-Feb-2021 Geetha sowjanya <gakula@marvell.com>

octeontx2-pf: cn10k: Use LMTST lines for NPA/NIX operations

This patch adds support to use new LMTST lines for NPA batch free
and burst SQE flush. Adds new dev_hw_ops structure to hold platform
specific functions and create new files cn10k.c and cn10k.h.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# c9c12d33 24-Aug-2020 Aleksey Makarov <amakarov@marvell.com>

octeontx2-pf: Add support for PTP clock

This patch adds PTP clock and uses it in Octeontx2
network device. PTP clock uses mailbox calls to
access the hardware counter on the RVU side.

Co-developed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Aleksey Makarov <amakarov@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 7a36e491 09-May-2020 Kevin Hao <haokexin@gmail.com>

octeontx2-pf: Use the napi_alloc_frag() to alloc the pool buffers

In the current codes, the octeontx2 uses its own method to allocate
the pool buffers, but there are some issues in this implementation.
1. We have to run the otx2_get_page() for each allocation cycle and
this is pretty error prone. As I can see there is no invocation
of the otx2_get_page() in otx2_pool_refill_task(), this will leave
the allocated pages have the wrong refcount and may be freed wrongly.
2. It wastes memory. For example, if we only receive one packet in a
NAPI RX cycle, and then allocate a 2K buffer with otx2_alloc_rbuf()
to refill the pool buffers and leave the remain area of the allocated
page wasted. On a kernel with 64K page, 62K area is wasted.

IMHO it is really unnecessary to implement our own method for the
buffers allocate, we can reuse the napi_alloc_frag() to simplify
our code.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# d45d8979 27-Jan-2020 Christina Jacob <cjacob@marvell.com>

octeontx2-pf: Add basic ethtool support

This patch adds ethtool support for
- Driver stats, Tx/Rx perqueue and CGX LMAC stats
- Set/show Rx/Tx queue count
- Set/show Rx/Tx ring sizes
- Set/show IRQ coalescing parameters

Signed-off-by: Christina Jacob <cjacob@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 86d74760 27-Jan-2020 Sunil Goutham <sgoutham@marvell.com>

octeontx2-pf: TCP segmentation offload support

Adds TCP segmentation offload (TSO) support. First version
of the silicon didn't support TSO offload, for this driver
level TSO support is added.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 4ff7d148 27-Jan-2020 Geetha sowjanya <gakula@marvell.com>

octeontx2-pf: Error handling support

HW reports many errors on the receive and transmit paths.
Such as incorrect queue configuration, pkt transmission errors,
LMTST instruction errors, transmit queue full etc. These are reported
via QINT interrupt. Most of the errors are fatal and needs
reinitialization.

Also added support to allocate receive buffers in non-atomic context
when allocation fails in NAPI context.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Aleksey Makarov <amakarov@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 34bfe0eb 27-Jan-2020 Sunil Goutham <sgoutham@marvell.com>

octeontx2-pf: MTU, MAC and RX mode config support

This patch addes support to change interface MTU, MAC address
retrieval and config, RX mode ie unicast, multicast and promiscuous.
Also added link loopback support

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 3ca6c4c8 27-Jan-2020 Sunil Goutham <sgoutham@marvell.com>

octeontx2-pf: Add packet transmission support

This patch adds the packet transmission support.
For a given skb prepares send queue descriptors (SQEs) and pushes them
to HW. Here driver doesn't maintain it's own SQ rings, SQEs are pushed
to HW using a silicon specific operations called LMTST. From the
instuction HW derives the transmit queue number and queues the SQE to
that queue. These LMTST instructions are designed to avoid queue
maintenance in SW and lockless behavior ie when multiple cores are trying
to add SQEs to same queue then HW will takecare of serialization, no need
for SW to hold locks.

Also supports scatter/gather.

Co-developed-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# abe02543 27-Jan-2020 Sunil Goutham <sgoutham@marvell.com>

octeontx2-pf: Receive packet handling support

Added receive packet handling (NAPI) support, error stats, RX_ALL
capability config option to passon error pkts to stack upon user request.

In subsequent patches these error stats will be added to ethttool.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 04a21ef3 27-Jan-2020 Sunil Goutham <sgoutham@marvell.com>

octeontx2-pf: Setup interrupts and NAPI handler

Completion queue (CQ) is the one with which HW notifies SW on a packet
reception or transmission. Each of the RQ and SQ are mapped to a unique
CQ and again both CQs are mapped to same interrupt ie the CINT. So that
each core has one interrupt source in whose handler both Rx and Tx
notifications are processed.

Also
- Registered a NAPI handler for the CINT.
- Setup coalescing parameters.
- IRQ affinity hints etc

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# caa2da34 27-Jan-2020 Sunil Goutham <sgoutham@marvell.com>

octeontx2-pf: Initialize and config queues

This patch does the initialization of all queues ie the
receive buffer pools, receive and transmit queues, completion
or notification queues etc. Allocates all required resources
(eg transmit schedulers, receive buffers etc) and configures
them for proper functioning of queues. Also sets up receive
queue's RED dropping levels.

Co-developed-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>