History log of /linux-master/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
Revision Date Author Comments
# 0ca6755f 03-Jan-2024 Aniruddha Paul <aniruddha.paul@intel.com>

ice: Add a new counter for Rx EIPE errors

HW incorrectly reports EIPE errors on encapsulated packets
with L2 padding inside inner packet. HW shows outer UDP/IPV4
packet checksum errors as part of the EIPE flags of the
Rx descriptor. These are reported only if checksum offload
is enabled and L3/L4 parsed flag is valid in Rx descriptor.

When that error is reported by HW, we don't act on it
instead of incrementing main Rx errors statistic as it
would normally happen.

Add a new statistic to count these errors since we still want
to print them.

Signed-off-by: Aniruddha Paul <aniruddha.paul@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Jan Glaza <jan.glaza@intel.com>
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>


# b591137c 05-Dec-2023 Larysa Zaremba <larysa.zaremba@intel.com>

ice: use VLAN proto from ring packet context in skb path

VLAN proto, used in ice XDP hints implementation is stored in ring packet
context. Utilize this value in skb VLAN processing too instead of checking
netdev features.

At the same time, use vlan_tci instead of vlan_tag in touched code,
because VLAN tag often refers to VLAN proto and VLAN TCI combined,
while in the code we clearly store only VLAN TCI.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/20231205210847.28460-12-larysa.zaremba@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 714ed949 05-Dec-2023 Larysa Zaremba <larysa.zaremba@intel.com>

ice: Implement VLAN tag hint

Implement .xmo_rx_vlan_tag callback to allow XDP code to read
packet's VLAN tag.

At the same time, use vlan_tci instead of vlan_tag in touched code,
because VLAN tag often refers to VLAN proto and VLAN TCI combined,
while in the code we clearly store only VLAN TCI.

Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Link: https://lore.kernel.org/r/20231205210847.28460-11-larysa.zaremba@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 0e6a7b09 05-Dec-2023 Larysa Zaremba <larysa.zaremba@intel.com>

ice: Support RX hash XDP hint

RX hash XDP hint requests both hash value and type.
Type is XDP-specific, so we need a separate way to map
these values to the hardware ptypes, so create a lookup table.

Instead of creating a new long list, reuse contents
of ice_decode_rx_desc_ptype[] through preprocessor.

Current hash type enum does not contain ICMP packet type,
but ice devices support it, so also add a new type into core code.

Then use previously refactored code and create a function
that allows XDP code to read RX hash.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Link: https://lore.kernel.org/r/20231205210847.28460-7-larysa.zaremba@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 9031d5f4 05-Dec-2023 Larysa Zaremba <larysa.zaremba@intel.com>

ice: Support HW timestamp hint

Use previously refactored code and create a function
that allows XDP code to read HW timestamp.

Also, introduce packet context, where hints-related data will be stored.
ice_xdp_buff contains only a pointer to this structure, to avoid copying it
in ZC mode later in the series.

HW timestamp is the first supported hint in the driver,
so also add xdp_metadata_ops.

Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Link: https://lore.kernel.org/r/20231205210847.28460-6-larysa.zaremba@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 6b62a421 05-Dec-2023 Larysa Zaremba <larysa.zaremba@intel.com>

ice: Make ptype internal to descriptor info processing

Currently, rx_ptype variable is used only as an argument
to ice_process_skb_fields() and is computed
just before the function call.

Therefore, there is no reason to pass this value as an argument.
Instead, remove this argument and compute the value directly inside
ice_process_skb_fields() function.

Also, separate its calculation into a short function, so the code
can later be reused in .xmo_() callbacks.

Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Link: https://lore.kernel.org/r/20231205210847.28460-4-larysa.zaremba@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 3310aad2 05-Dec-2023 Larysa Zaremba <larysa.zaremba@intel.com>

ice: make RX HW timestamp reading code more reusable

Previously, we only needed RX HW timestamp in skb path,
hence all related code was written with skb in mind.
But with the addition of XDP hints via kfuncs to the ice driver,
the same logic will be needed in .xmo_() callbacks.

Put generic process of reading RX HW timestamp from a descriptor
into a separate function.
Move skb-related code into another source file.

Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Link: https://lore.kernel.org/r/20231205210847.28460-3-larysa.zaremba@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 9244384e 05-Dec-2023 Larysa Zaremba <larysa.zaremba@intel.com>

ice: make RX hash reading code more reusable

Previously, we only needed RX hash in skb path,
hence all related code was written with skb in mind.
But with the addition of XDP hints via kfuncs to the ice driver,
the same logic will be needed in .xmo_() callbacks.

Separate generic process of reading RX hash from a descriptor
into a separate function.

Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Link: https://lore.kernel.org/r/20231205210847.28460-2-larysa.zaremba@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 7f04bd10 08-Sep-2023 Sebastian Andrzej Siewior <bigeasy@linutronix.de>

net: Tree wide: Replace xdp_do_flush_map() with xdp_do_flush().

xdp_do_flush_map() is deprecated and new code should use xdp_do_flush()
instead.

Replace xdp_do_flush_map() with xdp_do_flush().

Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Cc: Clark Wang <xiaoning.wang@nxp.com>
Cc: Claudiu Manoil <claudiu.manoil@nxp.com>
Cc: David Arinzon <darinzon@amazon.com>
Cc: Edward Cree <ecree.xilinx@gmail.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Jassi Brar <jaswinder.singh@linaro.org>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: John Crispin <john@phrozen.org>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Lorenzo Bianconi <lorenzo@kernel.org>
Cc: Louis Peens <louis.peens@corigine.com>
Cc: Marcin Wojtas <mw@semihalf.com>
Cc: Mark Lee <Mark-MC.Lee@mediatek.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Cc: Noam Dagan <ndagan@amazon.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Saeed Bishara <saeedb@amazon.com>
Cc: Saeed Mahameed <saeedm@nvidia.com>
Cc: Sean Wang <sean.wang@mediatek.com>
Cc: Shay Agroskin <shayagr@amazon.com>
Cc: Shenwei Wang <shenwei.wang@nxp.com>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Cc: Vladimir Oltean <vladimir.oltean@nxp.com>
Cc: Wei Fang <wei.fang@nxp.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Arthur Kiyanovski <akiyano@amazon.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://lore.kernel.org/r/20230908143215.869913-2-bigeasy@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 66ceaa4c 13-Mar-2023 Jesse Brandeburg <jesse.brandeburg@intel.com>

ice: fix W=1 headers mismatch

make modules W=1 returns:
.../ice/ice_txrx_lib.c:448: warning: Function parameter or member 'first_idx' not described in 'ice_finalize_xdp_rx'
.../ice/ice_txrx.c:948: warning: Function parameter or member 'ntc' not described in 'ice_get_rx_buf'
.../ice/ice_txrx.c:1038: warning: Excess function parameter 'rx_buf' description in 'ice_construct_skb'

Fix these warnings by adding and deleting the deviant arguments.

Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side")
Fixes: d7956d81f150 ("ice: Pull out next_to_clean bump out of ice_put_rx_buf()")
CC: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>


# ad07f29b 10-Feb-2023 Alexander Lobakin <alexandr.lobakin@intel.com>

ice: Micro-optimize .ndo_xdp_xmit() path

After the recent mbuf changes, ice_xmit_xdp_ring() became a 3-liner.
It makes no sense to keep it global in a different file than its caller.
Move it just next to the sole call site and mark static. Also, it
doesn't need a full xdp_convert_frame_to_buff(). Save several cycles
and fill only the fields used by __ice_xmit_xdp_ring() later on.
Finally, since it doesn't modify @xdpf anyhow, mark the argument const
to save some more (whole -11 bytes of .text! :D).

Thanks to 1 jump less and less calcs as well, this yields as many as
6.7 Mpps per queue. `xdp.data_hard_start = xdpf` is fully intentional
again (see xdp_convert_buff_to_frame()) and just works when there are
no source device's driver issues.

Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20230210170618.1973430-7-alexandr.lobakin@intel.com


# 055d0920 10-Feb-2023 Alexander Lobakin <alexandr.lobakin@intel.com>

ice: Fix freeing XDP frames backed by Page Pool

As already mentioned, freeing any &xdp_frame via page_frag_free() is
wrong, as it assumes the frame is backed by either an order-0 page or
a page with no "patrons" behind them, while in fact frames backed by
Page Pool can be redirected to a device, which's driver doesn't use it.
Keep storing a pointer to the raw buffer and then freeing it
unconditionally via page_frag_free() for %XDP_TX frames, but introduce
a separate type in the enum for frames coming through .ndo_xdp_xmit(),
and free them via xdp_return_frame_bulk(). Note that saving xdpf as
xdp_buff->data_hard_start is intentional and is always true when
everything is configured properly.
After this change, %XDP_REDIRECT from a Page Pool based driver to ice
becomes zero-alloc as it should be and horrendous 3.3 Mpps / queue
turn into 6.6, hehe.

Let it go with no "Fixes:" tag as it spans across good 5+ commits and
can't be trivially backported.

Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20230210170618.1973430-6-alexandr.lobakin@intel.com


# aa1d3faf 10-Feb-2023 Alexander Lobakin <alexandr.lobakin@intel.com>

ice: Robustify cleaning/completing XDP Tx buffers

When queueing frames from a Page Pool for redirecting to a device backed
by the ice driver, `perf top` shows heavy load on page_alloc() and
page_frag_free(), despite that on a properly working system it must be
fully or at least almost zero-alloc. The problem is in fact a bit deeper
and raises from how ice cleans up completed Tx buffers.

The story so far: when cleaning/freeing the resources related to
a particular completed Tx frame (skbs, DMA mappings etc.), ice uses some
heuristics only without setting any type explicitly (except for dummy
Flow Director packets, which are marked via ice_tx_buf::tx_flags).
This kinda works, but only up to some point. For example, currently ice
assumes that each frame coming to __ice_xmit_xdp_ring(), is backed by
either plain order-0 page or plain page frag, while it may also be
backed by Page Pool or any other possible memory models introduced in
future. This means any &xdp_frame must be freed properly via
xdp_return_frame() family with no assumptions.

In order to do that, the whole heuristics must be replaced with setting
the Tx buffer/frame type explicitly, just how it's always been done via
an enum. Let us reuse 16 bits from ::tx_flags -- 1 bit-and instr won't
hurt much -- especially given that sometimes there was a check for
%ICE_TX_FLAGS_DUMMY_PKT, which is now turned from a flag to an enum
member. The rest of the changes is straightforward and most of it is
just a conversion to rely now on the type set in &ice_tx_buf rather than
to some secondary properties.
For now, no functional changes intended, the change only prepares the
ground for starting freeing XDP frames properly next step. And it must
be done atomically/synchronously to not break stuff.

Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20230210170618.1973430-5-alexandr.lobakin@intel.com


# 923096b5 10-Feb-2023 Alexander Lobakin <alexandr.lobakin@intel.com>

ice: Remove two impossible branches on XDP Tx cleaning

The tagged commit started sending %XDP_TX frames from XSk Rx ring
directly without converting it to an &xdp_frame. However, when XSk is
enabled on a queue pair, it has its separate Tx cleaning functions, so
neither ice_clean_xdp_irq() nor ice_unmap_and_free_tx_buf() ever happens
there.
Remove impossible branches in order to reduce the diffstat of the
upcoming change.

Fixes: a24b4c6e9aab ("ice: xsk: Do not convert to buff to frame for XDP_TX")
Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20230210170618.1973430-4-alexandr.lobakin@intel.com


# 0bd939b6 10-Feb-2023 Alexander Lobakin <alexandr.lobakin@intel.com>

ice: Fix XDP Tx ring overrun

Sometimes, under heavy XDP Tx traffic, e.g. when using XDP traffic
generator (%BPF_F_TEST_XDP_LIVE_FRAMES), the machine can catch OOM due
to the driver not freeing all of the pages passed to it by
.ndo_xdp_xmit().
Turned out that during the development of the tagged commit, the check,
which ensures that we have a free descriptor to queue a frame, moved
into the branch happening only when a buffer has frags. Otherwise, we
only run a cleaning cycle, but don't check anything.
ATST, there can be situations when the driver gets new frames to send,
but there are no buffers that can be cleaned/completed and the ring has
no free slots. It's very rare, but still possible (> 6.5 Mpps per ring).
The driver then fills the next buffer/descriptor, effectively
overwriting the data, which still needs to be freed.

Restore the check after the cleaning routine to make sure there is a
slot to queue a new frame. When there are frags, there still will be a
separate check that we can place all of them, but if the ring is full,
there's no point in wasting any more time.

(minor: make `!ready_frames` unlikely since it happens ~1-2 times per
billion of frames)

Fixes: 3246a10752a7 ("ice: Add support for XDP multi-buffer on Tx side")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20230210170618.1973430-3-alexandr.lobakin@intel.com


# bc4db834 10-Feb-2023 Alexander Lobakin <alexandr.lobakin@intel.com>

ice: fix ice_tx_ring:: Xdp_tx_active underflow

xdp_tx_active is used to indicate whether an XDP ring has any %XDP_TX
frames queued to shortcut processing Tx cleaning for XSk-enabled queues.
When !XSk, it simply indicates whether the ring has any queued frames in
general.
It gets increased on each frame placed onto the ring and counts the
whole frame, not each frag. However, currently it gets decremented in
ice_clean_xdp_tx_buf(), which is called per each buffer, i.e. per each
frag. Thus, on completing multi-frag frames, an underflow happens.
Move the decrement to the outer function and do it once per frame, not
buf. Also, do that on the stack and update the ring counter after the
loop is done to save several cycles.
XSk rings are fine since there are no frags at the moment.

Fixes: 3246a10752a7 ("ice: Add support for XDP multi-buffer on Tx side")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20230210170618.1973430-2-alexandr.lobakin@intel.com


# a24b4c6e 31-Jan-2023 Maciej Fijalkowski <maciej.fijalkowski@intel.com>

ice: xsk: Do not convert to buff to frame for XDP_TX

Let us store pointer to xdp_buff that came from xsk_buff_pool on tx_buf
so that it will be possible to recycle it via xsk_buff_free() on Tx
cleaning side. This way it is not necessary to do expensive copy to
another xdp_buff backed by a newly allocated page.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Link: https://lore.kernel.org/bpf/20230131204506.219292-14-maciej.fijalkowski@intel.com


# 3246a107 31-Jan-2023 Maciej Fijalkowski <maciej.fijalkowski@intel.com>

ice: Add support for XDP multi-buffer on Tx side

Similarly as for Rx side in previous patch, logic on XDP Tx in ice
driver needs to be adjusted for multi-buffer support. Specifically, the
way how HW Tx descriptors are produced and cleaned.

Currently, XDP_TX works on strict ring boundaries, meaning it sets RS
bit (on producer side) / looks up DD bit (on consumer/cleaning side)
every quarter of the ring. It means that if for example multi buffer
frame would span across the ring quarter boundary (say that frame
consists of 4 frames and we start from 62 descriptor where ring is sized
to 256 entries), RS bit would be produced in the middle of multi buffer
frame, which would be a broken behavior as it needs to be set on the
last descriptor of the frame.

To make it work, set RS bit at the last descriptor from the batch of
frames that XDP_TX action was used on and make the first entry remember
the index of last descriptor with RS bit set. This way, cleaning side
can take the index of descriptor with RS bit, look up DD bit's presence
and clean from first entry to last.

In order to clean up the code base introduce the common ice_set_rs_bit()
which will return index of descriptor that got RS bit produced on so
that standard driver can store this within proper ice_tx_buf and ZC
driver can simply ignore return value.

Co-developed-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@linux.intel.com>
Signed-off-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@linux.intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Link: https://lore.kernel.org/bpf/20230131204506.219292-12-maciej.fijalkowski@intel.com


# 288ecf49 18-Nov-2022 Benjamin Mikailenko <benjamin.mikailenko@intel.com>

ice: Accumulate ring statistics over reset

Resets may occur with or without user interaction. For example, a TX hang
or reconfiguration of parameters will result in a reset. During reset, the
VSI is freed, freeing any statistics structures inside as well. This would
create an issue for the user where a reset happens in the background,
statistics set to zero, and the user checks ring statistics expecting them
to be populated.

To ensure this doesn't happen, accumulate ring statistics over reset.

Define a new ring statistics structure, ice_ring_stats. The new structure
lives in the VSI's parent, preserving ring statistics when VSI is freed.

1. Define a new structure vsi_ring_stats in the PF scope
2. Allocate/free stats only during probe, unload, or change in ring size
3. Replace previous ring statistics functionality with new structure

Signed-off-by: Benjamin Mikailenko <benjamin.mikailenko@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>


# 0d54d8f7 02-Dec-2021 Brett Creeley <brett.creeley@intel.com>

ice: Add hot path support for 802.1Q and 802.1ad VLAN offloads

Currently the driver only supports 802.1Q VLAN insertion and stripping.
However, once Double VLAN Mode (DVM) is fully supported, then both 802.1Q
and 802.1ad VLAN insertion and stripping will be supported. Unfortunately
the VSI context parameters only allow for one VLAN ethertype at a time
for VLAN offloads so only one or the other VLAN ethertype offload can be
supported at once.

To support this, multiple changes are needed.

Rx path changes:

[1] In DVM, the Rx queue context l2tagsel field needs to be cleared so
the outermost tag shows up in the l2tag2_2nd field of the Rx flex
descriptor. In Single VLAN Mode (SVM), the l2tagsel field should remain
1 to support SVM configurations.

[2] Modify the ice_test_staterr() function to take a __le16 instead of
the ice_32b_rx_flex_desc union pointer so this function can be used for
both rx_desc->wb.status_error0 and rx_desc->wb.status_error1.

[3] Add the new inline function ice_get_vlan_tag_from_rx_desc() that
checks if there is a VLAN tag in l2tag1 or l2tag2_2nd.

[4] In ice_receive_skb(), add a check to see if NETIF_F_HW_VLAN_STAG_RX
is enabled in netdev->features. If it is, then this is the VLAN
ethertype that needs to be added to the stripping VLAN tag. Since
ice_fix_features() prevents CTAG_RX and STAG_RX from being enabled
simultaneously, the VLAN ethertype will only ever be 802.1Q or 802.1ad.

Tx path changes:

[1] In DVM, the VLAN tag needs to be placed in the l2tag2 field of the Tx
context descriptor. The new define ICE_TX_FLAGS_HW_OUTER_SINGLE_VLAN was
added to the list of tx_flags to handle this case.

[2] When the stack requests the VLAN tag to be offloaded on Tx, the
driver needs to set either ICE_TX_FLAGS_HW_OUTER_SINGLE_VLAN or
ICE_TX_FLAGS_HW_VLAN, so the tag is inserted in l2tag2 or l2tag1
respectively. To determine which location to use, set a bit in the Tx
ring flags field during ring allocation that can be used to determine
which field to use in the Tx descriptor. In DVM, always use l2tag2,
and in SVM, always use l2tag1.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>


# 59e92bfe 25-Jan-2022 Maciej Fijalkowski <maciej.fijalkowski@intel.com>

ice: xsk: Borrow xdp_tx_active logic from i40e

One of the things that commit 5574ff7b7b3d ("i40e: optimize AF_XDP Tx
completion path") introduced was the @xdp_tx_active field. Its usage
from i40e can be adjusted to ice driver and give us positive performance
results.

If the descriptor that @next_dd points to has been sent by HW (its DD
bit is set), then we are sure that at least quarter of the ring is ready
to be cleaned. If @xdp_tx_active is 0 which means that related xdp_ring
is not used for XDP_{TX, REDIRECT} workloads, then we know how many XSK
entries should placed to completion queue, IOW walking through the ring
can be skipped.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20220125160446.78976-9-maciej.fijalkowski@intel.com


# 3dd411ef 25-Jan-2022 Maciej Fijalkowski <maciej.fijalkowski@intel.com>

ice: Make Tx threshold dependent on ring length

XDP_TX workloads use a concept of Tx threshold that indicates the
interval of setting RS bit on descriptors which in turn tells the HW to
generate an interrupt to signal the completion of Tx on HW side. It is
currently based on a constant value of 32 which might not work out well
for various sizes of ring combined with for example batch size that can
be set via SO_BUSY_POLL_BUDGET.

Internal tests based on AF_XDP showed that most convenient setup of
mentioned threshold is when it is equal to quarter of a ring length.

Make use of recently introduced ICE_RING_QUARTER macro and use this
value as a substitute for ICE_TX_THRESH.

Align also ethtool -G callback so that next_dd/next_rs fields are up to
date in terms of the ring size.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20220125160446.78976-5-maciej.fijalkowski@intel.com


# c1e5da5d 25-Oct-2021 Wojciech Drewek <wojciech.drewek@intel.com>

ice: improve switchdev's slow-path

In current switchdev implementation, every VF PR is assigned to
individual ring on switchdev ctrl VSI. For slow-path traffic, there
is a mapping VF->ring done in software based on src_vsi value (by
calling ice_eswitch_get_target_netdev function).

With this change, HW solution is introduced which is more
efficient. For each VF, src MAC (VF's MAC) filter will be created,
which forwards packets to the corresponding switchdev ctrl VSI queue
based on src MAC address.

This filter has to be removed and then replayed in case of
resetting one VF. Keep information about this rule in repr->mac_rule,
thanks to that we know which rule has to be removed and replayed
for a given VF.

In case of CORE/GLOBAL all rules are removed
automatically. We have to take care of readding them. This is done
by ice_replay_vsi_adv_rule.

When driver leaves switchdev mode, remove all advanced rules
from switchdev ctrl VSI. This is done by ice_rem_adv_rule_for_vsi.

Flag repr->rule_added is needed because in some cases reset
might be triggered before VF sends request to add MAC.

Co-developed-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>


# b6459415 28-Dec-2021 Jakub Kicinski <kuba@kernel.org>

net: Don't include filter.h from net/sock.h

sock.h is pretty heavily used (5k objects rebuilt on x86 after
it's touched). We can drop the include of filter.h from it and
add a forward declaration of struct sk_filter instead.
This decreases the number of rebuilt objects when bpf.h
is touched from ~5k to ~1k.

There's a lot of missing includes this was masking. Primarily
in networking tho, this time.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/bpf/20211229004913.513372-1-kuba@kernel.org


# 22bf877e 19-Aug-2021 Maciej Fijalkowski <maciej.fijalkowski@intel.com>

ice: introduce XDP_TX fallback path

Under rare circumstances there might be a situation where a requirement
of having XDP Tx queue per CPU could not be fulfilled and some of the Tx
resources have to be shared between CPUs. This yields a need for placing
accesses to xdp_ring inside a critical section protected by spinlock.
These accesses happen to be in the hot path, so let's introduce the
static branch that will be triggered from the control plane when driver
could not provide Tx queue dedicated for XDP on each CPU.

Currently, the design that has been picked is to allow any number of XDP
Tx queues that is at least half of a count of CPUs that platform has.
For lower number driver will bail out with a response to user that there
were not enough Tx resources that would allow configuring XDP. The
sharing of rings is signalled via static branch enablement which in turn
indicates that lock for xdp_ring accesses needs to be taken in hot path.

Approach based on static branch has no impact on performance of a
non-fallback path. One thing that is needed to be mentioned is a fact
that the static branch will act as a global driver switch, meaning that
if one PF got out of Tx resources, then other PFs that ice driver is
servicing will suffer. However, given the fact that HW that ice driver
is handling has 1024 Tx queues per each PF, this is currently an
unlikely scenario.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>


# 9610bd98 19-Aug-2021 Maciej Fijalkowski <maciej.fijalkowski@intel.com>

ice: optimize XDP_TX workloads

Optimize Tx descriptor cleaning for XDP. Current approach doesn't
really scale and chokes when multiple flows are handled.

Introduce two ring fields, @next_dd and @next_rs that will keep track of
descriptor that should be looked at when the need for cleaning arise and
the descriptor that should have the RS bit set, respectively.

Note that at this point the threshold is a constant (32), but it is
something that we could make configurable.

First thing is to get away from setting RS bit on each descriptor. Let's
do this only once NTU is higher than the currently @next_rs value. In
such case, grab the tx_desc[next_rs], set the RS bit in descriptor and
advance the @next_rs by a 32.

Second thing is to clean the Tx ring only when there are less than 32
free entries. For that case, look up the tx_desc[next_dd] for a DD bit.
This bit is written back by HW to let the driver know that xmit was
successful. It will happen only for those descriptors that had RS bit
set. Clean only 32 descriptors and advance the DD bit.

Actual cleaning routine is moved from ice_napi_poll() down to the
ice_xmit_xdp_ring(). It is safe to do so as XDP ring will not get any
SKBs in there that would rely on interrupts for the cleaning. Nice side
effect is that for rare case of Tx fallback path (that next patch is
going to introduce) we don't have to trigger the SW irq to clean the
ring.

With those two concepts, ring is kept at being almost full, but it is
guaranteed that driver will be able to produce Tx descriptors.

This approach seems to work out well even though the Tx descriptors are
produced in one-by-one manner. Test was conducted with the ice HW
bombarded with packets from HW generator, configured to generate 30
flows.

Xdp2 sample yields the following results:
<snip>
proto 17: 79973066 pkt/s
proto 17: 80018911 pkt/s
proto 17: 80004654 pkt/s
proto 17: 79992395 pkt/s
proto 17: 79975162 pkt/s
proto 17: 79955054 pkt/s
proto 17: 79869168 pkt/s
proto 17: 79823947 pkt/s
proto 17: 79636971 pkt/s
</snip>

As that sample reports the Rx'ed frames, let's look at sar output.
It says that what we Rx'ed we do actually Tx, no noticeable drops.
Average: IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil
Average: ens4f1 79842324.00 79842310.40 4678261.17 4678260.38 0.00 0.00 0.00 38.32

with tx_busy staying calm.

When compared to a state before:
Average: IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil
Average: ens4f1 90919711.60 42233822.60 5327326.85 2474638.04 0.00 0.00 0.00 43.64

it can be observed that the amount of txpck/s is almost doubled, meaning
that the performance is improved by around 90%. All of this due to the
drops in the driver, previously the tx_busy stat was bumped at a 7mpps
rate.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>


# eb087cd8 19-Aug-2021 Maciej Fijalkowski <maciej.fijalkowski@intel.com>

ice: propagate xdp_ring onto rx_ring

With rings being split, it is now convenient to introduce a pointer to
XDP ring within the Rx ring. For XDP_TX workloads this means that
xdp_rings array access will be skipped, which was executed per each
processed frame.

Also, read the XDP prog once per NAPI and if prog is present, set up the
local xdp_ring pointer. Reading prog a single time was discussed in [1]
with some concern raised by Toke around dispatcher handling and having
the need for going through the RCU grace period in the ndo_bpf driver
callback, but ice currently is torning down NAPI instances regardless of
the prog presence on VSI.

Although the pointer to XDP ring introduced to Rx ring makes things a
lot slimmer/simpler, I still feel that single prog read per NAPI
lifetime is beneficial.

Further patch that will introduce the fallback path will also get a
profit from that as xdp_ring pointer will be set during the XDP rings
setup.

[1]: https://lore.kernel.org/bpf/87k0oseo6e.fsf@toke.dk/

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>


# 0bb4f9ec 19-Aug-2021 Maciej Fijalkowski <maciej.fijalkowski@intel.com>

ice: unify xdp_rings accesses

There has been a long lasting issue of improper xdp_rings indexing for
XDP_TX and XDP_REDIRECT actions. Given that currently rx_ring->q_index
is mixed with smp_processor_id(), there could be a situation where Tx
descriptors are produced onto XDP Tx ring, but tail is never bumped -
for example pin a particular queue id to non-matching IRQ line.

Address this problem by ignoring the user ring count setting and always
initialize the xdp_rings array to be of num_possible_cpus() size. Then,
always use the smp_processor_id() as an index to xdp_rings array. This
provides serialization as at given time only a single softirq can run on
a particular CPU.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>


# e72bba21 19-Aug-2021 Maciej Fijalkowski <maciej.fijalkowski@intel.com>

ice: split ice_ring onto Tx/Rx separate structs

While it was convenient to have a generic ring structure that served
both Tx and Rx sides, next commits are going to introduce several
Tx-specific fields, so in order to avoid hurting the Rx side, let's
pull out the Tx ring onto new ice_tx_ring and ice_rx_ring structs.

Rx ring could be handled by the old ice_ring which would reduce the code
churn within this patch, but this would make things asymmetric.

Make the union out of the ring container within ice_q_vector so that it
is possible to iterate over newly introduced ice_tx_ring.

Remove the @size as it's only accessed from control path and it can be
calculated pretty easily.

Change definitions of ice_update_ring_stats and
ice_fetch_u64_stats_per_ring so that they are ring agnostic and can be
used for both Rx and Tx rings.

Sizes of Rx and Tx ring structs are 256 and 192 bytes, respectively. In
Rx ring xdp_rxq_info occupies its own cacheline, so it's the major
difference now.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>


# f5396b8a 19-Aug-2021 Grzegorz Nitka <grzegorz.nitka@intel.com>

ice: switchdev slow path

Slow path means allowing packet to go from uplink to representor
and from representor to correct VF on Rx site and from VF to
representor and to uplink on Tx site.

To accomplish this driver, has to set correct Tx descriptor. When
packet is sent from representor to VF, destination should be
set to VF VSI. When packet is sent from uplink port destination
should be uplink to bypass switch infrastructure and send packet
outside.

On Rx site driver should check source VSI field from Rx descriptor
and based on that forward packed to correct netdev. To allow
this there is a target netdevs table in control plane VSI
struct.

Co-developed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>


# dda90cb9 23-Feb-2021 Jesse Brandeburg <jesse.brandeburg@intel.com>

ice: report hash type such as L2/L3/L4

The hardware is reporting the type of the hash used for RSS
as a PTYPE field in the receive descriptor. Use this value to set
the skb packet hash type by extending the hash type table to
cover all 10-bits of possible values (requiring some variables
to be changed from u8 to u16), and then use that table to convert
to one of the possible values in enum pkt_hash_types.

While we're here, remove the unused ptype struct value, which
makes table init easier for the zero entries, and use ranged
initializer to remove a bunch of code (works with gcc and clang).

Without this change, the kernel will recalculate the hash in software,
which can consume extra CPU cycles.

Co-developed-by: Kiran Patil <kiran.patil@intel.com>
Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>


# 77a78115 09-Jun-2021 Jacob Keller <jacob.e.keller@intel.com>

ice: enable receive hardware timestamping

Add SIOCGHWTSTAMP and SIOCSHWTSTAMP ioctl handlers to respond to
requests to enable timestamping support. If the request is for enabling
Rx timestamps, set a bit in the Rx descriptors to indicate that receive
timestamps should be reported.

Hardware captures receive timestamps in the PHY which only captures part
of the timer, and reports only 40 bits into the Rx descriptor. The upper
32 bits represent the contents of GLTSYN_TIME_L at the point of packet
reception, while the lower 8 bits represent the upper 8 bits of
GLTSYN_TIME_0.

The networking and PTP stack expect 64 bit timestamps in nanoseconds. To
support this, implement some logic to extend the timestamps by using the
full PHC time.

If the Rx timestamp was captured prior to the PHC time, then the real
timestamp is

PHC - (lower_32_bits(PHC) - timestamp)

If the Rx timestamp was captured after the PHC time, then the real
timestamp is

PHC + (timestamp - lower_32_bits(PHC))

These calculations are correct as long as neither the PHC timestamp nor
the Rx timestamps are more than 2^32-1 nanseconds old. Further, we can
detect when the Rx timestamp is before or after the PHC as long as the
PHC timestamp is no more than 2^31-1 nanoseconds old.

In that case, we calculate the delta between the lower 32 bits of the
PHC and the Rx timestamp. If it's larger than 2^31-1 then the Rx
timestamp must have been captured in the past. If it's smaller, then the
Rx timestamp must have been captured after PHC time.

Add an ice_ptp_extend_32b_ts function that relies on a cached copy of
the PHC time and implements this algorithm to calculate the proper upper
32bits of the Rx timestamps.

Cache the PHC time periodically in all of the Rx rings. This enables
each Rx ring to simply call the extension function with a recent copy of
the PHC time. By ensuring that the PHC time is kept up to date
periodically, we ensure this algorithm doesn't use stale data and
produce incorrect results.

To cache the time, introduce a kworker and a kwork item to periodically
store the Rx time. It might seem like we should use the .do_aux_work
interface of the PTP clock. This doesn't work because all PFs must cache
this time, but only one PF owns the PTP clock device.

Thus, the ice driver will manage its own kthread instead of relying on
the PTP do_aux_work handler.

With this change, the driver can now report Rx timestamps on all
incoming packets.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>


# 9ded647a 05-Mar-2021 Gustavo A. R. Silva <gustavoars@kernel.org>

ice: Fix fall-through warnings for Clang

In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning
by explicitly adding a break statement instead of just letting the code
fall through to the next case.

Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>


# f73fc403 08-Jan-2021 Eric Dumazet <edumazet@google.com>

ice: drop dead code in ice_receive_skb()

napi_gro_receive() can never return GRO_DROP

GRO_DROP can only be returned from napi_gro_frags()
which is the other NAPI GRO entry point.

Followup patch will remove GRO_DROP, because drivers
are not supposed to call napi_gro_frags() if prior
napi_get_frags() has failed.

Note that I have left the gro_dropped variable. I leave to ice
maintainers the decision to further remove it from ethtool -S results.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# a8fffd7a 29-Jul-2020 Jesse Brandeburg <jesse.brandeburg@intel.com>

ice: add useful statistics

Display and count some useful hot-path statistics. The usefulness is as
follows:

- tx_restart: use to determine if the transmit ring size is too small or
if the transmit interrupt rate is too low.
- rx_gro_dropped: use to count drops from GRO layer, which previously were
completely uncounted when occurring.
- tx_busy: use to determine when the driver is miscounting number of
descriptors needed for an skb.
- tx_timeout: as our other drivers, count the number of times we've reset
due to timeout because the kernel only prints a warning once per netdev.

Several of these were already counted but not displayed.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>


# 1b698fa5 28-May-2020 Lorenzo Bianconi <lorenzo@kernel.org>

xdp: Rename convert_to_xdp_frame in xdp_convert_buff_to_frame

In order to use standard 'xdp' prefix, rename convert_to_xdp_frame
utility routine in xdp_convert_buff_to_frame and replace all the
occurrences

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/6344f739be0d1a08ab2b9607584c4d5478c8c083.1590698295.git.lorenzo@kernel.org


# 13f90b39 15-May-2020 Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>

ice: Refactor Rx checksum checks

We don't need both rx_status and rx_error parameters, as the latter is
a subset of the former. Remove rx_error completely and check the right bit
in rx_status.

Rename rx_status to rx_status0, and rx_status_err1 to
rx_status1. This naming more closely reflects the specification.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>


# 5757cc7c 07-May-2020 Tony Nguyen <anthony.l.nguyen@intel.com>

ice: Rename build_ctob to ice_build_ctob

To make the function easier to identify as being part of the ice driver,
prepend ice to the function name.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>


# 88865fc4 07-May-2020 Karol Kolacinski <karol.kolacinski@intel.com>

ice: Fix casting issues

Change min() macros to min_t() which has compare type specified and it
helps avoid precision loss.

In some cases there was precision loss during calls or assignments.
Some fields in structs were unnecessarily large and gave multiple
warnings.

There were also some minor type differences which are now fixed as well as
some cases where a simple cast was needed.

Callers were were passing data that is a u16 to
ice_sched_cfg_node_bw_alloc() but the function was truncating that to a u8.
Fix that by changing the function to take a u16.

Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>


# a4e82a81 06-May-2020 Tony Nguyen <anthony.l.nguyen@intel.com>

ice: Add support for tunnel offloads

Create a boost TCAM entry for each tunnel port in order to get a tunnel
PTYPE. Update netdev feature flags and implement the appropriate logic to
get and set values for hardware offloads.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>


# 168983a8 06-Feb-2020 Brett Creeley <brett.creeley@intel.com>

ice: Don't allow same value for Rx tail to be written twice

Currently we compare the value we are about to write to the Rx tail
register with the previous value of next_to_use. The problem with this
is we only write tail on 8 descriptor boundaries, but next_to_use is
updated whenever we clean Rx descriptors. Fix this by comparing the
value we are about to write to tail with the previously written tail
value. This will prevent duplicate Rx tail bumps.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>


# 0891d6d4 04-Nov-2019 Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com>

ice: Move common functions to ice_txrx_lib.c

In preparation of AF XDP, move functions that will be used both by skb and
zero-copy paths to a new file called ice_txrx_lib.c. This allows us to
avoid using ifdefs to control the staticness of said functions.

Move other functions (ice_rx_csum, ice_rx_hash and ice_ptype_to_htype)
called only by the moved ones to the new file as well.

Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>