Cross Reference: /linux-master/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c

History log of /linux-master/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
Revision	Date	Author	Comments
# 7f525acb	14-Feb-2024	Tariq Toukan <tariqt@nvidia.com>	net/mlx5e: Support per-mdev queue counter Each queue counter object counts some events (in hardware) for the RQs that are attached to it, like events of packet drops due to no receive WQE (rx_out_of_buffer). Each RQ can be attached to a queue counter only within the same vhca. To still cover all RQs with these counters, we create multiple instances, one per vhca. The result that's shown to the user is now the sum of all instances. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# f25e7b82	09-Feb-2024	Joe Damato <jdamato@fastly.com>	net/mlx5e: link NAPI instances to queues and IRQs Make mlx5 compatible with the newly added netlink queue GET APIs. Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20240209202312.30181-1-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# 90502d43	08-Feb-2024	Rahul Rameshbabu <rrameshbabu@nvidia.com>	net/mlx5e: Switch to using _bh variant of of spinlock API in port timestamping NAPI poll context The NAPI poll context is a softirq context. Do not use normal spinlock API in this context to prevent concurrency issues. Fixes: 3178308ad4ca ("net/mlx5e: Make tx_port_ts logic resilient to out-of-order CQEs") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> CC: Vadim Fedorenko <vadfed@meta.com>
# 3876638b	22-Nov-2023	Rahul Rameshbabu <rrameshbabu@nvidia.com>	net/mlx5e: Fix operation precedence bug in port timestamping napi_poll context Indirection (*) is of lower precedence than postfix increment (++). Logic in napi_poll context would cause an out-of-bound read by first increment the pointer address by byte address space and then dereference the value. Rather, the intended logic was to dereference first and then increment the underlying value. Fixes: 92214be5979c ("net/mlx5e: Update doorbell for port timestamping CQ before the software counter") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 3fbf6120	07-Jan-2024	Jakub Kicinski <kuba@kernel.org>	Revert "mlx5 updates 2023-12-20" Revert "net/mlx5: Implement management PF Ethernet profile" This reverts commit 22c4640698a1d47606b5a4264a584e8046641784. Revert "net/mlx5: Enable SD feature" This reverts commit c88c49ac9c18fb7c3fa431126de1d8f8f555e912. Revert "net/mlx5e: Block TLS device offload on combined SD netdev" This reverts commit 83a59ce0057b7753d7fbece194b89622c663b2a6. Revert "net/mlx5e: Support per-mdev queue counter" This reverts commit d72baceb92539a178d2610b0e9ceb75706a75b55. Revert "net/mlx5e: Support cross-vhca RSS" This reverts commit c73a3ab8fa6e93a783bd563938d7cf00d62d5d34. Revert "net/mlx5e: Let channels be SD-aware" This reverts commit e4f9686bdee7b4dd89e0ed63cd03606e4bda4ced. Revert "net/mlx5e: Create EN core HW resources for all secondary devices" This reverts commit c4fb94aa822d6c9d05fc3c5aee35c7e339061dc1. Revert "net/mlx5e: Create single netdev per SD group" This reverts commit e2578b4f983cfcd47837bbe3bcdbf5920e50b2ad. Revert "net/mlx5: SD, Add informative prints in kernel log" This reverts commit c82d360325112ccc512fc11a3b68cdcdf04a1478. Revert "net/mlx5: SD, Implement steering for primary and secondaries" This reverts commit 605fcce33b2d1beb0139b6e5913fa0b2062116b2. Revert "net/mlx5: SD, Implement devcom communication and primary election" This reverts commit a45af9a96740873db9a4b5bb493ce2ad81ccb4d5. Revert "net/mlx5: SD, Implement basic query and instantiation" This reverts commit 63b9ce944c0e26c44c42cdd5095c2e9851c1a8ff. Revert "net/mlx5: SD, Introduce SD lib" This reverts commit 4a04a31f49320d078b8078e1da4b0e2faca5dfa3. Revert "net/mlx5: Fix query of sd_group field" This reverts commit e04984a37398b3f4f5a79c993b94c6b1224184cc. Revert "net/mlx5e: Use the correct lag ports number when creating TISes" This reverts commit a7e7b40c4bc115dbf2a2bb453d7bbb2e0ea99703. There are some unanswered questions on the list, and we don't have any docs. Given the lack of replies so far and the fact that v6.8 merge window has started - let's revert this and revisit for v6.9. Link: https://lore.kernel.org/all/20231221005721.186607-1-saeed@kernel.org/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# d72baceb	08-Aug-2023	Tariq Toukan <tariqt@nvidia.com>	net/mlx5e: Support per-mdev queue counter Each queue counter object counts some events (in hardware) for the RQs that are attached to it, like events of packet drops due to no receive WQE (rx_out_of_buffer). Each RQ can be attached to a queue counter only within the same vhca. To still cover all RQs with these counters, we create multiple instances, one per vhca. The result that's shown to the user is now the sum of all instances. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# db52aa6d	04-Aug-2023	Tariq Toukan <tariqt@nvidia.com>	net/mlx5e: Decouple CQ from priv Make CQ struct and methods independent of "priv", use more basic arguments instead. This will ease the transition to netdev with multiple mdevs. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# b25bd37c	06-Aug-2023	Tariq Toukan <tariqt@nvidia.com>	net/mlx5: Move TISes from priv to mdev HW resources The transport interface send (TIS) object is responsible for performing all transport related operations of the transmit side. Messages from Send Queues get segmented and transmitted by the TIS including all transport required implications, e.g. in the case of large send offload, the TIS is responsible for the segmentation. These are stateless objects and can be used by multiple netdevs (e.g. representors) who share the same core device. Providing the TISes as a service from the core layer to the netdev layer reduces the number of replecated TIS objects (in case of multiple netdevs), and will ease the transition to netdev with multiple mdevs. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 92214be5	14-Nov-2023	Rahul Rameshbabu <rrameshbabu@nvidia.com>	net/mlx5e: Update doorbell for port timestamping CQ before the software counter Previously, mlx5e_ptp_poll_ts_cq would update the device doorbell with the incremented consumer index after the relevant software counters in the kernel were updated. In the mlx5e_sq_xmit_wqe context, this would lead to either overrunning the device CQ or exceeding the expected software buffer size in the device CQ if the device CQ size was greater than the software buffer size. Update the relevant software counter only after updating the device CQ consumer index in the port timestamping napi_poll context. Log: mlx5_core 0000:08:00.0: cq_err_event_notifier:517:(pid 0): CQ error on CQN 0x487, syndrome 0x1 mlx5_core 0000:08:00.0 eth2: mlx5e_cq_error_event: cqn=0x000487 event=0x04 Fixes: 1880bc4e4a96 ("net/mlx5e: Add TX port timestamp support") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20231114215846.5902-12-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# 53b836a4	08-Aug-2023	Rahul Rameshbabu <rrameshbabu@nvidia.com>	net/mlx5e: Add recovery flow for tx devlink health reporter for unhealthy PTP SQ A new check for the tx devlink health reporter is introduced for determining when the PTP port timestamping SQ is considered unhealthy. If there are enough CQEs considered never to be delivered, the space that can be utilized on the SQ decreases significantly, impacting performance and usability of the SQ. The health reporter is triggered when the number of likely never delivered port timestamping CQEs that utilize the space of the PTP SQ is greater than 93.75% of the total capacity of the SQ. A devlink health reporter recover method is also provided for this specific TX error context that restarts the PTP SQ. Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 3178308a	02-May-2023	Rahul Rameshbabu <rrameshbabu@nvidia.com>	net/mlx5e: Make tx_port_ts logic resilient to out-of-order CQEs Use a map structure for associating CQEs containing port timestamping information with the appropriate skb. Track order of WQEs submitted using a FIFO. Check if the corresponding port timestamping CQEs from the lookup values in the FIFO are considered dropped due to time elapsed. Return the lookup value to a freelist after consuming the skb. Reuse the freed lookup in future WQE submission iterations. The map structure uses an integer identifier for the key and returns an skb corresponding to that identifier. Embed the integer identifier in the WQE submitted to the WQ for the transmit path when the SQ is a PTP (port timestamping) SQ. The embedded identifier can then be queried using a field in the CQE of the corresponding port timestamping CQ. In the port timestamping napi_poll context, the identifier is queried from the CQE polled from CQ and used to lookup the corresponding skb from the WQE submit path. The skb reference is removed from map and then embedded with the port HW timestamp information from the CQE and eventually consumed. The metadata freelist FIFO is an array containing integer identifiers that can be pushed and popped in the FIFO. The purpose of this structure is bookkeeping what identifier values can safely be used in a subsequent WQE submission and should not contain identifiers that have still not been reaped by processing a corresponding CQE completion on the port timestamping CQ. The ts_cqe_pending_list structure is a combination of an array and linked list. The array is pre-populated with the nodes that will be added and removed from the head of the linked list. Each node contains the unique identifier value associated with the values submitted in the WQEs and retrieved in the port timestamping CQEs. When a WQE is submitted, the node in the array corresponding to the identifier popped from the metadata freelist is added to the end of the CQE pending list and is marked as "in-use". The node is removed from the linked list under two conditions. The first condition is that the corresponding port timestamping CQE is polled in the PTP napi_poll context. The second condition is that more than a second has elapsed since the DMA timestamp value corresponding to the WQE submission. When the first condition occurs, the "in-use" bit in the linked list node is cleared, and the resources corresponding to the WQE submission are then released. The second condition, however, indicates that the port timestamping CQE will likely never be delivered. It's not impossible for the device to post a CQE after an infinite amount of time though highly improbable. In order to be resilient to this improbable case, resources related to the corresponding WQE submission are still kept, the identifier value is not returned to the freelist, and the "in-use" bit is cleared on the node to indicate that it's no longer part of the linked list of "likely to be delivered" port timestamping CQE identifiers. A count for the number of port timestamping CQEs considered highly likely to never be delivered by the device is maintained. This count gets decremented in the unlikely event a port timestamping CQE considered unlikely to ever be delivered is polled in the PTP napi_poll context. Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# d543b649	29-Jun-2023	Zhengchao Shao <shaozhengchao@huawei.com>	net/mlx5e: fix memory leak in mlx5e_ptp_open When kvzalloc_node or kvzalloc failed in mlx5e_ptp_open, the memory pointed by "c" or "cparams" is not freed, which can lead to a memory leak. Fix by freeing the array in the error path. Fixes: 145e5637d941 ("net/mlx5e: Add TX PTP port object support") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 7aa50380	21-Feb-2023	Rahul Rameshbabu <rrameshbabu@nvidia.com>	net/mlx5e: Fix SQ wake logic in ptp napi_poll context Check in the mlx5e_ptp_poll_ts_cq context if the ptp tx sq should be woken up. Before change, the ptp tx sq may never wake up if the ptp tx ts skb fifo is full when mlx5e_poll_tx_cq checks if the queue should be woken up. Fixes: 1880bc4e4a96 ("net/mlx5e: Add TX port timestamp support") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# cf1cccae	30-Jan-2023	Gal Pressman <gal@nvidia.com>	net/mlx5e: Rename misleading skb_pc/cc references in ptp code The 'skb_pc/cc' naming is misleading as the values hold the producer/consumer indices (masked values), not the counters. Rename to 'skb_pi/ci'. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Adham Faris <afaris@nvidia.com>
# 3a50cf1e	02-Feb-2023	Vadim Fedorenko <vadfed@meta.com>	mlx5: fix possible ptp queue fifo use-after-free Fifo indexes are not checked during pop operations and it leads to potential use-after-free when poping from empty queue. Such case was possible during re-sync action. WARN_ON_ONCE covers future cases. There were out-of-order cqe spotted which lead to drain of the queue and use-after-free because of lack of fifo pointers check. Special check and counter are added to avoid resync operation if SKB could not exist in the fifo because of OOO cqe (skb_id must be between consumer and producer index). Fixes: 58a518948f60 ("net/mlx5e: Add resiliency for PTP TX port timestamp") Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# e435941b	02-Feb-2023	Vadim Fedorenko <vadfed@meta.com>	mlx5: fix skb leak while fifo resync and push During ptp resync operation SKBs were poped from the fifo but were never freed neither by napi_consume nor by dev_kfree_skb_any. Add call to napi_consume_skb to properly free SKBs. Another leak was happening because mlx5e_skb_fifo_has_room() had an error in the check. Comparing free running counters works well unless C promotes the types to something wider than the counter. In this case counters are u16 but the result of the substraction is promouted to int and it causes wrong result (negative value) of the check when producer have already overlapped but consumer haven't yet. Explicit cast to u16 fixes the issue. Fixes: 58a518948f60 ("net/mlx5e: Add resiliency for PTP TX port timestamp") Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 79efecb4	30-Aug-2022	Maxim Mikityanskiy <maximmi@nvidia.com>	net/mlx5e: Trigger NAPI after activating an SQ If an SQ is deactivated and reactivated again, some packets could be sent after MLX5E_SQ_STATE_ENABLED is cleared, but before netif_tx_stop_queue, meaning that NAPI might miss some completions. In order to handle them, make sure to trigger NAPI after SQ activation in all cases where it can be relevant. Regular SQs, XDP SQs and XSK SQs are good. Missing cases added: after recovery, after activating HTB SQs and after activating PTP SQs. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# b48b89f9	27-Sep-2022	Jakub Kicinski <kuba@kernel.org>	net: drop the weight argument from netif_napi_add We tell driver developers to always pass NAPI_POLL_WEIGHT as the weight to netif_napi_add(). This may be confusing to newcomers, drop the weight argument, those who really need to tweak the weight can use netif_napi_add_weight(). Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN Link: https://lore.kernel.org/r/20220927132753.750069-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# 81a0b241	31-Jan-2022	Lama Kayal <lkayal@nvidia.com>	net/mlx5e: Drop priv argument of ptp function in en_fs Both mlx5e_ptp_alloc_rx_fs and mlx5e_ptp_free_rx_fs only make use of two priv member, pass them directly instead. This will help dropping priv from all en_fs file. Signed-off-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 4e0ecc17	25-Jan-2022	Lama Kayal <lkayal@nvidia.com>	net/mlx5e: Decouple fs_tt_redirect from en.h Make flow steering files fs_tt_redirect.c/h independent of en.h such that it goes through the flow steering API only. Make error reports be via mlx5_core API instead of netdev_err API, this to ensure a safe decoupling from en.h, and prevent redundant argument passing. Signed-off-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# f52f2fae	10-Jan-2022	Lama Kayal <lkayal@nvidia.com>	net/mlx5e: Introduce flow steering API Move mlx5e_flow_steering struct to fs_en.c to make it private. Introduce flow_steering API and let other files go through it. Signed-off-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# af8bbf73	09-Jan-2022	Lama Kayal <lkayal@nvidia.com>	net/mlx5e: Convert mlx5e_flow_steering member of mlx5e_priv to pointer Make mlx5e_flow_steering member of mlx5e_priv a pointer. Add dynamic allocation respectively. Allocate fs for all profiles when initializing profile, symmetrically deallocate at profile cleanup. Signed-off-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 58a51894	04-Jul-2022	Aya Levin <ayal@nvidia.com>	net/mlx5e: Add resiliency for PTP TX port timestamp PTP TX port timestamp relies on receiving 2 CQEs for each outgoing packet (WQE). The regular CQE has a less accurate timestamp than the wire CQE. On link change, the wire CQE may get lost. Let the driver detect and restore the relation between the CQEs, and re-sync after timeout. Add resiliency for this as follows: add id (producer counter) into the WQE's metadata. This id will be received in the wire CQE (in wqe_counter field). On handling the wire CQE, if there is no match, replay the PTP application with the time-stamp from the regular CQE and restore the sync between the CQEs and their SKBs. This patch adds 2 ptp counters: 1) ptp_cq0_resync_event: number of times a mismatch was detected between the regular CQE and the wire CQE. 2) ptp_cq0_resync_cqe: total amount of missing wire CQEs. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 2e642afb	15-Apr-2022	Maxim Mikityanskiy <maximmi@nvidia.com>	net/mlx5e: Disable softirq in mlx5e_activate_rq to avoid race condition When the driver activates the channels, it assumes NAPI isn't running yet. mlx5e_activate_rq posts a NOP WQE to ICOSQ to trigger a hardware interrupt and start NAPI, which will run mlx5e_alloc_rx_mpwqe and post UMR WQEs to ICOSQ to be able to receive packets with striding RQ. Unfortunately, a race condition is possible if NAPI is triggered by something else (for example, TX) at a bad timing, before mlx5e_activate_rq finishes. In this case, mlx5e_alloc_rx_mpwqe may post UMR WQEs to ICOSQ, and with the bad timing, the wqe_info of the first UMR may be overwritten by the wqe_info of the NOP posted by mlx5e_activate_rq. The consequence is that icosq->db.wqe_info[0].num_wqebbs will be changed from MLX5E_UMR_WQEBBS to 1, disrupting the integrity of the array-based linked list in wqe_info[]. mlx5e_poll_ico_cq will hang in an infinite loop after processing wqe_info[0], because after the corruption, the next item to be processed will be wqe_info[1], which is filled with zeros, and `sqcc += wi->num_wqebbs` will never move further. This commit fixes this race condition by using async_icosq to post the NOP and trigger the interrupt. async_icosq is always protected with a spinlock, eliminating the race condition. Fixes: bc77b240b3c5 ("net/mlx5e: Add fragmented memory support for RX multi packet WQE") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reported-by: Karsten Nielsen <karsten@foo-bar.dk> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# c27bd171	17-Jan-2022	Aya Levin <ayal@nvidia.com>	net/mlx5e: Read max WQEBBs on the SQ from firmware Prior to this patch the maximal value for max WQEBBs (WQE Basic Blocks, where WQE is a Work Queue Element) on the TX side was assumed to be 16 (fixed value). All firmware versions till today comply to this. In order to be more flexible and resilient, read from FW the corresponding: max_wqe_sz_sq. This value describes the maximum WQE size given in bytes, thus max WQEBBs is given by the division in WQEBB's byte size. The driver uses the top between 16 and the division result. This ensures synchronization between driver and firmware and avoids unexpected behavior. Store this value on the different SQs (Send Queues) for easy access. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 9536923d	19-May-2021	Tariq Toukan <tariqt@nvidia.com>	net/mlx5e: Remove unused tstamp SQ field Remove tstamp pointer in mlx5e_txqsq as it's no longer used after commit 7c39afb394c7 ("net/mlx5: PTP code migration to driver core section"). Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 6c72cb05	04-Dec-2021	Tariq Toukan <tariqt@nvidia.com>	net/mlx5e: Use bitmap field for profile features Use a features bitmap field in mlx5e_profile to declare profile support state of the different features. Let it replace the existing rx_ptp_support boolean. It will be extended to cover more features in a downstream patch. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 83fec3f1	12-Oct-2021	Aharon Landau <aharonl@nvidia.com>	RDMA/mlx5: Replace struct mlx5_core_mkey by u32 key In mlx5_core and vdpa there is no use of mlx5_core_mkey members except for the key itself. As preparation for moving mlx5_core_mkey to mlx5_ib, the occurrences of struct mlx5_core_mkey in all modules except for mlx5_ib are replaced by a u32 key. Signed-off-by: Aharon Landau <aharonl@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
# dd1979cf	29-Aug-2021	Lama Kayal <lkayal@nvidia.com>	net/mlx5e: Fix the presented RQ index in PTP stats PTP-RQ counters title format contains PTP-RQ identifier, which is mistakenly not passed to sprinft(). This leads to unexpected garbage values instead. This patch fixes it. Before applying the patch: ethtool -S eth3 \| grep ptp_rq ptp_rq15_packets: 0 ptp_rq8_bytes: 0 ptp_rq6_csum_complete: 0 ptp_rq14_csum_complete_tail: 0 ptp_rq3_csum_complete_tail_slow : 0 ptp_rq9_csum_unnecessary: 0 ptp_rq1_csum_unnecessary_inner: 0 ptp_rq7_csum_none: 0 ptp_rq10_xdp_drop: 0 ptp_rq9_xdp_redirect: 0 ptp_rq13_lro_packets: 0 ptp_rq12_lro_bytes: 0 ptp_rq10_ecn_mark: 0 ptp_rq9_removed_vlan_packets: 0 ptp_rq5_wqe_err: 0 ptp_rq8_mpwqe_filler_cqes: 0 ptp_rq2_mpwqe_filler_strides: 0 ptp_rq5_oversize_pkts_sw_drop: 0 ptp_rq6_buff_alloc_err: 0 ptp_rq15_cqe_compress_blks: 0 ptp_rq2_cqe_compress_pkts: 0 ptp_rq2_cache_reuse: 0 ptp_rq12_cache_full: 0 ptp_rq11_cache_empty: 256 ptp_rq12_cache_busy: 0 ptp_rq11_cache_waive: 0 ptp_rq12_congst_umr: 0 ptp_rq11_arfs_err: 0 ptp_rq9_recover: 0 After applying the patch: ethtool -S eth3 \| grep ptp_rq ptp_rq0_packets: 0 ptp_rq0_bytes: 0 ptp_rq0_csum_complete: 0 ptp_rq0_csum_complete_tail: 0 ptp_rq0_csum_complete_tail_slow : 0 ptp_rq0_csum_unnecessary: 0 ptp_rq0_csum_unnecessary_inner: 0 ptp_rq0_csum_none: 0 ptp_rq0_xdp_drop: 0 ptp_rq0_xdp_redirect: 0 ptp_rq0_lro_packets: 0 ptp_rq0_lro_bytes: 0 ptp_rq0_ecn_mark: 0 ptp_rq0_removed_vlan_packets: 0 ptp_rq0_wqe_err: 0 ptp_rq0_mpwqe_filler_cqes: 0 ptp_rq0_mpwqe_filler_strides: 0 ptp_rq0_oversize_pkts_sw_drop: 0 ptp_rq0_buff_alloc_err: 0 ptp_rq0_cqe_compress_blks: 0 ptp_rq0_cqe_compress_pkts: 0 ptp_rq0_cache_reuse: 0 ptp_rq0_cache_full: 0 ptp_rq0_cache_empty: 256 ptp_rq0_cache_busy: 0 ptp_rq0_cache_waive: 0 ptp_rq0_congst_umr: 0 ptp_rq0_arfs_err: 0 ptp_rq0_recover: 0 Fixes: a28359e922c6 ("net/mlx5e: Add PTP-RX statistics") Signed-off-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 86d747a3	06-Jul-2021	Tariq Toukan <tariqt@nvidia.com>	net/mlx5e: Abstract MQPRIO params Abstract the MQPRIO params into a struct. Use a getter for DCB mode num_tcs. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# d443c6f6	02-Jul-2021	Maor Gottlieb <maorg@nvidia.com>	net/mlx5e: Rename traffic type enums Rename traffic type enums as part of the preparation for moving the traffic type logic to a separate file. Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 43ec0f41	09-Apr-2021	Maxim Mikityanskiy <maximmi@nvidia.com>	net/mlx5e: Hide all implementation details of mlx5e_rx_res This commit moves all implementation details of struct mlx5e_rx_res under en/rx_res.c. All access to RX resources is now done using methods. Encapsulating RX resources into an object allows for better manageability, because all the implementation details are now in a single place, and external code can use only a limited set of API methods to init/teardown the whole thing, reconfigure RSS and LRO parameters, connect TIRs to flow steering and activate/deactivate TIRs. mlx5e_rx_res is self-contained and doesn't depend on struct mlx5e_priv or include en.h. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 0570c1c9	05-Apr-2021	Maxim Mikityanskiy <maximmi@nvidia.com>	net/mlx5e: Take RQT out of TIR and group RX resources RQT is not part of TIR, as multiple TIRs may point to the same RQT, as it happens with indir_tir and inner_indir_tir. These instances of a TIR don't use the embedded RQT. This commit takes RQT out of TIR, making them independent. The RQTs are placed into struct mlx5e_rx_res, and items in that struct are regrouped by functionality: RSS, channels and PTP. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 3f22d6c7	05-Apr-2021	Maxim Mikityanskiy <maximmi@nvidia.com>	net/mlx5e: Move RX resources to a separate struct This commit moves RQTs and TIRs to a separate struct that is allocated dynamically in profiles that support these RX resources (all profiles, except IPoIB PKey). It also allows to remove rqt_enabled flags, as RQTs are always enabled in profiles that support RX resources. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 678b1ae1	22-Jun-2021	Aya Levin <ayal@nvidia.com>	net/mlx5e: Fix page allocation failure for ptp-RQ over SF Set the correct pci-device pointer to the ptp-RQ. This allows access to dma_mask and avoids allocation request with wrong pci-device. Fixes: a099da8ffcf6 ("net/mlx5e: Add RQ to PTP channel") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# a759f845	30-Jun-2021	Aya Levin <ayal@nvidia.com>	net/mlx5e: Consider PTP-RQ when setting RX VLAN stripping Add PTP-RQ to the loop when setting rx-vlan-offload feature via ethtool. On PTP-RQ's creation, set rx-vlan-offload into its parameters. Fixes: a099da8ffcf6 ("net/mlx5e: Add RQ to PTP channel") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# a6ee6f5f	19-Apr-2021	Aya Levin <ayal@nvidia.com>	net/mlx5e: Fix select queue to consider SKBTX_HW_TSTAMP Steering packets to PTP-SQ should be done only if the SKB has SKBTX_HW_TSTAMP set in the tx_flags. While here, take the function into a header and inline it. Set the whole condition to select the PTP-SQ to unlikely. Fixes: 24c22dd0918b ("net/mlx5e: Add states to PTP channel") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 69cc4185	28-Jan-2021	Maxim Mikityanskiy <maximmi@mellanox.com>	net/mlx5e: Use mlx5e_safe_switch_channels when channels are closed This commit uses new functionality of mlx5e_safe_switch_channels introduced by the previous commit to reduce the amount of repeating similar code all over the driver. It's very common in mlx5e to call mlx5e_safe_switch_channels when the channels are open, but assign parameters and run hardware commands manually when the channels are closed. After the previous commit it's no longer needed to do such manual things every time, so this commit removes unneeded code and relies on the new functionality of mlx5e_safe_switch_channels. Some of the places are refactored and simplified, where more complex flows are used to change configuration on the fly, without recreating the channels (the logic is rewritten in a more robust way, with a reset required by default and a list of exceptions). Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 960fbfe2	20-Jan-2021	Aya Levin <ayal@nvidia.com>	net/mlx5e: Allow coexistence of CQE compression and HW TS PTP Update setting HW time-stamp to allow coexistence with CQE compression. Turn on RX PTP indication and try to reopen the channels. On success, coexistence with CQE compression is enabled. Otherwise, fall-back to turning off CQE compression. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# e5fe4946	15-Feb-2021	Aya Levin <ayal@nvidia.com>	net/mlx5e: Add PTP Flow Steering support When opening PTP channel with MLX5E_PTP_STATE_RX set, add the corresponding flow steering rules. Capture UDP packets with destination port 319 and L2 packets with ethertype 0x88F7 and steer them into the RQ of the PTP channel. Add API that manages the flow steering rules to be used in the following patches via safe_reopen_channels mechanism. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 3adb60b6	25-Feb-2021	Aya Levin <ayal@nvidia.com>	net:mlx5e: Add PTP-TIR and PTP-RQT Add PTP-TIR and initiate its RQT to allow PTP-RQ to integrate into the safe-reopen flow on configuration change. Add rx_ptp_support flag on a profile and turn it on for ETH driver. With this flag set, create a redirect-RQT for PTP-RQ. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# a28359e9	07-Mar-2021	Aya Levin <ayal@nvidia.com>	net/mlx5e: Add PTP-RX statistics Like PTP-TX, once the PTP-RX is opened, corresponding statistics appear. Add indication that PTP-RX was ever opened: rx_ptp_opened. If any of the PTP RX or TX were opened, display the PTP channel's statistics. Signed-off-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# a099da8f	07-Mar-2021	Aya Levin <ayal@nvidia.com>	net/mlx5e: Add RQ to PTP channel Enhance PTP channel to allow PTP without disabling CQE compression. Add RQ, TIR and PTP_RX_STATE to PTP channel. When this bit is set, PTP channel manages its RQ, and PTP traffic is directed to the PTP-RQ which is not affected by compression. Signed-off-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 24c22dd0	11-Jan-2021	Aya Levin <ayal@nvidia.com>	net/mlx5e: Add states to PTP channel Add PTP TX state to PTP channel, which indicates the corresponding SQ is available. Further patches in the set extend PTP channel to include RQ. The PTP channel state will be used for separation and coexistence of RX and TX PTP. Enhance conditions to verify the TX PTP state is set. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# e569cbd7	17-Jan-2021	Aya Levin <ayal@nvidia.com>	net/mlx5e: Cleanup PTP Reduce scope of mlx5e_ptp_params, move to its c file. Remove unneeded variables from mlx5e_ptp_open and state bitmap from PTP channel. In addition, remove channel index from PTP channel since it is set to a hard coded value, use define instead. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# b0d35de4	07-Mar-2021	Aya Levin <ayal@nvidia.com>	net/mlx5e: Generalize PTP implementation Following patches in the set add support for RX PTP. Rename PTP prefix from %s/port_ptp/ptp/g to include RX PTP too. In addition rename indication (used in statistics context) that PTP-SQ was opened: %s/port_ptp_opened/tx_ptp_opened/g. This will simplify adding indication that PTP-RQ was opened. Signed-off-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 183532b7	02-Mar-2021	Aya Levin <ayal@nvidia.com>	net/mlx5: Add helper to set time-stamp translator on a queue Translation method on the time-stamp is set by the capabilities. Avoid code duplication by using a helper to set ptp_cyc2time callback on a queue. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 89564920	10-Mar-2021	Tariq Toukan <tariqt@nvidia.com>	net/mlx5e: Restrict usage of mlx5e_priv in params logic functions Do not use generic struct mlx5e_priv as a parameter to param functions, as it is too generic. All calculations of the channel's param should be mainly based on struct mlx5_core_dev and struct mlx5e_params. Additional info can be explicitly passed. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# c276aae8	26-Jan-2021	Roi Dayan <roid@nvidia.com>	net/mlx5: Move mlx5e hw resources into a sub object This is to separate between resources attributes and other attributes we will want to use. Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 432119de	12-Feb-2021	Aya Levin <ayal@nvidia.com>	net/mlx5: Add cyc2time HW translation mode support Device timestamp can be in real time mode (cycles to time translation is offloaded into the Hardware). With real time mode, HW provides timestamp which is already translated into nanoseconds. With this mode, driver adjusts both the HW and timecounter (to keep clock_info_page updated) using callbacks: adjfreq, adjtime and settime. HW clock modifications are done via MTUTC access reg commands. Driver is allowed to modify HW real time clock only if MCAM ptpcyc2realtime_modify capability is set. Add MTUTC set function to be used for configuring the HW real time clock. Modify existing code to support both internal timer (with conversion via timecounter_cyc2time() and real time (no conversions). Align the signatures of the helpers converting from timestamp to nanoseconds. With that, when allocating a queue assign the corresponding callback with respect to the capability. Adjust 1PPS timestamp calculation flows based on the timestamp mode. Cyc2time offload brings two major advantages: - Improve MTAE (Max Time Absolute Error) for HW TS by up to 160 ns over a 100% loaded CPU. - Faster data-path timestamp to nanoseconds, as translation is lock-less and done in HW. On real time mode, timestamp format is 32 high bits of seconds and 32 low bits of nanoseconds. On some flows, driver shall convert this format into nanoseconds wall-clock with REAL_TIME_TO_NS macro. HW supports a single clock, and it is shared by all functions on a device. In case real time clock is used, it is recommended to use a single GM to all device's functions. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 7637e499	30-Dec-2019	Tariq Toukan <tariqt@mellanox.com>	net/mlx5e: Enable napi in channel's activation stage The channel's napi is first needed upon activation, not creation. Minimize its enabled scope by moving it from the channel's open/close stage into the activate/deactivate stage. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 214baf22	19-Jan-2021	Maxim Mikityanskiy <maximmi@mellanox.com>	net/mlx5e: Support HTB offload This commit adds support for HTB offload in the mlx5e driver. Performance: NIC: Mellanox ConnectX-6 Dx CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (24 cores with HT) 100 Gbit/s line rate, 500 UDP streams @ ~200 Mbit/s each 48 traffic classes, flower used for steering No shaping (rate limits set to 4 Gbit/s per TC) - checking for max throughput. Baseline: 98.7 Gbps, 8.25 Mpps HTB: 6.7 Gbps, 0.56 Mpps HTB offload: 95.6 Gbps, 8.00 Mpps Limitations: 1. 256 leaf nodes, 3 levels of depth. 2. Granularity for ceil is 1 Mbit/s. Rates are converted to weights, and the bandwidth is split among the siblings according to these weights. Other parameters for classes are not supported. Ethtool statistics support for QoS SQs are also added. The counters are called qos_txN_, where N is the QoS queue number (starting from 0, the numeration is separate from the normal SQs), and is the counter name (the counters are the same as for the normal SQs). Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# 1880bc4e	01-Dec-2020	Eran Ben Elisha <eranbe@nvidia.com>	net/mlx5e: Add TX port timestamp support Transmitted packet timestamping accuracy can be improved when using timestamp from the port, instead of packet CQE creation timestamp, as it better reflects the actual time of a packet's transmit. TX port timestamping is supported starting from ConnectX6-DX hardware. Although at the original completion, only CQE timestamp can be attached, we are able to get TX port timestamping via an additional completion over a special CQ associated with the SQ (in addition to the regular CQ). Driver to ignore the original packet completion timestamp, and report back the timestamp of the special CQ completion. If the absolute timestamp diff between the two completions is greater than 1 / 128 second, ignore the TX port timestamp as it has a jitter which is too big. No skb will be generate out of the extra completion. Allocate additional CQ per ptpsq, to receive the TX port timestamp. Driver to hold an skb FIFO in order to map between transmitted skb to the two expected completions. When using ptpsq, hold double refcount on the skb, to gaurantee it will not get released before both completions arrive. Expose dedicated counters of the ptp additional CQ and connect it to the TX health reporter. This patch improves TX Hardware timestamping offset to be less than 40ns at a 100Gbps line rate, compared to 600ns before. With that, making our HW compliant with G.8273.2 class C, and allow Linux systems to be deployed in the 5G telco edge, where this standard is a must. Signed-off-by: Eran Ben Elisha <eranbe@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 145e5637	01-Dec-2020	Eran Ben Elisha <eranbe@nvidia.com>	net/mlx5e: Add TX PTP port object support Add TX PTP port object support for better TX timestamping accuracy. Currently, driver supports CQE based TX port timestamp. Device also offers TX port timestamp, which has less jitter and better reflects the actual time of a packet's transmit. Define new driver layout called ptpsq, on which driver will create SQs that will support TX port timestamp for their transmitted packets. Driver to identify PTP TX skbs and steer them to these dedicated SQs as part of the select queue ndo. Driver to hold ptpsq per TC and report them at netif_set_real_num_tx_queues(). Add support for all needed functionality in order to xmit and poll completions received via ptpsq. Add ptpsq to the TX reporter recover, diagnose and dump methods. Creation of ptpsqs is disabled by default, and can be enabled via tx_port_ts private flag. This patch steer all timestamp related packets to a ptpsq, but it does not open the port timestamp support for it. The support will be added in the following patch. Signed-off-by: Eran Ben Elisha <eranbe@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>