Cross Reference: /linux-master/drivers/net/ethernet/mellanox/mlx5/core/en/tc

History log of /linux-master/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c
Revision	Date	Author	Comments
# 8e13cd73	21-Nov-2023	Vlad Buslov <vladbu@nvidia.com>	net/mlx5e: fix double free of encap_header Cited commit introduced potential double free since encap_header can be destroyed twice in some cases - once by error cleanup sequence in mlx5e_tc_tun_{create\|update}_header_ipv{4\|6}(), once by generic mlx5e_encap_put() that user calls as a result of getting an error from tunnel create\|update. At the same time the point where e->encap_header is assigned can't be delayed because the function can still return non-error code 0 as a result of checking for NUD_VALID flag, which will cause neighbor update to dereference NULL encap_header. Fix the issue by: - Nulling local encap_header variables in mlx5e_tc_tun_{create\|update}_header_ipv{4\|6}() to make kfree(encap_header) call in error cleanup sequence noop after that point. - Assigning reformat_params.data from e->encap_header instead of local variable encap_header that was set to NULL pointer by previous step. Also assign reformat_params.size from e->encap_size for uniformity and in order to make the code less error-prone in the future. Fixes: d589e785baf5 ("net/mlx5e: Allow concurrent creation of encap entries") Reported-by: Dust Li <dust.li@linux.alibaba.com> Reported-by: Cruz Zhao <cruzzhao@linux.alibaba.com> Reported-by: Tianchen Ding <dtcccc@linux.alibaba.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 5d089684	21-Nov-2023	Vlad Buslov <vladbu@nvidia.com>	Revert "net/mlx5e: fix double free of encap_header" This reverts commit 6f9b1a0731662648949a1c0587f6acb3b7f8acf1. This patch is causing a null ptr issue, the proper fix is in the next patch. Fixes: 6f9b1a073166 ("net/mlx5e: fix double free of encap_header") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 66ca8d4d	21-Nov-2023	Vlad Buslov <vladbu@nvidia.com>	Revert "net/mlx5e: fix double free of encap_header in update funcs" This reverts commit 3a4aa3cb83563df942be49d145ee3b7ddf17d6bb. This patch is causing a null ptr issue, the proper fix is in the next patch. Fixes: 3a4aa3cb8356 ("net/mlx5e: fix double free of encap_header in update funcs") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 3a4aa3cb	14-Nov-2023	Gavin Li <gavinl@nvidia.com>	net/mlx5e: fix double free of encap_header in update funcs Follow up to the previous patch to fix the same issue for mlx5e_tc_tun_update_header_ipv4{6} when mlx5_packet_reformat_alloc() fails. When mlx5_packet_reformat_alloc() fails, the encap_header allocated in mlx5e_tc_tun_update_header_ipv4{6} will be released within it. However, e->encap_header is already set to the previously freed encap_header before mlx5_packet_reformat_alloc(). As a result, the later mlx5e_encap_put() will free e->encap_header again, causing a double free issue. mlx5e_encap_put() --> mlx5e_encap_dealloc() --> kfree(e->encap_header) This patch fix it by not setting e->encap_header until mlx5_packet_reformat_alloc() success. Fixes: a54e20b4fcae ("net/mlx5e: Add basic TC tunnel set action for SRIOV offloads") Signed-off-by: Gavin Li <gavinl@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20231114215846.5902-7-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# 6f9b1a07	14-Nov-2023	Dust Li <dust.li@linux.alibaba.com>	net/mlx5e: fix double free of encap_header When mlx5_packet_reformat_alloc() fails, the encap_header allocated in mlx5e_tc_tun_create_header_ipv4{6} will be released within it. However, e->encap_header is already set to the previously freed encap_header before mlx5_packet_reformat_alloc(). As a result, the later mlx5e_encap_put() will free e->encap_header again, causing a double free issue. mlx5e_encap_put() --> mlx5e_encap_dealloc() --> kfree(e->encap_header) This happens when cmd: MLX5_CMD_OP_ALLOC_PACKET_REFORMAT_CONTEXT fail. This patch fix it by not setting e->encap_header until mlx5_packet_reformat_alloc() success. Fixes: d589e785baf5e ("net/mlx5e: Allow concurrent creation of encap entries") Reported-by: Cruz Zhao <cruzzhao@linux.alibaba.com> Reported-by: Tianchen Ding <dtcccc@linux.alibaba.com> Signed-off-by: Dust Li <dust.li@linux.alibaba.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# 197c0002	06-Feb-2023	Roi Dayan <roid@nvidia.com>	net/mlx5e: Use a simpler comparison for uplink rep get_route_and_out_devs() is uses the following condition mlx5e_eswitch_rep() && mlx5e_is_uplink_rep() to check if a given netdev is the uplink rep. Alternatively we can just use the straight forward version mlx5e_eswitch_uplink_rep() that only checks if a given netdev is uplink rep. Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 8ce81fc0	01-Dec-2022	Roi Dayan <roid@nvidia.com>	net/mlx5e: TC, Add peer flow in mpesw mode While at it rename mlx5_lag_mpesw_is_activated() to mlx5_lag_is_mpesw() to be consistent with checking if other lag modes are activated. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 521933cd	20-Dec-2022	Maor Dickman <maord@nvidia.com>	net/mlx5e: Support Geneve and GRE with VF tunnel offload Today VF tunnel offload (tunnel endpoint is on VF) is implemented by indirect table which use rules that match on VXLAN VNI to recirculated to root table, this limit the support for only VXLAN tunnels. This patch change indirect table to use one single match all rule to recirculated to root table which is added when any tunnel decap rule is added with tunnel endpoint is VF. This allow support of Geneve and GRE with this configuration. Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# bcb0da7f	05-Aug-2022	Matthias May <matthias.may@westermo.com>	mlx5: do not use RT_TOS for IPv6 flowlabel According to Guillaume Nault RT_TOS should never be used for IPv6. Quote: RT_TOS() is an old macro used to interprete IPv4 TOS as described in the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4 code, although, given the current state of the code, most of the existing calls have no consequence. But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS" field to be interpreted the RFC 1349 way. There's no historical compatibility to worry about. Fixes: ce99f6b97fcd ("net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels") Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Matthias May <matthias.may@westermo.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# e3fdc71b	25-Apr-2022	Ariel Levkovich <lariel@nvidia.com>	net/mlx5e: TC, fix decap fallback to uplink when int port not supported When resolving the decap route device for a tunnel decap rule, the result may be an OVS internal port device. Prior to adding the support for internal port offload, such case would result in using the uplink as the default decap route device which allowed devices that can't support internal port offload to offload this decap rule. This behavior got broken by adding the internal port offload which will fail in case the device can't support internal port offload. To restore the old behavior, use the uplink device as the decap route as before when internal port offload is not supported. Fixes: b16eb3c81fe2 ("net/mlx5: Support internal port as decap route device") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 48d67543	10-Jan-2022	Guillaume Nault <gnault@redhat.com>	mlx5: Don't accidentally set RTO_ONLINK before mlx5e_route_lookup_ipv4_get() Mask the ECN bits before calling mlx5e_route_lookup_ipv4_get(). The tunnel key might have the last ECN bit set. This interferes with the route lookup process as ip_route_output_key_hash() interpretes this bit specially (to restrict the route scope). Found by code inspection, compile tested only. Fixes: c7b9038d8af6 ("net/mlx5e: TC preparation refactoring for routing update event") Fixes: 9a941117fb76 ("net/mlx5e: Maximize ip tunnel key usage on the TC offloading path") Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# 819c319c	26-Oct-2021	Chris Mi <cmi@nvidia.com>	net/mlx5e: Specify out ifindex when looking up decap route There is a use case that the local and remote VTEPs are in the same host. Currently, the out ifindex is not specified when looking up the decap route for offloads. So in this case, a local route is returned and the route dev is lo. Actual tunnel interface can be created with a parameter "dev" [1], which specifies the physical device to use for tunnel endpoint communication. Pass this parameter to driver when looking up decap route for offloads. So that a unicast route will be returned. [1] ip link add name vxlan1 type vxlan id 100 dev enp4s0f0 remote 1.1.1.1 dstport 4789 Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# b16eb3c8	08-Jan-2021	Ariel Levkovich <lariel@nvidia.com>	net/mlx5: Support internal port as decap route device When performing route device lookup for decap action, support the case of ovs internal port as the lookup result. In such case, an internal port struct is mapped and attached to the flow attributes so that the source port matching of the rule will match on the internal port's metadata value. Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 100ad4e2	08-Jan-2021	Ariel Levkovich <lariel@nvidia.com>	net/mlx5e: Offload internal port as encap route device When pefroming encap action, a route lookup is performed to find the routing device the packet should be forwarded to after the encapsulation. This is the device that has the local tunnel ip address. This change adds support to offload an encap rule where the route device ends up being an ovs internal port. In such case, the driver will add a HW rule that will encapsulate the packet with the tunnel header and will overwrite the vport metadata in reg_c0 to the internal port metadata value. Finally, the packet will be forwarded to the root table to be processed again with the indication that it came from an internal port. Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 016c8946	22-Oct-2021	Jakub Kicinski <kuba@kernel.org>	mlx5: fix build after merge Silent merge conflict between these two: 3d677735d3b7 ("net/mlx5: Lag, move lag files into directory") 14fe2471c628 ("net/mlx5: Lag, change multipath and bonding to be mutually exclusive") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# 2f8ec867	26-Sep-2021	Chris Mi <cmi@nvidia.com>	net/mlx5e: Specify out ifindex when looking up encap route There is a use case that the local and remote VTEPs are in the same host. Currently, the out ifindex is not specified when looking up the encap route for offloads. So in this case, a local route is returned and the route dev is lo. Actual tunnel interface can be created with a parameter "dev" [1], which specifies the physical device to use for tunnel endpoint communication. Pass this parameter to driver when looking up encap route for offloads. So that a unicast route will be returned. [1] ip link add name vxlan1 type vxlan id 100 dev enp4s0f0 remote 1.1.1.1 dstport 4789 Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 14fe2471	07-Oct-2021	Maor Dickman <maord@nvidia.com>	net/mlx5: Lag, change multipath and bonding to be mutually exclusive Both multipath and bonding events are changing the HW LAG state independently. Handling one of the features events while the other is already enabled can cause unwanted behavior, for example handling bonding event while multipath enabled will disable the lag and cause multipath to stop working. Fix it by ignoring bonding event while in multipath and ignoring FIB events while in bonding mode. Fixes: 544fe7c2e654 ("net/mlx5e: Activate HW multipath and handle port affinity based on FIB events") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 39c538d6	29-Jul-2021	Cai Huoqing <caihuoqing@baidu.com>	net/mlx5: Fix typo in comments Fix typo: vectores ==> vectors realeased ==> released erros ==> errors namepsace ==> namespace trafic ==> traffic proccessed ==> processed retore ==> restore Currenlty ==> Currently crated ==> created chane ==> change cannnot ==> cannot usuallly ==> usually failes ==> fails importent ==> important reenabled ==> re-enabled alocation ==> allocation recived ==> received tanslation ==> translation Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# c623c95a	01-Aug-2021	Roi Dayan <roid@nvidia.com>	net/mlx5e: Avoid creating tunnel headers for local route It could be local and remote are on the same machine and the route result will be a local route which will result in creating encap id with src/dst mac address of 0. Fixes: a54e20b4fcae ("net/mlx5e: Add basic TC tunnel set action for SRIOV offloads") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 3f3f05ab	08-Mar-2021	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: Added new parameters to reformat context Adding new reformat context type (INSERT_HEADER) requires adding two new parameters to reformat context - reformat_param_0 and reformat_param_1. As defined by HW spec, these parameters have different meaning for different reformat context type. The first parameter (reformat_param_0) is not new to HW spec, but it wasn't used by any of the supported reformats. The second parameter (reformat_param_1) is new to the HW spec - it was added to allow supporting INSERT_HEADER. For NSERT_HEADER, reformat_param_0 indicates the header used to reference the location of the inserted header, and reformat_param_1 indicates the offset of the inserted header from the reference point defined by reformat_param_0. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 1e74152e	02-Mar-2021	Roi Dayan <roid@nvidia.com>	net/mlx5e: Check correct ip_version in decapsulation route resolution flow_attr->ip_version has the matching that should be done inner/outer. When working with chains, decapsulation is done on chain0 and next chain match on outer header which is the original inner which could be ipv4. So in tunnel route resolution we cannot use that to know which ip version we are at so save tun_ip_version when parsing the tunnel match and use that. Fixes: a508728a4c8b ("net/mlx5e: VF tunnel RX traffic offloading") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Dmytro Linkin <dlinkin@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# c7b9038d	25-Jan-2021	Vlad Buslov <vladbu@nvidia.com>	net/mlx5e: TC preparation refactoring for routing update event Following patch in series implement routing update event which requires ability to modify rule match_to_reg modify header actions dynamically during rule lifetime. In order to accommodate such behavior, refactor and extend TC infrastructure in following ways: - Modify mod_hdr infrastructure to preserve its parse attribute for whole rule lifetime, instead of deallocating it after rule creation. - Extend match_to_reg infrastructure with new function mlx5e_tc_match_to_reg_set_and_get_id() that returns mod_hdr action id that can be used afterwards to update the action, and mlx5e_tc_match_to_reg_mod_hdr_change() that can modify existing actions by its id. - Extend tun API with new functions mlx5e_tc_tun_update_header_ipv{4\|6}() that are used to updated existing encap entry tunnel header. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 2221d954	19-Sep-2020	Vlad Buslov <vladbu@nvidia.com>	net/mlx5e: Refactor neigh update infrastructure Following patches in series implements route update which can cause encap entries to migrate between routing devices. Consecutively, their parent nhe's need to be also transferable between devices instead of having neigh device as a part of their immutable key. Move neigh device from struct mlx5_neigh to struct mlx5e_neigh_hash_entry and check that nhe and neigh devices are the same in workqueue neigh update handler. Save neigh net_device that can change dynamically in dedicated nhe->dev field. With FIB event handler that is implemented in following patches changing nhe->dev, NETEVENT_DELAY_PROBE_TIME_UPDATE handler can concurrently access the nhe entry when traversing neigh list under rcu read lock. Processing stale values in that handler doesn't change the handler logic, so just wrap all accesses to the dev pointer in {WRITE\|READ}_ONCE() helpers. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 0d9f9647	24-Jan-2021	Vlad Buslov <vladbu@nvidia.com>	net/mlx5e: Extract tc tunnel encap/decap code to dedicated file Following patches in series extend the extracted code with routing infrastructure. To improve code modularity created a dedicated tc_tun_encap.c source file and move encap/decap related code to the new file. Export code that is used by both regular TC code and encap/decap code into tc_priv.h (new header intended to be used only by TC module). Rename some exported functions by adding "mlx5e_" prefix to their names. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# a508728a	25-Jan-2021	Vlad Buslov <vladbu@nvidia.com>	net/mlx5e: VF tunnel RX traffic offloading When tunnel endpoint is on VF the encapsulated RX traffic is exposed on the representor of the VF without any further processing of rules installed on the VF. Detect such case by checking if the device returned by route lookup in decap rule handling code is a mlx5 VF and handle it with new redirection tables API. Example TC rules for VF tunnel traffic: 1. Rule that encapsulates the tunneled flow and redirects packets from source VF rep to tunnel device: $ tc -s filter show dev enp8s0f0_1 ingress filter protocol ip pref 4 flower chain 0 filter protocol ip pref 4 flower chain 0 handle 0x1 dst_mac 0a:40:bd:30:89:99 src_mac ca:2e:a7:3f:f5:0f eth_type ipv4 ip_tos 0/0x3 ip_flags nofrag in_hw in_hw_count 1 action order 1: tunnel_key set src_ip 7.7.7.5 dst_ip 7.7.7.1 key_id 98 dst_port 4789 nocsum ttl 64 pipe index 1 ref 1 bind 1 installed 411 sec used 411 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 no_percpu used_hw_stats delayed action order 2: mirred (Egress Redirect to device vxlan_sys_4789) stolen index 1 ref 1 bind 1 installed 411 sec used 0 sec Action statistics: Sent 5615833 bytes 4028 pkt (dropped 0, overlimits 0 requeues 0) Sent software 0 bytes 0 pkt Sent hardware 5615833 bytes 4028 pkt backlog 0b 0p requeues 0 cookie bb406d45d343bf7ade9690ae80c7cba4 no_percpu used_hw_stats delayed 2. Rule that redirects from tunnel device to UL rep: $ tc -s filter show dev vxlan_sys_4789 ingress filter protocol ip pref 4 flower chain 0 filter protocol ip pref 4 flower chain 0 handle 0x1 dst_mac ca:2e:a7:3f:f5:0f src_mac 0a:40:bd:30:89:99 eth_type ipv4 enc_dst_ip 7.7.7.5 enc_src_ip 7.7.7.1 enc_key_id 98 enc_dst_port 4789 enc_tos 0 ip_flags nofrag in_hw in_hw_count 1 action order 1: tunnel_key unset pipe index 2 ref 1 bind 1 installed 434 sec used 434 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 used_hw_stats delayed action order 2: mirred (Egress Redirect to device enp8s0f0_1) stolen index 4 ref 1 bind 1 installed 434 sec used 0 sec Action statistics: Sent 129936 bytes 1082 pkt (dropped 0, overlimits 0 requeues 0) Sent software 0 bytes 0 pkt Sent hardware 129936 bytes 1082 pkt backlog 0b 0p requeues 0 cookie ac17cf398c4c69e4a5b2f7aabd1b88ff no_percpu used_hw_stats delayed Co-developed-by: Dmytro Linkin <dlinkin@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 4ad9116c	25-Jan-2021	Vlad Buslov <vladbu@nvidia.com>	net/mlx5e: Remove redundant match on tunnel destination mac Remove hardcoded match on tunnel destination MAC address. Such match is no longer required and would be wrong for stacked devices topology where encapsulation destination MAC address will be the address of tunnel VF that can change dynamically on route change (implemented in following patches in the series). Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 6717986e	22-Jan-2021	Vlad Buslov <vladbu@nvidia.com>	net/mlx5e: Refactor tun routing helpers Refactor tun routing helpers to use dedicated struct mlx5e_tc_tun_route_attr instead of multiple output arguments. This simplifies the callers (no need to keep track of bunch of output param pointers) and allows to unify struct release code in new mlx5e_tc_tun_route_attr_cleanup() helper instead of requiring callers to manually release some of the output parameters that require it. Simplify code by unifying error handling at the end of the function and rearranging code. Remove redundant empty line. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 78c906e4	31-Aug-2020	Vlad Buslov <vladbu@nvidia.com>	net/mlx5e: Protect encap route dev from concurrent release In functions mlx5e_route_lookup_ipv{4\|6}() route_dev can be arbitrary net device and not necessary mlx5 eswitch port representor. As such, in order to ensure that route_dev is not destroyed concurrent the code needs either explicitly take reference to the device before releasing reference to rtable instance or ensure that caller holds rtnl lock. First approach is chosen as a fix since rtnl lock dependency was intentionally removed from mlx5 TC layer. To prevent unprotected usage of route_dev in encap code take a reference to the device before releasing rt. Don't save direct pointer to the device in mlx5_encap_entry structure and use ifindex instead. Modify users of route_dev pointer to properly obtain the net device instance from its ifindex. Fixes: 61086f391044 ("net/mlx5e: Protect encap hash table with mutex") Fixes: 6707f74be862 ("net/mlx5e: Update hw flows when encap source mac changed") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# fca53304	18-May-2020	Eli Britstein <elibr@mellanox.com>	net/mlx5e: Optimize performance for IPv4/IPv6 ethertype The HW is optimized for IPv4/IPv6. For such cases, pending capability, avoid matching on ethertype, and use ip_version field instead. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# 4a5d5d73	11-May-2020	Eli Britstein <elibr@mellanox.com>	net/mlx5e: Helper function to set ethertype Set ethertype match in a helper function as a pre-step towards optimizing it. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# f828ca6a	17-Nov-2019	Eli Cohen <eli@mellanox.com>	net/mlx5e: Add support for hw encapsulation of MPLS over UDP MPLS over UDP is supported by adding a rule on a representor net device which does tunnel_key set, push mpls and forward to a baredup device. At the hardware level we use a packet_reformat_context object to do the encapsulation of the packet. The resulting packet looks as follows (left side transmitted first): outer L2 \| outer IP \| UDP \| MPLS \| inner L3 and data \| Example usage: tc filter add dev $rep0 protocol ip prio 1 root flower skip_sw \ action tunnel_key set src_ip 8.8.8.21 dst_ip 8.8.8.24 id 555 \ dst_port 6635 tos 4 ttl 6 csum action mpls push protocol 0x8847 \ label 555 tc 3 action mirred egress redirect dev bareudp0 This is how the filter is shown with tc filter show: tc filter show dev enp59s0f0_0 ingress filter protocol ip pref 1 flower chain 0 filter protocol ip pref 1 flower chain 0 handle 0x1 eth_type ipv4 skip_sw in_hw in_hw_count 1 action order 1: tunnel_key set src_ip 8.8.8.21 dst_ip 8.8.8.24 key_id 555 dst_port 6635 csum tos 0x4 ttl 6 pipe index 1 ref 1 bind 1 action order 2: mpls push protocol mpls_uc label 555 tc 3 ttl 255 pipe index 1 ref 1 bind 1 action order 3: mirred (Egress Redirect to device bareudp0) stolen index 1 ref 1 bind 1 Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# 549c243e	12-May-2020	Vlad Buslov <vladbu@mellanox.com>	net/mlx5e: Extract neigh-specific code from en_rep.c to rep/neigh.c As a preparation for introducing new kconfig option that controls compilation of all TC offloads code in mlx5, extract neigh-specific code from en_rep.c to standalone file. This allows easily compiling out the code by only including new source in make file when corresponding kconfig is enabled instead of adding multiple ifdef blocks to en_rep. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# 768c3667	12-May-2020	Vlad Buslov <vladbu@mellanox.com>	net/mlx5e: Extract TC-specific code from en_rep.c to rep/tc.c As a preparation for introducing new kconfig option that controls compilation of all TC offloads code in mlx5, extract TC-specific code from en_rep.c to standalone file. This allows easily compiling out the code by only including new source in make file when corresponding kconfig is enabled instead of adding multiple ifdef blocks to en_rep. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# 2639324a	15-May-2020	Tang Bin <tangbin@cmss.chinamobile.com>	net/mlx5e: Use IS_ERR() to check and simplify code Use IS_ERR() and PTR_ERR() instead of PTR_ERR_OR_ZERO() to simplify code, avoid redundant judgements. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# 87b51810	12-Mar-2020	Eli Cohen <eli@mellanox.com>	net/mlx5: Avoid forwarding to other eswitch uplink Do not allow forwarding of encapsulated traffic received from one eswtich's uplink to another eswtich's uplink. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# ea4cd837	15-Feb-2020	Paul Blakey <paulb@mellanox.com>	net/mlx5e: Move tc tunnel parsing logic with the rest at tc_tun module Currently, tunnel parsing is split between en_tc and tc_tun. The next patch will replace the tunnel fields matching with a register match, and will not need this parsing. Move the tunnel parsing logic to tc_tun as a pre-step for skipping it in the next patch. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# 6c8991f4	04-Dec-2019	Sabrina Dubroca <sd@queasysnail.net>	net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookup ipv6_stub uses the ip6_dst_lookup function to allow other modules to perform IPv6 lookups. However, this function skips the XFRM layer entirely. All users of ipv6_stub->ip6_dst_lookup use ip_route_output_flow (via the ip_route_output_key and ip_route_output helpers) for their IPv4 lookups, which calls xfrm_lookup_route(). This patch fixes this inconsistent behavior by switching the stub to ip6_dst_lookup_flow, which also calls xfrm_lookup_route(). This requires some changes in all the callers, as these two functions take different arguments and have different return types. Fixes: 5f81bd2e5d80 ("ipv6: export a stub for IPv6 symbols used by vxlan") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
# 5f9fc332	27-Nov-2019	YueHaibing <yuehaibing@huawei.com>	net/mlx5e: Fix build error without IPV6 If IPV6 is not set and CONFIG_MLX5_ESWITCH is y, building fails: drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c:322:5: error: redefinition of mlx5e_tc_tun_create_header_ipv6 int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv priv, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c:7:0: drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h:67:1: note: previous definition of mlx5e_tc_tun_create_header_ipv6 was here mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv priv, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Use #ifdef to guard this, also move mlx5e_route_lookup_ipv6 to cleanup unused warning. Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: e689e998e102 ("net/mlx5e: TC, Stub out ipv6 tun create header function") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# 90ac2458	31-Oct-2019	Eli Cohen <eli@mellanox.com>	net/mlx5e: Remove redundant pointer check When code reaches the "out" label, n is guaranteed to be valid so we can unconditionally call neigh_release. Also change the label to release_neigh to better reflect the fact that we unconditionally free the neighbour and also match other labels convention. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# e689e998	01-Nov-2019	Saeed Mahameed <saeedm@mellanox.com>	net/mlx5e: TC, Stub out ipv6 tun create header function Improve mlx5e_route_lookup_ipv6 function structure by avoiding #ifdef then return -EOPNOTSUPP in the middle of the function code. To do so, we stub out mlx5e_tc_tun_create_header_ipv6 which is the only caller of this helper function to avoid calling it altogether when ipv6 is compiled out, which should also cleanup some compiler warnings of unused variables. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>