Cross Reference: /linux-master/drivers/net/ethernet/mellanox/mlx5/core/en

History log of /linux-master/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h
Revision	Date	Author	Comments
# ed7a8fe7	30-Mar-2022	Mark Bloch <mbloch@nvidia.com>	net/mlx5e: rep, store send to vport rules per peer Each representor, for each send queue, is holding a send_to_vport rule for the peer eswitch. In order to support more than one peer, and to map between the peer rules and peer eswitches, refactor representor to hold both the peer rules and pointer to the peer eswitches. This enables mlx5 to store send_to_vport rules per peer, where each peer have dedicate index via mlx5_get_dev_index(). Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# cf14af14	21-Mar-2023	Maher Sanalla <msanalla@nvidia.com>	net/mlx5e: Add vnic devlink health reporter to representors Create a new devlink health reporter for representor interface, which reports the values of representor vnic diagnostic counters when diagnosed. This patch will allow admins to monitor VF diagnostic counters through the representor-interface vnic reporter. Example of usage: $ devlink health diagnose pci/0000:08:00.0/65537 reporter vnic vNIC env counters: total_error_queues: 0 send_queue_priority_update_flow: 0 comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0 invalid_command: 0 quota_exceeded_command: 0 nic_receive_steering_discard: 0 Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 73af3711	30-Nov-2022	Roi Dayan <roid@nvidia.com>	net/mlx5: Lag, set different uplink vport metadata in multiport eswitch mode In a follow-up commit multiport eswitch mode will use a shared fdb. In shared fdb there is a single eswitch fdb and traffic could come from any port. to distinguish between the ports set a different metadata per uplink port. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# d13674b1	12-Feb-2023	Oz Shlomo <ozsh@nvidia.com>	net/mlx5e: TC, map tc action cookie to a hw counter Currently a hardware counter is associated with a flow cookie. This does not apply to flows using branching action which are required to return per action stats. A single counter may apply to multiple actions. Scan the flow actions in reverse (from the last to the first action) while caching the last counter. Associate all the flow attribute tc action cookies with the current cached counter. Signed-off-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
# 430e2d5e	18-Jul-2022	Roi Dayan <roid@nvidia.com>	net/mlx5: E-Switch, Move send to vport meta rule creation Move the creation of the rules from offloads fdb table init to per rep vport init. This way the driver will creating the send to vport meta rule on any representor, e.g. SF representors. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 74e6b2a8	08-Jun-2021	Jianbo Liu <jianbol@nvidia.com>	net/mlx5e: Prepare for flow meter offload if hardware supports it If flow meter aso object is supported, set the allocated range, and initialize aso wqe. The allocated range is indicated by log_meter_aso_granularity in HW capabilities, and currently is 6. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# d1a3138f	24-Jan-2022	Paul Blakey <paulb@nvidia.com>	net/mlx5e: TC, Move flow hashtable to be per rep To allow shared tc block offload between two or more reps of the same eswitch, move the tc flow hashtable to be per rep, instead of per eswitch. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# c63741b4	06-Jan-2022	Maor Dickman <maord@nvidia.com>	net/mlx5e: Fix MPLSoUDP encap to use MPLS action information Currently the MPLSoUDP encap builds the MPLS header using encap action information (tunnel id, ttl and tos) instead of the MPLS action information (label, ttl, tc and bos) which is wrong. Fix by storing the MPLS action information during the flow action parse and later using it to create the encap MPLS header. Fixes: f828ca6a2fb6 ("net/mlx5e: Add support for hw encapsulation of MPLS over UDP") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 4f4edcc2	29-Apr-2021	Ariel Levkovich <lariel@nvidia.com>	net/mlx5: E-Switch, Add ovs internal port mapping to metadata support Adding infrastructure to map ovs internal port device to vport match metadata to support offload of rules with internal port as the filter device or as the destination device. The infrastructure allows adding and removing internal port device to an eswitch database and getting a unique vport metadata value to be placed and match on in reg_c0 when offloading rules that are coming from or going to an internal port. The new int port metadata can be written to the source port register in HW to indicate that current source port of the packet is the internal port and not one of the actual HW vports (uplink or VF). Using this method, it is possible to offload TC rules with an OVS internal port as their destination port (overwriting the src vport register) or as the filter port (matching on the value of the src vport register and making sure it matches to the internal port's value). There is also a need to handle a miss case where the packet's src port value was changed in HW to an internal port but a following rule which matches on this new src port value wasn't found in HW. In such case, the packet will be forwarded to the driver with metadata which allows driver to restore the info of the internal port's netdevice. Once this info is restored, the uplink driver can forward the packet to the relevant netdevice in SW. Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# f0da4daa	16-Aug-2021	Chris Mi <cmi@nvidia.com>	net/mlx5e: Refactor ct to use post action infrastructure Move post action table management to common library providing add/del/get API. Refactor the ct action offload to use the common API. Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 0027d70c	18-Aug-2021	Chris Mi <cmi@nvidia.com>	net/mlx5e: Move esw/sample to en/tc/sample Module sample belongs to en/tc instead of esw. Move it and rename accordingly. Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 39c538d6	29-Jul-2021	Cai Huoqing <caihuoqing@baidu.com>	net/mlx5: Fix typo in comments Fix typo: vectores ==> vectors realeased ==> released erros ==> errors namepsace ==> namespace trafic ==> traffic proccessed ==> processed retore ==> restore Currenlty ==> Currently crated ==> created chane ==> change cannnot ==> cannot usuallly ==> usually failes ==> fails importent ==> important reenabled ==> re-enabled alocation ==> allocation recived ==> received tanslation ==> translation Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 898b0786	03-Aug-2021	Mark Bloch <mbloch@nvidia.com>	net/mlx5: Add send to vport rules on paired device When two mlx5 devices are paired in switchdev mode, always offload the send-to-vport rule to the peer E-Switch. This allows to abstract the logic when this is really necessary (single FDB) and combine the logic of both cases into one. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 07810152	21-Apr-2021	Vlad Buslov <vladbu@nvidia.com>	net/mlx5e: Refactor mlx5e_eswitch_{*}rep() helpers Change the helper to functions to accept constant pointer to struct net_device. This is necessary for following patches in series that pass mlx5e_eswitch_rep() as a callback to kernel bridge infrastructure code. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 2a9ab10a	18-Sep-2020	Chris Mi <cmi@nvidia.com>	net/mlx5e: TC, Add sampler termination table API Sampled packets are sent to software using termination tables. There is only one rule in that table that is to forward sampled packets to the e-switch management vport. Create a sampler termination table and rule for each eswitch. Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# ee526030	16-Sep-2020	Roi Dayan <roid@nvidia.com>	net/mlx5e: Add offload stats ndos to nic netdev ops We will re-use the native NIC port net device instance for the Uplink representor, hence same ndos must be used. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 8914add2	25-Jan-2021	Vlad Buslov <vladbu@nvidia.com>	net/mlx5e: Handle FIB events to update tunnel endpoint device Process FIB route update events to dynamically update the stack device rules when tunnel routing changes. Use rtnl lock to prevent FIB event handler from running concurrently with neigh update and neigh stats workqueue tasks. Use encap_tbl_lock mutex to synchronize with TC rule update path that doesn't use rtnl lock. FIB event workflow for encap flows: - Unoffload all flows attached to route encaps from slow or fast path depending on encap destination endpoint neigh state. - Update encap IP header according to new route dev. - Update flows mod_hdr action that is responsible for overwriting reg_c0 source port bits to source port of new underlying VF of new route dev. This step requires changing flow create/delete code to save flow parse attribute mod_hdr_acts structure for whole flow lifetime instead of deallocating it after flow creation. Refactor mod_hdr code to allow saving id of individual mod_hdr actions and updating them with dedicated helper. - Offload all flows to either slow or fast path depending on encap destination endpoint neigh state. FIB event workflow for decap flows: - Unoffload all route flows from hardware. When last route flow is deleted all indirect table rules for the route dev will also be deleted. - Update flow attr decap_vport and destination MAC according to underlying VF of new rote dev. - Offload all route flows back to hardware creating new indirect table rules according to updated flow attribute data. Extract some neigh update code to helper functions to be used by both neigh update and route update infrastructure. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 2221d954	19-Sep-2020	Vlad Buslov <vladbu@nvidia.com>	net/mlx5e: Refactor neigh update infrastructure Following patches in series implements route update which can cause encap entries to migrate between routing devices. Consecutively, their parent nhe's need to be also transferable between devices instead of having neigh device as a part of their immutable key. Move neigh device from struct mlx5_neigh to struct mlx5e_neigh_hash_entry and check that nhe and neigh devices are the same in workqueue neigh update handler. Save neigh net_device that can change dynamically in dedicated nhe->dev field. With FIB event handler that is implemented in following patches changing nhe->dev, NETEVENT_DELAY_PROBE_TIME_UPDATE handler can concurrently access the nhe entry when traversing neigh list under rcu read lock. Processing stale values in that handler doesn't change the handler logic, so just wrap all accesses to the dev pointer in {WRITE\|READ}_ONCE() helpers. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 777bb800	21-Sep-2020	Vlad Buslov <vladbu@nvidia.com>	net/mlx5e: Create route entry infrastructure Implement dedicated route entry infrastructure to be used in following patch by route update event. Both encap (indirectly through their corresponding encap entries) and decap (directly) flows are attached to routing entry. Since route update also requires updating encap (route device MAC address is a source MAC address of tunnel encapsulation), same encap_tbl_lock mutex is used for synchronization. The new infrastructure looks similar to existing infrastructures for shared encap, mod_hdr and hairpin entries: - Per-eswitch hash table is used for quick entry lookup. - Flows are attached to per-entry linked list and hold reference to entry during their lifetime. - Atomic reference counting and rcu mechanisms are used as synchronization primitives for concurrent access. The infrastructure also enables connection tracking on stacked devices topology by attaching CT chain 0 flow on tunneling dev to decap route entry. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 912cebf4	04-Oct-2020	Leon Romanovsky <leon@kernel.org>	net/mlx5e: Connect ethernet part to auxiliary bus Reuse auxiliary bus to perform device management of the ethernet part of the mlx5 driver. Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
# 78c906e4	31-Aug-2020	Vlad Buslov <vladbu@nvidia.com>	net/mlx5e: Protect encap route dev from concurrent release In functions mlx5e_route_lookup_ipv{4\|6}() route_dev can be arbitrary net device and not necessary mlx5 eswitch port representor. As such, in order to ensure that route_dev is not destroyed concurrent the code needs either explicitly take reference to the device before releasing reference to rtable instance or ensure that caller holds rtnl lock. First approach is chosen as a fix since rtnl lock dependency was intentionally removed from mlx5 TC layer. To prevent unprotected usage of route_dev in encap code take a reference to the device before releasing rt. Don't save direct pointer to the device in mlx5_encap_entry structure and use ifindex instead. Modify users of route_dev pointer to properly obtain the net device instance from its ifindex. Fixes: 61086f391044 ("net/mlx5e: Protect encap hash table with mutex") Fixes: 6707f74be862 ("net/mlx5e: Update hw flows when encap source mac changed") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 1253935a	20-Sep-2020	Vlad Buslov <vladbu@nvidia.com>	net/mlx5e: Fix race condition on nhe->n pointer in neigh update Current neigh update event handler implementation takes reference to neighbour structure, assigns it to nhe->n, tries to schedule workqueue task and releases the reference if task was already enqueued. This results potentially overwriting existing nhe->n pointer with another neighbour instance, which causes double release of the instance (once in neigh update handler that failed to enqueue to workqueue and another one in neigh update workqueue task that processes updated nhe->n pointer instead of original one): [ 3376.512806] ------------[ cut here ]------------ [ 3376.513534] refcount_t: underflow; use-after-free. [ 3376.521213] Modules linked in: act_skbedit act_mirred act_tunnel_key vxlan ip6_udp_tunnel udp_tunnel nfnetlink act_gact cls_flower sch_ingress openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 mlx5_ib mlx5_core mlxfw pci_hyperv_intf ptp pps_core nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp rpcrdma rdma_ucm ib_umad ib_ipoib ib_iser rdma_cm ib_cm iw_cm rfkill ib_uverbs ib_core sunrpc kvm_intel kvm iTCO_wdt iTCO_vendor_support virtio_net irqbypass net_failover crc32_pclmul lpc_ich i2c_i801 failover pcspkr i2c_smbus mfd_core ghash_clmulni_intel sch_fq_codel drm i2c _core ip_tables crc32c_intel serio_raw [last unloaded: mlxfw] [ 3376.529468] CPU: 8 PID: 22756 Comm: kworker/u20:5 Not tainted 5.9.0-rc5+ #6 [ 3376.530399] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 [ 3376.531975] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core] [ 3376.532820] RIP: 0010:refcount_warn_saturate+0xd8/0xe0 [ 3376.533589] Code: ff 48 c7 c7 e0 b8 27 82 c6 05 0b b6 09 01 01 e8 94 93 c1 ff 0f 0b c3 48 c7 c7 88 b8 27 82 c6 05 f7 b5 09 01 01 e8 7e 93 c1 ff <0f> 0b c3 0f 1f 44 00 00 8b 07 3d 00 00 00 c0 74 12 83 f8 01 74 13 [ 3376.536017] RSP: 0018:ffffc90002a97e30 EFLAGS: 00010286 [ 3376.536793] RAX: 0000000000000000 RBX: ffff8882de30d648 RCX: 0000000000000000 [ 3376.537718] RDX: ffff8882f5c28f20 RSI: ffff8882f5c18e40 RDI: ffff8882f5c18e40 [ 3376.538654] RBP: ffff8882cdf56c00 R08: 000000000000c580 R09: 0000000000001a4d [ 3376.539582] R10: 0000000000000731 R11: ffffc90002a97ccd R12: 0000000000000000 [ 3376.540519] R13: ffff8882de30d600 R14: ffff8882de30d640 R15: ffff88821e000900 [ 3376.541444] FS: 0000000000000000(0000) GS:ffff8882f5c00000(0000) knlGS:0000000000000000 [ 3376.542732] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3376.543545] CR2: 0000556e5504b248 CR3: 00000002c6f10005 CR4: 0000000000770ee0 [ 3376.544483] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3376.545419] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3376.546344] PKRU: 55555554 [ 3376.546911] Call Trace: [ 3376.547479] mlx5e_rep_neigh_update.cold+0x33/0xe2 [mlx5_core] [ 3376.548299] process_one_work+0x1d8/0x390 [ 3376.548977] worker_thread+0x4d/0x3e0 [ 3376.549631] ? rescuer_thread+0x3e0/0x3e0 [ 3376.550295] kthread+0x118/0x130 [ 3376.550914] ? kthread_create_worker_on_cpu+0x70/0x70 [ 3376.551675] ret_from_fork+0x1f/0x30 [ 3376.552312] ---[ end trace d84e8f46d2a77eec ]--- Fix the bug by moving work_struct to dedicated dynamically-allocated structure. This enabled every event handler to work on its own private neighbour pointer and removes the need for handling the case when task is already enqueued. Fixes: 232c001398ae ("net/mlx5e: Add support to neighbour update flow") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# c7eddc60	31-Aug-2020	Parav Pandit <parav@nvidia.com>	net/mlx5: E-switch, Move devlink eswitch ports closer to eswitch Currently devlink eswitch ports are registered and unregistered by the representor layer. However it is better to register them at eswitch layer so that in future user initiated command port add and delete commands can also register/unregister devlink ports without depending on representor layer. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vu Pham <vuhuong@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
# 5adf4c475	30-Apr-2020	Tariq Toukan <tariqt@mellanox.com>	net/mlx5e: RX, Re-work initializaiton of RX function pointers Instead of exposing the RQ datapath handlers (from en_rx.c) so that they are set in the control path (in en_main.c), wrap this logic in a single function in en_rx.c and expose it alone. Every profile will now have a pointer to the new mlx5e_rx_handlers structure, instead of directly pointing to the previously-exposed RQ handlers. This significantly improves locality and modularity of the driver, and allows many functions in en_rx.c to become static. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# 9eabd188	28-May-2020	Pablo Neira Ayuso <pablo@netfilter.org>	mlx5: update indirect block support Register ndo callback via flow_indr_dev_register() and flow_indr_dev_unregister(). No need for mlx5e_rep_indr_clean_block_privs() since flow_block_cb_free() already releases the internal mapping via ->release callback, which in this case is mlx5e_rep_indr_tc_block_unbind(). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
# 88e96e53	02-Mar-2020	Vu Pham <vuhuong@mellanox.com>	net/mlx5e: Slave representors sharing unique metadata for match Bonded slave representors' vports must share a unique metadata for match. On enslaving event of slave representor to lag device, allocate new unique "bond_metadata" for match if this is the first slave. The subsequent enslaved representors will share the same unique "bond_metadata". On unslaving event of slave representor, reset the slave representor's vport to use its own default metadata. Replace ingress acl and rx rules of the slave representors' vports using new vport->bond_metadata. Signed-off-by: Vu Pham <vuhuong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# d97555e1	28-Feb-2020	Vu Pham <vuhuong@mellanox.com>	net/mlx5e: Add bond_metadata and its slave entries Adding bond_metadata and its slave entries to represent a lag device and its slaves VF representors. Bond_metadata structure includes a unique metadata shared by slaves VF respresentors, and a list of slaves representors slave entries. On enslaving event, create a bond_metadata structure representing the upper lag device of this slave representor if it has not been created yet. Create and add entry for the slave representor to the slaves list. On unslaving event, free the slave entry of the slave representor. On the last unslave event, free the bond_metadata structure and its resources. Introduce APIs to create and remove bond_metadata and its resources, enslave and unslave VF representor slave entries. Signed-off-by: Vu Pham <vuhuong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# 7e51891a	21-Jun-2019	Or Gerlitz <ogerlitz@mellanox.com>	net/mlx5e: Use netdev events to set/del egress acl forward-to-vport rule Register a notifier block to handle netdev events for bond device of non-uplink representors to support eswitch vports bonding. When a non-uplink representor is a lower dev (slave) of bond and becomes active, adding egress acl forward-to-vport rule of all slave netdevs (active + standby) to forward to this representor's vport. Use change lower netdev event to do this. Use change upper event to detect slave representor unslaved from lag device to delete its vport egress acl forward rule if any. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Vu Pham <vuhuong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# 32134847	23-Apr-2020	Maor Dickman <maord@mellanox.com>	net/mlx5e: Fix allowed tc redirect merged eswitch offload cases After changing the parent_id to be the same for both NICs of same The cited commit wrongly allow offload of tc redirect flows from VF to uplink and vice versa when devcies are on different eswitch, these cases aren't supported by HW. Disallow the above offloads when devcies are on different eswitch and VF LAG is not configured. Fixes: f6dc1264f1c0 ("net/mlx5e: Disallow tc redirect offload cases we don't support") Signed-off-by: Maor Dickman <maord@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# 14e6b038	03-Feb-2020	Eli Cohen <eli@mellanox.com>	net/mlx5e: Add support for hw decapsulation of MPLS over UDP MPLS over UDP is supported in hardware by using a packet reformat object with reformat type equal L3_TUNNEL_TO_L2 which both decapsulates the outer L3, L4 and MPLS headers, and allows for setting the L2 headers of the resulting decapsulated packet. For the hardware to operate correctly, the configuration of the firmware must have FLEX_PARSER_PROFILE_ENABLE = 1. Example tc rule: tc filter add dev bareudp0 protocol all prio 1 root flower enc_dst_port \ 6635 enc_src_ip 8.8.8.23 action mpls pop protocol ip pipe \ action pedit ex munge eth dst set 00:11:22:33:44:21 pipe action \ mirred egress redirect dev enp59s0f0_0 We use pedit to set the correct destination MAC. For MPLS over UDP decapsulation to take place, the driver logic requires the following: 1. flower filter added on bareudp device. 2. action mpls pop 3. zero or more pedit munge actions 4. one redirect action Current implementation supports only IPv4 and no VLAN. tc filter show output looks like this: filter protocol all pref 1 flower chain 0 filter protocol all pref 1 flower chain 0 handle 0x1 enc_src_ip 8.8.8.24 enc_dst_port 6635 in_hw in_hw_count 1 action order 1: mpls pop protocol ip pipe index 2 ref 1 bind 1 action order 2: pedit action pipe keys 2 index 1 ref 1 bind 1 key #0 at eth+0: val 00112233 mask 00000000 key #1 at eth+4: val 44210000 mask 0000ffff action order 3: mirred (Egress Redirect to device enp59s0f0_0) stolen index 2 ref 1 bind 1 Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# 549c243e	12-May-2020	Vlad Buslov <vladbu@mellanox.com>	net/mlx5e: Extract neigh-specific code from en_rep.c to rep/neigh.c As a preparation for introducing new kconfig option that controls compilation of all TC offloads code in mlx5, extract neigh-specific code from en_rep.c to standalone file. This allows easily compiling out the code by only including new source in make file when corresponding kconfig is enabled instead of adding multiple ifdef blocks to en_rep. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# 768c3667	12-May-2020	Vlad Buslov <vladbu@mellanox.com>	net/mlx5e: Extract TC-specific code from en_rep.c to rep/tc.c As a preparation for introducing new kconfig option that controls compilation of all TC offloads code in mlx5, extract TC-specific code from en_rep.c to standalone file. This allows easily compiling out the code by only including new source in make file when corresponding kconfig is enabled instead of adding multiple ifdef blocks to en_rep. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# 4c3844d9	11-Mar-2020	Paul Blakey <paulb@mellanox.com>	net/mlx5e: CT: Introduce connection tracking Add support for offloading tc ct action and ct matches. We translate the tc filter with CT action the following HW model: +-------------------+ +--------------------+ +--------------+ + pre_ct (tc chain) +----->+ CT (nat or no nat) +--->+ post_ct +-----> + original match + \| + tuple + zone match + \| + fte_id match + \| +-------------------+ \| +--------------------+ \| +--------------+ \| v v v set chain miss mapping set mark original set fte_id set label filter set zone set established actions set tunnel_id do nat (if needed) do decap Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# 20f7b37f	15-May-2019	Saeed Mahameed <saeedm@mellanox.com>	net/mlx5e: Introduce root ft concept for representors netdevs Uplink representor traffic will be redirected to an empty root ft rather than directly to a direct tir or ttc table, this root ft will be empty and will be used as a link for auto-chaining with ttc table or ethtool tables in downstream patches. On load, fs core will connect uplink rep root_ft with ttc table. In case ethtool steering will be used, fs core will auto connect root_ft with the ethtool bypass tables, which will be connected with the ttc table. vport_rx_rule[uplink_rep]->root_ft->ethtool->ttc. For non-uplink representors, for simplicity root_ft will always point at ttc table, hence the replace vport_rx rule logic is removed. vport_rx_rule[non_uplink_rep]->root_ft(ttc). For now ethtool steering support can only be available on uplink rep. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com>
# ffec9702	17-Feb-2020	Tonghao Zhang <xiangxia.m.yue@gmail.com>	net/mlx5e: Don't allow forwarding between uplink We can install forwarding packets rule between uplink in switchdev mode, as show below. But the hardware does not do that as expected (mlnx_perf -i $PF1, we can't get the counter of the PF1). By the way, if we add the uplink PF0, PF1 to Open vSwitch and enable hw-offload, the rules can be offloaded but not work fine too. This patch add a check and if so return -EOPNOTSUPP. $ tc filter add dev $PF0 protocol all parent ffff: prio 1 handle 1 \ flower skip_sw action mirred egress redirect dev $PF1 $ tc -d -s filter show dev $PF0 ingress skip_sw in_hw in_hw_count 1 action order 1: mirred (Egress Redirect to device enp130s0f1) stolen ... Sent hardware 408954 bytes 4173 pkt ... Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# 0a7fcb78	15-Feb-2020	Paul Blakey <paulb@mellanox.com>	net/mlx5e: Support inner header rewrite with goto action The hardware supports header rewrite of outer headers only. To perform header rewrite on inner headers, we must first decapsulate the packet. Currently, the hardware decap action is explicitly set by the tc tunnel_key unset action. However, with goto action the user won't use the tunnel_key unset action. In addition, header rewrites actions will not apply to the inner header as done by the software model. To support this, we will map each tunnel matches seen on a tc rule to a unique tunnel id, implicity add a decap action on tc chain 0 flows, and mark the packets with this unique tunnel id. Tunnel matches on the decapsulated tunnel on later chains will match on this unique id instead of the actual packet. We will also use this mapping to restore the tunnel info metadata on miss. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# dfd9e750	15-Feb-2020	Paul Blakey <paulb@mellanox.com>	net/mlx5e: Rx, Split rep rx mpwqe handler from nic Copy the current rep mpwqe rx handler which is also used by nic profile. In the next patch, we will add rep specific logic, just for the rep profile rx handler. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# d48834f9	24-Jan-2020	Jiri Pirko <jiri@mellanox.com>	mlx5: Use dev_net netdevice notifier registrations Register the dev_net notifier and allow the per-net notifier to follow the device into different namespace. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# a6d35fb4	02-Sep-2019	Roi Dayan <roid@mellanox.com>	net/mlx5e: Remove leftover declaration This function was removed in the cited commit below. Fixes: 13e509a4c194 ("net/mlx5e: Remove leftover code from the PF netdev being uplink rep") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
# 2b688ea5	15-Aug-2019	Maor Gottlieb <maorg@mellanox.com>	net/mlx5: Add flow steering actions to fs_cmd shim layer Add flow steering actions: modify header and packet reformat to the fs_cmd shim layer. This allows each namespace to define possibly different functionality for alloc/dealloc action commands. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>