Cross Reference: /linux-master/drivers/net/ethernet/mellanox/mlxsw/spectrum.h

History log of /linux-master/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
Revision	Date	Author	Comments
# 6fb88aaf	08-Mar-2024	Petr Machata <petrm@nvidia.com>	mlxsw: spectrum: Allow fetch-and-clear of flow counters For the report_delta-like interface like a previous patch has added for collection of NH group statistics, it's easiest to read the counter and have the HW clear it right away. Thus, change mlxsw_sp_flow_counter_get() to take a bool indicating whether this should be done. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/6a096ede8ee92d5041e3832242c3bbc137198aba.1709901020.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# c6ca2884	26-Jan-2024	Amit Cohen <amcohen@nvidia.com>	mlxsw: spectrum: Query max_lag once The maximum number of LAGs is queried from core several times. It is used to allocate LAG array, and then to iterate over it. In addition, it is used for PGT initialization. To simplify the code, instead of querying it several times, store the value as part of 'mlxsw_sp' and use it. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
# 6dce962c	26-Jan-2024	Amit Cohen <amcohen@nvidia.com>	mlxsw: spectrum: Change mlxsw_sp_upper to LAG structure The structure mlxsw_sp_upper is used only as LAG. Rename it to mlxsw_sp_lag and move it to spectrum.c file, as it is used only there. Move the function mlxsw_sp_lag_get() with the structure. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
# b2f5eb5a	14-Dec-2023	Petr Machata <petrm@nvidia.com>	mlxsw: spectrum_fid: Add an "any" packet type Flood profiles have been used prior to CFF support for NVE underlay. Like is the case with FID flooding, an NVE profile describes at which offset a datum is located given traffic type. mlxsw currently only ever uses one KVD entry for NVE lookup, i.e. regardless of traffic type, the offset is always zero. To be able to describe this, add a traffic type enumerator describing "any traffic type". Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# 315702e0	28-Nov-2023	Petr Machata <petrm@nvidia.com>	mlxsw: spectrum_fid: Add hooks for RSP table maintenance In the CFF flood mode, the driver has to allocate a table within PGT, which holds flood vectors for router subport FIDs. For LAGs, these flood vectors have to obviously be maintained dynamically as port membership in a LAG changes. But even for physical ports, the flood vectors have to be kept valid, and may not contain enabled bits corresponding to non-existent ports. It is therefore not possible to precompute the port part of the RSP table, it has to be maintained as ports come and go due to splits. To support the RSP table maintenance, add to FID ops two new ops: fid_port_init and fid_port_fini, for when a port comes to existence, or joins a lag, and vice versa. Invoke these ops from mlxsw_sp_port_fids_init() and mlxsw_sp_port_fids_fini(), which are called when port is added and removed, respectively. Also add two new hooks for LAG maintenance, mlxsw_sp_fid_port_join_lag() / _leave_lag() which transitively call into the same ops. Later patches will actually add the op implementations themselves, this just adds the scaffolding. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/234398a23540317abb25f74f920a5c8121faecf0.1701183892.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# a59316ff	28-Nov-2023	Petr Machata <petrm@nvidia.com>	mlxsw: spectrum_fid: Add a not-UC packet type In CFF flood mode, the rFID family will allocate two tables. One for unknown UC traffic, one for everything else. Add a traffic type for the everything else traffic. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/8fb968b2d1cc37137cd0110c98cdeb625b03ca99.1701183892.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# 01de00f4	28-Nov-2023	Petr Machata <petrm@nvidia.com>	mlxsw: spectrum_fid: Privatize FID families Currently, mlxsw always uses a "controlled" flood mode on all Nvidia Spectrum generations. The following patches will however introduce a possibility to run a "CFF" (for Compressed FID Flooding) mode on newer machines, if the FW supports it. Several operations will differ between how they need to be done in controlled mode vs. CFF mode. Thus the per-FID-family ops will differ between controlled and CFF, thus the FID family array as such will differ depending on whether the mode negotiated with FW is controlled or CFF. The simple approach of having several globally visible arrays for spectrum.c to statically choose from no longer works. Instead privatize all FID initialization and finalization logic, and expose it as ops instead. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/d3fa390d97cf3dbd2f7a28741be69b311e2059e4.1701183891.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# 27851dfa	20-Nov-2023	Petr Machata <petrm@nvidia.com>	mlxsw: spectrum_router: Add a helper to get subport number from a RIF In the CFF flood mode, responsibility for management of the PGT entries for rFIDs is moved from FW to the driver. All rFIDs are based off either a front panel port, or a LAG port. The flood vectors for port-based rFIDs enable just the port itself, the ones for LAG-based rFIDs enable all member ports of the LAG in question. Since all rFIDs based off the same port have the same flood vector, and similarly for LAG-based rFIDs, the flood entries are shared. The PGT address of the flood vector is therefore determined based on the port (or LAG) number of the RIF connected with the rFID. Add a helper to determine subport number given a RIF, to be used in these calculations. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/d7ab43cf5b021f785f363f236e4b6780d10eea93.1700503644.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# c6789725	18-Oct-2023	Petr Machata <petrm@nvidia.com>	mlxsw: spectrum: Allocate LAG table when in SW LAG mode In this patch, if the LAG mode is SW, allocate the LAG table and configure SGCR to indicate where it was allocated. We use the default "DDD" (for dynamic data duplication) layout of the LAG table. In the DDD mode, the membership information for each LAG is copied in 8 PGT entries. This is done for performance reasons. The LAG table then needs to be allocated on an address aligned to 8. Deal with this by moving the LAG init ahead so that the LAG table is allocated at address 0. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# 8c893abd	18-Oct-2023	Petr Machata <petrm@nvidia.com>	mlxsw: spectrum_pgt: Generalize PGT allocation PGT blocks are allocated through the function mlxsw_sp_pgt_mid_alloc_range(). The interface assumes that the caller knows which piece of PGT exactly they want to get. That was fine while the FID code was the only client allocating blocks of PGT. However for SW-allocated LAG table, there will be an additional client: mlxsw_sp_lag_init(). The interface should therefore be changed to not require particular coordinates, but to take just the requested size, allocate the block wherever, and give back the PGT address. In this patch, change the interface accordingly. Initialize FID family's pgt_base from the result of the PGT allocation (note that mlxsw makes a copy of the family structure, so what gets initialized is not actually the global structure). Drop the now-unnecessary pgt_base initializations and the corresponding defines. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# 9793a5a9	11-Aug-2023	Ido Schimmel <idosch@nvidia.com>	mlxsw: spectrum: Stop ignoring learning notifications from redirected traffic As explained in the previous patch, with the ignore action prepended to the redirect action, it is not longer possible for redirected traffic to generate learning notifications. Therefore, remove the workaround that was added in commit 577fa14d2100 ("mlxsw: spectrum: Do not process learned records with a dummy FID") as it is no longer needed. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# 0433670e	11-Aug-2023	Ido Schimmel <idosch@nvidia.com>	mlxsw: spectrum_flower: Disable learning and security lookup when redirecting It is possible to add a filter that redirects traffic from the ingress of a bridge port that is locked (i.e., performs security / SMAC lookup) and has learning enabled. For example: # ip link add name br0 type bridge # ip link set dev swp1 master br0 # bridge link set dev swp1 learning on locked on mab on # tc qdisc add dev swp1 clsact # tc filter add dev swp1 ingress pref 1 proto ip flower skip_sw src_ip 192.0.2.1 action mirred egress redirect dev swp2 In the kernel's Rx path, this filter is evaluated before the Rx handler of the bridge, which means that redirected traffic should not be affected by bridge port configuration such as learning. However, the hardware data path is a bit different and the redirect action (FORWARDING_ACTION in hardware) merely attaches a pointer to the packet, which is later used by the L2 lookup stage to understand how to forward the packet. Between both stages - ingress ACL and L2 lookup - learning and security lookup are performed, which means that redirected traffic is affected by bridge port configuration, unlike in the kernel's data path. The learning discrepancy was handled in commit 577fa14d2100 ("mlxsw: spectrum: Do not process learned records with a dummy FID") by simply ignoring learning notifications generated by the redirected traffic. A similar solution is not possible for the security / SMAC lookup since - unlike learning - the CPU is not involved and packets that failed the lookup are dropped by the device. Instead, solve this by prepending the ignore action to the redirect action and use it to instruct the device to disable both learning and the security / SMAC lookup for redirected traffic. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# 852c18d5	03-Aug-2023	Yue Haibing <yuehaibing@huawei.com>	mlxsw: spectrum: Remove unused function declarations Commit c3d2ed93b14d ("mlxsw: Remove old parsing depth infrastructure") left behind mlxsw_sp_nve_inc_parsing_depth_get()/mlxsw_sp_nve_inc_parsing_depth_put(). And commit 532b49e41e64 ("mlxsw: spectrum_span: Derive SBIB from maximum port speed & MTU") remove mlxsw_sp_span_port_mtu_update()/mlxsw_sp_span_speed_update_work() but leave the declarations. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20230803142047.42660-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# 569f98b3	27-Jul-2023	Petr Machata <petrm@nvidia.com>	mlxsw: spectrum: Drop unused functions mlxsw_sp_port_lower_dev_hold/_put() As of commit 151b89f6025a ("mlxsw: spectrum_router: Reuse work neighbor initialization in work scheduler"), the functions mlxsw_sp_port_lower_dev_hold() and mlxsw_sp_port_dev_put() have no users. Drop them. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/d0adcd7cb4ea19416294a0f861100edba84c9f36.1690471774.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# ec4643ca	19-Jul-2023	Petr Machata <petrm@nvidia.com>	mlxsw: spectrum_switchdev: Replay switchdev objects on port join Currently it never happens that a netdevice that is already a bridge slave would suddenly become mlxsw upper. The only case where this might be possible as far as mlxsw is concerned, is with LAG netdevices. But if a LAG has any upper (e.g. is enslaved), enlaving mlxsw port to that LAG is forbidden. Thus the only way to install a LAG between a bridge and a mlxsw port is by first enslaving the port to the LAG, and then enslaving that LAG to a bridge. At that point there are no bridge objects (such as port VLANs) to replay. Those are added afterwards, and notified as they are created. This holds even for the PVID. However in the following patches, the requirement that ports be only enslaved to masters without uppers, is going to be relaxed. It will therefore be necessary to replay the existing bridge objects. Without this replay, e.g. the mlxsw bridge_port_vlan objects are not instantiated, which causes issues later, as a lot of code relies on their presence. To that end, add a new notifier block whose sole role is to filter out events related to the one relevant upper, and forward those to the existing switchdev notifier block. Pass the new notifier block to switchdev_bridge_port_offload() when the bridge port is created. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# fe22f741	11-Jul-2023	Ido Schimmel <idosch@nvidia.com>	mlxsw: spectrum_flower: Add ability to match on port ranges Add the ability to match on port ranges by utilizing the previously added port range registers and the port range key element. Up to two port range registers can be used for each filter, one for source port and another for destination port. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/df4385a9592917e9a22ebff339e0463e4a8dfa82.1689092769.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# 898979c7	11-Jul-2023	Ido Schimmel <idosch@nvidia.com>	mlxsw: spectrum_acl: Pass main driver structure to mlxsw_sp_acl_rulei_destroy() The main driver structure will be needed in this function by a subsequent patch, so pass it. No functional changes intended. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/24d96a4e21310e5de2951ace58263db35e44a0df.1689092769.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# 74d6786c	11-Jul-2023	Ido Schimmel <idosch@nvidia.com>	mlxsw: spectrum_port_range: Add devlink resource support Expose via devlink-resource the maximum number of port range registers and their current occupancy. Besides the observability benefits, this resource will be used by subsequent patches for scale and occupancy tests. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/7945e0c715dc5efb1617f45f7560c1f1bd0bcf8a.1689092769.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# b3eb04be	11-Jul-2023	Ido Schimmel <idosch@nvidia.com>	mlxsw: spectrum_port_range: Add port range core The Spectrum ASICs have a fixed number of port range registers, each of which maintains the following parameters: * Minimum and maximum port. * Apply port range for source port, destination port or both. * Apply port range for TCP, UDP or both. * Apply port range for IPv4, IPv6 or both. Implement a port range core which takes care of the allocation and configuration of these registers and exposes an API that allows in-driver consumers (e.g., the ACL code) to request matching on a range of either source or destination port. These registers are going to be used for port range matching in the flower classifier that already matches on EtherType being IPv4 / IPv6 and IP protocol being TCP / UDP. As such, there is no need to limit these registers to a specific EtherType or IP protocol, which will increase the likelihood of a register being shared by multiple flower filters. It is unlikely that a filter will match on the same range of both source and destination ports, which is why each register is only configured to match on either source or destination port. If a filter requires matching on a range of both source and destination ports, it will utilize two port range registers and match on the output of both. For efficient lookup and traversal, use XArray to store the allocated port range registers. The XArray uses RCU and an internal spinlock to synchronise access, so there is no need for a dedicate lock. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/674f00539a0072d455847663b5feb504db51a259.1689092769.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# 76962b80	12-Jun-2023	Petr Machata <petrm@nvidia.com>	mlxsw: spectrum_router: Add a helper specifically for joining a LAG Currently, joining a LAG very simply means that the LAG RIF should be joined by the subport representing untagged traffic. If the RIF does not exist, it does not have to be created: if the user wants there to be RIF for the LAG device, they are supposed to add an IP address, and they are supposed to do it after tha LAG becomes mlxsw upper. We can also assume that the LAG has no uppers, otherwise the enslavement is not allowed. In the future, these ordering dependencies should be removed. That means that joining LAG will be more complex operation, possibly involving a lazy RIF creation, and possibly joining / lazily creating RIFs for VLAN uppers of the LAG. It will be handy to have a dedicated function that handles all this. The new function mlxsw_sp_router_port_join_lag() is that. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
# 41b2bd20	09-Jun-2023	Petr Machata <petrm@nvidia.com>	mlxsw: spectrum_router: Move here inetaddr validator notifiers The validation logic is already in the router code. Move there the notifier blocks themselves as well. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# 74cbc3c0	06-Feb-2023	Ido Schimmel <idosch@nvidia.com>	mlxsw: spectrum_acl_tcam: Move devlink param to TCAM code Cited commit added 'DEVLINK_CMD_PARAM_DEL' notifications whenever the network namespace of the devlink instance is changed. Specifically, the notifications are generated after calling reload_down(), but before calling reload_up(). At this stage, the data structures accessed while reading the value of the "acl_region_rehash_interval" devlink parameter are uninitialized, resulting in a use-after-free [1]. Fix by moving the registration and unregistration of the devlink parameter to the TCAM code where it is actually used. This means that the parameter is unregistered during reload_down() and then re-registered during reload_up(), avoiding the use-after-free between these two operations. Reproducer: # ip netns add test123 # devlink dev reload pci/0000:06:00.0 netns test123 [1] BUG: KASAN: use-after-free in mlxsw_sp_acl_tcam_vregion_rehash_intrvl_get+0xb2/0xd0 Read of size 4 at addr ffff888162fd37d8 by task devlink/1323 [...] Call Trace: <TASK> dump_stack_lvl+0x95/0xbd print_report+0x181/0x4a1 kasan_report+0xdb/0x200 mlxsw_sp_acl_tcam_vregion_rehash_intrvl_get+0xb2/0xd0 mlxsw_sp_params_acl_region_rehash_intrvl_get+0x32/0x80 devlink_nl_param_fill.constprop.0+0x29a/0x11e0 devlink_param_notify.constprop.0+0xb9/0x250 devlink_notify_unregister+0xbc/0x470 devlink_reload+0x1aa/0x440 devlink_nl_cmd_reload+0x559/0x11b0 genl_family_rcv_msg_doit.isra.0+0x1f8/0x2e0 genl_rcv_msg+0x558/0x7f0 netlink_rcv_skb+0x170/0x440 genl_rcv+0x2d/0x40 netlink_unicast+0x53f/0x810 netlink_sendmsg+0x961/0xe80 __sys_sendto+0x2a4/0x420 __x64_sys_sendto+0xe5/0x1c0 do_syscall_64+0x38/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 7d7e9169a3ec ("devlink: move devlink reload notifications back in between _down() and _up() calls") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# dc0d1a8b	08-Nov-2022	Ido Schimmel <idosch@nvidia.com>	mlxsw: spectrum: Add an API to configure security checks Add an API to enable or disable security checks on a local port. It will be used by subsequent patches when the 'BR_PORT_LOCKED' flag is toggled. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
# 24157bc6	27-Jul-2022	Danielle Ratson <danieller@nvidia.com>	mlxsw: Send PTP packets as data packets to overcome a limitation In Spectrum-2 and Spectrum-3, the correction field of PTP packets which are sent as control packets is not updated at egress port. To overcome this limitation, PTP packets which require time stamp, should be sent as data packets with the following details: 1. FID valid = 1 2. FID value above the maximum FID 3. rx_router_port = 1 >From Spectrum-4 and on, this limitation will be solved. Extend the function which handles TX header, in case that the packet is a PTP packet, add TX header with type=data and all the above mentioned requirements. Add operation as part of 'struct mlxsw_sp_ptp_ops', to be able to separate the handling of PTP packets between different ASICs. Use the data packet solution only for Spectrum-2 and Spectrum-3. Therefore, add a dedicated operation structure for Spectrum-4, as it will be same to Spectrum-2 in PTP implementation, just will not have the limitation of control packets. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# 77b7f83d	04-Jul-2022	Amit Cohen <amcohen@nvidia.com>	mlxsw: Enable unified bridge model After all the preparations for unified bridge model, finally flip mlxsw driver to use the new model. Change config profile, set 'ubridge' to true and remove the configurations that are relevant only for the legacy model. Set 'flood_mode' to 'controlled' as the current mode is not supported with unified bridge model. Remove all the code which is dedicated to the legacy model. Remove 'struct mlxsw_sp.ubridge' variable which was temporarily added to separate configurations between the models. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# d4324e31	04-Jul-2022	Amit Cohen <amcohen@nvidia.com>	mlxsw: Add new FID families for unified bridge model In the unified bridge model, mlxsw will no longer emulate 802.1Q FIDs using 802.1D FIDs. The new FID table will look as follows: +---------------+ \| 802.1q FIDs \| 4K entries \| [1..4094] \| +---------------+ \| 802.1d FIDs \| 1K entries \| [4095..5118] \| +---------------+ \| Dummy FIDs \| 1 entry \| [5119..5119] \| +---------------+ \| rFIDs \| 11K entries \| [5120..16383] \| +---------------+ In order to make the change easier to review, four new temporary FID families will be added (e.g., MLXSW_SP_FID_TYPE_8021D_UB) and will not be registered with the FID core until mlxsw is flipped to use the unified bridge model. Add .1d, rfid and dummy FID families for unified bridge, the next patch will add .1q family separately as it requires more changes. The following changes are required: 1. Add 'smpe_index_valid' field to 'struct mlxsw_sp_fid_family' and set SFMR.smpe accordingly. SMPE index is reserved for rFIDs, as their flooding is handled by firmware, and always reserved in Spectrum-1, as it is configured as part of PGT table. 2. Add 'ubridge' field to 'struct mlxsw_sp_fid_family'. This field will be removed later, use it in mlxsw_sp_fid_family_{register,unregister}() to skip the registration / unregistration of the new families when the legacy model is used. 3. Indexes - the start and end indexes of each FID family will need to be changed according to the above diagram. 4. Add flood tables for unified bridge model, use 'fid_offset' as table type, as in the new model the access to flood tables will be using 'fid_offset' calculation. 5. FID family operation changes: a. rFID supposed to be created using SFMR, as it is not created by firmware using unified bridge model. b. port_vid_map() should perform SVFA for rFID, as the mapping is not created by firmware using unified bridge model. c. flood_index() is not aligned to the new model, as this function will be removed later. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# 662761d8	04-Jul-2022	Amit Cohen <amcohen@nvidia.com>	mlxsw: Add support for VLAN RIFs Router interfaces (RIFs) constructed on top of VLAN-aware bridges are of 'VLAN' type, whereas RIFs constructed on top of VLAN-unaware bridges are of 'FID' type. Currently 802.1Q FIDs are emulated using 802.1D FIDs, therefore VLAN RIFs are emulated using FID RIFs. As part of converting the driver to use unified bridge model, 802.1Q FIDs and VLAN RIFs will be used. The egress FID is required for VLAN RIFs in Spectrum-2 and above, but not in Spectrum-1, as in Spectrum-1 the mapping for VLAN RIFs is VID->FID, while in other ASICs it is FID->FID. The reason for the change is that it is more scalable to reuse the FID->FID entry than creating multiple {Port, VID}->FID entries for the router port. Use the existing operation structure to separate the configuration between different ASICs. Add support for VLAN RIFs, most of the configurations are same to FID RIFs. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# fea20547	04-Jul-2022	Amit Cohen <amcohen@nvidia.com>	mlxsw: Configure ingress RIF classification Before layer 2 forwarding, the device classifies an incoming packet to a FID. The classification is done based on one of the following keys: 1. FID 2. VNI (after decapsulation) 3. VID / {Port, VID} After classification, the FID is known, but also all the attributes of the FID, such as the router interface (RIF) via which a packet that needs to be routed will ingress the router block. In the legacy model, when a RIF was created / destroyed, it was firmware's responsibility to update it in the previously mentioned FID classification records. In the unified bridge model, this responsibility moved to software. The third classification requires to iterate over the FID's {Port, VID} list and issue SVFA write with the correct mapping table according to the port's mode (virtual or not). We never map multiple VLANs to the same FID using VID->FID mapping, so such a mapping needs to be performed once. When a new FID classification entry is configured and the FID already has a RIF, set the RIF as part of SVFA configuration. The reverse needs to be done when clearing a RIF from a FID. Currently, clearing is done by issuing mlxsw_sp_fid_rif_set() with a NULL RIF pointer. Instead, introduce mlxsw_sp_fid_rif_unset(). Note that mlxsw_sp_fid_rif_set() is called after the RIF is fully operational, so it conforms to the internal requirement regarding SVFA.irif_v: "Must not be set for a non-enabled RIF". Do not set the ingress RIF for rFIDs, as the {Port, VID}->rFID entry is configured by firmware when legacy model is used, a next patch will handle this configuration for rFIDs and unified bridge model. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# eede53a4	28-Jun-2022	Amit Cohen <amcohen@nvidia.com>	mlxsw: spectrum_switchdev: Rename MID structure Currently the structure which represents MDB entry is called 'struct mlxsw_sp_mid'. This name is not accurate as a MID entry stores a bitmap of ports to which a packet needs to be replicated and a MDB entry stores the mapping from {MAC, FID} to PGT index (MID). Rename the structure to 'struct mlxsw_sp_mdb_entry'. The structure 'mlxsw_sp_mid' is defined as part of spectrum.h. The only file which uses it is spectrum_switchdev.c, so there is no reason to expose it to other files. Move the definition to spectrum_switchdev.c. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# 4abaa5cc	28-Jun-2022	Amit Cohen <amcohen@nvidia.com>	mlxsw: Align PGT index to legacy bridge model FID code reserves about 15K entries in PGT table for flooding. These entries are just allocated and are not used yet because the code that uses them is skipped now. The next patches will convert MDB code to use PGT APIs. The allocation of indexes for multicast is done after FID code reserves 15K entries. Currently, legacy bridge model is used and firmware manages PGT table. That means that the indexes which are allocated using PGT API are too high when legacy bridge model is used. To not exceed firmware limitation for MDB entries, add an API that returns the correct 'mid_index', based on bridge model. For legacy model, subtract the number of flood entries from PGT index. Use it to write the correct MID to SMID register. This API will be used also from MDB code in the next patches. PGT should not be aware of MDB and FID different usage, this API is temporary and will be removed once unified bridge model will be used. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# a3a7992b	27-Jun-2022	Amit Cohen <amcohen@nvidia.com>	mlxsw: Extend PGT APIs to support maintaining list of ports per entry Add an API to associate a PGT entry with SMPE index and add or remove a port. This API will be used by FID code and MDB code, to add/remove port from specific PGT entry. When the first port is added to PGT entry, allocate the entry in the given MID index, when the last port is removed from PGT entry, free it. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
# d7a7b697	27-Jun-2022	Amit Cohen <amcohen@nvidia.com>	mlxsw: Add a dedicated structure for bitmap of ports Currently when bitmap of ports is needed, 'unsigned long *' type is used. The functions which use the bitmap assume its length according to its name, i.e., each function which gets a bitmap of ports queries the maximum number of ports and uses it as the size. As preparation for the next patch which will use bitmap of ports, add a dedicated structure for it. Refactor the existing code to use the new structure. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
# a1697d11	27-Jun-2022	Amit Cohen <amcohen@nvidia.com>	mlxsw: Add an indication of SMPE index validity for PGT table In Spectrum-1, the index into the MPE table - called switch multicast to port egress VID (SMPE) - is derived from the PGT entry, whereas in Spectrum-2 and later ASICs it is derived from the FID. Therefore, in Spectrum-1, the SMPE index needs to be programmed as part of the PGT entry via SMID register, while it is reserved for Spectrum-2 and later ASICs. Add 'pgt_smpe_index_valid' boolean as part of 'struct mlxsw_sp' and set it to true for Spectrum-1 and to false for the later ASICs. Add 'smpe_index_valid' as part of 'struct mlxsw_sp_pgt' and set it according to the value in 'struct mlxsw_sp' as part of PGT initialization. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
# d8782ec5	27-Jun-2022	Amit Cohen <amcohen@nvidia.com>	mlxsw: Add an initial PGT table support The PGT (Port Group Table) table maps an index to a bitmap of local ports to which a packet needs to be replicated. This table is used for layer 2 multicast and flooding. In the legacy model, software did not interact with this table directly. Instead, it was accessed by firmware in response to registers such as SFTR and SMID. In the new model, the SFTR register is deprecated and software has full control over the PGT table using the SMID register. The entire state of the PGT table needs to be maintained in software because member ports in a PGT entry needs to be reference counted to avoid releasing entries which are still in use. Add the following APIs: 1. mlxsw_sp_pgt_{init, fini}() - allocate/free the PGT table. 2. mlxsw_sp_pgt_mid_alloc_range() - allocate a range of MID indexes in PGT. To be used by FID code during initialization to reserve specific PGT indexes for flooding entries. 3. mlxsw_sp_pgt_mid_free_range() - free indexes in a given range. 4. mlxsw_sp_pgt_mid_alloc() - allocate one MID index in the PGT at a non-specific range, just search for free index. To be used by MDB code. 5. mlxsw_sp_pgt_mid_free() - free the given index. Note that alloc() functions do not allocate the entries in software, just allocate IDs using 'idr'. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
# d6d90266	27-Jun-2022	Amit Cohen <amcohen@nvidia.com>	mlxsw: spectrum: Add a temporary variable to indicate bridge model As part of transition to unified bridge model, many different firmware configurations are done. Some of the configuration that needs to be done for the unified bridge model is not valid under the legacy model, and would be rejected by the firmware. At the same time, the driver cannot switch to the unified bridge model until all of the code has been converted. To allow breaking the change into patches, and to not break driver behavior during the transition, add a boolean variable to indicate bridge model. Then, forbidden configurations will be skipped using the check - "if (!mlxsw_sp->ubridge)". The new variable is temporary for several sets, it will be removed when firmware will be configured to work with unified bridge model. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
# 027c92e0	23-Jun-2022	Amit Cohen <amcohen@nvidia.com>	mlxsw: spectrum: Rename MLXSW_SP_RIF_TYPE_VLAN Currently, the driver emulates 802.1Q FIDs using 802.1D FIDs. As such, the RIFs configured on top of these FIDs are FID RIFs and not VLAN RIFs. As part of converting the driver to the unified bridge model, 802.1Q FIDs and VLAN RIFs will be used. As a preparation for this change, rename the emulated VLAN RIFs from 'MLXSW_SP_RIF_TYPE_VLAN' to 'MLXSW_SP_RIF_TYPE_VLAN_EMU'. After the conversion the emulated VLAN RIFs will be removed. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# 04e85970	23-Jun-2022	Amit Cohen <amcohen@nvidia.com>	mlxsw: spectrum: Use different arrays of FID families per-ASIC type Egress VID for layer 2 multicast is determined from two tables, the MPE and PGT tables. The MPE table is a two dimensional table indexed by local port and SMPE index, which should be thought of as a FID index. In Spectrum-1 the SMPE index is derived from the PGT entry, whereas in Spectrum-2 and newer ASICs the SMPE index is a FID attribute configured via the SFMR register. The validity of the SMPE index in SFMR is influenced from two factors: 1. FID family. SMPE index is reserved for rFIDs, as their flooding is handled by firmware. 2. ASIC generation. SMPE index is always reserved for Spectrum-1. As such, the validity of the SMPE index should be an attribute of the FID family and have different arrays of FID families per-ASIC type. As a preparation for SMPE index configuration, create separate arrays of FID families for different ASICs. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# 22aae520	21-Jun-2022	Amit Cohen <amcohen@nvidia.com>	mlxsw: Remove lag_vid_valid indication Currently 'struct mlxsw_sp_fid_family' has a field which indicates if 'lag_vid' is valid for use in SFD register. This is a leftover from using .1Q FIDs instead of emulating them using .1D FIDs. Currently when .1Q FIDs are emulated using .1D FIDs, this field is true for both families, so there is no reason to maintain it. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# 4ec2feb2	16-Jun-2022	Petr Machata <petrm@nvidia.com>	mlxsw: Add a resource describing number of RIFs The Spectrum ASIC has a limit on how many L3 devices (called RIFs) can be created. The limit depends on the ASIC and FW revision, and mlxsw reads it from the FW. In order to communicate both the number of RIFs that there can be, and how many are taken now (i.e. occupancy), introduce a corresponding devlink resource. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# 75ef4342	08-May-2022	Petr Machata <petrm@nvidia.com>	mlxsw: spectrum: Move handling of tunnel events to router code The events related to IPIP tunnels are handled by the router code. Move the handling from the central dispatcher in spectrum.c to the new notifier handler in the router module. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>