#
2b3082c6 |
|
28-Jul-2023 |
Ratheesh Kannoth <rkannoth@marvell.com> |
net: flow_dissector: Use 64bits for used_keys As 32bits of dissector->used_keys are exhausted, increase the size to 64bits. This is base change for ESP/AH flow dissector patch. Please find patch and discussions at https://lore.kernel.org/netdev/ZMDNjD46BvZ5zp5I@corigine.com/T/#t Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Reviewed-by: Petr Machata <petrm@nvidia.com> # for mlxsw Tested-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
f44a9010 |
|
24-Jul-2023 |
Rob Herring <robh@kernel.org> |
net: dsa: Explicitly include correct DT includes The DT of_device.h and of_platform.h date back to the separate of_platform_bus_type before it as merged into the regular platform bus. As part of that merge prepping Arm DT support 13 years ago, they "temporarily" include each other. They also include platform_device.h and of.h. As a result, there's a pretty much random mix of those include files used throughout the tree. In order to detangle these headers and replace the implicit includes with struct declarations, users need to explicitly include the correct includes. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20230724211859.805481-1-robh@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
d44036ca |
|
17-Aug-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: fix oversize frame dropping for always closed tc-taprio gates The blamed commit resolved a bug where frames would still get stuck at egress, even though they're smaller than the maxSDU[tc], because the driver did not take into account the extra 33 ns that the queue system needs for scheduling the frame. It now takes that into account, but the arithmetic that we perform in vsc9959_tas_remaining_gate_len_ps() is buggy, because we operate on 64-bit unsigned integers, so gate_len_ns - VSC9959_TAS_MIN_GATE_LEN_NS may become a very large integer if gate_len_ns < 33 ns. In practice, this means that we've introduced a regression where all traffic class gates which are permanently closed will not get detected by the driver, and we won't enable oversize frame dropping for them. Before: mscc_felix 0000:00:00.5: port 0: max frame size 1526 needs 12400000 ps, 1152000 ps for mPackets at speed 1000 mscc_felix 0000:00:00.5: port 0 tc 0 min gate len 1000000, sending all frames mscc_felix 0000:00:00.5: port 0 tc 1 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 2 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 3 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 4 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 5 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 6 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 7 min gate length 5120 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 615 octets including FCS After: mscc_felix 0000:00:00.5: port 0: max frame size 1526 needs 12400000 ps, 1152000 ps for mPackets at speed 1000 mscc_felix 0000:00:00.5: port 0 tc 0 min gate len 1000000, sending all frames mscc_felix 0000:00:00.5: port 0 tc 1 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 2 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 3 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 4 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 5 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 6 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 7 min gate length 5120 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 615 octets including FCS Fixes: 11afdc6526de ("net: dsa: felix: tc-taprio intervals smaller than MTU should send at least one packet") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230817120111.3522827-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
c6efb4ae |
|
05-Jul-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: fix oversize frame dropping for preemptible TCs This switch implements Hold/Release in a strange way, with no control from the user as required by IEEE 802.1Q-2018 through Set-And-Hold-MAC and Set-And-Release-MAC, but rather, it emits HOLD requests implicitly based on the schedule. Namely, when the gate of a preemptible TC is about to close (actually QSYS::PREEMPTION_CFG.HOLD_ADVANCE octet times in advance of this event), the QSYS seems to emit a HOLD request pulse towards the MAC which preempts the currently transmitted packet, and further packets are held back in the queue system. This allows large frames to be squeezed through small time slots, because HOLD requests initiated by the gate events result in the frame being segmented in multiple fragments, the bit time of which is equal to the size of the time slot. It has been reported that the vsc9959_tas_guard_bands_update() logic breaks this, because it doesn't take preemptible TCs into account, and enables oversized frame dropping when the time slot doesn't allow a full MTU to be sent, but it does allow 2*minFragSize to be sent (128B). Packets larger than 128B are dropped instead of being sent in multiple fragments. Confusingly, the manual says: | For guard band, SDU calculation of a traffic class of a port, if | preemption is enabled (through 'QSYS::PREEMPTION_CFG.P_QUEUES') then | QSYS::PREEMPTION_CFG.HOLD_ADVANCE is used, otherwise | QSYS::QMAXSDU_CFG_*.QMAXSDU_* is used. but this only refers to the static guard band durations, and the QMAXSDU_CFG_* registers have dual purpose - the other being oversized frame dropping, which takes place irrespective of whether frames are preemptible or express. So, to fix the problem, we need to call vsc9959_tas_guard_bands_update() from ocelot_port_update_active_preemptible_tcs(), and modify the guard band logic to consider a different (lower) oversize limit for preemptible traffic classes. Fixes: 403ffc2c34de ("net: mscc: ocelot: add support for preemptible traffic classes") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Message-ID: <20230705104422.49025-4-vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
c6081914 |
|
05-Jul-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: make vsc9959_tas_guard_bands_update() visible to ocelot->ops In a future change we will need to make ocelot_port_update_active_preemptible_tcs() call vsc9959_tas_guard_bands_update(), but that is currently not possible, since the ocelot switch lib does not have access to functions private to the DSA wrapper. Move the pointer to vsc9959_tas_guard_bands_update() from felix->info (which is private to the DSA driver) to ocelot->ops (which is also visible to the ocelot switch lib). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Message-ID: <20230705104422.49025-3-vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
009d30f1 |
|
05-Jul-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: extend ocelot->fwd_domain_lock to cover ocelot->tas_lock In a future commit we will have to call vsc9959_tas_guard_bands_update() from ocelot_port_update_active_preemptible_tcs(), and that will be impossible due to the AB/BA locking dependencies between ocelot->tas_lock and ocelot->fwd_domain_lock. Just like we did in commit 3ff468ef987e ("net: mscc: ocelot: remove struct ocelot_mm_state :: lock"), the only solution is to expand the scope of ocelot->fwd_domain_lock for it to also serialize changes made to the Time-Aware Shaper, because those will have to result in a recalculation of cut-through TCs, which is something that depends on the forwarding domain. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Message-ID: <20230705104422.49025-2-vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
6ac7a27a |
|
13-Jun-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: fix taprio guard band overflow at 10Mbps with jumbo frames The DEV_MAC_MAXLEN_CFG register contains a 16-bit value - up to 65535. Plus 2 * VLAN_HLEN (4), that is up to 65543. The picos_per_byte variable is the largest when "speed" is lowest - SPEED_10 = 10. In that case it is (1000000L * 8) / 10 = 800000. Their product - 52434400000 - exceeds 32 bits, which is a problem, because apparently, a multiplication between two 32-bit factors is evaluated as 32-bit before being assigned to a 64-bit variable. In fact it's a problem for any MTU value larger than 5368. Cast one of the factors of the multiplication to u64 to force the multiplication to take place on 64 bits. Issue found by Coverity. Fixes: 55a515b1f5a9 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230613170907.2413559-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
cad7526f |
|
06-Jun-2023 |
Dan Carpenter <dan.carpenter@linaro.org> |
net: dsa: ocelot: unlock on error in vsc9959_qos_port_tas_set() This error path needs call mutex_unlock(&ocelot->tas_lock) before returning. Fixes: 2d800bc500fb ("net/sched: taprio: replace tc_taprio_qopt_offload :: enable with a "cmd" enum") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
2d800bc5 |
|
29-May-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net/sched: taprio: replace tc_taprio_qopt_offload :: enable with a "cmd" enum Inspired from struct flow_cls_offload :: cmd, in order for taprio to be able to report statistics (which is future work), it seems that we need to drill one step further with the ndo_setup_tc(TC_SETUP_QDISC_TAPRIO) multiplexing, and pass the command as part of the common portion of the muxed structure. Since we already have an "enable" variable in tc_taprio_qopt_offload, refactor all drivers to check for "cmd" instead of "enable", and reject every other command except "replace" and "destroy" - to be future proof. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> # for lan966x Acked-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek Reviewed-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
5767c6a8 |
|
26-May-2023 |
Russell King (Oracle) <rmk+kernel@armlinux.org.uk> |
net: dsa: ocelot: use lynx_pcs_create_mdiodev() Use the newly introduced lynx_pcs_create_mdiodev() which simplifies the creation and destruction of the lynx PCS. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
403ffc2c |
|
15-Apr-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: add support for preemptible traffic classes In order to not transmit (preemptible) frames which will be received by the link partner as corrupted (because it doesn't support FP), the hardware requires the driver to program the QSYS_PREEMPTION_CFG_P_QUEUES register only after the MAC Merge layer becomes active (verification succeeds, or was disabled). There are some cases when FP is known (through experimentation) to be broken. Give priority to FP over cut-through switching, and disable FP for known broken link modes. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
a1ca9f8b |
|
15-Apr-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: act upon the mqprio qopt in taprio offload The mqprio queue configuration can appear either through TC_SETUP_QDISC_MQPRIO or through TC_SETUP_QDISC_TAPRIO. Make sure both are treated in the same way. Code does nothing new for now (except for rejecting multiple TXQs per TC, which is a useless concept with DSA switches). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ferenc Fejes <fejes@inf.elte.hu> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
aac80140 |
|
15-Apr-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: add support for mqprio offload This doesn't apply anything to hardware and in general doesn't do anything that the software variant doesn't do, except for checking that there isn't more than 1 TXQ per TC (TXQs for a DSA switch are a dubious concept anyway). The reason we add this is to be able to parse one more field added to struct tc_mqprio_qopt_offload, namely preemptible_tcs. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ferenc Fejes <fejes@inf.elte.hu> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
15f93f46 |
|
15-Apr-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: export a single ocelot_mm_irq() When the switch emits an IRQ, we don't know what caused it, and we iterate through all ports to check the MAC Merge status. Move that iteration inside the ocelot lib; we will change the locking in a future change and it would be good to encapsulate that lock completely within the ocelot lib. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
940af261 |
|
24-Feb-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: fix internal MDIO controller resource length The blamed commit did not properly convert the resource start/end format into the DEFINE_RES_MEM_NAMED() start/length format, resulting in a resource for vsc9959_imdio_res which is much longer than expected: $ cat /proc/iomem 1f8000000-1f815ffff : pcie@1f0000000 1f8140000-1f815ffff : 0000:00:00.5 1f8148030-1f815006f : imdio vs (correct) $ cat /proc/iomem 1f8000000-1f815ffff : pcie@1f0000000 1f8140000-1f815ffff : 0000:00:00.5 1f8148030-1f814803f : imdio Luckily it's not big enough to exceed the size of the parent resource (pci_resource_end(pdev, VSC9959_IMDIO_PCI_BAR)), and it doesn't overlap with anything else that the Linux driver uses currently, so the larger than expected size isn't a practical problem that I can see. Although it is clearly wrong in the /proc/iomem output. Fixes: 044d447a801f ("net: dsa: felix: use DEFINE_RES_MEM_NAMED for resources") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1dc6a2a0 |
|
27-Jan-2023 |
Colin Foster <colin.foster@in-advantage.com> |
net: dsa: felix: add configurable device quirks The define FELIX_MAC_QUIRKS was used directly in the felix.c shared driver. Other devices (VSC7512 for example) don't require the same quirks, so they need to be configured on a per-device basis. Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
6505b680 |
|
19-Jan-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: add MAC Merge layer support for VSC9959 Felix (VSC9959) has a DEV_GMII:MM_CONFIG block composed of 2 registers (ENABLE_CONFIG and VERIF_CONFIG). Because the MAC Merge statistics and pMAC statistics are already in the Ocelot switch lib even if just Felix supports them, I'm adding support for the whole MAC Merge layer in the common Ocelot library too. There is an interrupt (shared with the PTP interrupt) which signals changes to the MM verification state. This is done because the preemptible traffic classes should be committed to hardware only once the verification procedure has declared the link partner of being capable of receiving preemptible frames. We implement ethtool getters and setters for the MAC Merge layer state. The "TX enabled" and "verify status" are taken from the IRQ handler, using a mutex to ensure serialized access. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ab3f97a9 |
|
19-Jan-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: export ethtool MAC Merge stats for Felix VSC9959 The Felix VSC9959 switch supports frame preemption and has a MAC Merge layer. In addition to the structured stats that exist for the eMAC, export the counters associated with its pMAC (pause, RMON, MAC, PHY, control) plus the high-level MAC Merge layer stats. The unstructured ethtool counters, as well as the rtnl_link_stats64 were left to report only the eMAC counters. Because statistics processing is quite self-contained in ocelot_stats.c now, I've opted for introducing an ocelot->mm_supported bool, based on which the common switch lib does everything, rather than pushing the TSN-specific code in felix_vsc9959.c, as happens for other TSN stuff. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
80e87442 |
|
12-Jan-2023 |
Andrew Lunn <andrew@lunn.ch> |
enetc: Separate C22 and C45 transactions The enetc MDIO bus driver can perform both C22 and C45 transfers. Create separate functions for each and register the C45 versions using the new API calls where appropriate. This driver is shared with the Felix DSA switch, so update that at the same time. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
33d5eeb9 |
|
19-Nov-2022 |
Colin Foster <colin.foster@in-advantage.com> |
net: mscc: ocelot: remove redundant stats_layout pointers Ever since commit 4d1d157fb6a4 ("net: mscc: ocelot: share the common stat definitions between all drivers") the stats_layout entry in ocelot and felix drivers have become redundant. Remove the unnecessary code. Suggested-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
#
3e7e7832 |
|
14-Nov-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: use phylink_generic_validate() Drop the custom implementation of phylink_validate() in favor of the generic one, which requires config->mac_capabilities to be set. This was used up until now because of the possibility of being paired with Aquantia PHYs with support for rate matching. The phylink framework gained generic support for these, and knows to advertise all 10/100/1000 lower speed link modes when our SERDES protocol is 2500base-x (fixed speed). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
1712be05 |
|
27-Sep-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: offload per-tc max SDU from tc-taprio Our current vsc9959_tas_guard_bands_update() algorithm has a limitation imposed by the hardware design. To avoid packet overruns between one gate interval and the next (which would add jitter for scheduled traffic in the next gate), we configure the switch to use guard bands. These are as large as the largest packet which is possible to be transmitted. The problem is that at tc-taprio intervals of sizes comparable to a guard band, there isn't an obvious place in which to split the interval between the useful portion (for scheduling) and the guard band portion (where scheduling is blocked). For example, a 10 us interval at 1Gbps allows 1225 octets to be transmitted. We currently split the interval between the bare minimum of 33 ns useful time (required to schedule a single packet) and the rest as guard band. But 33 ns of useful scheduling time will only allow a single packet to be sent, be that packet 1200 octets in size, or 60 octets in size. It is impossible to send 2 60 octets frames in the 10 us window. Except that if we reduced the guard band (and therefore the maximum allowable SDU size) to 5 us, the useful time for scheduling is now also 5 us, so more packets could be scheduled. The hardware inflexibility of not scheduling according to individual packet lengths must unfortunately propagate to the user, who needs to tune the queueMaxSDU values if he wants to fit more small packets into a 10 us interval, rather than one large packet. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
1109b97b |
|
27-Sep-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: update regmap requests to be string-based Existing felix DSA drivers (vsc9959, vsc9953) are all switches that were integrated in NXP SoCs, which makes them a bit unusual compared to the usual Microchip branded Ocelot switches. To be precise, looking at Documentation/devicetree/bindings/net/mscc,vsc7514-switch.yaml, one can see 21 memory regions for the "switch" node, and these correspond to the "targets" of the switch IP, which are spread throughout the guts of that SoC's memory space. In NXP integrations, those targets still exist, but they were condensed within a single memory region, with no other peripheral in between them, so it made more sense for the driver to ioremap the entire memory space of the switch, and then find the targets within that memory space via some offsets hardcoded in the driver. The effect of this design decision is that now, the felix driver expects hardware instantiations to provide their own resource definitions, which is kind of odd when considering a typical device (those are retrieved from 'reg' properties in the device tree, using platform_get_resource() or similar). Allow other hardware instantiations that share the felix driver to not provide a hardcoded array of resources in the future. Instead, make the common denominator based on which regmaps are created be just the resource "names". Each instantiation comes with its own array of names that are mandatory for it, and with an optional array of resources. So we split the resources in 2 arrays, one is what's requested and the other is what's provided. There is one pool of provided resources, in felix->info->resources (of length felix->info->num_resources). There are 2 different ways of requesting a resource. One is by enum ocelot_target (this handles the global regmaps), and one is by int port (this handles the per-port ones). For the existing vsc9959 and vsc9953, it would be a bit stupid to request something that's not provided, given that the 2 arrays are both defined in the same place. The advantage is that we can now modify felix_request_regmap_by_name() to make felix->info->resources[] optional, and if absent, the implementation can call dev_get_regmap() and this is something that is compatible with MFD. Co-developed-by: Colin Foster <colin.foster@in-advantage.com> Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
044d447a |
|
27-Sep-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: use DEFINE_RES_MEM_NAMED for resources Use less verbose resource definitions in vsc9959 and vsc9953. This also sets IORESOURCE_MEM in the constant array of resources, so we don't have to do this from felix_init_structs() - in fact, in the future, we may even support IORESOURCE_REG resources. Note that this macro takes start and length as argument, and we had start and end before. So transform end into length. While at it, sort the resources according to their offset. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
8f66c64b |
|
27-Sep-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: remove felix_info :: init_regmap It turns out that the idea of having a customizable implementation of a regmap creation from a resource is not exactly useful. The idea was for the new MFD-based VSC7512 driver to use something that creates a SPI regmap from a resource. But there are problems in actually getting those resources (it involves getting them from MFD). To avoid all that, we'll be getting resources by name, so this custom init_regmap() method won't be needed. Remove it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
1382ba68 |
|
27-Sep-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: remove felix_info :: imdio_base This address is only relevant for the vsc9959, which is a PCIe device that holds its switch registers in a different PCIe BAR compared to the registers for the internal MDIO controller. Hide this aspect from the common felix driver and move the pci_resource_start() call to the only place that needs it, which is in vsc9959_mdio_bus_alloc(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
5fc080de |
|
27-Sep-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: remove felix_info :: imdio_res The imdio_res is used only by vsc9959, which references its own vsc9959_imdio_res through the common felix_info->imdio_res pointer. Since the common code doesn't care about this resource (and it can't be part of the common array of resources, either, because it belongs in a different PCI BAR), just reference it directly. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
f66d1ecc |
|
21-Sep-2022 |
Yang Yingliang <yangyingliang@huawei.com> |
net: dsa: ocelot: remove unnecessary set_drvdata() Remove unnecessary set_drvdata(NULL) function in ->remove(), the driver_data will be set to NULL in device_unbind_cleanup() after calling ->remove(). Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
4d1d157f |
|
08-Sep-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: share the common stat definitions between all drivers All switch families supported by the ocelot lib (ocelot, felix, seville) export the same registers so far. But for example felix also has TSN counters, while the others don't. To reduce the bloat even further, create an OCELOT_COMMON_STATS() macro which just lists all stats that are common between switches. The array elements are still replicated among all of vsc9959_stats_layout, vsc9953_stats_layout and ocelot_stats_layout. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b69cf1c6 |
|
08-Sep-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: minimize definitions for stats The current definition of struct ocelot_stat_layout is long-winded (4 lines per entry, and we have hundreds of entries), so we could make an effort to use the C preprocessor and reduce the line count. Create an implicit correspondence between enum ocelot_reg, which tells us the register address (SYS_COUNT_RX_OCTETS etc) and enum ocelot_stat which allows us to index the ocelot->stats array (OCELOT_STAT_RX_OCTETS etc), and don't require us to specify both when we define what stats each switch family has. Create an OCELOT_STAT() macro that pairs only an enum ocelot_stat to an enum ocelot_reg, and an OCELOT_STAT_ETHTOOL() macro which also contains a name exported to the unstructured ethtool -S stringset API. For now, we define all counters as having the OCELOT_STAT_ETHTOOL() kind, but we will add more counters in the future which are not exported to the unstructured ethtool -S. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
be5c13f2 |
|
08-Sep-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: harmonize names of SYS_COUNT_TX_AGING and OCELOT_STAT_TX_AGED The hardware counter is called C_TX_AGED, so rename SYS_COUNT_TX_AGING to SYS_COUNT_TX_AGED. This will become important since we want to minimize the way in which we declare struct ocelot_stat_layout elements, using the C preprocessor. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
25027c84 |
|
08-Sep-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: check the 32-bit PSFP stats against overflow The Felix PSFP counters suffer from the same problem as the ocelot ndo_get_stats64 ones - they are 32-bit, so they can easily overflow and this can easily go undetected. Add a custom hook in ocelot_check_stats_work() through which driver specific actions can be taken, and update the stats for the existing PSFP filters from that hook. Previously, vsc9959_psfp_filter_add() and vsc9959_psfp_filter_del() were serialized with respect to each other via rtnl_lock(). However, with the new entry point into &psfp->sfi_list coming from the periodic worker, we now need an explicit mutex to serialize access to these lists. We used to keep a struct felix_stream_filter_counters on stack, through which vsc9959_psfp_stats_get() - a FLOW_CLS_STATS callback - would retrieve data from vsc9959_psfp_counters_get(). We need to become smarter about that in 3 ways: - we need to keep a persistent set of counters for each stream instead of keeping them on stack - we need to promote those counters from u32 to u64, and create a procedure that properly keeps 64-bit counters. Since we clear the hardware counters anyway, and we poll every 2 seconds, a simple increment of a u64 counter with a u32 value will perfectly do the job. - FLOW_CLS_STATS also expect incremental counters, so we also need to zeroize our u64 counters every time sch_flower calls us Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
96980ff7 |
|
08-Sep-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: make access to STAT_VIEW sleepable again To support SPI-controlled switches in the future, access to SYS_STAT_CFG_STAT_VIEW needs to be done outside of any spinlock protected region, but it still needs to be serialized (by a mutex). Split the ocelot->stats_lock spinlock into a mutex that serializes indirect access to hardware registers (ocelot->stat_view_lock) and a spinlock that serializes access to the u64 ocelot->stats array. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
0a2360c5 |
|
08-Sep-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: add definitions for the stream filter counters TSN stream (802.1Qci, 802.1CB) filters are also accessed through STAT_VIEW, just like the port registers, but these counters are per stream, rather than per port. So we don't keep them in ocelot_port_update_stats(). What we can do, however, is we can create register definitions for them just like we have for the port counters, and delete the last remaining user of the SYS_CNT register + a group index (read_gix). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a4bb481a |
|
05-Sep-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: access QSYS_TAG_CONFIG under tas_lock in vsc9959_sched_speed_set The read-modify-write of QSYS_TAG_CONFIG from vsc9959_sched_speed_set() runs unlocked with respect to the other functions that access it, which are vsc9959_tas_guard_bands_update(), vsc9959_qos_port_tas_set() and vsc9959_tas_clock_adjust(). All the others are under ocelot->tas_lock, so move the vsc9959_sched_speed_set() access under that lock as well, to resolve the concurrency. Fixes: 55a515b1f5a9 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
843794bb |
|
05-Sep-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: disable cut-through forwarding for frames oversized for tc-taprio Experimentally, it looks like when QSYS_QMAXSDU_CFG_7 is set to 605, frames even way larger than 601 octets are transmitted even though these should be considered as oversized, according to the documentation, and dropped. Since oversized frame dropping depends on frame size, which is only known at the EOF stage, and therefore not at SOF when cut-through forwarding begins, it means that the switch cannot take QSYS_QMAXSDU_CFG_* into consideration for traffic classes that are cut-through. Since cut-through forwarding has no UAPI to control it, and the driver enables it based on the mantra "if we can, then why not", the strategy is to alter vsc9959_cut_through_fwd() to take into consideration which tc's have oversize frame dropping enabled, and disable cut-through for them. Then, from vsc9959_tas_guard_bands_update(), we re-trigger the cut-through determination process. There are 2 strategies for vsc9959_cut_through_fwd() to determine whether a tc has oversized dropping enabled or not. One is to keep a bit mask of traffic classes per port, and the other is to read back from the hardware registers (a non-zero value of QSYS_QMAXSDU_CFG_* means the feature is enabled). We choose reading back from registers, because struct ocelot_port is shared with drivers (ocelot, seville) that don't support either cut-through nor tc-taprio, and we don't have a felix specific extension of struct ocelot_port. Furthermore, reading registers from the Felix hardware is quite cheap, since they are memory-mapped. Fixes: 55a515b1f5a9 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
11afdc65 |
|
05-Sep-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: tc-taprio intervals smaller than MTU should send at least one packet The blamed commit broke tc-taprio schedules such as this one: tc qdisc replace dev $swp1 root taprio \ num_tc 8 \ map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time 0 \ sched-entry S 0x7f 990000 \ sched-entry S 0x80 10000 \ flags 0x2 because the gate entry for TC 7 (S 0x80 10000 ns) now has a static guard band added earlier than its 'gate close' event, such that packet overruns won't occur in the worst case of the largest packet possible. Since guard bands are statically determined based on the per-tc QSYS_QMAXSDU_CFG_* with a fallback on the port-based QSYS_PORT_MAX_SDU, we need to discuss what happens with TC 7 depending on kernel version, since the driver, prior to commit 55a515b1f5a9 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port"), did not touch QSYS_QMAXSDU_CFG_*, and therefore relied on QSYS_PORT_MAX_SDU. 1 (before vsc9959_tas_guard_bands_update): QSYS_PORT_MAX_SDU defaults to 1518, and at gigabit this introduces a static guard band (independent of packet sizes) of 12144 ns, plus QSYS::HSCH_MISC_CFG.FRM_ADJ (bit time of 20 octets => 160 ns). But this is larger than the time window itself, of 10000 ns. So, the queue system never considers a frame with TC 7 as eligible for transmission, since the gate practically never opens, and these frames are forever stuck in the TX queues and hang the port. 2 (after vsc9959_tas_guard_bands_update): Under the sole goal of enabling oversized frame dropping, we make an effort to set QSYS_QMAXSDU_CFG_7 to 1230 bytes. But QSYS_QMAXSDU_CFG_7 plays one more role, which we did not take into account: per-tc static guard band, expressed in L2 byte time (auto-adjusted for FCS and L1 overhead). There is a discrepancy between what the driver thinks (that there is no guard band, and 100% of min_gate_len[tc] is available for egress scheduling) and what the hardware actually does (crops the equivalent of QSYS_QMAXSDU_CFG_7 ns out of min_gate_len[tc]). In practice, this means that the hardware thinks it has exactly 0 ns for scheduling tc 7. In both cases, even minimum sized Ethernet frames are stuck on egress rather than being considered for scheduling on TC 7, even if they would fit given a proper configuration. Considering the current situation, with vsc9959_tas_guard_bands_update(), frames between 60 octets and 1230 octets in size are not eligible for oversized dropping (because they are smaller than QSYS_QMAXSDU_CFG_7), but won't be considered as eligible for scheduling either, because the min_gate_len[7] (10000 ns) minus the guard band determined by QSYS_QMAXSDU_CFG_7 (1230 octets * 8 ns per octet == 9840 ns) minus the guard band auto-added for L1 overhead by QSYS::HSCH_MISC_CFG.FRM_ADJ (20 octets * 8 ns per octet == 160 octets) leaves 0 ns for scheduling in the queue system proper. Investigating the hardware behavior, it becomes apparent that the queue system needs precisely 33 ns of 'gate open' time in order to consider a frame as eligible for scheduling to a tc. So the solution to this problem is to amend vsc9959_tas_guard_bands_update(), by giving the per-tc guard bands less space by exactly 33 ns, just enough for one frame to be scheduled in that interval. This allows the queue system to make forward progress for that port-tc, and prevents it from hanging. Fixes: 297c4de6f780 ("net: dsa: felix: re-enable TAS guard band mode") Reported-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
d4c36765 |
|
16-Aug-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset With so many counter addresses recently discovered as being wrong, it is desirable to at least have a central database of information, rather than two: one through the SYS_COUNT_* registers (used for ndo_get_stats64), and the other through the offset field of struct ocelot_stat_layout elements (used for ethtool -S). The strategy will be to keep the SYS_COUNT_* definitions as the single source of truth, but for that we need to expand our current definitions to cover all registers. Then we need to convert the ocelot region creation logic, and stats worker, to the read semantics imposed by going through SYS_COUNT_* absolute register addresses, rather than offsets of 32-bit words relative to SYS_COUNT_RX_OCTETS (which should have been SYS_CNT, by the way). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
91904600 |
|
16-Aug-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: make struct ocelot_stat_layout array indexable The ocelot counters are 32-bit and require periodic reading, every 2 seconds, by ocelot_port_update_stats(), so that wraparounds are detected. Currently, the counters reported by ocelot_get_stats64() come from the 32-bit hardware counters directly, rather than from the 64-bit accumulated ocelot->stats, and this is a problem for their integrity. The strategy is to make ocelot_get_stats64() able to cherry-pick individual stats from ocelot->stats the way in which it currently reads them out from SYS_COUNT_* registers. But currently it can't, because ocelot->stats is an opaque u64 array that's used only to feed data into ethtool -S. To solve that problem, we need to make ocelot->stats indexable, and associate each element with an element of struct ocelot_stat_layout used by ethtool -S. This makes ocelot_stat_layout a fat (and possibly sparse) array, so we need to change the way in which we access it. We no longer need OCELOT_STAT_END as a sentinel, because we know the array's size (OCELOT_NUM_STATS). We just need to skip the array elements that were left unpopulated for the switch revision (ocelot, felix, seville). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
22d842e3 |
|
16-Aug-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: turn stats_lock into a spinlock ocelot_get_stats64() currently runs unlocked and therefore may collide with ocelot_port_update_stats() which indirectly accesses the same counters. However, ocelot_get_stats64() runs in atomic context, and we cannot simply take the sleepable ocelot->stats_lock mutex. We need to convert it to an atomic spinlock first. Do that as a preparatory change. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
5152de7b |
|
16-Aug-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: fix incorrect ndo_get_stats64 packet counters Reading stats using the SYS_COUNT_* register definitions is only used by ocelot_get_stats64() from the ocelot switchdev driver, however, currently the bucket definitions are incorrect. Separately, on both RX and TX, we have the following problems: - a 256-1023 bucket which actually tracks the 256-511 packets - the 1024-1526 bucket actually tracks the 512-1023 packets - the 1527-max bucket actually tracks the 1024-1526 packets => nobody tracks the packets from the real 1527-max bucket Additionally, the RX_PAUSE, RX_CONTROL, RX_LONGS and RX_CLASSIFIED_DROPS all track the wrong thing. However this doesn't seem to have any consequence, since ocelot_get_stats64() doesn't use these. Even though this problem only manifests itself for the switchdev driver, we cannot split the fix for ocelot and for DSA, since it requires fixing the bucket definitions from enum ocelot_reg, which makes us necessarily adapt the structures from felix and seville as well. Fixes: 84705fc16552 ("net: dsa: felix: introduce support for Seville VSC9953 switch") Fixes: 56051948773e ("net: dsa: ocelot: add driver for Felix switch family") Fixes: a556c76adc05 ("net: mscc: Add initial Ocelot switch support") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
40d21c45 |
|
16-Aug-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: fix ethtool 256-511 and 512-1023 TX packet counters What the driver actually reports as 256-511 is in fact 512-1023, and the TX packets in the 256-511 bucket are not reported. Fix that. Fixes: 56051948773e ("net: dsa: ocelot: add driver for Felix switch family") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
7e4babff |
|
04-Aug-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: fix min gate len calculation for tc when its first gate is closed min_gate_len[tc] is supposed to track the shortest interval of continuously open gates for a traffic class. For example, in the following case: TC 76543210 t0 00000001b 200000 ns t1 00000010b 200000 ns min_gate_len[0] and min_gate_len[1] should be 200000, while min_gate_len[2-7] should be 0. However what happens is that min_gate_len[0] is 200000, but min_gate_len[1] ends up being 0 (despite gate_len[1] being 200000 at the point where the logic detects the gate close event for TC 1). The problem is that the code considers a "gate close" event whenever it sees that there is a 0 for that TC (essentially it's level rather than edge triggered). By doing that, any time a gate is seen as closed without having been open prior, gate_len, which is 0, will be written into min_gate_len. Once min_gate_len becomes 0, it's impossible for it to track anything higher than that (the length of actually open intervals). To fix this, we make the writing to min_gate_len[tc] be edge-triggered, which avoids writes for gates that are closed in consecutive intervals. However what this does is it makes us need to special-case the permanently closed gates at the end. Fixes: 55a515b1f5a9 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220804202817.1677572-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
837ced3a |
|
28-Jun-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
time64.h: consolidate uses of PSEC_PER_NSEC Time-sensitive networking code needs to work with PTP times expressed in nanoseconds, and with packet transmission times expressed in picoseconds, since those would be fractional at higher than gigabit speed when expressed in nanoseconds. Convert the existing uses in tc-taprio and the ocelot/felix DSA driver to a PSEC_PER_NSEC macro. This macro is placed in include/linux/time64.h as opposed to its relatives (PSEC_PER_SEC etc) from include/vdso/time64.h because the vDSO library does not (yet) need/use it. Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> # for the vDSO parts Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
55a515b1 |
|
28-Jun-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port Currently, sending a packet into a time gate too small for it (or always closed) causes the queue system to hold the frame forever. Even worse, this frame isn't subject to aging either, because for that to happen, it needs to be scheduled for transmission in the first place. But the frame will consume buffer memory and frame references while it is forever held in the queue system. Before commit a4ae997adcbd ("net: mscc: ocelot: initialize watermarks to sane defaults"), this behavior was somewhat subtle, as the switch had a more intricately tuned default watermark configuration out of reset, which did not allow any single port and tc to consume the entire switch buffer space. Nonetheless, the held frames are still there, and they reduce the total backplane capacity of the switch. However, after the aforementioned commit, the behavior can be very clearly seen, since we deliberately allow each {port, tc} to consume the entire shared buffer of the switch minus the reservations (and we disable all reservations by default). That is to say, we allow a permanently closed tc-taprio gate to hang the entire switch. A careful inspection of the documentation shows that the QSYS:Q_MAX_SDU per-port-tc registers serve 2 purposes: one is for guard band calculation (when zero, this falls back to QSYS:PORT_MAX_SDU), and the other is to enable oversized frame dropping (when non-zero). Currently the QSYS:Q_MAX_SDU registers are all zero, so oversized frame dropping is disabled. The goal of the change is to enable it seamlessly. For that, we need to hook into the MTU change, tc-taprio change, and port link speed change procedures, since we depend on these variables. Frames are not dropped on egress due to a queue system oversize condition, instead that egress port is simply excluded from the mask of valid destination ports for the packet. If there are no destination ports at all, the ingress counter that increments is the generic "drop_tail" in ethtool -S. The issue exists in various forms since the tc-taprio offload was introduced. Fixes: de143c0e274b ("net: dsa: felix: Configure Time-Aware Scheduler via taprio offload") Reported-by: Richie Pearn <richard.pearn@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
d68a373b |
|
28-Jun-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: keep QSYS_TAG_CONFIG_INIT_GATE_STATE(0xFF) out of rmw In vsc9959_tas_clock_adjust(), the INIT_GATE_STATE field is not changed, only the ENABLE field. Similarly for the disabling of the time-aware shaper in vsc9959_qos_port_tas_set(). To reflect this, keep the QSYS_TAG_CONFIG_INIT_GATE_STATE_M mask out of the read-modify-write procedure to make it clearer what is the intention of the code. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
1c9017e4 |
|
28-Jun-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: keep reference on entire tc-taprio config In a future change we will need to remember the entire tc-taprio config on all ports rather than just the base time, so use the taprio_offload_get() helper function to replace ocelot_port->base_time with ocelot_port->taprio. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
8670dc33 |
|
16-Jun-2022 |
Xiaoliang Yang <xiaoliang.yang_1@nxp.com> |
net: dsa: felix: update base time of time-aware shaper when adjusting PTP time When adjusting the PTP clock, the base time of the TAS configuration will become unreliable. We need reset the TAS configuration by using a new base time. For example, if the driver gets a base time 0 of Qbv configuration from user, and current time is 20000. The driver will set the TAS base time to be 20000. After the PTP clock adjustment, the current time becomes 10000. If the TAS base time is still 20000, it will be a future time, and TAS entry list will stop running. Another example, if the current time becomes to be 10000000 after PTP clock adjust, a large time offset can cause the hardware to hang. This patch introduces a tas_clock_adjust() function to reset the TAS module by using a new base time after the PTP clock adjustment. This can avoid issues above. Due to PTP clock adjustment can occur at any time, it may conflict with the TAS configuration. We introduce a new TAS lock to serialize the access to the TAS registers. Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
58bf4db6 |
|
29-Jun-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: fix race between reading PSFP stats and port stats Both PSFP stats and the port stats read by ocelot_check_stats_work() are indirectly read through the same mechanism - write to STAT_CFG:STAT_VIEW, read from SYS:STAT:CNT[n]. It's just that for port stats, we write STAT_VIEW with the index of the port, and for PSFP stats, we write STAT_VIEW with the filter index. So if we allow them to run concurrently, ocelot_check_stats_work() may change the view from vsc9959_psfp_counters_get(), and vice versa. Fixes: 7d4b564d6add ("net: dsa: felix: support psfp filter on vsc9959") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220629183007.3808130-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
c295f983 |
|
21-May-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: switch from {,un}set to {,un}assign for tag_8021q CPU ports There is a desire for the felix driver to gain support for multiple tag_8021q CPU ports, but the current model prevents it. This is because ocelot_apply_bridge_fwd_mask() only takes into consideration whether a port is a tag_8021q CPU port, but not whose CPU port it is. We need a model where we can have a direct affinity between an ocelot port and a tag_8021q CPU port. This serves as the basis for multiple CPU ports. Declare a "dsa_8021q_cpu" backpointer in struct ocelot_port which encodes that affinity. Repurpose the "ocelot_set_dsa_8021q_cpu" API to "ocelot_assign_dsa_8021q_cpu" to express the change of paradigm. Note that this change makes the first practical use of the new ocelot_port->index field in ocelot_port_unassign_dsa_8021q_cpu(), where we need to remove the old tag_8021q CPU port from the reserved VLAN range. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
11ecf341 |
|
10-May-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: ocelot: accept 1000base-X for VSC9959 and VSC9953 Switches using the Lynx PCS driver support 1000base-X optical SFP modules. Accept this interface type on a port. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220510164320.10313-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
2f187bfa |
|
29-Apr-2022 |
Colin Foster <colin.foster@in-advantage.com> |
net: ethernet: ocelot: remove the need for num_stats initializer There is a desire to share the oclot_stats_layout struct outside of the current vsc7514 driver. In order to do so, the length of the array needs to be known at compile time, and defined in the struct ocelot and struct felix_info. Since the array is defined in a .c file and would be declared in the header file via: extern struct ocelot_stat_layout[]; the size of the array will not be known at compile time to outside modules. To fix this, remove the need for defining the number of stats at compile time and allow this number to be determined at initialization. Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e6934e40 |
|
07-Apr-2022 |
Michael Walle <michael@walle.cc> |
net: dsa: felix: suppress -EPROBE_DEFER errors The DSA master might not have been probed yet in which case the probe of the felix switch fails with -EPROBE_DEFER: [ 4.435305] mscc_felix 0000:00:00.5: Failed to register DSA switch: -517 It is not an error. Use dev_err_probe() to demote this particular error to a debug message. Fixes: 56051948773e ("net: dsa: ocelot: add driver for Felix switch family") Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220408101521.281886-1-michael@walle.cc Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
866b7a27 |
|
29-Mar-2022 |
Zheng Yongjun <zhengyongjun3@huawei.com> |
net: dsa: felix: fix possible NULL pointer dereference As the possible failure of the allocation, kzalloc() may return NULL pointer. Therefore, it should be better to check the 'sgi' in order to prevent the dereference of NULL pointer. Fixes: 23ae3a7877718 ("net: dsa: felix: add stream gate settings for psfp"). Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220329090800.130106-1-zhengyongjun3@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
a53cbe5d |
|
18-Mar-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: allow PHY_INTERFACE_MODE_INTERNAL on port 5 The Felix switch has 6 ports, 2 of which are internal. Due to some misunderstanding, my initial suggestion for vsc9959_port_modes[]: https://patchwork.kernel.org/project/netdevbpf/patch/20220129220221.2823127-10-colin.foster@in-advantage.com/#24718277 got translated by Colin into a 5-port array, leading to an all-zero port mode mask for port 5. Fixes: acf242fc739e ("net: dsa: felix: remove prevalidate_phy_mode interface") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220318195812.276276-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
acf242fc |
|
26-Feb-2022 |
Colin Foster <colin.foster@in-advantage.com> |
net: dsa: felix: remove prevalidate_phy_mode interface All users of the felix driver were creating their own prevalidate_phy_mode function. The same logic can be performed in a more general way by using a simple array of bit fields. Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e57a1540 |
|
25-Feb-2022 |
Russell King (Oracle) <rmk+kernel@armlinux.org.uk> |
net: dsa: ocelot: remove interface checks When the supported interfaces bitmap is populated, phylink will itself check that the interface mode is present in this bitmap. Drivers no longer need to perform this check themselves. Remove these checks. Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
209bdb7e |
|
07-Feb-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: don't use devres for mdiobus As explained in commits: 74b6d7d13307 ("net: dsa: realtek: register the MDIO bus under devres") 5135e96a3dd2 ("net: dsa: don't allocate the slave_mii_bus using devres") mdiobus_free() will panic when called from devm_mdiobus_free() <- devres_release_all() <- __device_release_driver(), and that mdiobus was not previously unregistered. The Felix VSC9959 switch is a PCI device, so the initial set of constraints that I thought would cause this (I2C or SPI buses which call ->remove on ->shutdown) do not apply. But there is one more which applies here. If the DSA master itself is on a bus that calls ->remove from ->shutdown (like dpaa2-eth, which is on the fsl-mc bus), there is a device link between the switch and the DSA master, and device_links_unbind_consumers() will unbind the felix switch driver on shutdown. So the same treatment must be applied to all DSA switch drivers, which is: either use devres for both the mdiobus allocation and registration, or don't use devres at all. The felix driver has the code structure in place for orderly mdiobus removal, so just replace devm_mdiobus_alloc_size() with the non-devres variant, and add manual free where necessary, to ensure that we don't let devres free a still-registered bus. Fixes: ac3a68d56651 ("net: phy: don't abuse devres in devm_mdiobus_register()") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
61f0d0c3 |
|
28-Dec-2021 |
Colin Foster <colin.foster@in-advantage.com> |
net: dsa: felix: name change for clarity from pcs to mdio_device Simple rename of a variable to make things more logical. Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e7026f15 |
|
28-Dec-2021 |
Colin Foster <colin.foster@in-advantage.com> |
net: phy: lynx: refactor Lynx PCS module to use generic phylink_pcs Remove references to lynx_pcs structures so drivers like the Felix DSA can reference alternate PCS drivers. Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
5a995900 |
|
17-Dec-2021 |
Baowen Zheng <baowen.zheng@corigine.com> |
flow_offload: add index to flow_action_entry structure Add index to flow_action_entry structure and delete index from police and gate child structure. We make this change to offload tc action for driver to identify a tc action. Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e44aecc7 |
|
06-Dec-2021 |
Yihao Han <hanyihao@vivo.com> |
net: dsa: felix: use kmemdup() to replace kmalloc + memcpy Fix following coccicheck warning: /drivers/net/dsa/ocelot/felix_vsc9959.c:1627:13-20: WARNING opportunity for kmemdup /drivers/net/dsa/ocelot/felix_vsc9959.c:1506:16-23: WARNING opportunity for kmemdup Signed-off-by: Yihao Han <hanyihao@vivo.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20211207064419.38632-1-hanyihao@vivo.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
242bd0c1 |
|
07-Dec-2021 |
Colin Foster <colin.foster@in-advantage.com> |
net: dsa: ocelot: felix: add interface for custom regmaps Add an interface so that non-mmio regmaps can be used Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
c9910484 |
|
07-Dec-2021 |
Colin Foster <colin.foster@in-advantage.com> |
net: dsa: ocelot: remove unnecessary pci_bar variables The pci_bar variables for the switch and imdio don't make sense for the generic felix driver. Moving them to felix_vsc9959 to limit scope and simplify the felix_info struct. Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
dcad856fe5 |
|
27-Nov-2021 |
kernel test robot <lkp@intel.com> |
net: dsa: felix: fix flexible_array.cocci warnings Zero-length and one-element arrays are deprecated, see Documentation/process/deprecated.rst Flexible-array members should be used instead. Generated by: scripts/coccinelle/misc/flexible_array.cocci Fixes: 23ae3a787771 ("net: dsa: felix: add stream gate settings for psfp") CC: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: kernel test robot <lkp@intel.com> Signed-off-by: Julia Lawall <julia.lawall@inria.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8abe1970 |
|
25-Nov-2021 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: enable cut-through forwarding between ports by default The VSC9959 switch embedded within NXP LS1028A (and that version of Ocelot switches only) supports cut-through forwarding - meaning it can start the process of looking up the destination ports for a packet, and forward towards those ports, before the entire packet has been received (as opposed to the store-and-forward mode). The up side is having lower forwarding latency for large packets. The down side is that frames with FCS errors are forwarded instead of being dropped. However, erroneous frames do not result in incorrect updates of the FDB or incorrect policer updates, since these processes are deferred inside the switch to the end of frame. Since the switch starts the cut-through forwarding process after all packet headers (including IP, if any) have been processed, packets with large headers and small payload do not see the benefit of lower forwarding latency. There are two cases that need special attention. The first is when a packet is multicast (or flooded) to multiple destinations, one of which doesn't have cut-through forwarding enabled. The switch deals with this automatically by disabling cut-through forwarding for the frame towards all destination ports. The second is when a packet is forwarded from a port of lower link speed towards a port of higher link speed. This is not handled by the hardware and needs software intervention. Since we practically need to update the cut-through forwarding domain from paths that aren't serialized by the rtnl_mutex (phylink mac_link_down/mac_link_up ops), this means we need to serialize physical link events with user space updates of bonding/bridging domains. Enabling cut-through forwarding is done per {egress port, traffic class}. I don't see any reason why this would be a configurable option as long as it works without issues, and there doesn't appear to be any user space configuration tool to toggle this on/off, so this patch enables cut-through forwarding on all eligible ports and traffic classes. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20211125125808.2383984-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
a7e13edf |
|
18-Nov-2021 |
Xiaoliang Yang <xiaoliang.yang_1@nxp.com> |
net: dsa: felix: restrict psfp rules on ingress port PSFP rules take effect on the streams from any port of VSC9959 switch. This patch use ingress port to limit the rule only active on this port. Each stream can only match two ingress source ports in VSC9959. Streams from lowest port gets the configuration of SFID pointed by MAC Table lookup and streams from highest port gets the configuration of (SFID+1) pointed by MAC Table lookup. This patch defines the PSFP rule on highest port as dummy rule, which means that it does not modify the MAC table. Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
76c13ede |
|
18-Nov-2021 |
Xiaoliang Yang <xiaoliang.yang_1@nxp.com> |
net: dsa: felix: use vcap policer to set flow meter for psfp This patch add police action to set flow meter table which is defined in IEEE802.1Qci. Flow metering is two rates two buckets and three color marker to policing the frames, we only enable one rate one bucket in this patch. Flow metering shares a same policer pool with VCAP policers, so the PSFP policer calls ocelot_vcap_policer_add() and ocelot_vcap_policer_del() to set flow meter police. Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
77043c37 |
|
18-Nov-2021 |
Xiaoliang Yang <xiaoliang.yang_1@nxp.com> |
net: mscc: ocelot: use index to set vcap policer Policer was previously automatically assigned from the highest index to the lowest index from policer pool. But police action of tc flower now uses index to set an police entry. This patch uses the police index to set vcap policers, so that one policer can be shared by multiple rules. Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
23ae3a78 |
|
18-Nov-2021 |
Xiaoliang Yang <xiaoliang.yang_1@nxp.com> |
net: dsa: felix: add stream gate settings for psfp This patch adds stream gate settings for PSFP. Use SGI table to store stream gate entries. Disable the gate entry when it is not used by any stream. Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
7d4b564d |
|
18-Nov-2021 |
Xiaoliang Yang <xiaoliang.yang_1@nxp.com> |
net: dsa: felix: support psfp filter on vsc9959 VSC9959 supports Per-Stream Filtering and Policing(PSFP) that complies with the IEEE 802.1Qci standard. The stream is identified by Null stream identification(DMAC and VLAN ID) defined in IEEE802.1CB. For PSFP, four tables need to be set up: stream table, stream filter table, stream gate table, and flow meter table. Identify the stream by parsing the tc flower keys and add it to the stream table. The stream filter table is automatically maintained, and its index is determined by SGID(flow gate index) and FMID(flow meter index). Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
4973056c |
|
22-Oct-2021 |
Sean Anderson <sean.anderson@seco.com> |
net: convert users of bitmap_foo() to linkmode_foo() This converts instances of bitmap_foo(args..., __ETHTOOL_LINK_MODE_MASK_NBITS) to linkmode_foo(args...) I manually fixed up some lines to prevent them from being excessively long. Otherwise, this change was generated with the following semantic patch: // Generated with // echo linux/linkmode.h > includes // git grep -Flf includes include/ | cut -f 2- -d / | cat includes - \ // | sort | uniq | tee new_includes | wc -l && mv new_includes includes // and repeating until the number stopped going up @i@ @@ ( #include <linux/acpi_mdio.h> | #include <linux/brcmphy.h> | #include <linux/dsa/loop.h> | #include <linux/dsa/sja1105.h> | #include <linux/ethtool.h> | #include <linux/ethtool_netlink.h> | #include <linux/fec.h> | #include <linux/fs_enet_pd.h> | #include <linux/fsl/enetc_mdio.h> | #include <linux/fwnode_mdio.h> | #include <linux/linkmode.h> | #include <linux/lsm_audit.h> | #include <linux/mdio-bitbang.h> | #include <linux/mdio.h> | #include <linux/mdio-mux.h> | #include <linux/mii.h> | #include <linux/mii_timestamper.h> | #include <linux/mlx5/accel.h> | #include <linux/mlx5/cq.h> | #include <linux/mlx5/device.h> | #include <linux/mlx5/driver.h> | #include <linux/mlx5/eswitch.h> | #include <linux/mlx5/fs.h> | #include <linux/mlx5/port.h> | #include <linux/mlx5/qp.h> | #include <linux/mlx5/rsc_dump.h> | #include <linux/mlx5/transobj.h> | #include <linux/mlx5/vport.h> | #include <linux/of_mdio.h> | #include <linux/of_net.h> | #include <linux/pcs-lynx.h> | #include <linux/pcs/pcs-xpcs.h> | #include <linux/phy.h> | #include <linux/phy_led_triggers.h> | #include <linux/phylink.h> | #include <linux/platform_data/bcmgenet.h> | #include <linux/platform_data/xilinx-ll-temac.h> | #include <linux/pxa168_eth.h> | #include <linux/qed/qed_eth_if.h> | #include <linux/qed/qed_fcoe_if.h> | #include <linux/qed/qed_if.h> | #include <linux/qed/qed_iov_if.h> | #include <linux/qed/qed_iscsi_if.h> | #include <linux/qed/qed_ll2_if.h> | #include <linux/qed/qed_nvmetcp_if.h> | #include <linux/qed/qed_rdma_if.h> | #include <linux/sfp.h> | #include <linux/sh_eth.h> | #include <linux/smsc911x.h> | #include <linux/soc/nxp/lpc32xx-misc.h> | #include <linux/stmmac.h> | #include <linux/sunrpc/svc_rdma.h> | #include <linux/sxgbe_platform.h> | #include <net/cfg80211.h> | #include <net/dsa.h> | #include <net/mac80211.h> | #include <net/selftests.h> | #include <rdma/ib_addr.h> | #include <rdma/ib_cache.h> | #include <rdma/ib_cm.h> | #include <rdma/ib_hdrs.h> | #include <rdma/ib_mad.h> | #include <rdma/ib_marshall.h> | #include <rdma/ib_pack.h> | #include <rdma/ib_pma.h> | #include <rdma/ib_sa.h> | #include <rdma/ib_smi.h> | #include <rdma/ib_umem.h> | #include <rdma/ib_umem_odp.h> | #include <rdma/ib_verbs.h> | #include <rdma/iw_cm.h> | #include <rdma/mr_pool.h> | #include <rdma/opa_addr.h> | #include <rdma/opa_port_info.h> | #include <rdma/opa_smi.h> | #include <rdma/opa_vnic.h> | #include <rdma/rdma_cm.h> | #include <rdma/rdma_cm_ib.h> | #include <rdma/rdmavt_cq.h> | #include <rdma/rdma_vt.h> | #include <rdma/rdmavt_qp.h> | #include <rdma/rw.h> | #include <rdma/tid_rdma_defs.h> | #include <rdma/uverbs_ioctl.h> | #include <rdma/uverbs_named_ioctl.h> | #include <rdma/uverbs_std_types.h> | #include <rdma/uverbs_types.h> | #include <soc/mscc/ocelot.h> | #include <soc/mscc/ocelot_ptp.h> | #include <soc/mscc/ocelot_vcap.h> | #include <trace/events/ib_mad.h> | #include <trace/events/rdma_core.h> | #include <trace/events/rdma.h> | #include <trace/events/rpcrdma.h> | #include <uapi/linux/ethtool.h> | #include <uapi/linux/ethtool_netlink.h> | #include <uapi/linux/mdio.h> | #include <uapi/linux/mii.h> ) @depends on i@ expression list args; @@ ( - bitmap_zero(args, __ETHTOOL_LINK_MODE_MASK_NBITS) + linkmode_zero(args) | - bitmap_copy(args, __ETHTOOL_LINK_MODE_MASK_NBITS) + linkmode_copy(args) | - bitmap_and(args, __ETHTOOL_LINK_MODE_MASK_NBITS) + linkmode_and(args) | - bitmap_or(args, __ETHTOOL_LINK_MODE_MASK_NBITS) + linkmode_or(args) | - bitmap_empty(args, ETHTOOL_LINK_MODE_MASK_NBITS) + linkmode_empty(args) | - bitmap_andnot(args, __ETHTOOL_LINK_MODE_MASK_NBITS) + linkmode_andnot(args) | - bitmap_equal(args, __ETHTOOL_LINK_MODE_MASK_NBITS) + linkmode_equal(args) | - bitmap_intersects(args, __ETHTOOL_LINK_MODE_MASK_NBITS) + linkmode_intersects(args) | - bitmap_subset(args, __ETHTOOL_LINK_MODE_MASK_NBITS) + linkmode_subset(args) ) Add missing linux/mii.h include to mellanox. -DaveM Signed-off-by: Sean Anderson <sean.anderson@seco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
0650bf52 |
|
17-Sep-2021 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: be compatible with masters which unregister on shutdown Lino reports that on his system with bcmgenet as DSA master and KSZ9897 as a switch, rebooting or shutting down never works properly. What does the bcmgenet driver have special to trigger this, that other DSA masters do not? It has an implementation of ->shutdown which simply calls its ->remove implementation. Otherwise said, it unregisters its network interface on shutdown. This message can be seen in a loop, and it hangs the reboot process there: unregister_netdevice: waiting for eth0 to become free. Usage count = 3 So why 3? A usage count of 1 is normal for a registered network interface, and any virtual interface which links itself as an upper of that will increment it via dev_hold. In the case of DSA, this is the call path: dsa_slave_create -> netdev_upper_dev_link -> __netdev_upper_dev_link -> __netdev_adjacent_dev_insert -> dev_hold So a DSA switch with 3 interfaces will result in a usage count elevated by two, and netdev_wait_allrefs will wait until they have gone away. Other stacked interfaces, like VLAN, watch NETDEV_UNREGISTER events and delete themselves, but DSA cannot just vanish and go poof, at most it can unbind itself from the switch devices, but that must happen strictly earlier compared to when the DSA master unregisters its net_device, so reacting on the NETDEV_UNREGISTER event is way too late. It seems that it is a pretty established pattern to have a driver's ->shutdown hook redirect to its ->remove hook, so the same code is executed regardless of whether the driver is unbound from the device, or the system is just shutting down. As Florian puts it, it is quite a big hammer for bcmgenet to unregister its net_device during shutdown, but having a common code path with the driver unbind helps ensure it is well tested. So DSA, for better or for worse, has to live with that and engage in an arms race of implementing the ->shutdown hook too, from all individual drivers, and do something sane when paired with masters that unregister their net_device there. The only sane thing to do, of course, is to unlink from the master. However, complications arise really quickly. The pattern of redirecting ->shutdown to ->remove is not unique to bcmgenet or even to net_device drivers. In fact, SPI controllers do it too (see dspi_shutdown -> dspi_remove), and presumably, I2C controllers and MDIO controllers do it too (this is something I have not researched too deeply, but even if this is not the case today, it is certainly plausible to happen in the future, and must be taken into consideration). Since DSA switches might be SPI devices, I2C devices, MDIO devices, the insane implication is that for the exact same DSA switch device, we might have both ->shutdown and ->remove getting called. So we need to do something with that insane environment. The pattern I've come up with is "if this, then not that", so if either ->shutdown or ->remove gets called, we set the device's drvdata to NULL, and in the other hook, we check whether the drvdata is NULL and just do nothing. This is probably not necessary for platform devices, just for devices on buses, but I would really insist for consistency among drivers, because when code is copy-pasted, it is not always copy-pasted from the best sources. So depending on whether the DSA switch's ->remove or ->shutdown will get called first, we cannot really guarantee even for the same driver if rebooting will result in the same code path on all platforms. But nonetheless, we need to do something minimally reasonable on ->shutdown too to fix the bug. Of course, the ->remove will do more (a full teardown of the tree, with all data structures freed, and this is why the bug was not caught for so long). The new ->shutdown method is kept separate from dsa_unregister_switch not because we couldn't have unregistered the switch, but simply in the interest of doing something quick and to the point. The big question is: does the DSA switch's ->shutdown get called earlier than the DSA master's ->shutdown? If not, there is still a risk that we might still trigger the WARN_ON in unregister_netdevice that says we are attempting to unregister a net_device which has uppers. That's no good. Although the reference to the master net_device won't physically go away even if DSA's ->shutdown comes afterwards, remember we have a dev_hold on it. The answer to that question lies in this comment above device_link_add: * A side effect of the link creation is re-ordering of dpm_list and the * devices_kset list by moving the consumer device and all devices depending * on it to the ends of these lists (that does not happen to devices that have * not been registered when this function is called). so the fact that DSA uses device_link_add towards its master is not exactly for nothing. device_shutdown() walks devices_kset from the back, so this is our guarantee that DSA's shutdown happens before the master's shutdown. Fixes: 2f1e8ea726e9 ("net: dsa: link interfaces with the DSA master to get rid of lockdep warnings") Link: https://lore.kernel.org/netdev/20210909095324.12978-1-LinoSanfilippo@gmx.de/ Reported-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
3c9cfb52 |
|
17-Sep-2021 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: update NXP copyright text NXP Legal insists that the following are not fine: - Saying "NXP Semiconductors" instead of "NXP", since the company's registered name is "NXP" - Putting a "(c)" sign in the copyright string - Putting a comma in the copyright string The only accepted copyright string format is "Copyright <year-range> NXP". This patch changes the copyright headers in the networking files that were sent by me, or derived from code sent by me. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
297c4de6 |
|
10-May-2021 |
Michael Walle <michael@walle.cc> |
net: dsa: felix: re-enable TAS guard band mode Commit 316bcffe4479 ("net: dsa: felix: disable always guard band bit for TAS config") disabled the guard band and broke 802.3Qbv compliance. There are two issues here: (1) Without the guard band the end of the scheduling window could be overrun by a frame in transit. (2) Frames that don't fit into a configured window will still be sent. The reason for both issues is that the switch will schedule the _start_ of a frame transmission inside the predefined window without taking the length of the frame into account. Thus, we'll need the guard band which will close the gate early, so that a complete frame can still be sent. Revert the commit and add a note. For a lengthy discussion see [1]. [1] https://lore.kernel.org/netdev/c7618025da6723418c56a54fe4683bd7@walle.cc/ Fixes: 316bcffe4479 ("net: dsa: felix: disable always guard band bit for TAS config") Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
316bcffe |
|
19-Apr-2021 |
Xiaoliang Yang <xiaoliang.yang_1@nxp.com> |
net: dsa: felix: disable always guard band bit for TAS config ALWAYS_GUARD_BAND_SCH_Q bit in TAS config register is descripted as this: 0: Guard band is implemented for nonschedule queues to schedule queues transition. 1: Guard band is implemented for any queue to schedule queue transition. The driver set guard band be implemented for any queue to schedule queue transition before, which will make each GCL time slot reserve a guard band time that can pass the max SDU frame. Because guard band time could not be set in tc-taprio now, it will use about 12000ns to pass 1500B max SDU. This limits each GCL time interval to be more than 12000ns. This patch change the guard band to be only implemented for nonschedule queues to schedule queues transition, so that there is no need to reserve guard band on each GCL. Users can manually add guard band time for each schedule queues in their configuration if they want. Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a180be79 |
|
28-Mar-2021 |
Guobin Huang <huangguobin4@huawei.com> |
net: mscc: ocelot: remove redundant dev_err call in vsc9959_mdio_bus_alloc() There is a error message within devm_ioremap_resource already, so remove the dev_err call to avoid redundant error message. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Guobin Huang <huangguobin4@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c8c0ba4f |
|
13-Feb-2021 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: setup MMIO filtering rules for PTP when using tag_8021q Since the tag_8021q tagger is software-defined, it has no means by itself for retrieving hardware timestamps of PTP event messages. Because we do want to support PTP on ocelot even with tag_8021q, we need to use the CPU port module for that. The RX timestamp is present in the Extraction Frame Header. And because we can't use NPI mode which redirects the CPU queues to an "external CPU" (meaning the ARM CPU running Linux), then we need to poll the CPU port module through the MMIO registers to retrieve TX and RX timestamps. Sadly, on NXP LS1028A, the Felix switch was integrated into the SoC without wiring the extraction IRQ line to the ARM GIC. So, if we want to be notified of any PTP packets received on the CPU port module, we have a problem. There is a possible workaround, which is to use the Ethernet CPU port as a notification channel that packets are available on the CPU port module as well. When a PTP packet is received by the DSA tagger (without timestamp, of course), we go to the CPU extraction queues, poll for it there, then we drop the original Ethernet packet and masquerade the packet retrieved over MMIO (plus the timestamp) as the original when we inject it up the stack. Create a quirk in struct felix is selected by the Felix driver (but not by Seville, since that doesn't support PTP at all). We want to do this such that the workaround is minimally invasive for future switches that don't require this workaround. The only traffic for which we need timestamps is PTP traffic, so add a redirection rule to the CPU port module for this. Currently we only have the need for PTP over L2, so redirection rules for UDP ports 319 and 320 are TBD for now. Note that for the workaround of matching of PTP-over-Ethernet-port with PTP-over-MMIO queues to work properly, both channels need to be absolutely lossless. There are two parts to achieving that: - We keep flow control enabled on the tag_8021q CPU port - We put the DSA master interface in promiscuous mode, so it will never drop a PTP frame (for the profiles we are interested in, these are sent to the multicast MAC addresses of 01-80-c2-00-00-0e and 01-1b-19-00-00-00). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
7c4bb540 |
|
13-Feb-2021 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: tag_ocelot: create separate tagger for Seville The ocelot tagger is a hot mess currently, it relies on memory initialized by the attached driver for basic frame transmission. This is against all that DSA tagging protocols stand for, which is that the transmission and reception of a DSA-tagged frame, the data path, should be independent from the switch control path, because the tag protocol is in principle hot-pluggable and reusable across switches (even if in practice it wasn't until very recently). But if another driver like dsa_loop wants to make use of tag_ocelot, it couldn't. This was done to have common code between Felix and Ocelot, which have one bit difference in the frame header format. Quoting from commit 67c2404922c2 ("net: dsa: felix: create a template for the DSA tags on xmit"): Other alternatives have been analyzed, such as: - Create a separate tag_seville.c: too much code duplication for just 1 bit field difference. - Create a separate DSA_TAG_PROTO_SEVILLE under tag_ocelot.c, just like tag_brcm.c, which would have a separate .xmit function. Again, too much code duplication for just 1 bit field difference. - Allocate the template from the init function of the tag_ocelot.c module, instead of from the driver: couldn't figure out a method of accessing the correct port template corresponding to the correct tagger in the .xmit function. The really interesting part is that Seville should have had its own tagging protocol defined - it is not compatible on the wire with Ocelot, even for that single bit. In principle, a packet generated by DSA_TAG_PROTO_OCELOT when booted on NXP LS1028A would look in a certain way, but when booted on NXP T1040 it would look differently. The reverse is also true: a packet generated by a Seville switch would be interpreted incorrectly by Wireshark if it was told it was generated by an Ocelot switch. Actually things are a bit more nuanced. If we concentrate only on the DSA tag, what I said above is true, but Ocelot/Seville also support an optional DSA tag prefix, which can be short or long, and it is possible to distinguish the two taggers based on an integer constant put in that prefix. Nonetheless, creating a separate tagger is still justified, since the tag prefix is optional, and without it, there is again no way to distinguish. Claiming backwards binary compatibility is a bit more tough, since I've already changed the format of tag_ocelot once, in commit 5124197ce58b ("net: dsa: tag_ocelot: use a short prefix on both ingress and egress"). Therefore I am not very concerned with treating this as a bugfix and backporting it to stable kernels (which would be another mess due to the fact that there would be lots of conflicts with the other DSA_TAG_PROTO* definitions). It's just simpler to say that the string values of the taggers have ABI value starting with kernel 5.12, which will be when the changing of tag protocol via /sys/class/net/<dsa-master>/dsa/tagging goes live. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
40d3f295 |
|
13-Feb-2021 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: use common tag parsing code with DSA The Injection Frame Header and Extraction Frame Header that the switch prepends to frames over the NPI port is also prepended to frames delivered over the CPU port module's queues. Let's unify the handling of the frame headers by making the ocelot driver call some helpers exported by the DSA tagger. Among other things, this allows us to get rid of the strange cpu_to_be32 when transmitting the Injection Frame Header on ocelot, since the packing API uses network byte order natively (when "quirks" is 0). The comments above ocelot_gen_ifh talk about setting pop_cnt to 3, and the cpu extraction queue mask to something, but the code doesn't do it, so we don't do it either. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
adb3dccf |
|
28-Jan-2021 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: convert to the new .change_tag_protocol DSA API In expectation of the new tag_ocelot_8021q tagger implementation, we need to be able to do runtime switchover between one tagger and another. So we must structure the existing code for the current NPI-based tagger in a certain way. We move the felix_npi_port_init function in expectation of the future driver configuration necessary for tag_ocelot_8021q: we would like to not have the NPI-related bits interspersed with the tag_8021q bits. The conversion from this: ocelot_write_rix(ocelot, ANA_PGID_PGID_PGID(GENMASK(ocelot->num_phys_ports, 0)), ANA_PGID_PGID, PGID_UC); to this: cpu_flood = ANA_PGID_PGID_PGID(BIT(ocelot->num_phys_ports)); ocelot_rmw_rix(ocelot, cpu_flood, cpu_flood, ANA_PGID_PGID, PGID_UC); is perhaps non-trivial, but is nonetheless non-functional. The PGID_UC (replicator for unknown unicast) is already configured out of hardware reset to flood to all ports except ocelot->num_phys_ports (the CPU port module). All we change is that we use a read-modify-write to only add the CPU port module to the unknown unicast replicator, as opposed to doing a full write to the register. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
70d39a6e |
|
14-Jan-2021 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: export NUM_TC constant from felix to common switch lib We should be moving anything that isn't DSA-specific or SoC-specific out of the felix DSA driver, and into the common mscc_ocelot switch library. The number of traffic classes is one of the aspects that is common between all ocelot switches, so it belongs in the library. This patch also makes seville use 8 TX queues, and therefore enables prioritization via the QOS_CLASS field in the NPI injection header. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
703b7621 |
|
14-Jan-2021 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: add ops for decoding watermark threshold and occupancy We'll need to read back the watermark thresholds and occupancy from hardware (for devlink-sb integration), not only to write them as we did so far in ocelot_port_set_maxlen. So introduce 2 new functions in struct ocelot_ops, similar to wm_enc, and implement them for the 3 supported mscc_ocelot switches. Remove the INUSE and MAXUSE unpacking helpers for the QSYS_RES_STAT register, because that doesn't scale with the number of switches that mscc_ocelot supports now. They have different bit widths for the watermarks, and we need function pointers to abstract that difference away. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
f6fe01d6 |
|
14-Jan-2021 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: auto-detect packet buffer size and number of frame references Instead of reading these values from the reference manual and writing them down into the driver, it appears that the hardware gives us the option of detecting them dynamically. The number of frame references corresponds to what the reference manual notes, however it seems that the frame buffers are reported as slightly less than the books would indicate. On VSC9959 (Felix), the books say it should have 128KB of packet buffer, but the registers indicate only 129840 bytes (126.79 KB). Also, the unit of measurement for FREECNT from the documentation of all these devices is incorrect (taken from an older generation). This was confirmed by Younes Leroul from Microchip support. Not having anything better to do with these values at the moment* (this will change soon), let's just print them. *The frame buffer size is, in fact, used to calculate the tail dropping watermarks. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
537e2b88 |
|
09-Jan-2021 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: the switch does not support DMA The code that sets the DMA mask to 64 bits is bogus, it is taken from the enetc driver together with the rest of the PCI probing boilerplate. Since this patch is touching the error path to delete err_dma, let's also change the err_alloc_felix label which was incorrect. The kzalloc failure does not need a kfree, but it doesn't hurt either, since kfree works with NULL pointers. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20210109203415.2120142-1-olteanv@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
edd2410b |
|
04-Dec-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: fix dropping of unknown IPv4 multicast on Seville The current assumption is that the felix DSA driver has flooding knobs per traffic class, while ocelot switchdev has a single flooding knob. This was correct for felix VSC9959 and ocelot VSC7514, but with the introduction of seville VSC9953, we see a switch driven by felix.c which has a single flooding knob. So it is clear that we must do what should have been done from the beginning, which is not to overwrite the configuration done by ocelot.c in felix, but instead to teach the common ocelot library about the differences in our switches, and set up the flooding PGIDs centrally. The effect that the bogus iteration through FELIX_NUM_TC has upon seville is quite dramatic. ANA_FLOODING is located at 0x00b548, and ANA_FLOODING_IPMC is located at 0x00b54c. So the bogus iteration will actually overwrite ANA_FLOODING_IPMC when attempting to write ANA_FLOODING[1]. There is no ANA_FLOODING[1] in sevile, just ANA_FLOODING. And when ANA_FLOODING_IPMC is overwritten with a bogus value, the effect is that ANA_FLOODING_IPMC gets the value of 0x0003CF7D: MC6_DATA = 61, MC6_CTRL = 61, MC4_DATA = 60, MC4_CTRL = 0. Because MC4_CTRL is zero, this means that IPv4 multicast control packets are not flooded, but dropped. An invalid configuration, and this is how the issue was actually spotted. Reported-by: Eldar Gasanov <eldargasanov2@gmail.com> Reported-by: Maxim Kochetkov <fido_max@inbox.ru> Tested-by: Eldar Gasanov <eldargasanov2@gmail.com> Fixes: 84705fc16552 ("net: dsa: felix: introduce support for Seville VSC9953 switch") Fixes: 3c7b51bd39b2 ("net: dsa: felix: allow flooding for all traffic classes") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20201204175416.1445937-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
01326493 |
|
04-Oct-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: warn when encoding an out-of-bounds watermark value There is an upper bound to the value that a watermark may hold. That upper bound is not immediately obvious during configuration, and it might be possible to have accidental truncation. Actually this has happened already, add a warning to prevent it from happening again. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
75944fda |
|
02-Oct-2020 |
Xiaoliang Yang <xiaoliang.yang_1@nxp.com> |
net: mscc: ocelot: offload ingress skbedit and vlan actions to VCAP IS1 VCAP IS1 is a VCAP module which can filter on the most common L2/L3/L4 Ethernet keys, and modify the results of the basic QoS classification and VLAN classification based on those flow keys. There are 3 VCAP IS1 lookups, mapped over chains 10000, 11000 and 12000. Currently the driver is hardcoded to use IS1_ACTION_TYPE_NORMAL half keys. Note that the VLAN_MANGLE has been omitted for now. In hardware, the VCAP_IS1_ACT_VID_REPLACE_ENA field replaces the classified VLAN (metadata associated with the frame) and not the VLAN from the header itself. There are currently some issues which need to be addressed when operating in standalone, or in bridge with vlan_filtering=0 modes, because in those cases the switch ports have VLAN awareness disabled, and changing the classified VLAN to anything other than the pvid causes the packets to be dropped. Another issue is that on egress, we expect port tagging to push the classified VLAN, but port tagging is disabled in the modes mentioned above, so although the classified VLAN is replaced, it is not visible in the packet transmitted by the switch. Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
319e4dd1 |
|
02-Oct-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: introduce conversion helpers between port and netdev Since the mscc_ocelot_switch_lib is common between a pure switchdev and a DSA driver, the procedure of retrieving a net_device for a certain port index differs, as those are registered by their individual front-ends. Up to now that has been dealt with by always passing the port index to the switch library, but now, we're going to need to work with net_device pointers from the tc-flower offload, for things like indev, or mirred. It is not desirable to refactor that, so let's make sure that the flower offload core has the ability to translate between a net_device and a port index properly. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
d732e9ce |
|
29-Sep-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: remove unneeded VCAP parameters for IS2 Now that we are deriving these from the constants exposed by the hardware, we can delete the static info we're keeping in the driver. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
20968054 |
|
29-Sep-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: automatically detect VCAP constants The numbers in struct vcap_props are not intuitive to derive, because they are not a straightforward copy-and-paste from the reference manual but instead rely on a fairly detailed level of understanding of the layout of an entry in the TCAM and in the action RAM. For this reason, bugs are very easy to introduce here. Ease the work of hardware porters and read from hardware the constants that were exported for this particular purpose. Note that this implies that struct vcap_props can no longer be const. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e3aea296 |
|
29-Sep-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: add definitions for VCAP ES0 keys, actions and target As a preparation step for the offloading to ES0, let's create the infrastructure for talking with this hardware block. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a61e365d |
|
29-Sep-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: add definitions for VCAP IS1 keys, actions and target As a preparation step for the offloading to IS1, let's create the infrastructure for talking with this hardware block. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c1c3993e |
|
29-Sep-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: generalize existing code for VCAP In the Ocelot switches there are 3 TCAMs: VCAP ES0, IS1 and IS2, which have the same configuration interface, but different sets of keys and actions. The driver currently only supports VCAP IS2. In preparation of VCAP IS1 and ES0 support, the existing code must be generalized to work with any VCAP. In that direction, we should move the structures that depend upon VCAP instantiation, like vcap_is2_keys and vcap_is2_actions, out of struct ocelot and into struct vcap_props .keys and .actions, a structure that is replicated 3 times, once per VCAP. We'll pass that structure as an argument to each function that does the key and action packing - only the control logic needs to distinguish between ocelot->vcap[VCAP_IS2] or IS1 or ES0. Another change is to make use of the newly introduced ocelot_target_read and ocelot_target_write API, since the 3 VCAPs have the same registers but put at different addresses. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
460e985e |
|
29-Sep-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: fix incorrect action offsets for VCAP IS2 The port mask width was larger than the actual number of ports, and therefore, all fields following this one were also shifted by the number of excess bits. But the driver doesn't use the REW_OP, SMAC_REPLACE_ENA or ACL_ID bits from the action vector, so the bug was inconsequential. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
5124197c |
|
26-Sep-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: tag_ocelot: use a short prefix on both ingress and egress There are 2 goals that we follow: - Reduce the header size - Make the header size equal between RX and TX The issue that required long prefix on RX was the fact that the ocelot DSA tag, being put before Ethernet as it is, would overlap with the area that a DSA master uses for RX filtering (destination MAC address mainly). Now that we can ask DSA to put the master in promiscuous mode, in theory we could remove the prefix altogether and call it a day, but it looks like we can't. Using no prefix on ingress, some packets (such as ICMP) would be received, while others (such as PTP) would not be received. This is because the DSA master we use (enetc) triggers parse errors ("MAC rx frame errors") presumably because it sees Ethernet frames with a bad length. And indeed, when using no prefix, the EtherType (bytes 12-13 of the frame, bits 96-111) falls over the REW_VAL field from the extraction header, aka the PTP timestamp. When turning the short (32-bit) prefix on, the EtherType overlaps with bits 64-79 of the extraction header, which are a reserved area transmitted as zero by the switch. The packets are not dropped by the DSA master with a short prefix. Actually, the frames look like this in tcpdump (below is a PTP frame, with an extra dsa_8021q tag - dadb 0482 - added by a downstream sja1105). 89:0c:a9:f2:01:00 > 88:80:00:0a:00:1d, 802.3, length 0: LLC, \ dsap Unknown (0x10) Individual, ssap ProWay NM (0x0e) Response, \ ctrl 0x0004: Information, send seq 2, rcv seq 0, \ Flags [Response], length 78 0x0000: 8880 000a 001d 890c a9f2 0100 0000 100f ................ 0x0010: 0400 0000 0180 c200 000e 001f 7b63 0248 ............{c.H 0x0020: dadb 0482 88f7 1202 0036 0000 0000 0000 .........6...... 0x0030: 0000 0000 0000 0000 0000 001f 7bff fe63 ............{..c 0x0040: 0248 0001 1f81 0500 0000 0000 0000 0000 .H.............. 0x0050: 0000 0000 0000 0000 0000 0000 ............ So the short prefix is our new default: we've shortened our RX frames by 12 octets, increased TX by 4, and headers are now equal between RX and TX. Note that we still need promiscuous mode for the DSA master to not drop it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
dba1e466 |
|
23-Sep-2020 |
Xiaoliang Yang <xiaoliang.yang_1@nxp.com> |
net: dsa: felix: convert TAS link speed based on phylink speed state->speed holds a value of 10, 100, 1000 or 2500, but QSYS_TAG_CONFIG_LINK_SPEED expects a value of 0, 1, 2, 3. So convert the speed to a proper value. Fixes: de143c0e274b ("net: dsa: felix: Configure Time-Aware Scheduler via taprio offload") Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8b9e03cd |
|
21-Sep-2020 |
Xiaoliang Yang <xiaoliang.yang_1@nxp.com> |
net: dsa: felix: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries Some of the IS2 IP4_TCP_UDP keys are not correct, like L4_DPORT, L4_SPORT and other L4 keys. This prevents offloaded tc-flower rules from matching on src_port and dst_port for TCP and UDP packets. Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
d60bc62d |
|
18-Sep-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: seville: build as separate module Seville does not need to depend on PCI or on the ENETC MDIO controller. There will also be other compile-time differences in the future. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
2ac7c6c5 |
|
18-Sep-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: move the PTP clock structure to felix_vsc9959.c Not only does Sevile not have a PTP clock, but with separate modules, this structure cannot even live in felix.c, due to the .owner = THIS_MODULE assignment causing this link time error: drivers/net/dsa/ocelot/felix.o:(.data+0x0): undefined reference to `__this_module' Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ccfdbab5 |
|
18-Sep-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: seville: duplicate vsc9959_mdio_bus_free While we don't plan on making any changes to this function, currently this is the only remaining dependency between felix and seville, after the PCS has been refactored out into pcs-lynx.c. Duplicate this function in seville to break the dependency completely. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
f8320ec1 |
|
18-Sep-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: replace tabs with spaces Over the time, some patches have introduced structures aligned with spaces, near structures aligned with tabs. Fix the inconsistencies. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c129fc55 |
|
18-Sep-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: ocelot: document why reset procedure is different for felix/seville The overall idea (issue soft reset, enable memories, initialize memories, enable core) is the same, so it would make sense that an attempt is made to unify the procedures. It is not immediately obvious that the fields are not part of the same register targets, though. So add a comment. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
75cea9cb |
|
18-Sep-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: use ocelot_field_{read,write} helpers consistently Since these helpers for regmap fields are available, use them. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
588d0550 |
|
30-Aug-2020 |
Ioana Ciornei <ioana.ciornei@nxp.com> |
net: dsa: ocelot: use the Lynx PCS helpers in Felix and Seville Use the helper functions introduced by the newly added Lynx PCS MDIO module in the Felix VSC9959 and Seville VSC9953. Instead of representing the PCS as a phy_device, a mdio_device structure will be passed to the Lynx module which is now actually implementing all the PCS configuration and status reporting. All code previously used for PCS monitoring and runtime configuration is removed and replaced will calls to the Lynx PCS operations. Tested on the following SERDES protocols of LS1028A: 0x7777 (2500Base-X), 0x85bb (QSGMII), 0x9999 (SGMII) and 0x13bb (USXGMII). Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
16659b81 |
|
19-Jul-2020 |
Michael Walle <michael@walle.cc> |
net: dsa: felix: (re)use already existing constants Now that there are USXGMII constants available, drop the old definitions and reuse the generic ones. Signed-off-by: Michael Walle <michael@walle.cc> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
84705fc1 |
|
13-Jul-2020 |
Maxim Kochetkov <fido_max@inbox.ru> |
net: dsa: felix: introduce support for Seville VSC9953 switch This is another switch from Vitesse / Microsemi / Microchip, that has 10 ports (8 external, 2 internal) and is integrated into the Freescale / NXP T1040 PowerPC SoC. It is very similar to Felix from NXP LS1028A, except that this is a platform device and Felix is a PCI device, and it doesn't support IEEE 1588 and TSN. Like Felix, this driver configures its own PCS on the internal MDIO bus using a phy_device abstraction for it (yes, it will be refactored to use a raw mdio_device, like other phylink drivers do, but let's keep it like that for now). But unlike Felix, the MDIO bus and the PCS are not from the same vendor. The PCS is the same QorIQ/Layerscape PCS as found in Felix/ENETC/DPAA*, but the internal MDIO bus that is used to access it is actually an instantiation of drivers/net/phy/mdio-mscc-miim.c. But it would be difficult to reuse that driver (it doesn't even use regmap, and it's less than 200 lines of code), so we hand-roll here some internal MDIO bus accessors within seville_vsc9953.c, which serves the purpose of driving the PCS absolutely fine. Also, same as Felix, the PCS doesn't support dynamic reconfiguration of SerDes protocol, so we need to do pre-validation of PHY mode from device tree and not let phylink change it. Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
375e1314 |
|
13-Jul-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: move probing to felix_vsc9959.c Felix is not actually meant to be a DSA driver only for the switch inside NXP LS1028A, but an umbrella for all Vitesse / Microsemi / Microchip switches that are register-compatible with Ocelot and that are using in DSA mode (with an NPI Ethernet port). For the dsa_switch_ops exported by the felix driver to be generic enough to be used by other non-PCI switches, we need to move the PCI-specific probing to the low-level translation module felix_vsc9959.c. This way, other switches can have their own probing functions, as platform devices or otherwise. This patch also removes the "Felix instance table", which did not stand the test of time and is unnecessary at this point. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
aa92d836 |
|
13-Jul-2020 |
Maxim Kochetkov <fido_max@inbox.ru> |
net: mscc: ocelot: extend watermark encoding function The ocelot_wm_encode function deals with setting thresholds for pause frame start and stop. In Ocelot and Felix the register layout is the same, but for Seville, it isn't. The easiest way to accommodate Seville hardware configuration is to introduce a function pointer for setting this up. Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
541132f0 |
|
13-Jul-2020 |
Maxim Kochetkov <fido_max@inbox.ru> |
net: mscc: ocelot: convert SYS_PAUSE_CFG register access to regfield Seville has a different bitwise layout than Ocelot and Felix. Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
67c24049 |
|
13-Jul-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: create a template for the DSA tags on xmit With this patch we try to kill 2 birds with 1 stone. First of all, some switches that use tag_ocelot.c don't have the exact same bitfield layout for the DSA tags. The destination ports field is different for Seville VSC9953 for example. So the choices are to either duplicate tag_ocelot.c into a new tag_seville.c (sub-optimal) or somehow take into account a supposed ocelot->dest_ports_offset when packing this field into the DSA injection header (again not ideal). Secondly, tag_ocelot.c already needs to memset a 128-bit area to zero and call some packing() functions of dubious performance in the fastpath. And most of the values it needs to pack are pretty much constant (BYPASS=1, SRC_PORT=CPU, DEST=port index). So it would be good if we could improve that. The proposed solution is to allocate a memory area per port at probe time, initialize that with the statically defined bits as per chip hardware revision, and just perform a simpler memcpy in the fastpath. Other alternatives have been analyzed, such as: - Create a separate tag_seville.c: too much code duplication for just 1 bit field difference. - Create a separate DSA_TAG_PROTO_SEVILLE under tag_ocelot.c, just like tag_brcm.c, which would have a separate .xmit function. Again, too much code duplication for just 1 bit field difference. - Allocate the template from the init function of the tag_ocelot.c module, instead of from the driver: couldn't figure out a method of accessing the correct port template corresponding to the correct tagger in the .xmit function. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
886e1387 |
|
13-Jul-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: convert QSYS_SWITCH_PORT_MODE and SYS_PORT_MODE to regfields Currently Felix and Ocelot share the same bit layout in these per-port registers, but Seville does not. So we need reg_fields for that. Actually since these are per-port registers, we need to also specify the number of ports, and register size per port, and use the regmap API for multiple ports. There's a more subtle point to be made about the other 2 register fields: - QSYS_SWITCH_PORT_MODE_SCH_NEXT_CFG - QSYS_SWITCH_PORT_MODE_INGRESS_DROP_MODE which we are not writing any longer, for 2 reasons: - Using the previous API (ocelot_write_rix), we were only writing 1 for Felix and Ocelot, which was their hardware-default value, and which there wasn't any intention in changing. - In the case of SCH_NEXT_CFG, in fact Seville does not have this register field at all, and therefore, if we want to have common code we would be required to not write to it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
2789658f |
|
13-Jul-2020 |
Maxim Kochetkov <fido_max@inbox.ru> |
soc: mscc: ocelot: add MII registers description Add the register definitions for the MSCC MIIM MDIO controller in preparation for seville_vsc9959.c to create its accessors for the internal MDIO bus. Since we've introduced elements to ocelot_regfields that are not instantiated by felix and ocelot, we need to define the size of the regfields arrays explicitly, otherwise ocelot_regfields_init, which iterates up to REGFIELD_MAX, will fault on the undefined regfield entries (if we're lucky). Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
91c724cf |
|
13-Jul-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: mscc: ocelot: convert port registers to regmap At the moment, there are some minimal register differences between VSC7514 Ocelot and VSC9959 Felix. To be precise, the PCS1G registers are missing from Felix because it was integrated with an NXP PCS. But with VSC9953 Seville (not yet introduced), the register differences are more pronounced. The MAC registers are located at different offsets within the DEV_GMII target. So we need to refactor the driver to keep a regmap even for per-port registers. The callers of the ocelot_port_readl and ocelot_port_writel were kept unchanged, only the implementation is now more generic. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
7e14a2dc |
|
05-Jul-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: use resolved link config in mac_link_up() Phylink now requires that parameters established through auto-negotiation be written into the MAC at the time of the mac_link_up() callback. In the case of felix, that means taking the port out of reset, setting the correct timers for PAUSE frames, and enabling/disabling TX flow control. This patch also splits the inband and noinband configuration of the vsc9959 PCS (currently found in a function called "init") into 2 different functions, which have a nomenclature closer to phylink: "config", for inband setup, and "link_up", for noinband (forced) setup. This is necessary as a preparation step for giving up control of the PCS to phylink, which will be done in further patch series. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b4c23545 |
|
05-Jul-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: delete .phylink_mac_an_restart code Phylink uses the .mac_an_restart method to offer the user an implementation of the "ethtool -r" behavior, when the media-side auto negotiation can be restarted by the local MAC PCS. This is the case for fiber modes 1000Base-X and 2500Base-X (IEEE clause 37) that don't have an Ethernet PHY connected locally, and the media is connected to the MAC PCS directly. On the other hand, the Cisco SGMII and USXGMII standards also have an auto negotiation mechanism based on IEEE 802.3 clause 37 (their respective specs require a MAC PCS and a PHY PCS to implement the same state machine, which is described in IEEE 802.3 "Auto-Negotiation Figure 37-6"), so the ability to restart auto-negotiation is intrinsically symmetrical (the MAC PCS can do it too). However, it appears that not all SGMII/USXGMII PHYs have logic to restart the MDI-side auto-negotiation process when they detect a transition of the SGMII link from data mode to configuration mode. Some do (VSC8234) and some don't (AR8033, MV88E1111). IEEE and/or Cisco specification wordings to not help to prove whether propagating the "AN restart" event from MII side ("mr_restart_an") to MDI side ("mr_restart_negotiation") is required behavior - neither of them specifies any mandatory interaction between the clause 37 AN state machine from Figure 37-6 and the clause 28 AN state machine from Figure 28-18. Therefore, even if a certain behavior could be proven as being required, real-life SGMII/USXGMII PHYs are inconsistent enough that a clause 37 AN restart cannot be used by phylink to reliably trigger a media-side renegotiation, when the user requests it via ethtool. The only remaining use that the .mac_an_restart callback might possibly have, given what we know now, is to implement some silicon quirks, but so far that has proven to not be necessary. So remove this code for now, since it never gets called and we don't foresee any circumstance in which it might be, either. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b1c7b874 |
|
05-Jul-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: support half-duplex link modes Ping tested: [ 11.808455] mscc_felix 0000:00:00.5 swp0: Link is Up - 1Gbps/Full - flow control rx/tx [ 11.816497] IPv6: ADDRCONF(NETDEV_CHANGE): swp0: link becomes ready [root@LS1028ARDB ~] # ethtool -s swp0 advertise 0x4 [ 18.844591] mscc_felix 0000:00:00.5 swp0: Link is Down [ 22.048337] mscc_felix 0000:00:00.5 swp0: Link is Up - 100Mbps/Half - flow control off [root@LS1028ARDB ~] # ip addr add 192.168.1.1/24 dev swp0 [root@LS1028ARDB ~] # ping 192.168.1.2 PING 192.168.1.2 (192.168.1.2): 56 data bytes (...) ^C--- 192.168.1.2 ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 0.383/0.611/1.051 ms [root@LS1028ARDB ~] # ethtool -s swp0 advertise 0x10 [ 355.637747] mscc_felix 0000:00:00.5 swp0: Link is Down [ 358.788034] mscc_felix 0000:00:00.5 swp0: Link is Up - 1Gbps/Half - flow control off [root@LS1028ARDB ~] # ping 192.168.1.2 PING 192.168.1.2 (192.168.1.2): 56 data bytes (...) ^C --- 192.168.1.2 ping statistics --- 16 packets transmitted, 16 packets received, 0% packet loss round-trip min/avg/max = 0.301/0.384/1.138 ms Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
3f2628d6 |
|
05-Jul-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: clarify the intention of writes to MII_BMCR The driver appears to write to BMCR_SPEED and BMCR_DUPLEX, fields which are read-only, since they are actually configured through the vendor-specific IF_MODE (0x14) register. But the reason we're writing back the read-only values of MII_BMCR is to alter these writable fields: BMCR_RESET BMCR_LOOPBACK BMCR_ANENABLE BMCR_PDOWN BMCR_ISOLATE BMCR_ANRESTART In particular, the only field which is really relevant to this driver is BMCR_ANENABLE. Clarify that intention by spelling it out, using phy_set_bits and phy_clear_bits. The driver also made a few writes to BMCR_RESET and BMCR_ANRESTART which are unnecessary and may temporarily disrupt the link to the PHY. Remove them. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
3ab4ceb6 |
|
20-Jun-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: make vcap is2 keys and actions static Get rid of some sparse warnings. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b4024c9e |
|
22-May-2020 |
Claudiu Manoil <claudiu.manoil@nxp.com> |
felix: Fix initialization of ioremap resources The caller of devm_ioremap_resource(), either accidentally or by wrong assumption, is writing back derived resource data to global static resource initialization tables that should have been constant. Meaning that after it computes the final physical start address it saves the address for no reason in the static tables. This doesn't affect the first driver probing after reboot, but it breaks consecutive driver reloads (i.e. driver unbind & bind) because the initialization tables no longer have the correct initial values. So the next probe() will map the device registers to wrong physical addresses, causing ARM SError async exceptions. This patch fixes all of the above. Fixes: 56051948773e ("net: dsa: ocelot: add driver for Felix switch family") Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b014d043 |
|
14-May-2020 |
Colin Ian King <colin.king@canonical.com> |
net: dsa: felix: fix incorrect clamp calculation for burst Currently burst is clamping on rate and not burst, the assignment of burst from the clamping discards the previous assignment of burst. This looks like a cut-n-paste error from the previous clamping calculation on ramp. Fix this by replacing ramp with burst. Addresses-Coverity: ("Unused value") Fixes: 0fbabf875d18 ("net: dsa: felix: add support Credit Based Shaper(CBS) for hardware offload") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
0fbabf87 |
|
12-May-2020 |
Xiaoliang Yang <xiaoliang.yang_1@nxp.com> |
net: dsa: felix: add support Credit Based Shaper(CBS) for hardware offload VSC9959 hardware support the Credit Based Shaper(CBS) which part of the IEEE-802.1Qav. This patch support sch_cbs set for VSC9959. Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
de143c0e |
|
12-May-2020 |
Xiaoliang Yang <xiaoliang.yang_1@nxp.com> |
net: dsa: felix: Configure Time-Aware Scheduler via taprio offload Ocelot VSC9959 switch supports time-based egress shaping in hardware according to IEEE 802.1Qbv. This patch add support for TAS configuration on egress port of VSC9959 switch. Felix driver is an instance of Ocelot family, with a DSA front-end. The patch uses tc taprio hardware offload to setup TAS set function on felix driver. Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
21ce7f3e |
|
03-May-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: ocelot: the MAC table on Felix is twice as large When running 'bridge fdb dump' on Felix, sometimes learnt and static MAC addresses would appear, sometimes they wouldn't. Turns out, the MAC table has 4096 entries on VSC7514 (Ocelot) and 8192 entries on VSC9959 (Felix), so the existing code from the Ocelot common library only dumped half of Felix's MAC table. They are both organized as a 4-way set-associative TCAM, so we just need a single variable indicating the correct number of rows. Fixes: 56051948773e ("net: dsa: ocelot: add driver for Felix switch family") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
94aca082 |
|
19-Apr-2020 |
Yangbo Lu <yangbo.lu@nxp.com> |
net: mscc: ocelot: add wave programming registers definitions Add wave programming registers definitions for Ocelot platforms. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
07d985ee |
|
29-Feb-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: Wire up the ocelot cls_flower methods Export the cls_flower methods from the ocelot driver and hook them up to the DSA passthrough layer. Tables for the VCAP IS2 parameters, as well as half key packing (field offsets and lengths) need to be defined for the VSC9959 core, as they are different from Ocelot, mainly due to the different port count. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
28a134f5 |
|
24-Feb-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: Use PHY_INTERFACE_MODE_INTERNAL instead of GMII phy-mode = "gmii" is confusing because it may mean that the port supports the 8-bit-wide parallel data interface pinout, which it doesn't. It may also be confusing because one of the "gmii" internal ports is actually overclocked to run at 2.5Gbps (even though, yes, as far as the switch MAC is concerned, it still thinks it's gigabit). So use the phy-mode = "internal" property to describe the internal ports inside the NXP LS1028A chip (the ones facing the ENETC). The change should be fine, because the device tree bindings document is yet to be introduced, and there are no stable DT blobs in use. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Michael Walle <michael@walle.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8c6123e1 |
|
16-Jan-2020 |
Alex Marginean <alexandru.marginean@nxp.com> |
net: dsa: felix: Don't restart PCS SGMII AN if not needed Some PHYs like VSC8234 don't like it when AN restarts on their system side and they restart line side AN too, going into an endless link up/down loop. Don't restart PCS AN if link is up already. Although in theory this feedback loop should be possible with the other in-band AN modes too, for some reason it was not seen with the VSC8514 QSGMII and AQR412 USXGMII PHYs. So keep this logic only for SGMII where the problem was found. Fixes: bdeced75b13f ("net: dsa: felix: Add PCS operations for PHYLINK") Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
062a33b1 |
|
16-Jan-2020 |
Alex Marginean <alexandru.marginean@nxp.com> |
net: dsa: felix: Set USXGMII link based on BMSR, not LPA At least some PHYs (AQR412) don't advertise copper-side link status during system side AN. So remove this duplicate assignment to pcs->link and rely on the previous one for link state: the local indication from the MAC PCS. Fixes: bdeced75b13f ("net: dsa: felix: Add PCS operations for PHYLINK") Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
bdeced75 |
|
05-Jan-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: felix: Add PCS operations for PHYLINK Layerscape SoCs traditionally expose the SerDes configuration/status for Ethernet protocols (PCS for SGMII/USXGMII/10GBase-R etc etc) in a register format that is compatible with clause 22 or clause 45 (depending on SerDes protocol). Each MAC has its own internal MDIO bus on which there is one or more of these PCS's, responding to commands at a configurable PHY address. The per-port internal MDIO bus (which is just for PCSs) is totally separate and has nothing to do with the dedicated external MDIO controller (which is just for PHYs), but the register map for the MDIO controller is the same. The VSC9959 (Felix) switch instantiated in the LS1028A is integrated in hardware with the ENETC PCS of its DSA master, and reuses its MDIO controller driver, so Felix has been made to depend on it in Kconfig. +------------------------------------------------------------------------+ | +--------+ GMII (typically disabled via RCW) | | ENETC PCI | ENETC |--------------------------+ | | Root Complex | port 3 |-----------------------+ | | | Integrated +--------+ | | | | Endpoint | | | | +--------+ 2.5G GMII | | | | | ENETC |--------------+ | | | | | port 2 |-----------+ | | | | | +--------+ | | | | | | +--------+ +--------+ | | | Felix | | Felix | | | | port 4 | | port 5 | | | +--------+ +--------+ | | | | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | | | ENETC | | ENETC | | Felix | | Felix | | Felix | | Felix | | | | port 0 | | port 1 | | port 0 | | port 1 | | port 2 | | port 3 | | +------------------------------------------------------------------------+ | |||| SerDes | |||| |||| |||| |||| | | +--------+block | +--------------------------------------------+ | | | ENETC | | | ENETC port 2 internal MDIO bus | | | | port 0 | | | PCS PCS PCS PCS | | | | PCS | | | 0 1 2 3 | | +-----------------|------------------------------------------------------+ v v v v v v SGMII/ RGMII QSGMII/QSXGMII/4xSGMII/4x1000Base-X/4x2500Base-X USXGMII/ (bypasses 1000Base-X/ SerDes) 2500Base-X In the LS1028A SoC described above, the VSC9959 Felix switch is PF5 of the ENETC root complex, and has 2 BARs: - BAR 4: the switch's effective registers - BAR 0: the MDIO controller register map lended from ENETC port 2 (PF2), for accessing its associated PCS's. This explanation is necessary because the patch does some renaming "pci_bar" -> "switch_pci_bar" for clarity, which would otherwise appear a bit obtuse. The fact that the internal MDIO bus is "borrowed" is relevant because the register map is found in PF5 (the switch) but it triggers an access fault if PF2 (the ENETC DSA master) is not enabled. This is not treated in any way (and I don't think it can be treated). All of this is so SoC-specific, that it was contained as much as possible in the platform-integration file felix_vsc9959.c. We need to parse and pre-validate the device tree because of 2 reasons: - The PHY mode (SerDes protocol) cannot change at runtime due to SoC design. - There is a circular dependency in that we need to know what clause the PCS speaks in order to find it on the internal MDIO bus. But the clause of the PCS depends on what phy-mode it is configured for. The goal of this patch is to make steps towards removing the bootloader dependency for SGMII PCS pre-configuration, as well as to add support for monitoring the in-band SGMII AN between the PCS and the system-side link partner (PHY or other MAC). In practice the bootloader dependency is not completely removed. U-Boot pre-programs the PHY address at which each PCS can be found on the internal MDIO bus (MDEV_PORT). This is needed because the PCS of each port has the same out-of-reset PHY address of zero. The SerDes register for changing MDEV_PORT is pretty deep in the SoC (outside the addresses of the ENETC PCI BARs) and therefore inaccessible to us from here. Felix VSC9959 and Ocelot VSC7514 are integrated very differently in their respective SoCs, and for that reason Felix does not use the Ocelot core library for PHYLINK. On one hand we don't want to impose the fixed phy-mode limitation to Ocelot, and on the other hand Felix doesn't need to force the MAC link speed the way Ocelot does, since the MAC is connected to the PCS through a fixed GMII, and the PCS is the one who does the rate adaptation at lower link speeds, which the MAC does not even need to know about. In fact changing the GMII speed for Felix irrecoverably breaks transmission through that port until a reset. The pair with ENETC port 3 and Felix port 5 is optional and doesn't support tagging. When we enable it, swp5 is a regular slave port, albeit an internal one. The trouble is that it doesn't work, and that is because the DSA PHYLIB adaptation layer doesn't treat fixed-link slave ports. So that is yet another reason for wanting to convert Felix to the native PHYLINK API. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
5df66c48 |
|
20-Nov-2019 |
Yangbo Lu <yangbo.lu@nxp.com> |
net: dsa: ocelot: define PTP registers for felix_vsc9959 This patch is to define PTP registers for felix_vsc9959. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
56051948 |
|
14-Nov-2019 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: ocelot: add driver for Felix switch family This supports an Ethernet switching core from Vitesse / Microsemi / Microchip (VSC9959) which is part of the Ocelot family (a brand name), and whose code name is Felix. The switch can be (and is) integrated on different SoCs as a PCIe endpoint device. The functionality is provided by the core of the Ocelot switch driver (drivers/net/ethernet/mscc). In this regard, the current driver is an instance of Microsemi's Ocelot core driver, with a DSA front-end. It inherits its name from VSC9959's code name, to distinguish itself from the switchdev ocelot driver. The patch adds the logic for probing a PCI device and defines the register map for the VSC9959 switch core, since it has some differences in register addresses and bitfield mappings compared to the other Ocelot switches (VSC7511, VSC7512, VSC7513, VSC7514). The Felix driver declares the register map as part of the "instance table". Currently the VSC9959 inside NXP LS1028A is the only instance, but presumably it can support other switches in the Ocelot family, when used in DSA mode (Linux running on the external CPU, and not on the embedded MIPS). In a few cases, some h/w operations have to be done differently on VSC9959 due to missing bitfields. This is the case for the switch core reset and init. Because for this operation Ocelot uses some bits that are not present on Felix, the latter has to use a register from the global registers block (GCB) instead. Although it is a PCI driver, it relies on DT bindings for compatibility with DSA (CPU port link, PHY library). It does not have any custom device tree bindings, since we would like to minimize its dependency on device tree though. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|