#
751de201 |
|
09-Apr-2024 |
Pablo Neira Ayuso <pablo@netfilter.org> |
netfilter: br_netfilter: skip conntrack input hook for promisc packets For historical reasons, when bridge device is in promisc mode, packets that are directed to the taps follow bridge input hook path. This patch adds a workaround to reset conntrack for these packets. Jianbo Liu reports warning splats in their test infrastructure where cloned packets reach the br_netfilter input hook to confirm the conntrack object. Scratch one bit from BR_INPUT_SKB_CB to annotate that this packet has reached the input hook because it is passed up to the bridge device to reach the taps. [ 57.571874] WARNING: CPU: 1 PID: 0 at net/bridge/br_netfilter_hooks.c:616 br_nf_local_in+0x157/0x180 [br_netfilter] [ 57.572749] Modules linked in: xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_isc si ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core mlx5ctl mlx5_core [ 57.575158] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.8.0+ #19 [ 57.575700] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 57.576662] RIP: 0010:br_nf_local_in+0x157/0x180 [br_netfilter] [ 57.577195] Code: fe ff ff 41 bd 04 00 00 00 be 04 00 00 00 e9 4a ff ff ff be 04 00 00 00 48 89 ef e8 f3 a9 3c e1 66 83 ad b4 00 00 00 04 eb 91 <0f> 0b e9 f1 fe ff ff 0f 0b e9 df fe ff ff 48 89 df e8 b3 53 47 e1 [ 57.578722] RSP: 0018:ffff88885f845a08 EFLAGS: 00010202 [ 57.579207] RAX: 0000000000000002 RBX: ffff88812dfe8000 RCX: 0000000000000000 [ 57.579830] RDX: ffff88885f845a60 RSI: ffff8881022dc300 RDI: 0000000000000000 [ 57.580454] RBP: ffff88885f845a60 R08: 0000000000000001 R09: 0000000000000003 [ 57.581076] R10: 00000000ffff1300 R11: 0000000000000002 R12: 0000000000000000 [ 57.581695] R13: ffff8881047ffe00 R14: ffff888108dbee00 R15: ffff88814519b800 [ 57.582313] FS: 0000000000000000(0000) GS:ffff88885f840000(0000) knlGS:0000000000000000 [ 57.583040] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 57.583564] CR2: 000000c4206aa000 CR3: 0000000103847001 CR4: 0000000000370eb0 [ 57.584194] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 57.584820] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 57.585440] Call Trace: [ 57.585721] <IRQ> [ 57.585976] ? __warn+0x7d/0x130 [ 57.586323] ? br_nf_local_in+0x157/0x180 [br_netfilter] [ 57.586811] ? report_bug+0xf1/0x1c0 [ 57.587177] ? handle_bug+0x3f/0x70 [ 57.587539] ? exc_invalid_op+0x13/0x60 [ 57.587929] ? asm_exc_invalid_op+0x16/0x20 [ 57.588336] ? br_nf_local_in+0x157/0x180 [br_netfilter] [ 57.588825] nf_hook_slow+0x3d/0xd0 [ 57.589188] ? br_handle_vlan+0x4b/0x110 [ 57.589579] br_pass_frame_up+0xfc/0x150 [ 57.589970] ? br_port_flags_change+0x40/0x40 [ 57.590396] br_handle_frame_finish+0x346/0x5e0 [ 57.590837] ? ipt_do_table+0x32e/0x430 [ 57.591221] ? br_handle_local_finish+0x20/0x20 [ 57.591656] br_nf_hook_thresh+0x4b/0xf0 [br_netfilter] [ 57.592286] ? br_handle_local_finish+0x20/0x20 [ 57.592802] br_nf_pre_routing_finish+0x178/0x480 [br_netfilter] [ 57.593348] ? br_handle_local_finish+0x20/0x20 [ 57.593782] ? nf_nat_ipv4_pre_routing+0x25/0x60 [nf_nat] [ 57.594279] br_nf_pre_routing+0x24c/0x550 [br_netfilter] [ 57.594780] ? br_nf_hook_thresh+0xf0/0xf0 [br_netfilter] [ 57.595280] br_handle_frame+0x1f3/0x3d0 [ 57.595676] ? br_handle_local_finish+0x20/0x20 [ 57.596118] ? br_handle_frame_finish+0x5e0/0x5e0 [ 57.596566] __netif_receive_skb_core+0x25b/0xfc0 [ 57.597017] ? __napi_build_skb+0x37/0x40 [ 57.597418] __netif_receive_skb_list_core+0xfb/0x220 Fixes: 62e7151ae3eb ("netfilter: bridge: confirm multicast packets before passing them up the stack") Reported-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
6d0259dd |
|
25-Oct-2023 |
Ido Schimmel <idosch@nvidia.com> |
bridge: mcast: Rename MDB entry get function The current name is going to conflict with the upcoming net device operation for the MDB get operation. Rename the function to br_mdb_entry_skb_get(). No functional changes intended. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
44bdb313 |
|
18-Sep-2023 |
Eric Dumazet <edumazet@google.com> |
net: bridge: use DEV_STATS_INC() syzbot/KCSAN reported data-races in br_handle_frame_finish() [1] This function can run from multiple cpus without mutual exclusion. Adopt SMP safe DEV_STATS_INC() to update dev->stats fields. Handles updates to dev->stats.tx_dropped while we are at it. [1] BUG: KCSAN: data-race in br_handle_frame_finish / br_handle_frame_finish read-write to 0xffff8881374b2178 of 8 bytes by interrupt on cpu 1: br_handle_frame_finish+0xd4f/0xef0 net/bridge/br_input.c:189 br_nf_hook_thresh+0x1ed/0x220 br_nf_pre_routing_finish_ipv6+0x50f/0x540 NF_HOOK include/linux/netfilter.h:304 [inline] br_nf_pre_routing_ipv6+0x1e3/0x2a0 net/bridge/br_netfilter_ipv6.c:178 br_nf_pre_routing+0x526/0xba0 net/bridge/br_netfilter_hooks.c:508 nf_hook_entry_hookfn include/linux/netfilter.h:144 [inline] nf_hook_bridge_pre net/bridge/br_input.c:272 [inline] br_handle_frame+0x4c9/0x940 net/bridge/br_input.c:417 __netif_receive_skb_core+0xa8a/0x21e0 net/core/dev.c:5417 __netif_receive_skb_one_core net/core/dev.c:5521 [inline] __netif_receive_skb+0x57/0x1b0 net/core/dev.c:5637 process_backlog+0x21f/0x380 net/core/dev.c:5965 __napi_poll+0x60/0x3b0 net/core/dev.c:6527 napi_poll net/core/dev.c:6594 [inline] net_rx_action+0x32b/0x750 net/core/dev.c:6727 __do_softirq+0xc1/0x265 kernel/softirq.c:553 run_ksoftirqd+0x17/0x20 kernel/softirq.c:921 smpboot_thread_fn+0x30a/0x4a0 kernel/smpboot.c:164 kthread+0x1d7/0x210 kernel/kthread.c:388 ret_from_fork+0x48/0x60 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304 read-write to 0xffff8881374b2178 of 8 bytes by interrupt on cpu 0: br_handle_frame_finish+0xd4f/0xef0 net/bridge/br_input.c:189 br_nf_hook_thresh+0x1ed/0x220 br_nf_pre_routing_finish_ipv6+0x50f/0x540 NF_HOOK include/linux/netfilter.h:304 [inline] br_nf_pre_routing_ipv6+0x1e3/0x2a0 net/bridge/br_netfilter_ipv6.c:178 br_nf_pre_routing+0x526/0xba0 net/bridge/br_netfilter_hooks.c:508 nf_hook_entry_hookfn include/linux/netfilter.h:144 [inline] nf_hook_bridge_pre net/bridge/br_input.c:272 [inline] br_handle_frame+0x4c9/0x940 net/bridge/br_input.c:417 __netif_receive_skb_core+0xa8a/0x21e0 net/core/dev.c:5417 __netif_receive_skb_one_core net/core/dev.c:5521 [inline] __netif_receive_skb+0x57/0x1b0 net/core/dev.c:5637 process_backlog+0x21f/0x380 net/core/dev.c:5965 __napi_poll+0x60/0x3b0 net/core/dev.c:6527 napi_poll net/core/dev.c:6594 [inline] net_rx_action+0x32b/0x750 net/core/dev.c:6727 __do_softirq+0xc1/0x265 kernel/softirq.c:553 do_softirq+0x5e/0x90 kernel/softirq.c:454 __local_bh_enable_ip+0x64/0x70 kernel/softirq.c:381 __raw_spin_unlock_bh include/linux/spinlock_api_smp.h:167 [inline] _raw_spin_unlock_bh+0x36/0x40 kernel/locking/spinlock.c:210 spin_unlock_bh include/linux/spinlock.h:396 [inline] batadv_tt_local_purge+0x1a8/0x1f0 net/batman-adv/translation-table.c:1356 batadv_tt_purge+0x2b/0x630 net/batman-adv/translation-table.c:3560 process_one_work kernel/workqueue.c:2630 [inline] process_scheduled_works+0x5b8/0xa30 kernel/workqueue.c:2703 worker_thread+0x525/0x730 kernel/workqueue.c:2784 kthread+0x1d7/0x210 kernel/kthread.c:388 ret_from_fork+0x48/0x60 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304 value changed: 0x00000000000d7190 -> 0x00000000000d7191 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 14848 Comm: kworker/u4:11 Not tainted 6.6.0-rc1-syzkaller-00236-gad8a69f361b9 #0 Fixes: 1c29fc4989bc ("[BRIDGE]: keep track of received multicast packets") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Roopa Prabhu <roopa@nvidia.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: bridge@lists.linux-foundation.org Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230918091351.1356153-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
#
7b4858df |
|
29-May-2023 |
Ido Schimmel <idosch@nvidia.com> |
skbuff: bridge: Add layer 2 miss indication For EVPN non-DF (Designated Forwarder) filtering we need to be able to prevent decapsulated traffic from being flooded to a multi-homed host. Filtering of multicast and broadcast traffic can be achieved using the following flower filter: # tc filter add dev bond0 egress pref 1 proto all flower indev vxlan0 dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 action drop Unlike broadcast and multicast traffic, it is not currently possible to filter unknown unicast traffic. The classification into unknown unicast is performed by the bridge driver, but is not visible to other layers such as tc. Solve this by adding a new 'l2_miss' bit to the tc skb extension. Clear the bit whenever a packet enters the bridge (received from a bridge port or transmitted via the bridge) and set it if the packet did not match an FDB or MDB entry. If there is no skb extension and the bit needs to be cleared, then do not allocate one as no extension is equivalent to the bit being cleared. The bit is not set for broadcast packets as they never perform a lookup and therefore never incur a miss. A bit that is set for every flooded packet would also work for the current use case, but it does not allow us to differentiate between registered and unregistered multicast traffic, which might be useful in the future. To keep the performance impact to a minimum, the marking of packets is guarded by the 'tc_skb_ext_tc' static key. When 'false', the skb is not touched and an skb extension is not allocated. Instead, only a 5 bytes nop is executed, as demonstrated below for the call site in br_handle_frame(). Before the patch: ``` memset(skb->cb, 0, sizeof(struct br_input_skb_cb)); c37b09: 49 c7 44 24 28 00 00 movq $0x0,0x28(%r12) c37b10: 00 00 p = br_port_get_rcu(skb->dev); c37b12: 49 8b 44 24 10 mov 0x10(%r12),%rax memset(skb->cb, 0, sizeof(struct br_input_skb_cb)); c37b17: 49 c7 44 24 30 00 00 movq $0x0,0x30(%r12) c37b1e: 00 00 c37b20: 49 c7 44 24 38 00 00 movq $0x0,0x38(%r12) c37b27: 00 00 ``` After the patch (when static key is disabled): ``` memset(skb->cb, 0, sizeof(struct br_input_skb_cb)); c37c29: 49 c7 44 24 28 00 00 movq $0x0,0x28(%r12) c37c30: 00 00 c37c32: 49 8d 44 24 28 lea 0x28(%r12),%rax c37c37: 48 c7 40 08 00 00 00 movq $0x0,0x8(%rax) c37c3e: 00 c37c3f: 48 c7 40 10 00 00 00 movq $0x0,0x10(%rax) c37c46: 00 #ifdef CONFIG_HAVE_JUMP_LABEL_HACK static __always_inline bool arch_static_branch(struct static_key *key, bool branch) { asm_volatile_goto("1:" c37c47: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) br_tc_skb_miss_set(skb, false); p = br_port_get_rcu(skb->dev); c37c4c: 49 8b 44 24 10 mov 0x10(%r12),%rax ``` Subsequent patches will extend the flower classifier to be able to match on the new 'l2_miss' bit and enable / disable the static key when filters that match on it are added / deleted. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
e408336a |
|
19-Apr-2023 |
Ido Schimmel <idosch@nvidia.com> |
bridge: Pass VLAN ID to br_flood() Subsequent patches are going to add per-{Port, VLAN} neighbor suppression, which will require br_flood() to potentially suppress ARP / NS packets on a per-{Port, VLAN} basis. As a preparation, pass the VLAN ID of the packet as another argument to br_flood(). Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
3e35f26d |
|
10-Nov-2022 |
Ido Schimmel <idosch@nvidia.com> |
bridge: Add missing parentheses No changes in generated code. Reported-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20221110085422.521059-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
a35ec8e3 |
|
01-Nov-2022 |
Hans J. Schultz <netdev@kapio-technology.com> |
bridge: Add MAC Authentication Bypass (MAB) support Hosts that support 802.1X authentication are able to authenticate themselves by exchanging EAPOL frames with an authenticator (Ethernet bridge, in this case) and an authentication server. Access to the network is only granted by the authenticator to successfully authenticated hosts. The above is implemented in the bridge using the "locked" bridge port option. When enabled, link-local frames (e.g., EAPOL) can be locally received by the bridge, but all other frames are dropped unless the host is authenticated. That is, unless the user space control plane installed an FDB entry according to which the source address of the frame is located behind the locked ingress port. The entry can be dynamic, in which case learning needs to be enabled so that the entry will be refreshed by incoming traffic. There are deployments in which not all the devices connected to the authenticator (the bridge) support 802.1X. Such devices can include printers and cameras. One option to support such deployments is to unlock the bridge ports connecting these devices, but a slightly more secure option is to use MAB. When MAB is enabled, the MAC address of the connected device is used as the user name and password for the authentication. For MAB to work, the user space control plane needs to be notified about MAC addresses that are trying to gain access so that they will be compared against an allow list. This can be implemented via the regular learning process with the sole difference that learned FDB entries are installed with a new "locked" flag indicating that the entry cannot be used to authenticate the device. The flag cannot be set by user space, but user space can clear the flag by replacing the entry, thereby authenticating the device. Locked FDB entries implement the following semantics with regards to roaming, aging and forwarding: 1. Roaming: Locked FDB entries can roam to unlocked (authorized) ports, in which case the "locked" flag is cleared. FDB entries cannot roam to locked ports regardless of MAB being enabled or not. Therefore, locked FDB entries are only created if an FDB entry with the given {MAC, VID} does not already exist. This behavior prevents unauthenticated devices from disrupting traffic destined to already authenticated devices. 2. Aging: Locked FDB entries age and refresh by incoming traffic like regular entries. 3. Forwarding: Locked FDB entries forward traffic like regular entries. If user space detects an unauthorized MAC behind a locked port and wishes to prevent traffic with this MAC DA from reaching the host, it can do so using tc or a different mechanism. Enable the above behavior using a new bridge port option called "mab". It can only be enabled on a bridge port that is both locked and has learning enabled. Locked FDB entries are flushed from the port once MAB is disabled. A new option is added because there are pure 802.1X deployments that are not interested in notifications about locked FDB entries. Signed-off-by: Hans J. Schultz <netdev@kapio-technology.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
fbb3abdf |
|
17-May-2022 |
Andrew Lunn <andrew@lunn.ch> |
net: bridge: Clear offload_fwd_mark when passing frame up bridge interface. It is possible to stack bridges on top of each other. Consider the following which makes use of an Ethernet switch: br1 / \ / \ / \ br0.11 wlan0 | br0 / | \ p1 p2 p3 br0 is offloaded to the switch. Above br0 is a vlan interface, for vlan 11. This vlan interface is then a slave of br1. br1 also has a wireless interface as a slave. This setup trunks wireless lan traffic over the copper network inside a VLAN. A frame received on p1 which is passed up to the bridge has the skb->offload_fwd_mark flag set to true, indicating that the switch has dealt with forwarding the frame out ports p2 and p3 as needed. This flag instructs the software bridge it does not need to pass the frame back down again. However, the flag is not getting reset when the frame is passed upwards. As a result br1 sees the flag, wrongly interprets it, and fails to forward the frame to wlan0. When passing a frame upwards, clear the flag. This is the Rx equivalent of br_switchdev_frame_unmark() in br_dev_xmit(). Fixes: f1c2eddf4cb6 ("bridge: switchdev: Use an helper to clear forward mark") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20220518005840.771575-1-andrew@lunn.ch Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
#
ec7328b5 |
|
16-Mar-2022 |
Tobias Waldekranz <tobias@waldekranz.com> |
net: bridge: mst: Multiple Spanning Tree (MST) mode Allow the user to switch from the current per-VLAN STP mode to an MST mode. Up to this point, per-VLAN STP states where always isolated from each other. This is in contrast to the MSTP standard (802.1Q-2018, Clause 13.5), where VLANs are grouped into MST instances (MSTIs), and the state is managed on a per-MSTI level, rather that at the per-VLAN level. Perhaps due to the prevalence of the standard, many switching ASICs are built after the same model. Therefore, add a corresponding MST mode to the bridge, which we can later add offloading support for in a straight-forward way. For now, all VLANs are fixed to MSTI 0, also called the Common Spanning Tree (CST). That is, all VLANs will follow the port-global state. Upcoming changes will make this actually useful by allowing VLANs to be mapped to arbitrary MSTIs and allow individual MSTI states to be changed. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
a21d9a67 |
|
23-Feb-2022 |
Hans Schultz <schultz.hans@gmail.com> |
net: bridge: Add support for bridge port in locked mode In a 802.1X scenario, clients connected to a bridge port shall not be allowed to have traffic forwarded until fully authenticated. A static fdb entry of the clients MAC address for the bridge port unlocks the client and allows bidirectional communication. This scenario is facilitated with setting the bridge port in locked mode, which is also supported by various switchcore chipsets. Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a37c5c26 |
|
23-Aug-2021 |
Kangmin Park <l4stpr0gr4m@gmail.com> |
net: bridge: change return type of br_handle_ingress_vlan_tunnel br_handle_ingress_vlan_tunnel() is only referenced in br_handle_frame(). If br_handle_ingress_vlan_tunnel() is called and return non-zero value, goto drop in br_handle_frame(). But, br_handle_ingress_vlan_tunnel() always return 0. So, the routines that check the return value and goto drop has no meaning. Therefore, change return type of br_handle_ingress_vlan_tunnel() to void and remove if statement of br_handle_frame(). Signed-off-by: Kangmin Park <l4stpr0gr4m@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Link: https://lore.kernel.org/r/20210823102118.17966-1-l4stpr0gr4m@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
f4b7002a |
|
19-Jul-2021 |
Nikolay Aleksandrov <nikolay@nvidia.com> |
net: bridge: add vlan mcast snooping knob Add a global knob that controls if vlan multicast snooping is enabled. The proper contexts (vlan or bridge-wide) will be chosen based on the knob when processing packets and changing bridge device state. Note that vlans have their individual mcast snooping enabled by default, but this knob is needed to turn on bridge vlan snooping. It is disabled by default. To enable the knob vlan filtering must also be enabled, it doesn't make sense to have vlan mcast snooping without vlan filtering since that would lead to inconsistencies. Disabling vlan filtering will also automatically disable vlan mcast snooping. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
adc47037 |
|
19-Jul-2021 |
Nikolay Aleksandrov <nikolay@nvidia.com> |
net: bridge: multicast: use multicast contexts instead of bridge or port Pass multicast context pointers to multicast functions instead of bridge/port. This would make it easier later to switch these contexts to their per-vlan versions. The patch is basically search and replace, no functional changes. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1a3065a2 |
|
13-May-2021 |
Linus Lüssing <linus.luessing@c0d3.blue> |
net: bridge: mcast: prepare is-router function for mcast router split In preparation for the upcoming split of multicast router state into their IPv4 and IPv6 variants make br_multicast_is_router() protocol family aware. Note that for now br_ip6_multicast_is_router() uses the currently still common ip4_mc_router_timer for now. It will be renamed to ip6_mc_router_timer later when the split is performed. While at it also renames the "1" and "2" constants in br_multicast_is_router() to the MDB_RTR_TYPE_TEMP_QUERY and MDB_RTR_TYPE_PERM enums. Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ecd1c6a5 |
|
09-Mar-2021 |
Gustavo A. R. Silva <gustavoars@kernel.org> |
net: bridge: Fix fall-through warnings for Clang In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
efb5b338 |
|
07-Jan-2021 |
Menglong Dong <dong.menglong@zte.com.cn> |
net: bridge: fix misspellings using codespell tool Some typos are found out by codespell tool: $ codespell ./net/bridge/ ./net/bridge/br_stp.c:604: permanant ==> permanent ./net/bridge/br_stp.c:605: persistance ==> persistence ./net/bridge/br.c:125: underlaying ==> underlying ./net/bridge/br_input.c:43: modue ==> mode ./net/bridge/br_mrp.c:828: Determin ==> Determine ./net/bridge/br_mrp.c:848: Determin ==> Determine ./net/bridge/br_mrp.c:897: Determin ==> Determine Fix typos found by codespell. Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn> Acked-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20210108025332.52480-1-dong.menglong@zte.com.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
7609ecb2 |
|
19-Nov-2020 |
Heiner Kallweit <hkallweit1@gmail.com> |
net: bridge: switch to net core statistics counters handling Use netdev->tstats instead of a member of net_bridge for storing a pointer to the per-cpu counters. This allows us to use core functionality for statistics handling. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/9bad2be2-fd84-7c6e-912f-cee433787018@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
955062b0 |
|
28-Oct-2020 |
Nikolay Aleksandrov <nikolay@nvidia.com> |
net: bridge: mcast: add support for raw L2 multicast groups Extend the bridge multicast control and data path to configure routes for L2 (non-IP) multicast groups. The uapi struct br_mdb_entry union u is extended with another variant, mac_addr, which does not change the structure size, and which is valid when the proto field is zero. To be compatible with the forwarding code that is already in place, which acts as an IGMP/MLD snooping bridge with querier capabilities, we need to declare that for L2 MDB entries (for which there exists no such thing as IGMP/MLD snooping/querying), that there is always a querier. Otherwise, these entries would be flooded to all bridge ports and not just to those that are members of the L2 multicast group. Needless to say, only permanent L2 multicast groups can be installed on a bridge port. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20201028233831.610076-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
90c628dd |
|
27-Oct-2020 |
Henrik Bjoernlund <henrik.bjoernlund@microchip.com> |
net: bridge: extend the process of special frames This patch extends the processing of frames in the bridge. Currently MRP frames needs special processing and the current implementation doesn't allow a nice way to process different frame types. Therefore try to improve this by adding a list that contains frame types that need special processing. This list is iterated for each input frame and if there is a match based on frame type then these functions will be called and decide what to do with the frame. It can process the frame then the bridge doesn't need to do anything or don't process so then the bridge will do normal forwarding. Signed-off-by: Henrik Bjoernlund <henrik.bjoernlund@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
9eb8eff0 |
|
10-May-2020 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: bridge: allow enslaving some DSA master network devices Commit 8db0a2ee2c63 ("net: bridge: reject DSA-enabled master netdevices as bridge members") added a special check in br_if.c in order to check for a DSA master network device with a tagging protocol configured. This was done because back then, such devices, once enslaved in a bridge would become inoperative and would not pass DSA tagged traffic anymore due to br_handle_frame returning RX_HANDLER_CONSUMED. But right now we have valid use cases which do require bridging of DSA masters. One such example is when the DSA master ports are DSA switch ports themselves (in a disjoint tree setup). This should be completely equivalent, functionally speaking, from having multiple DSA switches hanging off of the ports of a switchdev driver. So we should allow the enslaving of DSA tagged master network devices. Instead of the regular br_handle_frame(), install a new function br_handle_frame_dummy() on these DSA masters, which returns RX_HANDLER_PASS in order to call into the DSA specific tagging protocol handlers, and lift the restriction from br_add_if. Suggested-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
65369933 |
|
26-Apr-2020 |
Horatiu Vultur <horatiu.vultur@microchip.com> |
bridge: mrp: Integrate MRP into the bridge To integrate MRP into the bridge, the bridge needs to do the following: - detect if the MRP frame was received on MRP ring port in that case it would be processed otherwise just forward it as usual. - enable parsing of MRP - before whenever the bridge was set up, it would set all the ports in forwarding state. Add an extra check to not set ports in forwarding state if the port is an MRP ring port. The reason of this change is that if the MRP instance initially sets the port in blocked state by setting the bridge up it would overwrite this setting. Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a580c76d |
|
24-Jan-2020 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
net: bridge: vlan: add per-vlan state The first per-vlan option added is state, it is needed for EVPN and for per-vlan STP. The state allows to control the forwarding on per-vlan basis. The vlan state is considered only if the port state is forwarding in order to avoid conflicts and be consistent. br_allowed_egress is called only when the state is forwarding, but the ingress case is a bit more complicated due to the fact that we may have the transition between port:BR_STATE_FORWARDING -> vlan:BR_STATE_LEARNING which should still allow the bridge to learn from the packet after vlan filtering and it will be dropped after that. Also to optimize the pvid state check we keep a copy in the vlan group to avoid one lookup. The state members are modified with *_ONCE() to annotate the lockless access. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
5d1fcaf3 |
|
04-Nov-2019 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
net: bridge: fdb: eliminate extra port state tests from fast-path When commit df1c0b8468b3 ("[BRIDGE]: Packets leaking out of disabled/blocked ports.") introduced the port state tests in br_fdb_update() it was to avoid learning/refreshing from STP BPDUs, it was also used to avoid learning/refreshing from user-space with NTF_USE. Those two tests are done for every packet entering the bridge if it's learning, but for the fast-path we already have them checked in br_handle_frame() and is unnecessary to do it again. Thus push the checks to the unlikely cases and drop them from br_fdb_update(), the new nbp_state_should_learn() helper is used to determine if the port state allows br_fdb_update() to be called. The two places which need to do it manually are: - user-space add call with NTF_USE set - link-local packet learning done in __br_handle_local_finish() Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
be0c5677 |
|
01-Nov-2019 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
net: bridge: fdb: br_fdb_update can take flags directly If we modify br_fdb_update() to take flags directly we can get rid of one test and one atomic bitop in the learning path. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
6869c3b0 |
|
29-Oct-2019 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
net: bridge: fdb: convert is_local to bitops The patch adds a new fdb flags field in the hole between the two cache lines and uses it to convert is_local to bitops. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
0d9cb300 |
|
02-Jul-2019 |
Florian Westphal <fw@strlen.de> |
netfilter: nf_queue: remove unused hook entries pointer Its not used anywhere, so remove this. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
3d26eb8a |
|
02-Jul-2019 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
net: bridge: don't cache ether dest pointer on input We would cache ether dst pointer on input in br_handle_frame_finish but after the neigh suppress code that could lead to a stale pointer since both ipv4 and ipv6 suppress code do pskb_may_pull. This means we have to always reload it after the suppress code so there's no point in having it cached just retrieve it directly. Fixes: 057658cb33fbf ("bridge: suppress arp pkts on BR_NEIGH_SUPPRESS ports") Fixes: ed842faeb2bd ("bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
2874c5fd |
|
27-May-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
3b2e2904 |
|
11-Apr-2019 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
net: bridge: fix per-port af_packet sockets When the commit below was introduced it changed two visible things: - the skb was no longer passed through the protocol handlers with the original device - the skb was passed up the stack with skb->dev = bridge The first change broke af_packet sockets on bridge ports. For example we use them for hostapd which listens for ETH_P_PAE packets on the ports. We discussed two possible fixes: - create a clone and pass it through NF_HOOK(), act on the original skb based on the result - somehow signal to the caller from the okfn() that it was called, meaning the skb is ok to be passed, which this patch is trying to implement via returning 1 from the bridge link-local okfn() Note that we rely on the fact that NF_QUEUE/STOLEN would return 0 and drop/error would return < 0 thus the okfn() is called only when the return was 1, so we signal to the caller that it was called by preserving the return value from nf_hook(). Fixes: 8626c56c8279 ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
dc2f4189 |
|
12-Apr-2019 |
Stephen Rothwell <sfr@canb.auug.org.au> |
bridge: only include nf_queue.h if needed After merging the netfilter-next tree, today's linux-next build (powerpc ppc44x_defconfig) failed like this: In file included from net/bridge/br_input.c:19: include/net/netfilter/nf_queue.h:16:23: error: field 'state' has incomplete type struct nf_hook_state state; ^~~~~ Fixes: 971502d77faa ("bridge: netfilter: unroll NF_HOOK helper in bridge input path") Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
223fd0ad |
|
11-Apr-2019 |
Florian Westphal <fw@strlen.de> |
bridge: broute: make broute a real ebtables table This makes broute a normal ebtables table, hooking at PREROUTING. The broute hook is removed. It uses skb->cb to signal to bridge rx handler that the skb should be routed instead of being bridged. This change is backwards compatible with ebtables as no userspace visible parts are changed. This means we can also remove the !ops test in ebt_register_table, it was only there for broute table sake. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
971502d7 |
|
11-Apr-2019 |
Florian Westphal <fw@strlen.de> |
bridge: netfilter: unroll NF_HOOK helper in bridge input path Replace NF_HOOK() based invocation of the netfilter hooks with a private copy of nf_hook_slow(). This copy has one difference: it can return the rx handler value expected by the stack, i.e. RX_HANDLER_CONSUMED or RX_HANDLER_PASS. This is needed by the next patch to invoke the ebtables "broute" table via the standard netfilter hooks rather than the custom "br_should_route_hook" indirection that is used now. When the skb is to be "brouted", we must return RX_HANDLER_PASS from the bridge rx input handler, but there is no way to indicate this via NF_HOOK(), unless perhaps by some hack such as exposing bridge_cb in the netfilter core or a percpu flag. text data bss dec filename 3369 56 0 3425 net/bridge/br_input.o.before 3458 40 0 3498 net/bridge/br_input.o.after This allows removal of the "br_should_route_hook" in the next patch. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
f12064d1 |
|
11-Apr-2019 |
Florian Westphal <fw@strlen.de> |
bridge: reduce size of input cb to 16 bytes Reduce size of br_input_skb_cb from 24 to 16 bytes by using bitfield for those values that can only be 0 or 1. igmp is the igmp type value, so it needs to be at least u8. Furthermore, the bridge currently relies on step-by-step initialization of br_input_skb_cb fields as the skb passes through the stack. Explicitly zero out the bridge input cb instead, this avoids having to review/validate that no BR_INPUT_SKB_CB(skb)->foo test can see a 'random' value from previous protocol cb. AFAICS all current fields are always set up before they are read again, so this is not a bug fix. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
70e4272b |
|
23-Nov-2018 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
net: bridge: add no_linklocal_learn bool option Use the new boolopt API to add an option which disables learning from link-local packets. The default is kept as before and learning is enabled. This is a simple map from a boolopt bit to a bridge private flag that is tested before learning. v2: pass NULL for extack via sysfs Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c69c2cd4 |
|
26-Sep-2018 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
net: bridge: convert neigh_suppress_enabled option to a bit Convert the neigh_suppress_enabled option to a bit. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
7d850abd |
|
24-May-2018 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
net: bridge: add support for port isolation This patch adds support for a new port flag - BR_ISOLATED. If it is set then isolated ports cannot communicate between each other, but they can still communicate with non-isolated ports. The same can be achieved via ACLs but they can't scale with large number of ports and also the complexity of the rules grows. This feature can be used to achieve isolated vlan functionality (similar to pvlan) as well, though currently it will be port-wide (for all vlans on the port). The new test in should_deliver uses data that is already cache hot and the new boolean is used to avoid an additional source port test in should_deliver. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ff0fd34e |
|
09-Nov-2017 |
Andrew Lunn <andrew@lunn.ch> |
net: bridge: Rename mglist to host_joined The boolean mglist indicates the host has joined a particular multicast group on the bridge interface. It is badly named, obscuring what is means. Rename it. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ed842fae |
|
06-Oct-2017 |
Roopa Prabhu <roopa@cumulusnetworks.com> |
bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports This patch avoids flooding and proxies ndisc packets for BR_NEIGH_SUPPRESS ports. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
057658cb |
|
06-Oct-2017 |
Roopa Prabhu <roopa@cumulusnetworks.com> |
bridge: suppress arp pkts on BR_NEIGH_SUPPRESS ports This patch avoids flooding and proxies arp packets for BR_NEIGH_SUPPRESS ports. Moves existing br_do_proxy_arp to br_do_proxy_suppress_arp to support both proxy arp and neigh suppress. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
5af48b59 |
|
27-Sep-2017 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
net: bridge: add per-port group_fwd_mask with less restrictions We need to be able to transparently forward most link-local frames via tunnels (e.g. vxlan, qinq). Currently the bridge's group_fwd_mask has a mask which restricts the forwarding of STP and LACP, but we need to be able to forward these over tunnels and control that forwarding on a per-port basis thus add a new per-port group_fwd_mask option which only disallows mac pause frames to be forwarded (they're always dropped anyway). The patch does not change the current default situation - all of the others are still restricted unless configured for forwarding. We have successfully tested this patch with LACP and STP forwarding over VxLAN and qinq tunnels. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
31a4562d |
|
13-Jul-2017 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
net: bridge: fix dest lookup when vlan proto doesn't match With 802.1ad support the vlan_ingress code started checking for vlan protocol mismatch which causes the current tag to be inserted and the bridge vlan protocol & pvid to be set. The vlan tag insertion changes the skb mac_header and thus the lookup mac dest pointer which was loaded prior to calling br_allowed_ingress in br_handle_frame_finish is VLAN_HLEN bytes off now, pointing to the last two bytes of the destination mac and the first four of the source mac causing lookups to always fail and broadcasting all such packets to all ports. Same thing happens for locally originated packets when passing via br_dev_xmit. So load the dest pointer after the vlan checks and possible skb change. Fixes: 8580e2117c06 ("bridge: Prepare for 802.1ad vlan filtering support") Reported-by: Anitha Narasimha Murthy <anitha@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a13b2082 |
|
13-Mar-2017 |
Florian Westphal <fw@strlen.de> |
bridge: drop netfilter fake rtable unconditionally Andreas reports kernel oops during rmmod of the br_netfilter module. Hannes debugged the oops down to a NULL rt6info->rt6i_indev. Problem is that br_netfilter has the nasty concept of adding a fake rtable to skb->dst; this happens in a br_netfilter prerouting hook. A second hook (in bridge LOCAL_IN) is supposed to remove these again before the skb is handed up the stack. However, on module unload hooks get unregistered which means an skb could traverse the prerouting hook that attaches the fake_rtable, while the 'fake rtable remove' hook gets removed from the hooklist immediately after. Fixes: 34666d467cbf1e2e3c7 ("netfilter: bridge: move br_netfilter out of the core") Reported-by: Andreas Karis <akaris@redhat.com> Debugged-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
bfd0aeac |
|
13-Feb-2017 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
bridge: fdb: converge fdb searching functions into one Before this patch we had 3 different fdb searching functions which was confusing. This patch reduces all of them to one - fdb_find_rcu(), and two flavors: br_fdb_find() which requires hash_lock and br_fdb_find_rcu which requires RCU. This makes it clear what needs to be used, we also remove two abusers of __br_fdb_get which called it under hash_lock and replace them with br_fdb_find(). Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ca6d4480 |
|
07-Feb-2017 |
stephen hemminger <stephen@networkplumber.org> |
bridge: avoid unnecessary read of jiffies Jiffies is volatile so read it once. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
83a718d6 |
|
04-Feb-2017 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
bridge: fdb: write to used and updated at most once per jiffy Writing once per jiffy is enough to limit the bridge's false sharing. After this change the bridge doesn't show up in the local load HitM stats. Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
11538d03 |
|
31-Jan-2017 |
Roopa Prabhu <roopa@cumulusnetworks.com> |
bridge: vlan dst_metadata hooks in ingress and egress paths - ingress hook: - if port is a tunnel port, use tunnel info in attached dst_metadata to map it to a local vlan - egress hook: - if port is a tunnel port, use tunnel info attached to vlan to set dst_metadata on the skb CC: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8addd5e7 |
|
31-Aug-2016 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
net: bridge: change unicast boolean to exact pkt_type Remove the unicast flag and introduce an exact pkt_type. That would help us for the upcoming per-port multicast flood flag and also slightly reduce the tests in the input fast path. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
85a3d4a9 |
|
30-Aug-2016 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
net: bridge: don't increment tx_dropped in br_do_proxy_arp pskb_may_pull may fail due to various reasons (e.g. alloc failure), but the skb isn't changed/dropped and processing continues so we shouldn't increment tx_dropped. CC: Kyeyoon Park <kyeyoonp@codeaurora.org> CC: Roopa Prabhu <roopa@cumulusnetworks.com> CC: Stephen Hemminger <stephen@networkplumber.org> CC: bridge@lists.linux-foundation.org Fixes: 958501163ddd ("bridge: Add support for IEEE 802.11 Proxy ARP") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
6bc506b4 |
|
25-Aug-2016 |
Ido Schimmel <idosch@mellanox.com> |
bridge: switchdev: Add forward mark support for stacked devices switchdev_port_fwd_mark_set() is used to set the 'offload_fwd_mark' of port netdevs so that packets being flooded by the device won't be flooded twice. It works by assigning a unique identifier (the ifindex of the first bridge port) to bridge ports sharing the same parent ID. This prevents packets from being flooded twice by the same switch, but will flood packets through bridge ports belonging to a different switch. This method is problematic when stacked devices are taken into account, such as VLANs. In such cases, a physical port netdev can have upper devices being members in two different bridges, thus requiring two different 'offload_fwd_mark's to be configured on the port netdev, which is impossible. The main problem is that packet and netdev marking is performed at the physical netdev level, whereas flooding occurs between bridge ports, which are not necessarily port netdevs. Instead, packet and netdev marking should really be done in the bridge driver with the switch driver only telling it which packets it already forwarded. The bridge driver will mark such packets using the mark assigned to the ingress bridge port and will prevent the packet from being forwarded through any bridge port sharing the same mark (i.e. having the same parent ID). Remove the current switchdev 'offload_fwd_mark' implementation and instead implement the proposed method. In addition, make rocker - the sole user of the mark - use the proposed method. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
baedbe55 |
|
22-Jul-2016 |
Ido Schimmel <idosch@mellanox.com> |
bridge: Fix incorrect re-injection of LLDP packets Commit 8626c56c8279 ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict") caused LLDP packets arriving through a bridge port to be re-injected to the Rx path with skb->dev set to the bridge device, but this breaks the lldpad daemon. The lldpad daemon opens a packet socket with protocol set to ETH_P_LLDP for any valid device on the system, which doesn't not include soft devices such as bridge and VLAN. Since packet sockets (ptype_base) are processed in the Rx path after the Rx handler, LLDP packets with skb->dev set to the bridge device never reach the lldpad daemon. Fix this by making the bridge's Rx handler re-inject LLDP packets with RX_HANDLER_PASS, which effectively restores the behaviour prior to the mentioned commit. This means netfilter will never receive LLDP packets coming through a bridge port, as I don't see a way in which we can have okfn() consume the packet without breaking existing behaviour. I've already carried out a similar fix for STP packets in commit 56fae404fb2c ("bridge: Fix incorrect re-injection of STP packets"). Fixes: 8626c56c8279 ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Cc: Florian Westphal <fw@strlen.de> Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
37b090e6 |
|
13-Jul-2016 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
net: bridge: remove _deliver functions and consolidate forward code Before this patch we had two flavors of most forwarding functions - _forward and _deliver, the difference being that the latter are used when the packets are locally originated. Instead of all this function pointer passing and code duplication, we can just pass a boolean noting that the packet was locally originated and use that to perform the necessary checks in __br_forward. This gives a minor performance improvement but more importantly consolidates the forwarding paths. Also add a kernel doc comment to explain the exported br_forward()'s arguments. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b35c5f63 |
|
13-Jul-2016 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
net: bridge: drop skb2/skb0 variables and use a local_rcv boolean Currently if the packet is going to be received locally we set skb0 or sometimes called skb2 variables to the original skb. This can get confusing and also we can avoid one conditional on the fast path by simply using a boolean and passing it around. Thanks to Roopa for the name suggestion. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e151aab9 |
|
13-Jul-2016 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
net: bridge: rearrange flood vs unicast receive paths This patch removes one conditional from the unicast path by using the fact that skb is NULL only when the packet is multicast or is local. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
46c0772d |
|
13-Jul-2016 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
net: bridge: minor style adjustments in br_handle_frame_finish Trivial style changes in br_handle_frame_finish. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a65056ec |
|
06-Jul-2016 |
Nikolay Aleksandrov <razor@blackwall.org> |
net: bridge: extend MLD/IGMP query stats As was suggested this patch adds support for the different versions of MLD and IGMP query types. Since the user visible structure is still in net-next we can augment it instead of adding netlink attributes. The distinction between the different IGMP/MLD query types is done as suggested in Section 7.1, RFC 3376 [1] and Section 8.1, RFC 3810 [2] based on query payload size and code for IGMP. Since all IGMP packets go through multicast_rcv() and it uses ip_mc_check_igmp/ipv6_mc_check_mld we can be sure that at least the ip/ipv6 header can be directly used. [1] https://tools.ietf.org/html/rfc3376#section-7 [2] https://tools.ietf.org/html/rfc3810#section-8.1 Suggested-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1080ab95 |
|
28-Jun-2016 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
net: bridge: add support for IGMP/MLD stats and export them via netlink This patch adds stats support for the currently used IGMP/MLD types by the bridge. The stats are per-port (plus one stat per-bridge) and per-direction (RX/TX). The stats are exported via netlink via the new linkxstats API (RTM_GETSTATS). In order to minimize the performance impact, a new option is used to enable/disable the stats - multicast_stats_enabled, similar to the recent vlan stats. Also in order to avoid multiple IGMP/MLD type lookups and checks, we make use of the current "igmp" member of the bridge private skb->cb region to record the type on Rx (both host-generated and external packets pass by multicast_rcv()). We can do that since the igmp member was used as a boolean and all the valid IGMP/MLD types are positive values. The normal bridge fast-path is not affected at all, the only affected paths are the flooding ones and since we make use of the IGMP/MLD type, we can quickly determine if the packet should be counted using cache-hot data (cb's igmp member). We add counters for: * IGMP Queries * IGMP Leaves * IGMP v1/v2/v3 reports * MLD Queries * MLD Leaves * MLD v1/v2 reports These are invaluable when monitoring or debugging complex multicast setups with bridges. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
56fae404 |
|
06-Jun-2016 |
Ido Schimmel <idosch@mellanox.com> |
bridge: Fix incorrect re-injection of STP packets Commit 8626c56c8279 ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict") fixed incorrect usage of NF_HOOK's return value by consuming packets in okfn via br_pass_frame_up(). However, this function re-injects packets to the Rx path with skb->dev set to the bridge device, which breaks kernel's STP, as all STP packets appear to originate from the bridge device itself. Instead, if STP is enabled and bridge isn't a 802.1ad bridge, then learn packet's SMAC and inject it back to the Rx path for further processing by the packet handlers. The patch also makes netfilter's behavior consistent with regards to packets destined to the Bridge Group Address, as no hook registered at LOCAL_IN will ever be called, regardless if STP is enabled or not. Cc: Florian Westphal <fw@strlen.de> Cc: Shmulik Ladkani <shmulik.ladkani@gmail.com> Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Fixes: 8626c56c8279 ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8626c56c |
|
12-Mar-2016 |
Florian Westphal <fw@strlen.de> |
bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict Zefir Kurtisi reported kernel panic with an openwrt specific patch. However, it turns out that mainline has a similar bug waiting to happen. Once NF_HOOK() returns the skb is in undefined state and must not be used. Moreover, the okfn must consume the skb to support async processing (NF_QUEUE). Current okfn in this spot doesn't consume it and caller assumes that NF_HOOK return value tells us if skb was freed or not, but thats wrong. It "works" because no in-tree user registers a NFPROTO_BRIDGE hook at LOCAL_IN that returns STOLEN or NF_QUEUE verdicts. Once we add NF_QUEUE support for nftables bridge this will break -- NF_QUEUE holds the skb for async processing, caller will erronoulsy return RX_HANDLER_PASS and on reinject netfilter will access free'd skb. Fix this by pushing skb up the stack in the okfn instead. NB: It also seems dubious to use LOCAL_IN while bypassing PRE_ROUTING completely in this case but this is how its been forever so it seems preferable to not change this. Cc: Felix Fietkau <nbd@openwrt.org> Cc: Zefir Kurtisi <zefir.kurtisi@neratec.com> Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Zefir Kurtisi <zefir.kurtisi@neratec.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
907b1e6e |
|
12-Oct-2015 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
bridge: vlan: use proper rcu for the vlgrp member The bridge and port's vlgrp member is already used in RCU way, currently we rely on the fact that it cannot disappear while the port exists but that is error-prone and we might miss places with improper locking (either RCU or RTNL must be held to walk the vlan_list). So make it official and use RCU for vlgrp to catch offenders. Introduce proper vlgrp accessors and use them consistently throughout the code. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
77751ee8 |
|
30-Sep-2015 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
bridge: vlan: move pvid inside net_bridge_vlan_group One obvious way to converge more code (which was also used by the previous vlan code) is to move pvid inside net_bridge_vlan_group. This allows us to simplify some and remove other port-specific functions. Also gives us the ability to simply pass the vlan group and use all of the contained information. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
2594e906 |
|
25-Sep-2015 |
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> |
bridge: vlan: add per-vlan struct and move to rhashtables This patch changes the bridge vlan implementation to use rhashtables instead of bitmaps. The main motivation behind this change is that we need extensible per-vlan structures (both per-port and global) so more advanced features can be introduced and the vlan support can be extended. I've tried to break this up but the moment net_port_vlans is changed and the whole API goes away, thus this is a larger patch. A few short goals of this patch are: - Extensible per-vlan structs stored in rhashtables and a sorted list - Keep user-visible behaviour (compressed vlans etc) - Keep fastpath ingress/egress logic the same (optimizations to come later) Here's a brief list of some of the new features we'd like to introduce: - per-vlan counters - vlan ingress/egress mapping - per-vlan igmp configuration - vlan priorities - avoid fdb entries replication (e.g. local fdb scaling issues) The structure is kept single for both global and per-port entries so to avoid code duplication where possible and also because we'll soon introduce "port0 / aka bridge as port" which should simplify things further (thanks to Vlad for the suggestion!). Now we have per-vlan global rhashtable (bridge-wide) and per-vlan port rhashtable, if an entry is added to a port it'll get a pointer to its global context so it can be quickly accessed later. There's also a sorted vlan list which is used for stable walks and some user-visible behaviour such as the vlan ranges, also for error paths. VLANs are stored in a "vlan group" which currently contains the rhashtable, sorted vlan list and the number of "real" vlan entries. A good side-effect of this change is that it resembles how hw keeps per-vlan data. One important note after this change is that if a VLAN is being looked up in the bridge's rhashtable for filtering purposes (or to check if it's an existing usable entry, not just a global context) then the new helper br_vlan_should_use() needs to be used if the vlan is found. In case the lookup is done only with a port's vlan group, then this check can be skipped. Things tested so far: - basic vlan ingress/egress - pvids - untagged vlans - undef CONFIG_BRIDGE_VLAN_FILTERING - adding/deleting vlans in different scenarios (with/without global ctx, while transmitting traffic, in ranges etc) - loading/removing the module while having/adding/deleting vlans - extracting bridge vlan information (user ABI), compressed requests - adding/deleting fdbs on vlans - bridge mac change, promisc mode - default pvid change - kmemleak ON during the whole time Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
0c4b51f0 |
|
15-Sep-2015 |
Eric W. Biederman <ebiederm@xmission.com> |
netfilter: Pass net into okfn This is immediately motivated by the bridge code that chains functions that call into netfilter. Without passing net into the okfns the bridge code would need to guess about the best expression for the network namespace to process packets in. As net is frequently one of the first things computed in continuation functions after netfilter has done it's job passing in the desired network namespace is in many cases a code simplification. To support this change the function dst_output_okfn is introduced to simplify passing dst_output as an okfn. For the moment dst_output_okfn just silently drops the struct net. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
29a26a56 |
|
15-Sep-2015 |
Eric W. Biederman <ebiederm@xmission.com> |
netfilter: Pass struct net into the netfilter hooks Pass a network namespace parameter into the netfilter hooks. At the call site of the netfilter hooks the path a packet is taking through the network stack is well known which allows the network namespace to be easily and reliabily. This allows the replacement of magic code like "dev_net(state->in?:state->out)" that appears at the start of most netfilter hooks with "state->net". In almost all cases the network namespace passed in is derived from the first network device passed in, guaranteeing those paths will not see any changes in practice. The exceptions are: xfrm/xfrm_output.c:xfrm_output_resume() xs_net(skb_dst(skb)->xfrm) ipvs/ip_vs_xmit.c:ip_vs_nat_send_or_cont() ip_vs_conn_net(cp) ipvs/ip_vs_xmit.c:ip_vs_send_or_cont() ip_vs_conn_net(cp) ipv4/raw.c:raw_send_hdrinc() sock_net(sk) ipv6/ip6_output.c:ip6_xmit() sock_net(sk) ipv6/ndisc.c:ndisc_send_skb() dev_net(skb->dev) not dev_net(dst->dev) ipv6/raw.c:raw6_send_hdrinc() sock_net(sk) br_netfilter_hooks.c:br_nf_pre_routing_finish() dev_net(skb->dev) before skb->dev is set to nf_bridge->physindev In all cases these exceptions seem to be a better expression for the network namespace the packet is being processed in then the historic "dev_net(in?in:out)". I am documenting them in case something odd pops up and someone starts trying to track down what happened. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
04eb4489 |
|
15-Sep-2015 |
Eric W. Biederman <ebiederm@xmission.com> |
bridge: Add br_netif_receive_skb remove netif_receive_skb_sk netif_receive_skb_sk is only called once in the bridge code, replace it with a bridge specific function that calls netif_receive_skb. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
7026b1dd |
|
05-Apr-2015 |
David Miller <davem@davemloft.net> |
netfilter: Pass socket pointer down through okfn(). On the output paths in particular, we have to sometimes deal with two socket contexts. First, and usually skb->sk, is the local socket that generated the frame. And second, is potentially the socket used to control a tunneling socket, such as one the encapsulates using UDP. We do not want to disassociate skb->sk when encapsulating in order to fix this, because that would break socket memory accounting. The most extreme case where this can cause huge problems is an AF_PACKET socket transmitting over a vxlan device. We hit code paths doing checks that assume they are dealing with an ipv4 socket, but are actually operating upon the AF_PACKET one. Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
842a9ae0 |
|
03-Mar-2015 |
Jouni Malinen <jouni@codeaurora.org> |
bridge: Extend Proxy ARP design to allow optional rules for Wi-Fi This extends the design in commit 958501163ddd ("bridge: Add support for IEEE 802.11 Proxy ARP") with optional set of rules that are needed to meet the IEEE 802.11 and Hotspot 2.0 requirements for ProxyARP. The previously added BR_PROXYARP behavior is left as-is and a new BR_PROXYARP_WIFI alternative is added so that this behavior can be configured from user space when required. In addition, this enables proxyarp functionality for unicast ARP requests for both BR_PROXYARP and BR_PROXYARP_WIFI since it is possible to use unicast as well as broadcast for these frames. The key differences in functionality: BR_PROXYARP: - uses the flag on the bridge port on which the request frame was received to determine whether to reply - block bridge port flooding completely on ports that enable proxy ARP BR_PROXYARP_WIFI: - uses the flag on the bridge port to which the target device of the request belongs - block bridge port flooding selectively based on whether the proxyarp functionality replied Signed-off-by: Jouni Malinen <jouni@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
d92cfdbb |
|
13-Jan-2015 |
Arnd Bergmann <arnd@arndb.de> |
bridge: only provide proxy ARP when CONFIG_INET is enabled When IPV4 support is disabled, we cannot call arp_send from the bridge code, which would result in a kernel link error: net/built-in.o: In function `br_handle_frame_finish': :(.text+0x59914): undefined reference to `arp_send' :(.text+0x59a50): undefined reference to `arp_tbl' This makes the newly added proxy ARP support in the bridge code depend on the CONFIG_INET symbol and lets the compiler optimize the code out to avoid the link error. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 958501163ddd ("bridge: Add support for IEEE 802.11 Proxy ARP") Cc: Kyeyoon Park <kyeyoonp@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
95850116 |
|
23-Oct-2014 |
Kyeyoon Park <kyeyoonp@codeaurora.org> |
bridge: Add support for IEEE 802.11 Proxy ARP This feature is defined in IEEE Std 802.11-2012, 10.23.13. It allows the AP devices to keep track of the hardware-address-to-IP-address mapping of the mobile devices within the WLAN network. The AP will learn this mapping via observing DHCP, ARP, and NS/NA frames. When a request for such information is made (i.e. ARP request, Neighbor Solicitation), the AP will respond on behalf of the associated mobile device. In the process of doing so, the AP will drop the multicast request frame that was intended to go out to the wireless medium. It was recommended at the LKS workshop to do this implementation in the bridge layer. vxlan.c is already doing something very similar. The DHCP snooping code will be added to the userspace application (hostapd) per the recommendation. This RFC commit is only for IPv4. A similar approach in the bridge layer will be taken for IPv6 as well. Signed-off-by: Kyeyoon Park <kyeyoonp@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
34666d46 |
|
18-Sep-2014 |
Pablo Neira Ayuso <pablo@netfilter.org> |
netfilter: bridge: move br_netfilter out of the core Jesper reported that br_netfilter always registers the hooks since this is part of the bridge core. This harms performance for people that don't need this. This patch modularizes br_netfilter so it can be rmmod'ed, thus, the hooks can be unregistered. I think the bridge netfilter should have been a separated module since the beginning, Patrick agreed on that. Note that this is breaking compatibility for users that expect that bridge netfilter is going to be available after explicitly 'modprobe bridge' or via automatic load through brctl. However, the damage can be easily undone by modprobing br_netfilter. The bridge core also spots a message to provide a clue to people that didn't notice that this has been deprecated. On top of that, the plan is that nftables will not rely on this software layer, but integrate the connection tracking into the bridge layer to enable stateful filtering and NAT, which is was bridge netfilter users seem to require. This patch still keeps the fake_dst_ops in the bridge core, since this is required by when the bridge port is initialized. So we can safely modprobe/rmmod br_netfilter anytime. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Florian Westphal <fw@strlen.de>
|
#
f2808d22 |
|
10-Jun-2014 |
Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> |
bridge: Prepare for forwarding another bridge group addresses If a bridge is an 802.1ad bridge, it must forward another bridge group addresses (the Nearest Customer Bridge group addresses). (For details, see IEEE 802.1Q-2011 8.6.3.) As user might not want group_fwd_mask to be modified by enabling 802.1ad, introduce a new mask, group_fwd_mask_required, which indicates addresses the bridge wants to forward. This will be set by enabling 802.1ad. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e0d7968a |
|
26-May-2014 |
Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> |
bridge: Prevent insertion of FDB entry with disallowed vlan br_handle_local_finish() is allowing us to insert an FDB entry with disallowed vlan. For example, when port 1 and 2 are communicating in vlan 10, and even if vlan 10 is disallowed on port 3, port 3 can interfere with their communication by spoofed src mac address with vlan id 10. Note: Even if it is judged that a frame should not be learned, it should not be dropped because it is destined for not forwarding layer but higher layer. See IEEE 802.1Q-2011 8.13.10. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
eb707618 |
|
09-Apr-2014 |
Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> |
bridge: Fix double free and memory leak around br_allowed_ingress br_allowed_ingress() has two problems. 1. If br_allowed_ingress() is called by br_handle_frame_finish() and vlan_untag() in br_allowed_ingress() fails, skb will be freed by both vlan_untag() and br_handle_frame_finish(). 2. If br_allowed_ingress() is called by br_dev_xmit() and br_allowed_ingress() fails, the skb will not be freed. Fix these two problems by freeing the skb in br_allowed_ingress() if it fails. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
fc92f745 |
|
27-Mar-2014 |
Vlad Yasevich <vyasevic@redhat.com> |
bridge: Fix crash with vlan filtering and tcpdump When the vlan filtering is enabled on the bridge, but the filter is not configured on the bridge device itself, running tcpdump on the bridge device will result in a an Oops with NULL pointer dereference. The reason is that br_pass_frame_up() will bypass the vlan check because promisc flag is set. It will then try to get the table pointer and process the packet based on the table. Since the table pointer is NULL, we oops. Catch this special condition in br_handle_vlan(). Reported-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> CC: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a5642ab4 |
|
07-Feb-2014 |
Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> |
bridge: Fix the way to find old local fdb entries in br_fdb_changeaddr br_fdb_changeaddr() assumes that there is at most one local entry per port per vlan. It used to be true, but since commit 36fd2b63e3b4 ("bridge: allow creating/deleting fdb entries via netlink"), it has not been so. Therefore, the function might fail to search a correct previous address to be deleted and delete an arbitrary local entry if user has added local entries manually. Example of problematic case: ip link set eth0 address ee:ff:12:34:56:78 brctl addif br0 eth0 bridge fdb add 12:34:56:78:90:ab dev eth0 master ip link set eth0 address aa:bb:cc:dd:ee:ff Then, the address 12:34:56:78:90:ab might be deleted instead of ee:ff:12:34:56:78, the original mac address of eth0. Address this issue by introducing a new flag, added_by_user, to struct net_bridge_fdb_entry. Note that br_fdb_delete_by_port() has to set added_by_user to 0 in cases like: ip link set eth0 address 12:34:56:78:90:ab ip link set eth1 address aa:bb:cc:dd:ee:ff brctl addif br0 eth0 bridge fdb add aa:bb:cc:dd:ee:ff dev eth0 master brctl addif br0 eth1 brctl delif br0 eth0 In this case, kernel should delete the user-added entry aa:bb:cc:dd:ee:ff, but it also should have been added by "brctl addif br0 eth1" originally, so we don't delete it and treat it a new kernel-created entry. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8f84985f |
|
03-Jan-2014 |
Li RongQing <roy.qing.li@gmail.com> |
net: unify the pcpu_tstats and br_cpu_netstats as one They are same, so unify them as one, pcpu_sw_netstats. Define pcpu_sw_netstat in netdevice.h, remove pcpu_tstats from if_tunnel and remove br_cpu_netstats from br_private.h Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
06499098 |
|
28-Oct-2013 |
Vlad Yasevich <vyasevic@redhat.com> |
bridge: pass correct vlan id to multicast code Currently multicast code attempts to extrace the vlan id from the skb even when vlan filtering is disabled. This can lead to mdb entries being created with the wrong vlan id. Pass the already extracted vlan id to the multicast filtering code to make the correct id is used in creation as well as lookup. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
cc0fdd80 |
|
30-Aug-2013 |
Linus Lüssing <linus.luessing@c0d3.blue> |
bridge: separate querier and query timer into IGMP/IPv4 and MLD/IPv6 ones Currently we would still potentially suffer multicast packet loss if there is just either an IGMP or an MLD querier: For the former case, we would possibly drop IPv6 multicast packets, for the latter IPv4 ones. This is because we are currently assuming that if either an IGMP or MLD querier is present that the other one is present, too. This patch makes the behaviour and fix added in "bridge: disable snooping if there is no querier" (b00589af3b04) to also work if there is either just an IGMP or an MLD querier on the link: It refines the deactivation of the snooping to be protocol specific by using separate timers for the snooped IGMP and MLD queries as well as separate timers for our internal IGMP and MLD queriers. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b00589af |
|
31-Jul-2013 |
Linus Lüssing <linus.luessing@c0d3.blue> |
bridge: disable snooping if there is no querier If there is no querier on a link then we won't get periodic reports and therefore won't be able to learn about multicast listeners behind ports, potentially leading to lost multicast packets, especially for multicast listeners that joined before the creation of the bridge. These lost multicast packets can appear since c5c23260594 ("bridge: Add multicast_querier toggle and disable queries by default") in particular. With this patch we are flooding multicast packets if our querier is disabled and if we didn't detect any other querier. A grace period of the Maximum Response Delay of the querier is added to give multicast responses enough time to arrive and to be learned from before disabling the flooding behaviour again. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
867a5943 |
|
05-Jun-2013 |
Vlad Yasevich <vyasevic@redhat.com> |
bridge: Add a flag to control unicast packet flood. Add a flag to control flood of unicast traffic. By default, flood is on and the bridge will flood unicast traffic if it doesn't know the destination. When the flag is turned off, unicast traffic without an FDB will not be forwarded to the specified port. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
9ba18891 |
|
05-Jun-2013 |
Vlad Yasevich <vyasevic@redhat.com> |
bridge: Add flag to control mac learning. Allow user to control whether mac learning is enabled on the port. By default, mac learning is enabled. Disabling mac learning will cause new dynamic FDB entries to not be created for a particular port. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
fbca58a2 |
|
06-Mar-2013 |
Cong Wang <amwang@redhat.com> |
bridge: add missing vid to br_mdb_get() Obviously, vid should be considered when searching for multicast group. Cc: Vlad Yasevich <vyasevic@redhat.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Vlad Yasevich <vyasevich@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
2ba071ec |
|
12-Feb-2013 |
Vlad Yasevich <vyasevic@redhat.com> |
bridge: Add vlan to unicast fdb entries This patch adds vlan to unicast fdb entries that are created for learned addresses (not the manually configured ones). It adds vlan id into the hash mix and uses vlan as an addditional parameter for an entry match. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
78851988 |
|
12-Feb-2013 |
Vlad Yasevich <vyasevic@redhat.com> |
bridge: Implement vlan ingress/egress policy with PVID. At ingress, any untagged traffic is assigned to the PVID. Any tagged traffic is filtered according to membership bitmap. At egress, if the vlan matches the PVID, the frame is sent untagged. Otherwise the frame is sent tagged. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
85f46c6b |
|
12-Feb-2013 |
Vlad Yasevich <vyasevic@redhat.com> |
bridge: Verify that a vlan is allowed to egress on given port When bridge forwards a frame, make sure that a frame is allowed to egress on that port. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a37b85c9 |
|
12-Feb-2013 |
Vlad Yasevich <vyasevic@redhat.com> |
bridge: Validate that vlan is permitted on ingress When a frame arrives on a port or transmitted by the bridge, if we have VLANs configured, validate that a given VLAN is allowed to enter the bridge. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
46acc460 |
|
01-Nov-2012 |
Ben Hutchings <bhutchings@solarflare.com> |
eth: Make is_link_local() consistent with other address tests Function name should include '_ether_addr'. Return type should be bool. Parameter name should be 'addr' not 'dest' (also matching kernel-doc). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b3343a2a |
|
17-Sep-2012 |
John Fastabend <john.r.fastabend@intel.com> |
net, ixgbe: handle link local multicast addresses in SR-IOV mode In SR-IOV mode the PF driver acts as the uplink port and is used to send control packets e.g. lldpad, stp, etc. eth0.1 eth0.2 eth0 VF VF PF | | | <-- stand-in for uplink | | | -------------------------- | Embedded Switch | -------------------------- | MAC <-- uplink But the embedded switch is setup to forward multicast addresses to all interfaces both VFs and PF and onto the physical link. This results in reserved MAC addresses used by control protocols to be forwarded over the switch onto the VF. In the LLDP case the PF sends an LLDPDU and it is currently being forwarded to all the VFs who then see the PF as a peer. This is incorrect. This patch adds the multicast addresses to the RAR table in the hardware to prevent this behavior. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
#
9a7b6ef9 |
|
08-May-2012 |
Joe Perches <joe@perches.com> |
bridge: Convert compare_ether_addr to ether_addr_equal Use the new bool function ether_addr_equal to add some clarity and reduce the likelihood for misuse of compare_ether_addr for sorting. Done via cocci script: $ cat compare_ether_addr.cocci @@ expression a,b; @@ - !compare_ether_addr(a, b) + ether_addr_equal(a, b) @@ expression a,b; @@ - compare_ether_addr(a, b) + !ether_addr_equal(a, b) @@ expression a,b; @@ - !ether_addr_equal(a, b) == 0 + ether_addr_equal(a, b) @@ expression a,b; @@ - !ether_addr_equal(a, b) != 0 + !ether_addr_equal(a, b) @@ expression a,b; @@ - ether_addr_equal(a, b) == 0 + !ether_addr_equal(a, b) @@ expression a,b; @@ - ether_addr_equal(a, b) != 0 + ether_addr_equal(a, b) @@ expression a,b; @@ - !!ether_addr_equal(a, b) + ether_addr_equal(a, b) Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
bc3b2d7f |
|
15-Jul-2011 |
Paul Gortmaker <paul.gortmaker@windriver.com> |
net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules These files are non modular, but need to export symbols using the macros now living in export.h -- call out the include so that things won't break when we remove the implicit presence of module.h from everywhere. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
#
515853cc |
|
03-Oct-2011 |
stephen hemminger <shemminger@vyatta.com> |
bridge: allow forwarding some link local frames This is based on an earlier patch by Nick Carter with comments by David Lamparter but with some refinements. Thanks for their patience this is a confusing area with overlap of standards, user requirements, and compatibility with earlier releases. It adds a new sysfs attribute /sys/class/net/brX/bridge/group_fwd_mask that controls forwarding of frames with address of: 01-80-C2-00-00-0X The default setting has no forwarding to retain compatibility. One change from earlier releases is that forwarding of group addresses is not dependent on STP being enabled or disabled. This choice was made based on interpretation of tie 802.1 standards. I expect complaints will arise because of this, but better to follow the standard than continue acting incorrectly by default. The filtering mask is writeable, but only values that don't forward known control frames are allowed. It intentionally blocks attempts to filter control protocols. For example: writing a 8 allows forwarding 802.1X PAE addresses which is the most common request. Reported-by: David Lamparter <equinox@diac24.net> Original-patch-by: Nick Carter <ncarter100@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Tested-by: Benjamin Poirier <benjamin.poirier@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
44661462 |
|
05-Jul-2011 |
Herbert Xu <herbert@gondor.apana.org.au> |
bridge: Always flood broadcast packets As is_multicast_ether_addr returns true on broadcast packets as well, we need to explicitly exclude broadcast packets so that they're always flooded. This wasn't an issue before as broadcast packets were considered to be an unregistered multicast group, which were always flooded. However, as we now only flood such packets to router ports, this is no longer acceptable. Reported-by: Michael Guntsche <mike@it-loops.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
f01cb5fb |
|
21-Apr-2011 |
David S. Miller <davem@davemloft.net> |
Revert "bridge: Forward reserved group addresses if !STP" This reverts commit 1e253c3b8a1aeed51eef6fc366812f219b97de65. It breaks 802.3ad bonding inside of a bridge. The commit was meant to support transport bridging, and specifically virtual machines bridged to an ethernet interface connected to a switch port wiht 802.1x enabled. But this isn't the way to do it, it breaks too many other things. Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
7cd8861a |
|
04-Apr-2011 |
stephen hemminger <shemminger@vyatta.com> |
bridge: track last used time in forwarding table Adds tracking the last used time in forwarding table. Rename ageing_timer to updated to better describe it. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8a4eb573 |
|
11-Mar-2011 |
Jiri Pirko <jpirko@redhat.com> |
net: introduce rx_handler results and logic around that This patch allows rx_handlers to better signalize what to do next to it's caller. That makes skb->deliver_no_wcard no longer needed. kernel-doc for rx_handler_result is taken from Nicolas' patch. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8a870178 |
|
12-Feb-2011 |
Herbert Xu <herbert@gondor.apana.org.au> |
bridge: Replace mp->mglist hlist with a bool As it turns out we never need to walk through the list of multicast groups subscribed by the bridge interface itself (the only time we'd want to do that is when we shut down the bridge, in which case we simply walk through all multicast groups), we don't really need to keep an hlist for mp->mglist. This means that we can replace it with just a single bit to indicate whether the bridge interface is subscribed to a group. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a386f990 |
|
14-Nov-2010 |
Eric Dumazet <eric.dumazet@gmail.com> |
bridge: add proper RCU annotation to should_route_hook Add br_should_route_hook_t typedef, this is the only way we can get a clean RCU implementation for function pointer. Move route_hook to location where it is used. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1e253c3b |
|
18-Oct-2010 |
Benjamin Poirier <benjamin.poirier@polymtl.ca> |
bridge: Forward reserved group addresses if !STP Make all frames sent to reserved group MAC addresses (01:80:c2:00:00:00 to 01:80:c2:00:00:0f) be forwarded if STP is disabled. This enables forwarding EAPOL frames, among other things. Signed-off-by: Benjamin Poirier <benjamin.poirier@polymtl.ca> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c2368e79 |
|
22-Aug-2010 |
Simon Horman <horms@verge.net.au> |
bridge: is PACKET_LOOPBACK unlikely()? While looking at using netdev_rx_handler_register for openvswitch Jesse Gross suggested that an unlikely() might be worthwhile in that code. I'm interested to see if its appropriate for the bridge code. Cc: Jesse Gross <jesse@nicira.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
eeaf61d8 |
|
27-Jul-2010 |
stephen hemminger <shemminger@vyatta.com> |
bridge: add rcu_read_lock on transmit Long ago, when bridge was converted to RCU, rcu lock was equivalent to having preempt disabled. RCU has changed a lot since then and bridge code was still assuming the since transmit was called with bottom half disabled, it was RCU safe. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Tested-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
406818ff |
|
23-Jun-2010 |
Eric Dumazet <eric.dumazet@gmail.com> |
bridge: 64bit rx/tx counters Use u64_stats_sync infrastructure to provide 64bit rx/tx counters even on 32bit hosts. It is safe to use a single u64_stats_sync for rx and tx, because BH is disabled on both, and we use per_cpu data. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
f350a0a8 |
|
15-Jun-2010 |
Jiri Pirko <jpirko@redhat.com> |
bridge: use rx_handler_data pointer to store net_bridge_port pointer Register net_bridge_port pointer as rx_handler data pointer. As br_port is removed from struct net_device, another netdev priv_flag is added to indicate the device serves as a bridge port. Also rcuized pointers are now correctly dereferenced in br_fdb.c and in netfilter parts. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ab95bfe0 |
|
01-Jun-2010 |
Jiri Pirko <jpirko@redhat.com> |
net: replace hooks in __netif_receive_skb V5 What this patch does is it removes two receive frame hooks (for bridge and for macvlan) from __netif_receive_skb. These are replaced them with a single hook for both. It only supports one hook per device because it makes no sense to do bridging and macvlan on the same device. Then a network driver (of virtual netdev like macvlan or bridge) can register an rx_handler for needed net device. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
5a0e3ad6 |
|
24-Mar-2010 |
Tejun Heo <tj@kernel.org> |
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
|
#
713aefa3 |
|
22-Mar-2010 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: bridge: use NFPROTO values for NF_HOOK invocation The first argument to NF_HOOK* is an nfproto since quite some time. Commit v2.6.27-2457-gfdc9314 was the first to practically start using the new names. Do that now for the remaining NF_HOOK calls. The semantic patch used was: // <smpl> @@ @@ (NF_HOOK |NF_HOOK_THRESH )( -PF_BRIDGE, +NFPROTO_BRIDGE, ...) @@ @@ NF_HOOK( -PF_INET6, +NFPROTO_IPV6, ...) @@ @@ NF_HOOK( -PF_INET, +NFPROTO_IPV4, ...) // </smpl> Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
14bb4789 |
|
02-Mar-2010 |
stephen hemminger <shemminger@vyatta.com> |
bridge: per-cpu packet statistics (v3) The shared packet statistics are a potential source of slow down on bridged traffic. Convert to per-cpu array, but only keep those statistics which change per-packet. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
32dec5dd |
|
15-Mar-2010 |
YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> |
bridge br_multicast: Don't refer to BR_INPUT_SKB_CB(skb)->mrouters_only without IGMP snooping. Without CONFIG_BRIDGE_IGMP_SNOOPING, BR_INPUT_SKB_CB(skb)->mrouters_only is not appropriately initialized, so we can see garbage. A clear option to fix this is to set it even without that config, but we cannot optimize out the branch. Let's introduce a macro that returns value of mrouters_only and let it return 0 without CONFIG_BRIDGE_IGMP_SNOOPING. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
7f7708f0 |
|
16-Mar-2010 |
Michael Braun <michael-dev@fami-braun.de> |
bridge: Fix br_forward crash in promiscuous mode From: Michael Braun <michael-dev@fami-braun.de> bridge: Fix br_forward crash in promiscuous mode It's a linux-next kernel from 2010-03-12 on an x86 system and it OOPs in the bridge module in br_pass_frame_up (called by br_handle_frame_finish) because brdev cannot be dereferenced (its set to a non-null value). Adding some BUG_ON statements revealed that BR_INPUT_SKB_CB(skb)->brdev == br-dev (as set in br_handle_frame_finish first) only holds until br_forward is called. The next call to br_pass_frame_up then fails. Digging deeper it seems that br_forward either frees the skb or passes it to NF_HOOK which will in turn take care of freeing the skb. The same is holds for br_pass_frame_ip. So it seems as if two independent skb allocations are required. As far as I can see, commit b33084be192ee1e347d98bb5c9e38a53d98d35e2 ("bridge: Avoid unnecessary clone on forward path") removed skb duplication and so likely causes this crash. This crash does not happen on 2.6.33. I've therefore modified br_forward the same way br_flood has been modified so that the skb is not freed if skb0 is going to be used and I can confirm that the attached patch resolves the issue for me. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c4fcb78c |
|
27-Feb-2010 |
Herbert Xu <herbert@gondor.apana.org.au> |
bridge: Add multicast data-path hooks This patch finally hooks up the multicast snooping module to the data path. In particular, all multicast packets passing through the bridge are fed into the module and switched by it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b33084be |
|
27-Feb-2010 |
Herbert Xu <herbert@gondor.apana.org.au> |
bridge: Avoid unnecessary clone on forward path When the packet is delivered to the local bridge device we may end up cloning it unnecessarily if no bridge port can receive the packet in br_flood. This patch avoids this by moving the skb_clone into br_flood. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
68b7c895 |
|
27-Feb-2010 |
Herbert Xu <herbert@gondor.apana.org.au> |
bridge: Allow tail-call on br_pass_frame_up This patch allows tail-call on the call to br_pass_frame_up in br_handle_frame_finish. This is now possible because of the previous patch to call br_pass_frame_up last. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
87557c18 |
|
27-Feb-2010 |
Herbert Xu <herbert@gondor.apana.org.au> |
bridge: Do br_pass_frame_up after other ports At the moment we deliver to the local bridge port via the function br_pass_frame_up before all other ports. There is no requirement for this. For the purpose of IGMP snooping, it would be more convenient if we did the local port last. Therefore this patch rearranges the bridge input processing so that the local bridge port gets to see the packet last (if at all). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a598f6ae |
|
15-May-2009 |
Stephen Hemminger <shemminger@vyatta.com> |
bridge: relay bridge multicast pkgs if !STP Currently the bridge catches all STP packets; even if STP is turned off. This prevents other systems (which do have STP turned on) from being able to detect loops in the network. With this patch, if STP is off, then any packet sent to the STP multicast group address is forwarded to all ports. Based on earlier patch by Joakim Tjernlund with changes to go through forwarding (not local chain), and optimization that only last octet needs to be checked. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
43aa1920 |
|
17-Jun-2008 |
Stephen Hemminger <shemminger@vyatta.com> |
bridge: handle process all link-local frames Any frame addressed to link-local addresses should be processed by local receive path. The earlier code would process them only if STP was enabled. Since there are other frames like LACP for bonding, we should always process them. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
0b040829 |
|
10-Jun-2008 |
Adrian Bunk <bunk@kernel.org> |
net: remove CVS keywords This patch removes CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a339f1c8 |
|
21-May-2008 |
Pavel Emelyanov <xemul@openvz.org> |
bridge: Use on-device stats instead of private ones. Even though bridges require 6 fields from struct net_device_stats, the on-device stats are always there, so we may just use them. The br_dev_get_stats is no longer required after this. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
91c5ec3e |
|
11-Dec-2007 |
YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> |
[BRIDGE]: Use cpu_to_be16() where appropriate. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
82de382c |
|
29-Nov-2007 |
Pavel Emelyanov <xemul@openvz.org> |
[BRIDGE]: Properly dereference the br_should_route_hook This hook is protected with the RCU, so simple if (br_should_route_hook) br_should_route_hook(...) is not enough on some architectures. Use the rcu_dereference/rcu_assign_pointer in this case. Fixed Stephen's comment concerning using the typeof(). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
#
3db05fea |
|
15-Oct-2007 |
Herbert Xu <herbert@gondor.apana.org.au> |
[NETFILTER]: Replace sk_buff ** with sk_buff * With all the users of the double pointers removed, this patch mops up by finally replacing all occurances of sk_buff ** in the netfilter API by sk_buff *. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
7b995651 |
|
14-Oct-2007 |
Herbert Xu <herbert@gondor.apana.org.au> |
[BRIDGE]: Unshare skb upon entry Due to the special location of the bridging hook, it should never see a shared packet anyway (certainly not with any in-kernel code). So it makes sense to unshare the skb there if necessary as that will greatly simplify the code below it (in particular, netfilter). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e081e1e3 |
|
16-Sep-2007 |
Herbert Xu <herbert@gondor.apana.org.au> |
[BRIDGE]: Kill clone argument to br_flood_* The clone argument is only used by one caller and that caller can clone the packet itself. This patch moves the clone call into the caller and kills the clone argument. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
df1c0b84 |
|
30-Aug-2007 |
Stephen Hemminger <shemminger@linux-foundation.org> |
[BRIDGE]: Packets leaking out of disabled/blocked ports. This patch fixes some packet leakage in bridge. The bridging code was allowing forward table entries to be generated even if a device was being blocked. The fix is to not add forwarding database entries unless the port is active. The bug arose as part of the conversion to processing STP frames through normal receive path (in 2.6.17). Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Acked-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
582ee43d |
|
26-Jul-2007 |
Al Viro <viro@ftp.linux.org.uk> |
net/* misc endianness annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
c2886d62 |
|
25-Apr-2007 |
Stephen Hemminger <shemminger@linux-foundation.org> |
[BRIDGE]: if no STP then forward all BPDUs If a bridge is not running STP, then it has no way to detect a cycle in the network. But if it is not running STP and some other machine or device is running STP, then if STP BPDU's get forwarded to it can detect the cycle. This is how the old 2.4 and early 2.6 code worked. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
2111f8b9 |
|
25-Apr-2007 |
Stephen Hemminger <shemminger@linux-foundation.org> |
[BRIDGE]: drop PAUSE frames Pause frames should never make it out of the network device into the stack. But if a device was misconfigured, it might happen. So drop pause frames in bridge. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
83aa0938 |
|
25-Apr-2007 |
Stephen Hemminger <shemminger@linux-foundation.org> |
[BRIDGE]: don't change packet type The change to forward STP bpdu's (for usermode STP) through normal path, changed the packet type in the process. Since link local stuff is multicast, it should stay pkt_type = PACKET_MULTICAST. The code was probably copy/pasted incorrectly from the bridge pseudo-device receive path. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
467aea0d |
|
21-Mar-2007 |
Stephen Hemminger <shemminger@linux-foundation.org> |
bridge: don't route packets while learning While in the STP learning state, don't route packets; wait until forwarding delay has expired. The purpose of the forwarding delay is to detect loops in the network, and if a brouter started up and started forwarding, it could cause a flood. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
|
#
6229e362 |
|
21-Mar-2007 |
Stephen Hemminger <shemminger@osdl.org> |
bridge: eliminate call by reference Change the bridging hook to be simple function with return value rather than modifying the skb argument. This could generate better code and is cleaner. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
|
#
fd74e6cc |
|
12-Mar-2007 |
Stephen Hemminger <shemminger@linux-foundation.org> |
[BRIDGE]: faster compare for link local addresses Use logic operations rather than memcmp() to compare destination address with link local multicast addresses. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
9d6f229f |
|
09-Feb-2007 |
YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> |
[NET] BRIDGE: Fix whitespace errors. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1c29fc49 |
|
05-May-2006 |
Stephen Hemminger <shemminger@osdl.org> |
[BRIDGE]: keep track of received multicast packets It makes sense to add this simple statistic to keep track of received multicast packets. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b7595b49 |
|
10-Apr-2006 |
Stephen Hemminger <shemminger@osdl.org> |
[BRIDGE]: receive link-local on disabled ports. This change allows link local packets (like 802.3ad and Spanning Tree Protocol) to be processed even when the bridge is not using the port. It fixes the chicken-egg problem for bridging a bonded device, and may also fix problems with spanning tree failover. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b3e83d6d |
|
21-Mar-2006 |
Andrew Morton <akpm@osdl.org> |
[BRIDGE]: Remove duplicate const from is_link_local() argument type. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
fda93d92 |
|
20-Mar-2006 |
Stephen Hemminger <shemminger@osdl.org> |
[BRIDGE]: allow show/store of group multicast address Bridge's communicate with each other using Spanning Tree Protocol over a standard multicast address. There are times when testing or layering bridges over existing topologies or tunnels, when it is useful to use alternative multicast addresses for STP packets. The 802.1d standard has some unused addresses, that can be used for this. This patch is restrictive in that it only allows one of the possible addresses in the standard. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
cf0f02d0 |
|
20-Mar-2006 |
Stephen Hemminger <shemminger@osdl.org> |
[BRIDGE]: use llc for receiving STP packets Use LLC for the receive path of Spanning Tree Protocol packets. This allows link local multicast packets to be received by other protocols (if they care), and uses the existing LLC code to get STP packets back into bridge code. The bridge multicast address is also checked, so bridges using other link local multicast addresses are ignored. This allows for use of different multicast addresses to define separate STP domains. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
d5513a7d |
|
20-Mar-2006 |
Stephen Hemminger <shemminger@osdl.org> |
[BRIDGE]: optimize frame pass up The netfilter hook that is used to receive frames doesn't need to be a stub. It is only called in two ways, both of which ignore the return value. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b3f1be4b |
|
09-Feb-2006 |
Stephen Hemminger <shemminger@osdl.org> |
[BRIDGE]: fix for RCU and deadlock on device removal Change Bridge receive path to correctly handle RCU removal of device from bridge. Also fixes deadlock between carrier_check and del_nbp. This replaces the previous deleted flag fix. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
dbbc0988 |
|
06-Jan-2006 |
Kris Katterjohn <kjak@users.sourceforge.net> |
[NET]: Use newer is_multicast_ether_addr() in some files This uses is_multicast_ether_addr() because it has recently been changed to do the same thing these seperate tests are doing. Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
0e5eabac |
|
21-Dec-2005 |
Stephen Hemminger <shemminger@osdl.org> |
[BRIDGE]: filter packets in learning state While in the learning state, run filters but drop the result. This prevents us from acquiring bad fdb entries in learning state. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
6ede2463 |
|
25-Oct-2005 |
Stephen Hemminger <shemminger@osdl.org> |
[BRIDGE]: Use ether_compare Use compare_ether_addr in bridge code. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
|
#
18b8afc7 |
|
21-Jun-2005 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: Kill nf_debug Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
7ce54e3f |
|
29-May-2005 |
Stephen Hemminger <shemminger@osdl.org> |
[BRIDGE]: receive path optimization This improves the bridge local receive path by avoiding going through another softirq. The bridge receive path is already being called from a netif_receive_skb() there is no point in going through another receiveq round trip. Recursion is limited because bridge can never be a port of a bridge so handle_bridge() always returns. Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
85967bb4 |
|
29-May-2005 |
Stephen Hemminger <shemminger@osdl.org> |
[BRIDGE]: prevent bad forwarding table updates Avoid poisoning of the bridge forwarding table by frames that have been dropped by filtering. This prevents spoofed source addresses on hostile side of bridge from causing packet leakage, a small but possible security risk. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1da177e4 |
|
16-Apr-2005 |
Linus Torvalds <torvalds@ppc970.osdl.org> |
Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
|