#
d8a6213d |
|
05-Apr-2024 |
Eric Dumazet <edumazet@google.com> |
geneve: fix header validation in geneve[6]_xmit_skb syzbot is able to trigger an uninit-value in geneve_xmit() [1] Problem : While most ip tunnel helpers (like ip_tunnel_get_dsfield()) uses skb_protocol(skb, true), pskb_inet_may_pull() is only using skb->protocol. If anything else than ETH_P_IPV6 or ETH_P_IP is found in skb->protocol, pskb_inet_may_pull() does nothing at all. If a vlan tag was provided by the caller (af_packet in the syzbot case), the network header might not point to the correct location, and skb linear part could be smaller than expected. Add skb_vlan_inet_prepare() to perform a complete mac validation. Use this in geneve for the moment, I suspect we need to adopt this more broadly. v4 - Jakub reported v3 broke l2_tos_ttl_inherit.sh selftest - Only call __vlan_get_protocol() for vlan types. Link: https://lore.kernel.org/netdev/20240404100035.3270a7d5@kernel.org/ v2,v3 - Addressed Sabrina comments on v1 and v2 Link: https://lore.kernel.org/netdev/Zg1l9L2BNoZWZDZG@hog/ [1] BUG: KMSAN: uninit-value in geneve_xmit_skb drivers/net/geneve.c:910 [inline] BUG: KMSAN: uninit-value in geneve_xmit+0x302d/0x5420 drivers/net/geneve.c:1030 geneve_xmit_skb drivers/net/geneve.c:910 [inline] geneve_xmit+0x302d/0x5420 drivers/net/geneve.c:1030 __netdev_start_xmit include/linux/netdevice.h:4903 [inline] netdev_start_xmit include/linux/netdevice.h:4917 [inline] xmit_one net/core/dev.c:3531 [inline] dev_hard_start_xmit+0x247/0xa20 net/core/dev.c:3547 __dev_queue_xmit+0x348d/0x52c0 net/core/dev.c:4335 dev_queue_xmit include/linux/netdevice.h:3091 [inline] packet_xmit+0x9c/0x6c0 net/packet/af_packet.c:276 packet_snd net/packet/af_packet.c:3081 [inline] packet_sendmsg+0x8bb0/0x9ef0 net/packet/af_packet.c:3113 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x30f/0x380 net/socket.c:745 __sys_sendto+0x685/0x830 net/socket.c:2191 __do_sys_sendto net/socket.c:2203 [inline] __se_sys_sendto net/socket.c:2199 [inline] __x64_sys_sendto+0x125/0x1d0 net/socket.c:2199 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 Uninit was created at: slab_post_alloc_hook mm/slub.c:3804 [inline] slab_alloc_node mm/slub.c:3845 [inline] kmem_cache_alloc_node+0x613/0xc50 mm/slub.c:3888 kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:577 __alloc_skb+0x35b/0x7a0 net/core/skbuff.c:668 alloc_skb include/linux/skbuff.h:1318 [inline] alloc_skb_with_frags+0xc8/0xbf0 net/core/skbuff.c:6504 sock_alloc_send_pskb+0xa81/0xbf0 net/core/sock.c:2795 packet_alloc_skb net/packet/af_packet.c:2930 [inline] packet_snd net/packet/af_packet.c:3024 [inline] packet_sendmsg+0x722d/0x9ef0 net/packet/af_packet.c:3113 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x30f/0x380 net/socket.c:745 __sys_sendto+0x685/0x830 net/socket.c:2191 __do_sys_sendto net/socket.c:2203 [inline] __se_sys_sendto net/socket.c:2199 [inline] __x64_sys_sendto+0x125/0x1d0 net/socket.c:2199 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 CPU: 0 PID: 5033 Comm: syz-executor346 Not tainted 6.9.0-rc1-syzkaller-00005-g928a87efa423 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024 Fixes: d13f048dd40e ("net: geneve: modify IP header check in geneve6_xmit_skb and geneve_xmit_skb") Reported-by: syzbot+9ee20ec1de7b3168db09@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/000000000000d19c3a06152f9ee4@google.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Phillip Potter <phil@philpotter.co.uk> Cc: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Phillip Potter <phil@philpotter.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
771d791d |
|
05-Mar-2024 |
Breno Leitao <leitao@debian.org> |
net: geneve: Remove generic .ndo_get_stats64 Commit 3e2f544dd8a33 ("net: get stats64 if device if driver is configured") moved the callback to dev_get_tstats64() to net core, so, unless the driver is doing some custom stats collection, it does not need to set .ndo_get_stats64. Since this driver is now relying in NETDEV_PCPU_STAT_TSTATS, then, it doesn't need to set the dev_get_tstats64() generic .ndo_get_stats64 function pointer. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240305172911.502058-2-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
f5f07d06 |
|
05-Mar-2024 |
Breno Leitao <leitao@debian.org> |
net: geneve: Leverage core stats allocator With commit 34d21de99cea9 ("net: Move {l,t,d}stats allocation to core and convert veth & vrf"), stats allocation could be done on net core instead of in this driver. With this new approach, the driver doesn't have to bother with error handling (allocation failure checking, making sure free happens in the right spot, etc). This is core responsibility now. Remove the allocation in the geneve driver and leverage the network core allocation instead. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240305172911.502058-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
93e16ea0 |
|
01-Mar-2024 |
Eric Dumazet <edumazet@google.com> |
net: gro: rename skb_gro_header_hard() skb_gro_header_hard() is renamed to skb_gro_may_pull() to match the convention used by common helpers like pskb_may_pull(). This means the condition is inverted: if (skb_gro_header_hard(skb, hlen)) slow_path(); becomes: if (!skb_gro_may_pull(skb, hlen)) slow_path(); Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
#
e443f3ac |
|
17-Feb-2024 |
Ricardo B. Marliere <ricardo@marliere.net> |
net: geneve: constify the struct device_type usage Since commit aed65af1cc2f ("drivers: make device_type const"), the driver core can properly handle constant struct device_type. Move the geneve_type variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
0bef5120 |
|
12-Feb-2024 |
Eric Dumazet <edumazet@google.com> |
net: add netdev_lockdep_set_classes() to virtual drivers Based on a syzbot report, it appears many virtual drivers do not yet use netdev_lockdep_set_classes(), triggerring lockdep false positives. WARNING: possible recursive locking detected 6.8.0-rc4-next-20240212-syzkaller #0 Not tainted syz-executor.0/19016 is trying to acquire lock: ffff8880162cb298 (_xmit_ETHER#2){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffff8880162cb298 (_xmit_ETHER#2){+.-.}-{2:2}, at: __netif_tx_lock include/linux/netdevice.h:4452 [inline] ffff8880162cb298 (_xmit_ETHER#2){+.-.}-{2:2}, at: sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340 but task is already holding lock: ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: __netif_tx_lock include/linux/netdevice.h:4452 [inline] ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340 other info that might help us debug this: Possible unsafe locking scenario: CPU0 lock(_xmit_ETHER#2); lock(_xmit_ETHER#2); *** DEADLOCK *** May be due to missing lock nesting notation 9 locks held by syz-executor.0/19016: #0: ffffffff8f385208 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock net/core/rtnetlink.c:79 [inline] #0: ffffffff8f385208 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x82c/0x1040 net/core/rtnetlink.c:6603 #1: ffffc90000a08c00 ((&in_dev->mr_ifc_timer)){+.-.}-{0:0}, at: call_timer_fn+0xc0/0x600 kernel/time/timer.c:1697 #2: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:298 [inline] #2: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:750 [inline] #2: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x45f/0x1360 net/ipv4/ip_output.c:228 #3: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: local_bh_disable include/linux/bottom_half.h:20 [inline] #3: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: rcu_read_lock_bh include/linux/rcupdate.h:802 [inline] #3: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x2c4/0x3b10 net/core/dev.c:4284 #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: spin_trylock include/linux/spinlock.h:361 [inline] #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: qdisc_run_begin include/net/sch_generic.h:195 [inline] #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_xmit_skb net/core/dev.c:3771 [inline] #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_queue_xmit+0x1262/0x3b10 net/core/dev.c:4325 #5: ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] #5: ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: __netif_tx_lock include/linux/netdevice.h:4452 [inline] #5: ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340 #6: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:298 [inline] #6: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:750 [inline] #6: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x45f/0x1360 net/ipv4/ip_output.c:228 #7: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: local_bh_disable include/linux/bottom_half.h:20 [inline] #7: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: rcu_read_lock_bh include/linux/rcupdate.h:802 [inline] #7: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x2c4/0x3b10 net/core/dev.c:4284 #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: spin_trylock include/linux/spinlock.h:361 [inline] #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: qdisc_run_begin include/net/sch_generic.h:195 [inline] #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_xmit_skb net/core/dev.c:3771 [inline] #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_queue_xmit+0x1262/0x3b10 net/core/dev.c:4325 stack backtrace: CPU: 1 PID: 19016 Comm: syz-executor.0 Not tainted 6.8.0-rc4-next-20240212-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 check_deadlock kernel/locking/lockdep.c:3062 [inline] validate_chain+0x15c1/0x58e0 kernel/locking/lockdep.c:3856 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137 lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] __netif_tx_lock include/linux/netdevice.h:4452 [inline] sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340 __dev_xmit_skb net/core/dev.c:3784 [inline] __dev_queue_xmit+0x1912/0x3b10 net/core/dev.c:4325 neigh_output include/net/neighbour.h:542 [inline] ip_finish_output2+0xe66/0x1360 net/ipv4/ip_output.c:235 iptunnel_xmit+0x540/0x9b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x20ee/0x2960 net/ipv4/ip_tunnel.c:831 erspan_xmit+0x9de/0x1460 net/ipv4/ip_gre.c:720 __netdev_start_xmit include/linux/netdevice.h:4989 [inline] netdev_start_xmit include/linux/netdevice.h:5003 [inline] xmit_one net/core/dev.c:3555 [inline] dev_hard_start_xmit+0x242/0x770 net/core/dev.c:3571 sch_direct_xmit+0x2b6/0x5f0 net/sched/sch_generic.c:342 __dev_xmit_skb net/core/dev.c:3784 [inline] __dev_queue_xmit+0x1912/0x3b10 net/core/dev.c:4325 neigh_output include/net/neighbour.h:542 [inline] ip_finish_output2+0xe66/0x1360 net/ipv4/ip_output.c:235 igmpv3_send_cr net/ipv4/igmp.c:723 [inline] igmp_ifc_timer_expire+0xb71/0xd90 net/ipv4/igmp.c:813 call_timer_fn+0x17e/0x600 kernel/time/timer.c:1700 expire_timers kernel/time/timer.c:1751 [inline] __run_timers+0x621/0x830 kernel/time/timer.c:2038 run_timer_softirq+0x67/0xf0 kernel/time/timer.c:2051 __do_softirq+0x2bc/0x943 kernel/softirq.c:554 invoke_softirq kernel/softirq.c:428 [inline] __irq_exit_rcu+0xf2/0x1c0 kernel/softirq.c:633 irq_exit_rcu+0x9/0x30 kernel/softirq.c:645 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1076 [inline] sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1076 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702 RIP: 0010:resched_offsets_ok kernel/sched/core.c:10127 [inline] RIP: 0010:__might_resched+0x16f/0x780 kernel/sched/core.c:10142 Code: 00 4c 89 e8 48 c1 e8 03 48 ba 00 00 00 00 00 fc ff df 48 89 44 24 38 0f b6 04 10 84 c0 0f 85 87 04 00 00 41 8b 45 00 c1 e0 08 <01> d8 44 39 e0 0f 85 d6 00 00 00 44 89 64 24 1c 48 8d bc 24 a0 00 RSP: 0018:ffffc9000ee069e0 EFLAGS: 00000246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8880296a9e00 RDX: dffffc0000000000 RSI: ffff8880296a9e00 RDI: ffffffff8bfe8fa0 RBP: ffffc9000ee06b00 R08: ffffffff82326877 R09: 1ffff11002b5ad1b R10: dffffc0000000000 R11: ffffed1002b5ad1c R12: 0000000000000000 R13: ffff8880296aa23c R14: 000000000000062a R15: 1ffff92001dc0d44 down_write+0x19/0x50 kernel/locking/rwsem.c:1578 kernfs_activate fs/kernfs/dir.c:1403 [inline] kernfs_add_one+0x4af/0x8b0 fs/kernfs/dir.c:819 __kernfs_create_file+0x22e/0x2e0 fs/kernfs/file.c:1056 sysfs_add_file_mode_ns+0x24a/0x310 fs/sysfs/file.c:307 create_files fs/sysfs/group.c:64 [inline] internal_create_group+0x4f4/0xf20 fs/sysfs/group.c:152 internal_create_groups fs/sysfs/group.c:192 [inline] sysfs_create_groups+0x56/0x120 fs/sysfs/group.c:218 create_dir lib/kobject.c:78 [inline] kobject_add_internal+0x472/0x8d0 lib/kobject.c:240 kobject_add_varg lib/kobject.c:374 [inline] kobject_init_and_add+0x124/0x190 lib/kobject.c:457 netdev_queue_add_kobject net/core/net-sysfs.c:1706 [inline] netdev_queue_update_kobjects+0x1f3/0x480 net/core/net-sysfs.c:1758 register_queue_kobjects net/core/net-sysfs.c:1819 [inline] netdev_register_kobject+0x265/0x310 net/core/net-sysfs.c:2059 register_netdevice+0x1191/0x19c0 net/core/dev.c:10298 bond_newlink+0x3b/0x90 drivers/net/bonding/bond_netlink.c:576 rtnl_newlink_create net/core/rtnetlink.c:3506 [inline] __rtnl_newlink net/core/rtnetlink.c:3726 [inline] rtnl_newlink+0x158f/0x20a0 net/core/rtnetlink.c:3739 rtnetlink_rcv_msg+0x885/0x1040 net/core/rtnetlink.c:6606 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2543 netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline] netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1367 netlink_sendmsg+0xa3c/0xd70 net/netlink/af_netlink.c:1908 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x221/0x270 net/socket.c:745 __sys_sendto+0x3a4/0x4f0 net/socket.c:2191 __do_sys_sendto net/socket.c:2203 [inline] __se_sys_sendto net/socket.c:2199 [inline] __x64_sys_sendto+0xde/0x100 net/socket.c:2199 do_syscall_64+0xfb/0x240 entry_SYSCALL_64_after_hwframe+0x6d/0x75 RIP: 0033:0x7fc3fa87fa9c Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240212140700.2795436-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
f4b57b9d |
|
06-Feb-2024 |
Eric Dumazet <edumazet@google.com> |
geneve: use exit_batch_rtnl() method exit_batch_rtnl() is called while RTNL is held, and devices to be unregistered can be queued in the dev_kill_list. This saves one rtnl_lock()/rtnl_unlock() pair, and one unregister_netdevice_many() call. Note: it should be possible to remove the synchronize_net() call from geneve_sock_release() in a future patch. v4: move WARN_ON_ONCE(!list_empty(&gn->sock_list)) into geneve_exit_net(), after devices have been unregistered. (Antoine Tenart feedback) Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Antoine Tenart <atenart@kernel.org> Link: https://lore.kernel.org/r/20240206144313.2050392-7-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
1ca1ba46 |
|
29-Feb-2024 |
Eric Dumazet <edumazet@google.com> |
geneve: make sure to pull inner header in geneve_rx() syzbot triggered a bug in geneve_rx() [1] Issue is similar to the one I fixed in commit 8d975c15c0cd ("ip6_tunnel: make sure to pull inner header in __ip6_tnl_rcv()") We have to save skb->network_header in a temporary variable in order to be able to recompute the network_header pointer after a pskb_inet_may_pull() call. pskb_inet_may_pull() makes sure the needed headers are in skb->head. [1] BUG: KMSAN: uninit-value in IP_ECN_decapsulate include/net/inet_ecn.h:302 [inline] BUG: KMSAN: uninit-value in geneve_rx drivers/net/geneve.c:279 [inline] BUG: KMSAN: uninit-value in geneve_udp_encap_recv+0x36f9/0x3c10 drivers/net/geneve.c:391 IP_ECN_decapsulate include/net/inet_ecn.h:302 [inline] geneve_rx drivers/net/geneve.c:279 [inline] geneve_udp_encap_recv+0x36f9/0x3c10 drivers/net/geneve.c:391 udp_queue_rcv_one_skb+0x1d39/0x1f20 net/ipv4/udp.c:2108 udp_queue_rcv_skb+0x6ae/0x6e0 net/ipv4/udp.c:2186 udp_unicast_rcv_skb+0x184/0x4b0 net/ipv4/udp.c:2346 __udp4_lib_rcv+0x1c6b/0x3010 net/ipv4/udp.c:2422 udp_rcv+0x7d/0xa0 net/ipv4/udp.c:2604 ip_protocol_deliver_rcu+0x264/0x1300 net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x2b8/0x440 net/ipv4/ip_input.c:233 NF_HOOK include/linux/netfilter.h:314 [inline] ip_local_deliver+0x21f/0x490 net/ipv4/ip_input.c:254 dst_input include/net/dst.h:461 [inline] ip_rcv_finish net/ipv4/ip_input.c:449 [inline] NF_HOOK include/linux/netfilter.h:314 [inline] ip_rcv+0x46f/0x760 net/ipv4/ip_input.c:569 __netif_receive_skb_one_core net/core/dev.c:5534 [inline] __netif_receive_skb+0x1a6/0x5a0 net/core/dev.c:5648 process_backlog+0x480/0x8b0 net/core/dev.c:5976 __napi_poll+0xe3/0x980 net/core/dev.c:6576 napi_poll net/core/dev.c:6645 [inline] net_rx_action+0x8b8/0x1870 net/core/dev.c:6778 __do_softirq+0x1b7/0x7c5 kernel/softirq.c:553 do_softirq+0x9a/0xf0 kernel/softirq.c:454 __local_bh_enable_ip+0x9b/0xa0 kernel/softirq.c:381 local_bh_enable include/linux/bottom_half.h:33 [inline] rcu_read_unlock_bh include/linux/rcupdate.h:820 [inline] __dev_queue_xmit+0x2768/0x51c0 net/core/dev.c:4378 dev_queue_xmit include/linux/netdevice.h:3171 [inline] packet_xmit+0x9c/0x6b0 net/packet/af_packet.c:276 packet_snd net/packet/af_packet.c:3081 [inline] packet_sendmsg+0x8aef/0x9f10 net/packet/af_packet.c:3113 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] __sys_sendto+0x735/0xa10 net/socket.c:2191 __do_sys_sendto net/socket.c:2203 [inline] __se_sys_sendto net/socket.c:2199 [inline] __x64_sys_sendto+0x125/0x1c0 net/socket.c:2199 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b Uninit was created at: slab_post_alloc_hook mm/slub.c:3819 [inline] slab_alloc_node mm/slub.c:3860 [inline] kmem_cache_alloc_node+0x5cb/0xbc0 mm/slub.c:3903 kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:560 __alloc_skb+0x352/0x790 net/core/skbuff.c:651 alloc_skb include/linux/skbuff.h:1296 [inline] alloc_skb_with_frags+0xc8/0xbd0 net/core/skbuff.c:6394 sock_alloc_send_pskb+0xa80/0xbf0 net/core/sock.c:2783 packet_alloc_skb net/packet/af_packet.c:2930 [inline] packet_snd net/packet/af_packet.c:3024 [inline] packet_sendmsg+0x70c2/0x9f10 net/packet/af_packet.c:3113 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] __sys_sendto+0x735/0xa10 net/socket.c:2191 __do_sys_sendto net/socket.c:2203 [inline] __se_sys_sendto net/socket.c:2199 [inline] __x64_sys_sendto+0x125/0x1c0 net/socket.c:2199 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b Fixes: 2d07dc79fe04 ("geneve: add initial netdev driver for GENEVE tunnels") Reported-and-tested-by: syzbot+6a1423ff3f97159aae64@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c72a657b |
|
04-Jan-2024 |
Eric Dumazet <edumazet@google.com> |
geneve: use DEV_STATS_INC() geneve updates dev->stats fields locklessly. Adopt DEV_STATS_INC() to avoid races. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240104163633.2070538-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
69d72587 |
|
20-Oct-2023 |
Beniamino Galvani <b.galvani@gmail.com> |
geneve: use generic function for tunnel IPv6 route lookup The route lookup can be done now via generic function udp_tunnel6_dst_lookup() to replace the custom implementation in geneve_get_v6_dst(). This is similar to what already done for IPv4 in commit daa2ba7ed1d1 ("geneve: use generic function for tunnel IPv4 route lookup"). Suggested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Beniamino Galvani <b.galvani@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
daa2ba7e |
|
16-Oct-2023 |
Beniamino Galvani <b.galvani@gmail.com> |
geneve: use generic function for tunnel IPv4 route lookup The route lookup can be done now via generic function udp_tunnel_dst_lookup() to replace the custom implementation in geneve_get_v4_rt(). Suggested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Beniamino Galvani <b.galvani@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
60a77d11 |
|
16-Oct-2023 |
Beniamino Galvani <b.galvani@gmail.com> |
geneve: add dsfield helper function Add a helper function to compute the tos/dsfield. In this way, we can factor out some duplicate code. Also, the helper will be called from more places in the next commit. Suggested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Beniamino Galvani <b.galvani@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
251d5a28 |
|
19-Mar-2023 |
Josef Miegl <josef@miegl.cz> |
net: geneve: accept every ethertype The Geneve encapsulation, as defined in RFC 8926, has a Protocol Type field, which states the Ethertype of the payload appearing after the Geneve header. Commit 435fe1c0c1f7 ("net: geneve: support IPv4/IPv6 as inner protocol") introduced a new IFLA_GENEVE_INNER_PROTO_INHERIT flag that allowed the use of other Ethertypes than Ethernet. However, it did not get rid of a restriction that prohibits receiving payloads other than Ethernet, instead the commit white-listed additional Ethertypes, IPv4 and IPv6. This patch removes this restriction, making it possible to receive any Ethertype as a payload, if the IFLA_GENEVE_INNER_PROTO_INHERIT flag is set. The restriction was set in place back in commit 0b5e8b8eeae4 ("net: Add Geneve tunneling protocol driver"), which implemented a protocol layer driver for Geneve to be used with Open vSwitch. The relevant discussion about introducing the Ethertype white-list can be found here: https://lore.kernel.org/netdev/CAEP_g=_1q3ACX5NTHxLDnysL+dTMUVzdLpgw1apLKEdDSWPztw@mail.gmail.com/ <quote> >> + if (unlikely(geneveh->proto_type != htons(ETH_P_TEB))) > > Why? I thought the point of geneve carrying protocol field was to > allow protocols other than Ethernet... is this temporary maybe? Yes, it is temporary. Currently OVS only handles Ethernet packets but this restriction can be lifted once we have a consumer that is capable of handling other protocols. </quote> This white-list was then ported to a generic Geneve netdevice in commit 371bd1061d29 ("geneve: Consolidate Geneve functionality in single module."). Preserving the Ethertype white-list at this point made sense, as the Geneve device could send out only Ethernet payloads anyways. However, now that the Geneve netdevice supports encapsulating other payloads with IFLA_GENEVE_INNER_PROTO_INHERIT and we have a consumer capable of other protocols, it seems appropriate to lift the restriction and allow any Geneve payload to be received. Signed-off-by: Josef Miegl <josef@miegl.cz> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Eyal Birger <eyal.birger@gmail.com> Link: https://lore.kernel.org/r/20230319220954.21834-1-josef@miegl.cz Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
#
45ef71d1 |
|
12-Mar-2023 |
Josef Miegl <josef@miegl.cz> |
net: geneve: set IFF_POINTOPOINT with IFLA_GENEVE_INNER_PROTO_INHERIT The GENEVE tunnel used with IFLA_GENEVE_INNER_PROTO_INHERIT is point-to-point, so set IFF_POINTOPOINT to reflect that. Signed-off-by: Josef Miegl <josef@miegl.cz> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1d997f10 |
|
28-Oct-2022 |
Hangbin Liu <liuhangbin@gmail.com> |
rtnetlink: pass netlink message header and portid to rtnl_configure_link() This patch pass netlink message header and portid to rtnl_configure_link() All the functions in this call chain need to add the parameters so we can use them in the last call rtnl_notify(), and notify the userspace about the new link info if NLM_F_ECHO flag is set. - rtnl_configure_link() - __dev_notify_flags() - rtmsg_ifinfo() - rtmsg_ifinfo_event() - rtmsg_ifinfo_build_skb() - rtmsg_ifinfo_send() - rtnl_notify() Also move __dev_notify_flags() declaration to net/core/dev.h, as Jakub suggested. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
fb3ceec1 |
|
30-Aug-2022 |
Wolfram Sang <wsa+renesas@sang-engineering.com> |
net: move from strlcpy with unused retval to strscpy Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN Link: https://lore.kernel.org/r/20220830201457.7984-1-wsa+renesas@sang-engineering.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
35ffb665 |
|
23-Aug-2022 |
Richard Gobert <richardbgobert@gmail.com> |
net: gro: skb_gro_header helper function Introduce a simple helper function to replace a common pattern. When accessing the GRO header, we fetch the pointer from frag0, then test its validity and fetch it from the skb when necessary. This leads to the pattern skb_gro_header_fast -> skb_gro_header_hard -> skb_gro_header_slow recurring many times throughout GRO code. This patch replaces these patterns with a single inlined function call, improving code readability. Signed-off-by: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220823071034.GA56142@debian Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
#
ca2bb695 |
|
05-Aug-2022 |
Matthias May <matthias.may@westermo.com> |
geneve: do not use RT_TOS for IPv6 flowlabel According to Guillaume Nault RT_TOS should never be used for IPv6. Quote: RT_TOS() is an old macro used to interprete IPv4 TOS as described in the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4 code, although, given the current state of the code, most of the existing calls have no consequence. But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS" field to be interpreted the RFC 1349 way. There's no historical compatibility to worry about. Fixes: 3a56f86f1be6 ("geneve: handle ipv6 priority like ipv4 tos") Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Matthias May <matthias.may@westermo.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
b4ab94d6 |
|
05-Aug-2022 |
Matthias May <matthias.may@westermo.com> |
geneve: fix TOS inheriting for ipv4 The current code retrieves the TOS field after the lookup on the ipv4 routing table. The routing process currently only allows routing based on the original 3 TOS bits, and not on the full 6 DSCP bits. As a result the retrieved TOS is cut to the 3 bits. However for inheriting purposes the full 6 bits should be used. Extract the full 6 bits before the route lookup and use that instead of the cut off 3 TOS bits. Fixes: e305ac6cf5a1 ("geneve: Add support to collect tunnel metadata.") Signed-off-by: Matthias May <matthias.may@westermo.com> Acked-by: Guillaume Nault <gnault@redhat.com> Link: https://lore.kernel.org/r/20220805190006.8078-1-matthias.may@westermo.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
861396ac |
|
25-Jul-2022 |
Paul Chaignon <paul@isovalent.com> |
geneve: Use ip_tunnel_key flow flags in route lookups Use the new ip_tunnel_key field with the flow flags in the IPv4 route lookups for the encapsulated packet. This will be used by the bpf_skb_set_tunnel_key helper in the subsequent commit. Signed-off-by: Paul Chaignon <paul@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/fcc2e0eea01e8ea465a180126366ec20596ba530.1658759380.git.paul@isovalent.com
|
#
f623f83a |
|
13-Apr-2022 |
Paolo Abeni <pabeni@redhat.com> |
geneve: avoid indirect calls in GRO path, when possible In the most common setups, the geneve tunnels use an inner ethernet encapsulation. In the GRO path, when such condition is true, we can call directly the relevant GRO helper and avoid a few indirect calls. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
36c2e31a |
|
21-Mar-2022 |
Eyal Birger <eyal.birger@gmail.com> |
net: geneve: add missing netlink policy and size for IFLA_GENEVE_INNER_PROTO_INHERIT Add missing netlink attribute policy and size calculation. Also enable strict validation from this new attribute onwards. Fixes: 435fe1c0c1f7 ("net: geneve: support IPv4/IPv6 as inner protocol") Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Link: https://lore.kernel.org/r/20220322043954.3042468-1-eyal.birger@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
435fe1c0 |
|
16-Mar-2022 |
Eyal Birger <eyal.birger@gmail.com> |
net: geneve: support IPv4/IPv6 as inner protocol This patch adds support for encapsulating IPv4/IPv6 within GENEVE. In order to use this, a new IFLA_GENEVE_INNER_PROTO_INHERIT flag needs to be provided at device creation. This property cannot be changed for the time being. In case IP traffic is received on a non-tun device the drop count is increased. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Link: https://lore.kernel.org/r/20220316061557.431872-1-eyal.birger@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
#
baebdf48 |
|
11-Feb-2022 |
Sebastian Andrzej Siewior <bigeasy@linutronix.de> |
net: dev: Makes sure netif_rx() can be invoked in any context. Dave suggested a while ago (eleven years by now) "Let's make netif_rx() work in all contexts and get rid of netif_rx_ni()". Eric agreed and pointed out that modern devices should use netif_receive_skb() to avoid the overhead. In the meantime someone added another variant, netif_rx_any_context(), which behaves as suggested. netif_rx() must be invoked with disabled bottom halves to ensure that pending softirqs, which were raised within the function, are handled. netif_rx_ni() can be invoked only from process context (bottom halves must be enabled) because the function handles pending softirqs without checking if bottom halves were disabled or not. netif_rx_any_context() invokes on the former functions by checking in_interrupts(). netif_rx() could be taught to handle both cases (disabled and enabled bottom halves) by simply disabling bottom halves while invoking netif_rx_internal(). The local_bh_enable() invocation will then invoke pending softirqs only if the BH-disable counter drops to zero. Eric is concerned about the overhead of BH-disable+enable especially in regard to the loopback driver. As critical as this driver is, it will receive a shortcut to avoid the additional overhead which is not needed. Add a local_bh_disable() section in netif_rx() to ensure softirqs are handled if needed. Provide __netif_rx() which does not disable BH and has a lockdep assert to ensure that interrupts are disabled. Use this shortcut in the loopback driver and in drivers/net/*.c. Make netif_rx_ni() and netif_rx_any_context() invoke netif_rx() so they can be removed once they are no more users left. Link: https://lkml.kernel.org/r/20100415.020246.218622820.davem@davemloft.net Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
627b94f7 |
|
23-Nov-2021 |
Eric Dumazet <edumazet@google.com> |
gro: remove rcu_read_lock/rcu_read_unlock from gro_complete handlers All gro_complete() handlers are called from napi_gro_complete() while rcu_read_lock() has been called. There is no point stacking more rcu_read_lock() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
fc1ca334 |
|
23-Nov-2021 |
Eric Dumazet <edumazet@google.com> |
gro: remove rcu_read_lock/rcu_read_unlock from gro_receive handlers All gro_receive() handlers are called from dev_gro_receive() while rcu_read_lock() has been called. There is no point stacking more rcu_read_lock() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
4721031c |
|
15-Nov-2021 |
Eric Dumazet <edumazet@google.com> |
net: move gro definitions to include/net/gro.h include/linux/netdevice.h became too big, move gro stuff into include/net/gro.h Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
d13f048d |
|
22-Apr-2021 |
Phillip Potter <phil@philpotter.co.uk> |
net: geneve: modify IP header check in geneve6_xmit_skb and geneve_xmit_skb Modify the header size check in geneve6_xmit_skb and geneve_xmit_skb to use pskb_inet_may_pull rather than pskb_network_may_pull. This fixes two kernel selftest failures introduced by the commit introducing the checks: IPv4 over geneve6: PMTU exceptions IPv4 over geneve6: PMTU exceptions - nexthop objects It does this by correctly accounting for the fact that IPv4 packets may transit over geneve IPv6 tunnels (and vice versa), and still fixes the uninit-value bug fixed by the original commit. Reported-by: kernel test robot <oliver.sang@intel.com> Fixes: 6628ddfec758 ("net: geneve: check skb is large enough for IPv4/IPv6 header") Suggested-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Phillip Potter <phil@philpotter.co.uk> Acked-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
61630c4f |
|
29-Mar-2021 |
Paolo Abeni <pabeni@redhat.com> |
geneve: allow UDP L4 GRO passthrou Similar to the previous commit, let even geneve passthrou the L4 GRO packets Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
6628ddfe |
|
10-Apr-2021 |
Phillip Potter <phil@philpotter.co.uk> |
net: geneve: check skb is large enough for IPv4/IPv6 header Check within geneve_xmit_skb/geneve6_xmit_skb that sk_buff structure is large enough to include IPv4 or IPv6 header, and reject if not. The geneve_xmit_skb portion and overall idea was contributed by Eric Dumazet. Fixes a KMSAN-found uninit-value bug reported by syzbot at: https://syzkaller.appspot.com/bug?id=abe95dc3e3e9667fc23b8d81f29ecad95c6f106f Suggested-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot+2e406a9ac75bb71d4b7a@syzkaller.appspotmail.com Signed-off-by: Phillip Potter <phil@philpotter.co.uk> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
68c1a943e |
|
25-Mar-2021 |
Antoine Tenart <atenart@kernel.org> |
geneve: do not modify the shared tunnel info when PMTU triggers an ICMP reply When the interface is part of a bridge or an Open vSwitch port and a packet exceed a PMTU estimate, an ICMP reply is sent to the sender. When using the external mode (collect metadata) the source and destination addresses are reversed, so that Open vSwitch can match the packet against an existing (reverse) flow. But inverting the source and destination addresses in the shared ip_tunnel_info will make following packets of the flow to use a wrong destination address (packets will be tunnelled to itself), if the flow isn't updated. Which happens with Open vSwitch, until the flow times out. Fixes this by uncloning the skb's ip_tunnel_info before inverting its source and destination addresses, so that the modification will only be made for the PTMU packet, not the following ones. Fixes: c1a800e88dbf ("geneve: Support for PMTU discovery on directly bridged links") Tested-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
18423e1a |
|
15-Jan-2021 |
Xin Long <lucien.xin@gmail.com> |
geneve: add NETIF_F_FRAGLIST flag for dev features Some protocol HW GSO requires fraglist supported by the device, like SCTP. Without NETIF_F_FRAGLIST set in the dev features of geneve, it would have to do SW GSO before the packets enter the driver, even when the geneve dev and lower dev (like veth) both have the feature of NETIF_F_GSO_SCTP. So this patch is to add it for geneve. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
dedc33e7 |
|
06-Jan-2021 |
Jakub Kicinski <kuba@kernel.org> |
udp_tunnel: remove REGISTER/UNREGISTER handling from tunnel drivers udp_tunnel_nic handles REGISTER and UNREGISTER event, now that all drivers use that infra we can drop the event handling in the tunnel drivers. Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
c02bd115 |
|
09-Dec-2020 |
Jakub Kicinski <kuba@kernel.org> |
Revert "geneve: pull IP header before ECN decapsulation" This reverts commit 4179b00c04d1 ("geneve: pull IP header before ECN decapsulation"). Eric says: "network header should have been pulled already before hitting geneve_rx()". Let's revert the syzbot fix since it's causing more harm than good, and revisit. Suggested-by: Eric Dumazet <edumazet@google.com> Reported-by: Jianlin Shi <jishi@redhat.com> Fixes: 4179b00c04d1 ("geneve: pull IP header before ECN decapsulation") Link: https://bugzilla.kernel.org/show_bug.cgi?id=210569 Link: https://lore.kernel.org/netdev/CANn89iJVWfb=2i7oU1=D55rOyQnBbbikf+Mc6XHMkY7YX-yGEw@mail.gmail.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
4179b00c |
|
01-Dec-2020 |
Eric Dumazet <edumazet@google.com> |
geneve: pull IP header before ECN decapsulation IP_ECN_decapsulate() and IP6_ECN_decapsulate() assume IP header is already pulled. geneve does not ensure this yet. Fixing this generically in IP_ECN_decapsulate() and IP6_ECN_decapsulate() is not possible, since callers pass a pointer that might be freed by pskb_may_pull() syzbot reported : BUG: KMSAN: uninit-value in __INET_ECN_decapsulate include/net/inet_ecn.h:238 [inline] BUG: KMSAN: uninit-value in INET_ECN_decapsulate+0x345/0x1db0 include/net/inet_ecn.h:260 CPU: 1 PID: 8941 Comm: syz-executor.0 Not tainted 5.10.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x21c/0x280 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x5f/0xa0 mm/kmsan/kmsan_instr.c:197 __INET_ECN_decapsulate include/net/inet_ecn.h:238 [inline] INET_ECN_decapsulate+0x345/0x1db0 include/net/inet_ecn.h:260 geneve_rx+0x2103/0x2980 include/net/inet_ecn.h:306 geneve_udp_encap_recv+0x105c/0x1340 drivers/net/geneve.c:377 udp_queue_rcv_one_skb+0x193a/0x1af0 net/ipv4/udp.c:2093 udp_queue_rcv_skb+0x282/0x1050 net/ipv4/udp.c:2167 udp_unicast_rcv_skb net/ipv4/udp.c:2325 [inline] __udp4_lib_rcv+0x399d/0x5880 net/ipv4/udp.c:2394 udp_rcv+0x5c/0x70 net/ipv4/udp.c:2564 ip_protocol_deliver_rcu+0x572/0xc50 net/ipv4/ip_input.c:204 ip_local_deliver_finish net/ipv4/ip_input.c:231 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] ip_local_deliver+0x583/0x8d0 net/ipv4/ip_input.c:252 dst_input include/net/dst.h:449 [inline] ip_rcv_finish net/ipv4/ip_input.c:428 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] ip_rcv+0x5c3/0x840 net/ipv4/ip_input.c:539 __netif_receive_skb_one_core net/core/dev.c:5315 [inline] __netif_receive_skb+0x1ec/0x640 net/core/dev.c:5429 process_backlog+0x523/0xc10 net/core/dev.c:6319 napi_poll+0x420/0x1010 net/core/dev.c:6763 net_rx_action+0x35c/0xd40 net/core/dev.c:6833 __do_softirq+0x1a9/0x6fa kernel/softirq.c:298 asm_call_irq_on_stack+0xf/0x20 </IRQ> __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline] run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline] do_softirq_own_stack+0x6e/0x90 arch/x86/kernel/irq_64.c:77 do_softirq kernel/softirq.c:343 [inline] __local_bh_enable_ip+0x184/0x1d0 kernel/softirq.c:195 local_bh_enable+0x36/0x40 include/linux/bottom_half.h:32 rcu_read_unlock_bh include/linux/rcupdate.h:730 [inline] __dev_queue_xmit+0x3a9b/0x4520 net/core/dev.c:4167 dev_queue_xmit+0x4b/0x60 net/core/dev.c:4173 packet_snd net/packet/af_packet.c:2992 [inline] packet_sendmsg+0x86f9/0x99d0 net/packet/af_packet.c:3017 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg net/socket.c:671 [inline] __sys_sendto+0x9dc/0xc80 net/socket.c:1992 __do_sys_sendto net/socket.c:2004 [inline] __se_sys_sendto+0x107/0x130 net/socket.c:2000 __x64_sys_sendto+0x6e/0x90 net/socket.c:2000 do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 2d07dc79fe04 ("geneve: add initial netdev driver for GENEVE tunnels") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Link: https://lore.kernel.org/r/20201201090507.4137906-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
cc69837f |
|
20-Nov-2020 |
Jakub Kicinski <kuba@kernel.org> |
net: don't include ethtool.h from netdevice.h linux/netdevice.h is included in very many places, touching any of its dependecies causes large incremental builds. Drop the linux/ethtool.h include, linux/netdevice.h just needs a forward declaration of struct ethtool_ops. Fix all the places which made use of this implicit include. Acked-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Shannon Nelson <snelson@pensando.io> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Link: https://lore.kernel.org/r/20201120225052.1427503-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
9c2e14b4 |
|
10-Nov-2020 |
Yi-Hung Wei <yihung.wei@gmail.com> |
ip_tunnels: Set tunnel option flag when tunnel metadata is present Currently, we may set the tunnel option flag when the size of metadata is zero. For example, we set TUNNEL_GENEVE_OPT in the receive function no matter the geneve option is present or not. As this may result in issues on the tunnel flags consumers, this patch fixes the issue. Related discussion: * https://lore.kernel.org/netdev/1604448694-19351-1-git-send-email-yihung.wei@gmail.com/T/#u Fixes: 256c87c17c53 ("net: check tunnel option type in tunnel flags") Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Link: https://lore.kernel.org/r/1605053800-74072-1-git-send-email-yihung.wei@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
b220a4a7 |
|
07-Nov-2020 |
Heiner Kallweit <hkallweit1@gmail.com> |
net: switch to dev_get_tstats64 Replace ip_tunnel_get_stats64() with the new identical core function dev_get_tstats64(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
1e84527b |
|
05-Oct-2020 |
Fabian Frederick <fabf@skynet.be> |
geneve: use dev_sw_netstats_rx_add() use new helper for netstats settings Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
34beb215 |
|
16-Sep-2020 |
Mark Gray <mark.d.gray@redhat.com> |
geneve: add transport ports in route lookup for geneve This patch adds transport ports information for route lookup so that IPsec can select Geneve tunnel traffic to do encryption. This is needed for OVS/OVN IPsec with encrypted Geneve tunnels. This can be tested by configuring a host-host VPN using an IKE daemon and specifying port numbers. For example, for an Openswan-type configuration, the following parameters should be configured on both hosts and IPsec set up as-per normal: $ cat /etc/ipsec.conf conn in ... left=$IP1 right=$IP2 ... leftprotoport=udp/6081 rightprotoport=udp ... conn out ... left=$IP1 right=$IP2 ... leftprotoport=udp rightprotoport=udp/6081 ... The tunnel can then be setup using "ip" on both hosts (but changing the relevant IP addresses): $ ip link add tun type geneve id 1000 remote $IP2 $ ip addr add 192.168.0.1/24 dev tun $ ip link set tun up This can then be tested by pinging from $IP1: $ ping 192.168.0.2 Without this patch the traffic is unencrypted on the wire. Fixes: 2d07dc79fe04 ("geneve: add initial netdev driver for GENEVE tunnels") Signed-off-by: Qiuyu Xiao <qiuyu.xiao.qyx@gmail.com> Signed-off-by: Mark Gray <mark.d.gray@redhat.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c1a800e8 |
|
03-Aug-2020 |
Stefano Brivio <sbrivio@redhat.com> |
geneve: Support for PMTU discovery on directly bridged links If the interface is a bridge or Open vSwitch port, and we can't forward a packet because it exceeds the local PMTU estimate, trigger an ICMP or ICMPv6 reply to the sender, using the same interface to forward it back. If metadata collection is enabled, set destination and source addresses for the flow as if we were receiving the packet, so that Open vSwitch can match the ICMP error against the existing association. v2: Use netif_is_any_bridge_port() (David Ahern) Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
4cb47a86 |
|
03-Aug-2020 |
Stefano Brivio <sbrivio@redhat.com> |
tunnels: PMTU discovery support for directly bridged IP packets It's currently possible to bridge Ethernet tunnels carrying IP packets directly to external interfaces without assigning them addresses and routes on the bridged network itself: this is the case for UDP tunnels bridged with a standard bridge or by Open vSwitch. PMTU discovery is currently broken with those configurations, because the encapsulation effectively decreases the MTU of the link, and while we are able to account for this using PMTU discovery on the lower layer, we don't have a way to relay ICMP or ICMPv6 messages needed by the sender, because we don't have valid routes to it. On the other hand, as a tunnel endpoint, we can't fragment packets as a general approach: this is for instance clearly forbidden for VXLAN by RFC 7348, section 4.3: VTEPs MUST NOT fragment VXLAN packets. Intermediate routers may fragment encapsulated VXLAN packets due to the larger frame size. The destination VTEP MAY silently discard such VXLAN fragments. The same paragraph recommends that the MTU over the physical network accomodates for encapsulations, but this isn't a practical option for complex topologies, especially for typical Open vSwitch use cases. Further, it states that: Other techniques like Path MTU discovery (see [RFC1191] and [RFC1981]) MAY be used to address this requirement as well. Now, PMTU discovery already works for routed interfaces, we get route exceptions created by the encapsulation device as they receive ICMP Fragmentation Needed and ICMPv6 Packet Too Big messages, and we already rebuild those messages with the appropriate MTU and route them back to the sender. Add the missing bits for bridged cases: - checks in skb_tunnel_check_pmtu() to understand if it's appropriate to trigger a reply according to RFC 1122 section 3.2.2 for ICMP and RFC 4443 section 2.4 for ICMPv6. This function is already called by UDP tunnels - a new function generating those ICMP or ICMPv6 replies. We can't reuse icmp_send() and icmp6_send() as we don't see the sender as a valid destination. This doesn't need to be generic, as we don't cover any other type of ICMP errors given that we only provide an encapsulation function to the sender While at it, make the MTU check in skb_tunnel_check_pmtu() accurate: we might receive GSO buffers here, and the passed headroom already includes the inner MAC length, so we don't have to account for it a second time (that would imply three MAC headers on the wire, but there are just two). This issue became visible while bridging IPv6 packets with 4500 bytes of payload over GENEVE using IPv4 with a PMTU of 4000. Given the 50 bytes of encapsulation headroom, we would advertise MTU as 3950, and we would reject fragmented IPv6 datagrams of 3958 bytes size on the wire. We're exclusively dealing with network MTU here, though, so we could get Ethernet frames up to 3964 octets in that case. v2: - moved skb_tunnel_check_pmtu() to ip_tunnel_core.c (David Ahern) - split IPv4/IPv6 functions (David Ahern) Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
32818c07 |
|
22-Jul-2020 |
Cong Wang <xiyou.wangcong@gmail.com> |
geneve: fix an uninitialized value in geneve_changelink() geneve_nl2info() sets 'df' conditionally, so we have to initialize it by copying the value from existing geneve device in geneve_changelink(). Fixes: 56c09de347e4 ("geneve: allow changing DF behavior after creation") Reported-by: syzbot+7ebc2e088af5e4c0c9fa@syzkaller.appspotmail.com Cc: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
cc4e3835 |
|
09-Jul-2020 |
Jakub Kicinski <kuba@kernel.org> |
udp_tunnel: add central NIC RX port offload infrastructure Cater to devices which: (a) may want to sleep in the callbacks; (b) only have IPv4 support; (c) need all the programming to happen while the netdev is up. Drivers attach UDP tunnel offload info struct to their netdevs, where they declare how many UDP ports of various tunnel types they support. Core takes care of tracking which ports to offload. Use a fixed-size array since this matches what almost all drivers do, and avoids a complexity and uncertainty around memory allocations in an atomic context. Make sure that tunnel drivers don't try to replay the ports when new NIC netdev is registered. Automatic replays would mess up reference counting, and will be removed completely once all drivers are converted. v4: - use a #define NULL to avoid build issues with CONFIG_INET=n. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
9e06e859 |
|
06-Jul-2020 |
Sabrina Dubroca <sd@queasysnail.net> |
geneve: move all configuration under struct geneve_config This patch adds a new structure geneve_config and moves the per-device configuration attributes to it, like we already have in VXLAN with struct vxlan_config. This ends up being pretty invasive since those attributes are used everywhere. This allows us to clean up the argument lists for geneve_configure (4 arguments instead of 8) and geneve_nl2info (5 instead of 9). This also reduces the copy-paste of code setting those attributes between geneve_configure and geneve_changelink to a single memcpy, which would have avoided the bug fixed in commit 56c09de347e4 ("geneve: allow changing DF behavior after creation"). Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
56c09de3 |
|
17-Jun-2020 |
Sabrina Dubroca <sd@queasysnail.net> |
geneve: allow changing DF behavior after creation Currently, trying to change the DF parameter of a geneve device does nothing: # ip -d link show geneve1 14: geneve1: <snip> link/ether <snip> geneve id 1 remote 10.0.0.1 ttl auto df set dstport 6081 <snip> # ip link set geneve1 type geneve id 1 df unset # ip -d link show geneve1 14: geneve1: <snip> link/ether <snip> geneve id 1 remote 10.0.0.1 ttl auto df set dstport 6081 <snip> We just need to update the value in geneve_changelink. Fixes: a025fb5f49ad ("geneve: Allow configuration of DF behaviour") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
9d149045 |
|
03-Jun-2020 |
Jiri Benc <jbenc@redhat.com> |
geneve: change from tx_error to tx_dropped on missing metadata If the geneve interface is in collect_md (external) mode, it can't send any packets submitted directly to its net interface, as such packets won't have metadata attached. This is expected. However, the kernel itself sends some packets to the interface, most notably, IPv6 DAD, IPv6 multicast listener reports, etc. This is not wrong, as tunnel metadata can be specified in routing table (although technically, that has never worked for IPv6, but hopefully will be fixed eventually) and then the interface must correctly participate in IPv6 housekeeping. The problem is that any such attempt increases the tx_error counter. Just bringing up a geneve interface with IPv6 enabled is enough to see a number of tx_errors. That causes confusion among users, prompting them to find a network error where there is none. Change the counter used to tx_dropped. That better conveys the meaning (there's nothing wrong going on, just some packets are getting dropped) and hopefully will make admins panic less. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
9a7b5b50 |
|
22-Apr-2020 |
Sabrina Dubroca <sd@queasysnail.net> |
geneve: use the correct nlattr array in NL_SET_ERR_MSG_ATTR IFLA_GENEVE_* attributes are in the data array, which is correctly used when fetching the value, but not when setting the extended ack. Because IFLA_GENEVE_MAX < IFLA_MAX, we avoid out of bounds array accesses, but we don't provide a pointer to the invalid attribute to userspace. Fixes: a025fb5f49ad ("geneve: Allow configuration of DF behaviour") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
0fda7600 |
|
14-Mar-2020 |
Florian Westphal <fw@strlen.de> |
geneve: move debug check after netdev unregister The debug check must be done after unregister_netdevice_many() call -- the list_del() for this is done inside .ndo_stop. Fixes: 2843a25348f8 ("geneve: speedup geneve tunnels dismantle") Reported-and-tested-by: <syzbot+68a8ed58e3d17c700de5@syzkaller.appspotmail.com> Cc: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c593642c |
|
09-Dec-2019 |
Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> |
treewide: Use sizeof_field() macro Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except at places where these are defined. Later patches will remove the unused definition of FIELD_SIZEOF(). This patch is generated using following script: EXCLUDE_FILES="include/linux/stddef.h|include/linux/kernel.h" git grep -l -e "\bFIELD_SIZEOF\b" | while read file; do if [[ "$file" =~ $EXCLUDE_FILES ]]; then continue fi sed -i -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file; done Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: David Miller <davem@davemloft.net> # for net
|
#
6c8991f4 |
|
04-Dec-2019 |
Sabrina Dubroca <sd@queasysnail.net> |
net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookup ipv6_stub uses the ip6_dst_lookup function to allow other modules to perform IPv6 lookups. However, this function skips the XFRM layer entirely. All users of ipv6_stub->ip6_dst_lookup use ip_route_output_flow (via the ip_route_output_key and ip_route_output helpers) for their IPv4 lookups, which calls xfrm_lookup_route(). This patch fixes this inconsistent behavior by switching the stub to ip6_dst_lookup_flow, which also calls xfrm_lookup_route(). This requires some changes in all the callers, as these two functions take different arguments and have different return types. Fixes: 5f81bd2e5d80 ("ipv6: export a stub for IPv6 symbols used by vxlan") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
842841ec |
|
02-Sep-2019 |
Dave Taht <dave.taht@gmail.com> |
Convert usage of IN_MULTICAST to ipv4_is_multicast IN_MULTICAST's primary intent is as a uapi macro. Elsewhere in the kernel we use ipv4_is_multicast consistently. This patch unifies linux's multicast checks to use that function rather than this macro. Signed-off-by: Dave Taht <dave.taht@gmail.com> Reviewed-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
d2912cb1 |
|
04-Jun-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
eccc73a6 |
|
10-Jun-2019 |
Stefano Brivio <sbrivio@redhat.com> |
geneve: Don't assume linear buffers in error handler In commit a07966447f39 ("geneve: ICMP error lookup handler") I wrongly assumed buffers from icmp_socket_deliver() would be linear. This is not the case: icmp_socket_deliver() only guarantees we have 8 bytes of linear data. Eric fixed this same issue for fou and fou6 in commits 26fc181e6cac ("fou, fou6: do not assume linear skbs") and 5355ed6388e2 ("fou, fou6: avoid uninit-value in gue_err() and gue6_err()"). Use pskb_may_pull() instead of checking skb->len, and take into account the fact we later access the GENEVE header with udp_hdr(), so we also need to sum skb_transport_header() here. Reported-by: Guillaume Nault <gnault@redhat.com> Fixes: a07966447f39 ("geneve: ICMP error lookup handler") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
3616d08b |
|
22-Mar-2019 |
David Ahern <dsahern@gmail.com> |
ipv6: Move ipv6 stubs to a separate header file The number of stubs is growing and has nothing to do with addrconf. Move the definition of the stubs to a separate header file and update users. In the move, drop the vxlan specific comment before ipv6_stub. Code move only; no functional change intended. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
974eff2b |
|
21-Mar-2019 |
Moshe Shemesh <moshe@mellanox.com> |
net: Move the definition of the default Geneve udp port to public header file Move the definition of the default Geneve udp port from the geneve source to the header file, so we can re-use it from drivers. Modify existing drivers to use it. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Cc: John Hurley <john.hurley@netronome.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
cf1c9ccb |
|
28-Feb-2019 |
Jiri Benc <jbenc@redhat.com> |
geneve: correctly handle ipv6.disable module parameter When IPv6 is compiled but disabled at runtime, geneve_sock_add returns -EAFNOSUPPORT. For metadata based tunnels, this causes failure of the whole operation of bringing up the tunnel. Ignore failure of IPv6 socket creation for metadata based tunnels caused by IPv6 not being available. This is the same fix as what commit d074bf960044 ("vxlan: correctly handle ipv6.disable module parameter") is doing for vxlan. Note there's also commit c0a47e44c098 ("geneve: should not call rt6_lookup() when ipv6 was disabled") which fixes a similar issue but for regular tunnels, while this patch is needed for metadata based tunnels. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c0a47e44 |
|
07-Feb-2019 |
Hangbin Liu <liuhangbin@gmail.com> |
geneve: should not call rt6_lookup() when ipv6 was disabled When we add a new GENEVE device with IPv6 remote, checking only for IS_ENABLED(CONFIG_IPV6) is not enough as we may disable IPv6 in the kernel command line (ipv6.disable=1), and calling rt6_lookup() would cause a NULL pointer dereference. v2: - don't mix declarations and code (reported by Stefano Brivio, Eric Dumazet) - there's no need to use in6_dev_get() as we only need to check that idev exists (reported by David Ahern). This is under RTNL, so we can simply use __in6_dev_get() instead (Stefano, Eric). Reported-by: Jianlin Shi <jishi@redhat.com> Fixes: c40e89fd358e9 ("geneve: configure MTU based on a lower device") Cc: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8a962c4a |
|
16-Nov-2018 |
Nathan Chancellor <nathan@kernel.org> |
geneve: Initialize addr6 with memset Clang warns: drivers/net/geneve.c:428:29: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces] struct in6_addr addr6 = { 0 }; ^ {} Rather than trying to appease the various compilers that support the kernel, use memset, which is unambiguous. Fixes: a07966447f39 ("geneve: ICMP error lookup handler") Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a025fb5f |
|
07-Nov-2018 |
Stefano Brivio <sbrivio@redhat.com> |
geneve: Allow configuration of DF behaviour draft-ietf-nvo3-geneve-08 says: It is strongly RECOMMENDED that Path MTU Discovery ([RFC1191], [RFC1981]) be used by setting the DF bit in the IP header when Geneve packets are transmitted over IPv4 (this is the default with IPv6). Now that ICMP error handling is working for GENEVE, we can comply with this recommendation. Make this configurable, though, to avoid breaking existing setups. By default, DF won't be set. It can be set or inherited from inner IPv4 packets. If it's configured to be inherited and we are encapsulating IPv6, it will be set. This only applies to non-lwt tunnels: if an external control plane is used, tunnel key will still control the DF flag. v2: - DF behaviour configuration only applies for non-lwt tunnels, apply DF setting only if (!geneve->collect_md) in geneve_xmit_skb() (Stephen Hemminger) Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a0796644 |
|
07-Nov-2018 |
Stefano Brivio <sbrivio@redhat.com> |
geneve: ICMP error lookup handler Export an encap_err_lookup() operation to match an ICMP error against a valid VNI. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
d0522f1c |
|
06-Nov-2018 |
David Ahern <dsahern@gmail.com> |
net: Add extack argument to rtnl_create_link Add extack arg to rtnl_create_link and add messages for invalid number of Tx or Rx queues. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
6b4f92af |
|
12-Oct-2018 |
Stefano Brivio <sbrivio@redhat.com> |
geneve, vxlan: Don't set exceptions if skb->len < mtu We shouldn't abuse exceptions: if the destination MTU is already higher than what we're transmitting, no exception should be created. Fixes: 52a589d51f10 ("geneve: update skb dst pmtu on tx path") Fixes: a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
7463e4f9 |
|
12-Oct-2018 |
Stefano Brivio <sbrivio@redhat.com> |
geneve, vxlan: Don't check skb_dst() twice Commit f15ca723c1eb ("net: don't call update_pmtu unconditionally") avoids that we try updating PMTU for a non-existent destination, but didn't clean up cases where the check was already explicit. Drop those redundant checks. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a97d97ba |
|
29-Sep-2018 |
Hangbin Liu <liuhangbin@gmail.com> |
geneve: allow to clear ttl inherit As Michal remaind, we should allow to clear ttl inherit. Then we will have three states: 1. set the flag, and do ttl inherit. 2. do not set the flag, use configured ttl value, or default ttl (0) if not set. 3. disable ttl inherit, use previous configured ttl value, or default ttl (0). Fixes: 52d0d404d39dd ("geneve: add ttl inherit support") CC: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
52d0d404 |
|
11-Sep-2018 |
Hangbin Liu <liuhangbin@gmail.com> |
geneve: add ttl inherit support Similar with commit 72f6d71e491e6 ("vxlan: add ttl inherit support"), currently ttl == 0 means "use whatever default value" on geneve instead of inherit inner ttl. To respect compatibility with old behavior, let's add a new IFLA_GENEVE_TTL_INHERIT for geneve ttl inherit support. Reported-by: Jianlin Shi <jishi@redhat.com> Suggested-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
603d4cf8 |
|
30-Jun-2018 |
Sabrina Dubroca <sd@queasysnail.net> |
net: fix use-after-free in GRO with ESP Since the addition of GRO for ESP, gro_receive can consume the skb and return -EINPROGRESS. In that case, the lower layer GRO handler cannot touch the skb anymore. Commit 5f114163f2f5 ("net: Add a skb_gro_flush_final helper.") converted some of the gro_receive handlers that can lead to ESP's gro_receive so that they wouldn't access the skb when -EINPROGRESS is returned, but missed other spots, mainly in tunneling protocols. This patch finishes the conversion to using skb_gro_flush_final(), and adds a new helper, skb_gro_flush_final_remcsum(), used in VXLAN and GUE. Fixes: 5f114163f2f5 ("net: Add a skb_gro_flush_final helper.") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
256c87c1 |
|
26-Jun-2018 |
Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> |
net: check tunnel option type in tunnel flags Check the tunnel option type stored in tunnel flags when creating options for tunnels. Thereby ensuring we do not set geneve, vxlan or erspan tunnel options on interfaces that are not associated with them. Make sure all users of the infrastructure set correct flags, for the BPF helper we have to set all bits to keep backward compatibility. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
d4546c25 |
|
23-Jun-2018 |
David Miller <davem@davemloft.net> |
net: Convert GRO SKB handling to list_head. Manage pending per-NAPI GRO packets via list_head. Return an SKB pointer from the GRO receive handlers. When GRO receive handlers return non-NULL, it means that this SKB needs to be completed at this time and removed from the NAPI queue. Several operations are greatly simplified by this transformation, especially timing out the oldest SKB in the list when gro_count exceeds MAX_GRO_SKBS, and napi_gro_flush() which walks the queue in reverse order. Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c40e89fd |
|
19-Apr-2018 |
Alexey Kodanev <alexey.kodanev@oracle.com> |
geneve: configure MTU based on a lower device Currently, on a new link creation or when 'remote' address parameter is updated, an MTU is not changed and always equals 1500. When a lower device has a larger MTU, it might not be efficient, e.g. for UDP, and requires the manual MTU adjustments to match the MTU of the lower device. This patch tries to automate this process, finds a lower device using the 'remote' address parameter, then uses its MTU to tune GENEVE's MTU: * on a new link creation * when 'remote' parameter is changed Also with this patch, the MTU from a user, on a new link creation, is passed to geneve_change_mtu() where it is verified, and MTU adjustments with a lower device is skipped in that case. Prior that change, it was possible to set the invalid MTU values on a new link creation. Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
321acc1c |
|
19-Apr-2018 |
Alexey Kodanev <alexey.kodanev@oracle.com> |
geneve: check MTU for a minimum in geneve_change_mtu() geneve_change_mtu() will be used not only as ndo_change_mtu() callback, but also to verify a user specified MTU on a new link creation in the next patch. Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
5edbea69 |
|
19-Apr-2018 |
Alexey Kodanev <alexey.kodanev@oracle.com> |
geneve: cleanup hard coded value for Ethernet header length Use ETH_HLEN instead and introduce two new macros: GENEVE_IPV4_HLEN and GENEVE_IPV6_HLEN that include Ethernet header length, corresponded IP header length and GENEVE_BASE_HLEN. Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
4c52a889 |
|
19-Apr-2018 |
Alexey Kodanev <alexey.kodanev@oracle.com> |
geneve: remove white-space before '#if IS_ENABLED(CONFIG_IPV6)' Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
2f635cee |
|
27-Mar-2018 |
Kirill Tkhai <ktkhai@virtuozzo.com> |
net: Drop pernet_operations::async Synchronous pernet_operations are not allowed anymore. All are asynchronous. So, drop the structure member. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
f60f3346 |
|
26-Feb-2018 |
Kirill Tkhai <ktkhai@virtuozzo.com> |
net: Convert geneve_net_ops These pernet_operations are similar to bond_net_ops. Exit method unregisters all net geneve devices, and it looks like another pernet_operations are not interested in foreign net geneve list. So, it's possible to mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
f15ca723 |
|
25-Jan-2018 |
Nicolas Dichtel <nicolas.dichtel@6wind.com> |
net: don't call update_pmtu unconditionally Some dst_ops (e.g. md_dst_ops)) doesn't set this handler. It may result to: "BUG: unable to handle kernel NULL pointer dereference at (null)" Let's add a helper to check if update_pmtu is available before calling it. Fixes: 52a589d51f10 ("geneve: update skb dst pmtu on tx path") Fixes: a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path") CC: Roman Kapl <code@rkapl.cz> CC: Xin Long <lucien.xin@gmail.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
52a589d5 |
|
24-Dec-2017 |
Xin Long <lucien.xin@gmail.com> |
geneve: update skb dst pmtu on tx path Commit a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path") has fixed a performance issue caused by the change of lower dev's mtu for vxlan. The same thing needs to be done for geneve as well. Note that geneve cannot adjust it's mtu according to lower dev's mtu when creating it. The performance is very low later when netperfing over it without fixing the mtu manually. This patch could also avoid this issue. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
2843a253 |
|
16-Dec-2017 |
Haishuang Yan <yanhaishuang@cmss.chinamobile.com> |
geneve: speedup geneve tunnels dismantle Since we now hold RTNL lock in geneve_exit_net, it's better batch them to speedup geneve tunnel dismantle. Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
f9094b76 |
|
22-Nov-2017 |
Hangbin Liu <liuhangbin@gmail.com> |
geneve: only configure or fill UDP_ZERO_CSUM6_RX/TX info when CONFIG_IPV6 Stefano pointed that configure or show UDP_ZERO_CSUM6_RX/TX info doesn't make sense if we haven't enabled CONFIG_IPV6. Fix it by adding if IS_ENABLED(CONFIG_IPV6) check. Fixes: abe492b4f50c ("geneve: UDP checksum configuration via netlink") Fixes: fd7eafd02121 ("geneve: fix fill_info when link down") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
fd7eafd0 |
|
14-Nov-2017 |
Hangbin Liu <liuhangbin@gmail.com> |
geneve: fix fill_info when link down geneve->sock4/6 were added with geneve_open and released with geneve_stop. So when geneve link down, we will not able to show remote address and checksum info after commit 11387fe4a98 ("geneve: fix fill_info when using collect_metadata"). Fix this by avoid passing *_REMOTE{,6} for COLLECT_METADATA since they are mutually exclusive, and always show UDP_ZERO_CSUM6_RX info. Fixes: 11387fe4a98 ("geneve: fix fill_info when using collect_metadata") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ab384b63 |
|
12-Nov-2017 |
Vasily Averin <vvs@virtuozzo.com> |
geneve: exit_net cleanup check added Be sure that sock_list initialized in net_init hook was return to initial state. Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
3fa5f11d |
|
20-Oct-2017 |
Stefano Brivio <sbrivio@redhat.com> |
geneve: Get rid of is_all_zero(), streamline is_tnl_info_zero() No need to re-invent memchr_inv() with !is_all_zero(). While at it, replace conditional and return clauses with a single return clause in is_tnl_info_zero(). Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
772e97b5 |
|
19-Oct-2017 |
Stefano Brivio <sbrivio@redhat.com> |
geneve: Fix function matching VNI and tunnel ID on big-endian On big-endian machines, functions converting between tunnel ID and VNI use the three LSBs of tunnel ID storage to map VNI. The comparison function eq_tun_id_and_vni(), on the other hand, attempted to map the VNI from the three MSBs. Fix it by using the same check implemented on LE, which maps VNI from the three LSBs of tunnel ID. Fixes: 2e0b26e10352 ("geneve: Optimize geneve device lookup.") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c5ebc440 |
|
09-Aug-2017 |
Girish Moodalbail <girish.moodalbail@oracle.com> |
geneve: use netlink_ext_ack for error reporting in rtnl operations Add extack error messages for failure paths while creating/modifying geneve devices. Once extack support is added to iproute2, more meaningful and helpful error messages will be displayed making it easy for users to discern what went wrong. Before: ======= $ ip link add gen1 address 0:1:2:3:4:5:6 type geneve id 200 \ remote 192.168.13.2 RTNETLINK answers: Invalid argument After: ====== $ ip link add gen1 address 0:1:2:3:4:5:6 type geneve id 200 \ remote 192.168.13.2 Error: Provided link layer address is not Ethernet Also, netdev_dbg() calls used to log errors associated with Netlink request have been removed. Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
04db70d9 |
|
08-Aug-2017 |
Girish Moodalbail <girish.moodalbail@oracle.com> |
geneve: maximum value of VNI cannot be used Geneve's Virtual Network Identifier (VNI) is 24 bit long, so the range of values for it would be from 0 to 16777215 (2^24 -1). However, one cannot create a geneve device with VNI set to 16777215. This patch fixes this issue. Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
04584957 |
|
20-Jul-2017 |
Sabrina Dubroca <sd@queasysnail.net> |
geneve/vxlan: offload ports on register/unregister events This improves consistency of handling when moving a netdev to another netns. Most drivers currently do a full reset when the device goes up, so that will flush the offload state anyway. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
2d2b13fc |
|
20-Jul-2017 |
Sabrina Dubroca <sd@queasysnail.net> |
geneve/vxlan: add support for NETDEV_UDP_TUNNEL_DROP_INFO Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
5b861f6b |
|
20-Jul-2017 |
Girish Moodalbail <girish.moodalbail@oracle.com> |
geneve: add rtnl changelink support This patch adds changelink rtnl operation support for geneve devices and the code changes involve: - added geneve_quiesce() which quiesces the geneve device data path for both TX and RX. This lets us perform the changelink operation atomically w.r.t data path. Also added geneve_unquiesce() to reverse the operation of geneve_quiesce(). - refactor geneve_newlink into geneve_nl2info to be used by both geneve_newlink and geneve_changelink - geneve_nl2info takes a changelink boolean argument to isolate changelink checks. - Allow changing only a few attributes (ttl, tos, and remote tunnel endpoint IP address (within the same address family)): - return -EOPNOTSUPP for attributes that cannot be changed for now. Incremental patches can make the non-supported one available in the future if needed. Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
4b4c21fa |
|
02-Jul-2017 |
Jiri Benc <jbenc@redhat.com> |
geneve: fix hlist corruption It's not a good idea to add the same hlist_node to two different hash lists. This leads to various hard to debug memory corruptions. Fixes: 8ed66f0e8235 ("geneve: implement support for IPv6-based tunnels") Cc: John W. Linville <linville@tuxdriver.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a8b8a889 |
|
25-Jun-2017 |
Matthias Schiffer <mschiffer@universe-factory.net> |
net: add netlink_ext_ack argument to rtnl_link_ops.validate Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
7a3f4a18 |
|
25-Jun-2017 |
Matthias Schiffer <mschiffer@universe-factory.net> |
net: add netlink_ext_ack argument to rtnl_link_ops.newlink Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
d58ff351 |
|
16-Jun-2017 |
Johannes Berg <johannes.berg@intel.com> |
networking: make skb_push & __skb_push return void pointers It seems like a historic accident that these return unsigned char *, and in many places that means casts are required, more often than not. Make these functions return void * and remove all the casts across the tree, adding a (u8 *) cast only where the unsigned char pointer was used directly, all done with the following spatch: @@ expression SKB, LEN; typedef u8; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; @@ - *(fn(SKB, LEN)) + *(u8 *)fn(SKB, LEN) @@ expression E, SKB, LEN; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; type T; @@ - E = ((T *)(fn(SKB, LEN))) + E = fn(SKB, LEN) @@ expression SKB, LEN; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; @@ - fn(SKB, LEN)[0] + *(u8 *)fn(SKB, LEN) Note that the last part there converts from push(...)[0] to the more idiomatic *(u8 *)push(...). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
fe741e23 |
|
08-Jun-2017 |
Girish Moodalbail <girish.moodalbail@oracle.com> |
geneve: add missing rx stats accounting There are few places on the receive path where packet drops and packet errors were not accounted for. This patch fixes that issue. Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
cf124db5 |
|
07-May-2017 |
David S. Miller <davem@davemloft.net> |
net: Fix inconsistent teardown and release of private netdev state. Network devices can allocate reasources and private memory using netdev_ops->ndo_init(). However, the release of these resources can occur in one of two different places. Either netdev_ops->ndo_uninit() or netdev->destructor(). The decision of which operation frees the resources depends upon whether it is necessary for all netdev refs to be released before it is safe to perform the freeing. netdev_ops->ndo_uninit() presumably can occur right after the NETDEV_UNREGISTER notifier completes and the unicast and multicast address lists are flushed. netdev->destructor(), on the other hand, does not run until the netdev references all go away. Further complicating the situation is that netdev->destructor() almost universally does also a free_netdev(). This creates a problem for the logic in register_netdevice(). Because all callers of register_netdevice() manage the freeing of the netdev, and invoke free_netdev(dev) if register_netdevice() fails. If netdev_ops->ndo_init() succeeds, but something else fails inside of register_netdevice(), it does call ndo_ops->ndo_uninit(). But it is not able to invoke netdev->destructor(). This is because netdev->destructor() will do a free_netdev() and then the caller of register_netdevice() will do the same. However, this means that the resources that would normally be released by netdev->destructor() will not be. Over the years drivers have added local hacks to deal with this, by invoking their destructor parts by hand when register_netdevice() fails. Many drivers do not try to deal with this, and instead we have leaks. Let's close this hole by formalizing the distinction between what private things need to be freed up by netdev->destructor() and whether the driver needs unregister_netdevice() to perform the free_netdev(). netdev->priv_destructor() performs all actions to free up the private resources that used to be freed by netdev->destructor(), except for free_netdev(). netdev->needs_free_netdev is a boolean that indicates whether free_netdev() should be done at the end of unregister_netdevice(). Now, register_netdevice() can sanely release all resources after ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit() and netdev->priv_destructor(). And at the end of unregister_netdevice(), we invoke netdev->priv_destructor() and optionally call free_netdev(). Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
9a1c44d9 |
|
02-Jun-2017 |
Eric Garver <e@erig.me> |
geneve: fix needed_headroom and max_mtu for collect_metadata Since commit 9b4437a5b870 ("geneve: Unify LWT and netdev handling.") when using COLLECT_METADATA geneve devices are created with too small of a needed_headroom and too large of a max_mtu. This is because ip_tunnel_info_af() is not valid with the device level info when using COLLECT_METADATA and we mistakenly fall into the IPv4 case. For COLLECT_METADATA, always use the worst case of ipv6 since both sockets are created. Fixes: 9b4437a5b870 ("geneve: Unify LWT and netdev handling.") Signed-off-by: Eric Garver <e@erig.me> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
11387fe4 |
|
23-May-2017 |
Eric Garver <e@erig.me> |
geneve: fix fill_info when using collect_metadata Since 9b4437a5b870 ("geneve: Unify LWT and netdev handling.") fill_info does not return UDP_ZERO_CSUM6_RX when using COLLECT_METADATA. This is because it uses ip_tunnel_info_af() with the device level info, which is not valid for COLLECT_METADATA. Fix by checking for the presence of the actual sockets. Fixes: 9b4437a5b870 ("geneve: Unify LWT and netdev handling.") Signed-off-by: Eric Garver <e@erig.me> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
5e0740c4 |
|
27-Apr-2017 |
Girish Moodalbail <girish.moodalbail@oracle.com> |
geneve: fix incorrect setting of UDP checksum flag Creating a geneve link with 'udpcsum' set results in a creation of link for which UDP checksum will NOT be computed on outbound packets, as can be seen below. 11: gen0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN link/ether c2:85:27:b6:b4:15 brd ff:ff:ff:ff:ff:ff promiscuity 0 geneve id 200 remote 192.168.13.1 dstport 6081 noudpcsum Similarly, creating a link with 'noudpcsum' set results in a creation of link for which UDP checksum will be computed on outbound packets. Fixes: 9b4437a5b870 ("geneve: Unify LWT and netdev handling.") Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a717e3f7 |
|
24-Feb-2017 |
Jakub Kicinski <kuba@kernel.org> |
geneve: lock RCU on TX path There is no guarantees that callers of the TX path will hold the RCU lock. Grab it explicitly. Fixes: fceb9c3e3825 ("geneve: avoid using stale geneve socket.") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
5b010147 |
|
02-Dec-2016 |
Sabrina Dubroca <sd@queasysnail.net> |
geneve: avoid use-after-free of skb->data geneve{,6}_build_skb can end up doing a pskb_expand_head(), which makes the ip_hdr(skb) reference we stashed earlier stale. Since it's only needed as an argument to ip_tunnel_ecn_encap(), move this directly in the function call. Fixes: 08399efc6319 ("geneve: ensure ECN info is handled properly in all tx/rx paths") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
31ac1c19 |
|
27-Nov-2016 |
Haishuang Yan <yanhaishuang@cmss.chinamobile.com> |
geneve: fix ip_hdr_len reserved for geneve6 tunnel. It shold reserved sizeof(ipv6hdr) for geneve in ipv6 tunnel. Fixes: c3ef5aa5e5 ('geneve: Merge ipv4 and ipv6 geneve_build_skb()') Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
2e0b26e1 |
|
21-Nov-2016 |
pravin shelar <pshelar@ovn.org> |
geneve: Optimize geneve device lookup. Rather than comparing 64-bit tunnel-id, compare tunnel vni which is 24-bit id. This also save conversion from vni to tunnel id on each tunnel packet receive. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
bcceeec3 |
|
21-Nov-2016 |
pravin shelar <pshelar@ovn.org> |
geneve: Remove redundant socket checks. Geneve already has check for device socket in route lookup function. So no need to check it in xmit function. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c3ef5aa5 |
|
21-Nov-2016 |
pravin shelar <pshelar@ovn.org> |
geneve: Merge ipv4 and ipv6 geneve_build_skb() There are minimal difference in building Geneve header between ipv4 and ipv6 geneve tunnels. Following patch refactors code to unify it. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
9b4437a5 |
|
21-Nov-2016 |
pravin shelar <pshelar@ovn.org> |
geneve: Unify LWT and netdev handling. Current geneve implementation has two separate cases to handle. 1. netdev xmit 2. LWT xmit. In case of netdev, geneve configuration is stored in various struct geneve_dev members. For example geneve_addr, ttl, tos, label, flags, dst_cache, etc. For LWT ip_tunnel_info is passed to the device in ip_tunnel_info. Following patch uses ip_tunnel_info struct to store almost all of configuration of a geneve netdevice. This allows us to unify most of geneve driver code around ip_tunnel_info struct. This dramatically simplify geneve code, since it does not need to handle two different configuration cases. Removes duplicate code, single code path can handle either type of geneve devices. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c7d03a00 |
|
16-Nov-2016 |
Alexey Dobriyan <adobriyan@gmail.com> |
netns: make struct pernet_operations::id unsigned int Make struct pernet_operations::id unsigned. There are 2 reasons to do so: 1) This field is really an index into an zero based array and thus is unsigned entity. Using negative value is out-of-bound access by definition. 2) On x86_64 unsigned 32-bit data which are mixed with pointers via array indexing or offsets added or subtracted to pointers are preffered to signed 32-bit data. "int" being used as an array index needs to be sign-extended to 64-bit before being used. void f(long *p, int i) { g(p[i]); } roughly translates to movsx rsi, esi mov rdi, [rsi+...] call g MOVSX is 3 byte instruction which isn't necessary if the variable is unsigned because x86_64 is zero extending by default. Now, there is net_generic() function which, you guessed it right, uses "int" as an array index: static inline void *net_generic(const struct net *net, int id) { ... ptr = ng->ptr[id - 1]; ... } And this function is used a lot, so those sign extensions add up. Patch snipes ~1730 bytes on allyesconfig kernel (without all junk messing with code generation): add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730) Unfortunately some functions actually grow bigger. This is a semmingly random artefact of code generation with register allocator being used differently. gcc decides that some variable needs to live in new r8+ registers and every access now requires REX prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be used which is longer than [r8] However, overall balance is in negative direction: add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730) function old new delta nfsd4_lock 3886 3959 +73 tipc_link_build_proto_msg 1096 1140 +44 mac80211_hwsim_new_radio 2776 2808 +32 tipc_mon_rcv 1032 1058 +26 svcauth_gss_legacy_init 1413 1429 +16 tipc_bcbase_select_primary 379 392 +13 nfsd4_exchange_id 1247 1260 +13 nfsd4_setclientid_confirm 782 793 +11 ... put_client_renew_locked 494 480 -14 ip_set_sockfn_get 730 716 -14 geneve_sock_add 829 813 -16 nfsd4_sequence_done 721 703 -18 nlmclnt_lookup_host 708 686 -22 nfsd4_lockt 1085 1063 -22 nfs_get_client 1077 1050 -27 tcf_bpf_init 1106 1076 -30 nfsd4_encode_fattr 5997 5930 -67 Total: Before=154856051, After=154854321, chg -0.00% Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
fceb9c3e |
|
28-Oct-2016 |
pravin shelar <pshelar@ovn.org> |
geneve: avoid using stale geneve socket. This patch is similar to earlier vxlan patch. Geneve device close operation frees geneve socket. This operation can race with geneve-xmit function which dereferences geneve socket. Following patch uses RCU mechanism to avoid this situation. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
91572088 |
|
20-Oct-2016 |
Jarod Wilson <jarod@redhat.com> |
net: use core MTU range checking in core net infra geneve: - Merge __geneve_change_mtu back into geneve_change_mtu, set max_mtu - This one isn't quite as straight-forward as others, could use some closer inspection and testing macvlan: - set min/max_mtu tun: - set min/max_mtu, remove tun_net_change_mtu vxlan: - Merge __vxlan_change_mtu back into vxlan_change_mtu - Set max_mtu to IP_MAX_MTU and retain dynamic MTU range checks in change_mtu function - This one is also not as straight-forward and could use closer inspection and testing from vxlan folks bridge: - set max_mtu of IP_MAX_MTU and retain dynamic MTU range checks in change_mtu function openvswitch: - set min/max_mtu, remove internal_dev_change_mtu - note: max_mtu wasn't checked previously, it's been set to 65535, which is the largest possible size supported sch_teql: - set min/max_mtu (note: max_mtu previously unchecked, used max of 65535) macsec: - min_mtu = 0, max_mtu = 65535 macvlan: - min_mtu = 0, max_mtu = 65535 ntb_netdev: - min_mtu = 0, max_mtu = 65535 veth: - min_mtu = 68, max_mtu = 65535 8021q: - min_mtu = 0, max_mtu = 65535 CC: netdev@vger.kernel.org CC: Nicolas Dichtel <nicolas.dichtel@6wind.com> CC: Hannes Frederic Sowa <hannes@stressinduktion.org> CC: Tom Herbert <tom@herbertland.com> CC: Daniel Borkmann <daniel@iogearbox.net> CC: Alexander Duyck <alexander.h.duyck@intel.com> CC: Paolo Abeni <pabeni@redhat.com> CC: Jiri Benc <jbenc@redhat.com> CC: WANG Cong <xiyou.wangcong@gmail.com> CC: Roopa Prabhu <roopa@cumulusnetworks.com> CC: Pravin B Shelar <pshelar@ovn.org> CC: Sabrina Dubroca <sd@queasysnail.net> CC: Patrick McHardy <kaber@trash.net> CC: Stephen Hemminger <stephen@networkplumber.org> CC: Pravin Shelar <pshelar@nicira.com> CC: Maxim Krasnyansky <maxk@qti.qualcomm.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
fcd91dd4 |
|
20-Oct-2016 |
Sabrina Dubroca <sd@queasysnail.net> |
net: add recursion limit to GRO Currently, GRO can do unlimited recursion through the gro_receive handlers. This was fixed for tunneling protocols by limiting tunnel GRO to one level with encap_mark, but both VLAN and TEB still have this problem. Thus, the kernel is vulnerable to a stack overflow, if we receive a packet composed entirely of VLAN headers. This patch adds a recursion counter to the GRO layer to prevent stack overflow. When a gro_receive function hits the recursion limit, GRO is aborted for this skb and it is processed normally. This recursion counter is put in the GRO CB, but could be turned into a percpu counter if we run out of space in the CB. Thanks to Vladimír Beneš <vbenes@redhat.com> for the initial bug report. Fixes: CVE-2016-7039 Fixes: 9b174d88c257 ("net: Add Transparent Ethernet Bridging GRO support.") Fixes: 66e5133f19e9 ("vlan: Add GRO support for non hardware accelerated vlan") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Jiri Benc <jbenc@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e5de25dc |
|
11-Jul-2016 |
Sabrina Dubroca <sd@queasysnail.net> |
drivers/net: fixup comments after "Future-proof tunnel offload handlers" Some comments weren't updated to reflect the renaming of ndo's and the change of arguments. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
d5d5e8d5 |
|
02-Jul-2016 |
Haishuang Yan <yanhaishuang@cmss.chinamobile.com> |
geneve: fix max_mtu setting For ipv6+udp+geneve encapsulation data, the max_mtu should subtract sizeof(ipv6hdr), instead of sizeof(iphdr). Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
efeb2267 |
|
21-Jun-2016 |
Haishuang Yan <yanhaishuang@cmss.chinamobile.com> |
geneve: fix tx_errors statistics Tx errors present summation of errors encountered while transmitting packets. Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
7c46a640 |
|
16-Jun-2016 |
Alexander Duyck <aduyck@mirantis.com> |
net: Merge VXLAN and GENEVE push notifiers into a single notifier This patch merges the notifiers for VXLAN and GENEVE into a single UDP tunnel notifier. The idea is that we will want to only have to make one notifier call to receive the list of ports for VXLAN and GENEVE tunnels that need to be offloaded. In addition we add a new set of ndo functions named ndo_udp_tunnel_add and ndo_udp_tunnel_del that are meant to allow us to track the tunnel meta-data such as port and address family as tunnels are added and removed. The tunnel meta-data is now transported in a structure named udp_tunnel_info which for now carries the type, address family, and port number. In the future this could be updated so that we can include a tuple of values including things such as the destination IP address and other fields. I also ended up going with a naming scheme that consisted of using the prefix udp_tunnel on function names. I applied this to the notifier and ndo ops as well so that it hopefully points to the fact that these are primarily used in the udp_tunnel functions. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e7b3db5e |
|
16-Jun-2016 |
Alexander Duyck <aduyck@mirantis.com> |
net: Combine GENEVE and VXLAN port notifiers into single functions This patch merges the GENEVE and VXLAN code so that both functions pass through a shared code path. This way we can start the effort of using a single function on the network device drivers to handle both of these tunnel types. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
86a98057 |
|
16-Jun-2016 |
Alexander Duyck <aduyck@mirantis.com> |
vxlan/geneve: Include udp_tunnel.h in vxlan/geneve.h and fixup includes This patch makes it so that we add udp_tunnel.h to vxlan.h and geneve.h header files. This is useful as I plan to move the generic handlers for the port offloads into the udp_tunnel header file and leave the vxlan and geneve headers to be a bit more protocol specific. I also went through and cleaned out a number of redundant includes that where in the .h and .c files for these drivers. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
41009481 |
|
13-Jun-2016 |
Nicolas Dichtel <nicolas.dichtel@6wind.com> |
ovs/geneve: fix rtnl notifications on iface deletion The function geneve_dev_create_fb() (only used by ovs) never calls rtnl_configure_link(). The consequence is that dev->rtnl_link_state is never set to RTNL_LINK_INITIALIZED. During the deletion phase, the function rollback_registered_many() sends a RTM_DELLINK only if dev->rtnl_link_state is set to RTNL_LINK_INITIALIZED. Fixes: e305ac6cf5a1 ("geneve: Add support to collect tunnel metadata.") CC: Pravin B Shelar <pshelar@nicira.com> CC: Jesse Gross <jesse@nicira.com> CC: Thomas Graf <tgraf@suug.ch> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
106da663 |
|
13-Jun-2016 |
Nicolas Dichtel <nicolas.dichtel@6wind.com> |
ovs/gre,geneve: fix error path when creating an iface After ipgre_newlink()/geneve_configure() call, the netdev is registered. Fixes: 7e059158d57b ("vxlan, gre, geneve: Set a large MTU on ovs-created tunnel devices") CC: David Wragg <david@weave.works> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e5aed006 |
|
19-May-2016 |
Hannes Frederic Sowa <hannes@stressinduktion.org> |
udp: prevent skbs lingering in tunnel socket queues In case we find a socket with encapsulation enabled we should call the encap_recv function even if just a udp header without payload is available. The callbacks are responsible for correctly verifying and dropping the packets. Also, in case the header validation fails for geneve and vxlan we shouldn't put the skb back into the socket queue, no one will pick them up there. Instead we can simply discard them in the respective encap_recv functions. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
229740c6 |
|
03-May-2016 |
Jarno Rajahalme <jarno@ovn.org> |
udp_offload: Set encapsulation before inner completes. UDP tunnel segmentation code relies on the inner offsets being set for an UDP tunnel GSO packet, but the inner *_complete() functions will set the inner offsets only if 'encapsulation' is set before calling them. Currently, udp_gro_complete() sets 'encapsulation' only after the inner *_complete() functions are done. This causes the inner offsets having invalid values after udp_gro_complete() returns, which in turn will make it impossible to properly segment the packet in case it needs to be forwarded, which would be visible to the user either as invalid packets being sent or as packet loss. This patch fixes this by setting skb's 'encapsulation' in udp_gro_complete() before calling into the inner complete functions, and by making each possible UDP tunnel gro_complete() callback set the inner_mac_header to the beginning of the tunnel payload. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Reviewed-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
43b8448c |
|
03-May-2016 |
Jarno Rajahalme <jarno@ovn.org> |
udp_tunnel: Remove redundant udp_tunnel_gro_complete(). The setting of the UDP tunnel GSO type is already performed by udp[46]_gro_complete(). Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
681e683f |
|
18-Apr-2016 |
Hannes Frederic Sowa <hannes@stressinduktion.org> |
geneve: break dependency with netdev drivers Equivalent to "vxlan: break dependency with netdev drivers", don't autoload geneve module in case the driver is loaded. Instead make the coupling weaker by using netdevice notifiers as proxy. Cc: Jesse Gross <jesse@kernel.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1ba64fac |
|
19-Apr-2016 |
Dan Carpenter <dan.carpenter@oracle.com> |
geneve: testing the wrong variable in geneve6_build_skb() We intended to test "err" and not "skb". Fixes: aed069df099c ('ip_tunnel_core: iptunnel_handle_offloads returns int and doesn't free skb') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
aed069df |
|
14-Apr-2016 |
Alexander Duyck <aduyck@mirantis.com> |
ip_tunnel_core: iptunnel_handle_offloads returns int and doesn't free skb This patch updates the IP tunnel core function iptunnel_handle_offloads so that we return an int and do not free the skb inside the function. This actually allows us to clean up several paths in several tunnels so that we can free the skb at one point in the path without having to have a secondary path if we are supporting tunnel offloads. In addition it should resolve some double-free issues I have found in the tunnels paths as I believe it is possible for us to end up triggering such an event in the case of fou or gue. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
4a0090a9 |
|
05-Apr-2016 |
Tom Herbert <tom@herbertland.com> |
geneve: change to use UDP socket GRO Adapt geneve_gro_receive, geneve_gro_complete to take a socket argument. Set these functions in tunnel_config. Don't set udp_offloads any more. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
95caf6f7 |
|
18-Mar-2016 |
Daniel Borkmann <daniel@iogearbox.net> |
geneve: fix populating tclass in geneve_get_v6_dst The struct flowi6's flowi6_tos is not used in IPv6 route lookup, the traffic class information is handled in the flowi6's flowlabel member instead. For example, for policy routing, fib6_rule_match() uses ip6_tclass() that is applied on the flowlabel for matching on tclass, which would currently not work as expected. Fixes: 3a56f86f1be6 ("geneve: handle ipv6 priority like ipv4 tos") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c194cf93 |
|
09-Mar-2016 |
Alexander Duyck <aduyck@mirantis.com> |
gro: Defer clearing of flush bit in tunnel paths This patch updates the GRO handlers for GRE, VXLAN, GENEVE, and FOU so that we do not clear the flush bit until after we have called the next level GRO handler. Previously this was being cleared before parsing through the list of frames, however this resulted in several paths where either the bit needed to be reset but wasn't as in the case of FOU, or cases where it was being set as in GENEVE. By just deferring the clearing of the bit until after the next level protocol has been parsed we can avoid any unnecessary bit twiddling and avoid bugs. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8eb3b995 |
|
08-Mar-2016 |
Daniel Borkmann <daniel@iogearbox.net> |
geneve: support setting IPv6 flow label This work adds support for setting the IPv6 flow label for geneve per device and through collect metadata (ip_tunnel_key) frontends. Also here, the geneve dst cache does not need any special considerations, for the cases where caches can be used, the label is static per cache. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
13461144 |
|
08-Mar-2016 |
Daniel Borkmann <daniel@iogearbox.net> |
ip_tunnel: add support for setting flow label via collect metadata This patch extends udp_tunnel6_xmit_skb() to pass in the IPv6 flow label from call sites. Currently, there's no such option and it's always set to zero when writing ip6_flow_hdr(). Add a label member to ip_tunnel_key, so that flow-based tunnels via collect metadata frontends can make use of it. vxlan and geneve will be converted to add flow label support separately. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
db3c6139 |
|
04-Mar-2016 |
Daniel Borkmann <daniel@iogearbox.net> |
bpf, vxlan, geneve, gre: fix usage of dst_cache on xmit The assumptions from commit 0c1d70af924b ("net: use dst_cache for vxlan device"), 468dfffcd762 ("geneve: add dst caching support") and 3c1cb4d2604c ("net/ipv4: add dst cache support for gre lwtunnels") on dst_cache usage when ip_tunnel_info is used is unfortunately not always valid as assumed. While it seems correct for ip_tunnel_info front-ends such as OVS, eBPF however can fill in ip_tunnel_info for consumers like vxlan, geneve or gre with different remote dsts, tos, etc, therefore they cannot be assumed as packet independent. Right now vxlan, geneve, gre would cache the dst for eBPF and every packet would reuse the same entry that was first created on the initial route lookup. eBPF doesn't store/cache the ip_tunnel_info, so each skb may have a different one. Fix it by adding a flag that checks the ip_tunnel_info. Also the !tos test in vxlan needs to be handeled differently in this context as it is currently inferred from ip_tunnel_info as well if present. ip_tunnel_dst_cache_usable() helper is added for the three tunnel cases, which checks if we can use dst cache. Fixes: 0c1d70af924b ("net: use dst_cache for vxlan device") Fixes: 468dfffcd762 ("geneve: add dst caching support") Fixes: 3c1cb4d2604c ("net/ipv4: add dst cache support for gre lwtunnels") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
14ca0751 |
|
04-Mar-2016 |
Daniel Borkmann <daniel@iogearbox.net> |
bpf: support for access to tunnel options After eBPF being able to programmatically access/manage tunnel key meta data via commit d3aa45ce6b94 ("bpf: add helpers to access tunnel metadata") and more recently also for IPv6 through c6c33454072f ("bpf: support ipv6 for bpf_skb_{set,get}_tunnel_key"), this work adds two complementary helpers to generically access their auxiliary tunnel options. Geneve and vxlan support this facility. For geneve, TLVs can be pushed, and for the vxlan case its GBP extension. I.e. setting tunnel key for geneve case only makes sense, if we can also read/write TLVs into it. In the GBP case, it provides the flexibility to easily map the group policy ID in combination with other helpers or maps. I chose to model this as two separate helpers, bpf_skb_{set,get}_tunnel_opt(), for a couple of reasons. bpf_skb_{set,get}_tunnel_key() is already rather complex by itself, and there may be cases for tunnel key backends where tunnel options are not always needed. If we would have integrated this into bpf_skb_{set,get}_tunnel_key() nevertheless, we are very limited with remaining helper arguments, so keeping compatibility on structs in case of passing in a flat buffer gets more cumbersome. Separating both also allows for more flexibility and future extensibility, f.e. options could be fed directly from a map, etc. Moreover, change geneve's xmit path to test only for info->options_len instead of TUNNEL_GENEVE_OPT flag. This makes it more consistent with vxlan's xmit path and allows for avoiding to specify a protocol flag in the API on xmit, so it can be protocol agnostic. Having info->options_len is enough information that is needed. Tested with vxlan and geneve. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
14f1f724 |
|
19-Feb-2016 |
Alexander Duyck <aduyck@mirantis.com> |
GENEVE: Support outer IPv4 Tx checksums by default This change makes it so that if UDP CSUM is not specified we will default to enabling it. The main motivation behind this is the fact that with the use of outer checksum we can greatly improve the performance for GENEVE tunnels on hardware that doesn't know how to parse them. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Acked-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c868ee70 |
|
17-Feb-2016 |
Paolo Abeni <pabeni@redhat.com> |
lwt: fix rx checksum setting for lwt devices tunneling over ipv6 the commit 35e2d1152b22 ("tunnels: Allow IPv6 UDP checksums to be correctly controlled.") changed the default xmit checksum setting for lwt vxlan/geneve ipv6 tunnels, so that now the checksum is not set into external UDP header. This commit changes the rx checksum setting for both lwt vxlan/geneve devices created by openvswitch accordingly, so that lwt over ipv6 tunnel pairs are again able to communicate with default values. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Jiri Benc <jbenc@redhat.com> Acked-by: Jesse Gross <jesse@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
fc41cdb3 |
|
17-Feb-2016 |
Jiri Benc <jbenc@redhat.com> |
geneve: clear IFF_TX_SKB_SHARING ether_setup sets IFF_TX_SKB_SHARING but this is not supported by geneve as it modifies the skb on xmit. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
7f290c94 |
|
18-Feb-2016 |
Jiri Benc <jbenc@redhat.com> |
iptunnel: scrub packet in iptunnel_pull_header Part of skb_scrub_packet was open coded in iptunnel_pull_header. Let it call skb_scrub_packet directly instead. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
9fc47545 |
|
18-Feb-2016 |
Jiri Benc <jbenc@redhat.com> |
geneve: move geneve device lookup before iptunnel_pull_header This is in preparation for iptunnel_pull_header calling skb_scrub_packet. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1e9f12ec |
|
18-Feb-2016 |
Jiri Benc <jbenc@redhat.com> |
geneve: implement geneve_get_sk_family helper Similarly to the existing vxlan_get_sk_family. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
aeee0e66 |
|
18-Feb-2016 |
David Wragg <david@weave.works> |
geneve: Refine MTU limit Calculate the maximum MTU taking into account the size of headers involved in GENEVE encapsulation, as for other tunnel types. Changes in v3: - Correct comment style Changes in v2: - Conform more closely to ip_tunnel_change_mtu - Exclude GENEVE options from max MTU calculation Signed-off-by: David Wragg <david@weave.works> Acked-by: Jesse Gross <jesse@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
468dfffc |
|
12-Feb-2016 |
Paolo Abeni <pabeni@redhat.com> |
geneve: add dst caching support use generic dst implementation for both plain geneve devices and lwtunnels. In case of UDP traffic with datagram length below MTU this give about 2% performance increase for plain geneve tunnel over ipv4, about 65% performance increase for ipv6 tunnel. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Suggested-and-Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
7e059158 |
|
09-Feb-2016 |
David Wragg <david@weave.works> |
vxlan, gre, geneve: Set a large MTU on ovs-created tunnel devices Prior to 4.3, openvswitch tunnel vports (vxlan, gre and geneve) could transmit vxlan packets of any size, constrained only by the ability to send out the resulting packets. 4.3 introduced netdevs corresponding to tunnel vports. These netdevs have an MTU, which limits the size of a packet that can be successfully encapsulated. The default MTU values are low (1500 or less), which is awkwardly small in the context of physical networks supporting jumbo frames, and leads to a conspicuous change in behaviour for userspace. Instead, set the MTU on openvswitch-created netdevs to be the relevant maximum (i.e. the maximum IP packet size minus any relevant overhead), effectively restoring the behaviour prior to 4.3. Signed-off-by: David Wragg <david@weave.works> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
55e5bfb5 |
|
09-Feb-2016 |
David Wragg <david@weave.works> |
geneve: Relax MTU constraints Allow the MTU of geneve devices to be set to large values, in order to exploit underlying networks with larger frame sizes. GENEVE does not have a fixed encapsulation overhead (an openvswitch rule can add variable length options), so there is no relevant maximum MTU to enforce. A maximum of IP_MAX_MTU is used instead. Encapsulated packets that are too big for the underlying network will get dropped on the floor. Signed-off-by: David Wragg <david@weave.works> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
35e2d115 |
|
20-Jan-2016 |
Jesse Gross <jesse@kernel.org> |
tunnels: Allow IPv6 UDP checksums to be correctly controlled. When configuring checksums on UDP tunnels, the flags are different for IPv4 vs. IPv6 (and reversed). However, when lightweight tunnels are enabled the flags used are always the IPv4 versions, which are ignored in the IPv6 code paths. This uses the correct IPv6 flags, so checksums can be controlled appropriately. Fixes: a725e514 ("vxlan: metadata based tunneling for IPv6") Fixes: abe492b4 ("geneve: UDP checksum configuration via netlink") Signed-off-by: Jesse Gross <jesse@kernel.org> Acked-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
787d7ac3 |
|
07-Jan-2016 |
Hannes Frederic Sowa <hannes@stressinduktion.org> |
udp: restrict offloads to one namespace udp tunnel offloads tend to aggregate datagrams based on inner headers. gro engine gets notified by tunnel implementations about possible offloads. The match is solely based on the port number. Imagine a tunnel bound to port 53, the offloading will look into all DNS packets and tries to aggregate them based on the inner data found within. This could lead to data corruption and malformed DNS packets. While this patch minimizes the problem and helps an administrator to find the issue by querying ip tunnel/fou, a better way would be to match on the specific destination ip address so if a user space socket is bound to the same address it will conflict. Cc: Tom Herbert <tom@herbertland.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
039f5062 |
|
24-Dec-2015 |
Pravin B Shelar <pshelar@nicira.com> |
ip_tunnel: Move stats update to iptunnel_xmit() By moving stats update into iptunnel_xmit(), we can simplify iptunnel_xmit() usage. With this change there is no need to call another function (iptunnel_xmit_stats()) to update stats in tunnel xmit code path. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
184fc8b5 |
|
23-Dec-2015 |
Paolo Abeni <pabeni@redhat.com> |
geneve: initialize needed_headroom Currently the needed_headroom field for the geneve device is left to the default value. This patch set it to space required for basic geneve encapsulation, so that we can avoid the skb head re-allocation on xmit. This give a 6% speedup for unsegment traffic on geneve tunnel. v1 -> v2: - add ETH_HLEN for the lower device to the needed headroom Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
05ca4029 |
|
14-Dec-2015 |
Singhai, Anjali <anjali.singhai@intel.com> |
geneve: Add geneve_get_rx_port support This patch adds an op that the drivers can call into to get existing geneve ports. Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a8170d2b |
|
14-Dec-2015 |
Singhai, Anjali <anjali.singhai@intel.com> |
geneve: Add geneve udp port offload for ethernet devices Add ndo_ops to add/del UDP ports to a device that supports geneve offload. v2: Comment fix. Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
abe492b4 |
|
10-Dec-2015 |
Tom Herbert <tom@herbertland.com> |
geneve: UDP checksum configuration via netlink Add support to enable and disable UDP checksums via netlink. This is similar to how VXLAN and GUE allow this. This includes support for enabling the UDP zero checksum (for both TX and RX). Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a322a1bc |
|
07-Dec-2015 |
Pravin B Shelar <pshelar@nicira.com> |
geneve: Fix IPv6 xmit stats update. Call to iptunnel_xmit_stats() is not required after udp-tunnel6-xmit. By calling iptunnel_xmit_stats() results in incorrect device stats. Following patch drops this call. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b8812fa8 |
|
27-Oct-2015 |
John W. Linville <linville@tuxdriver.com> |
geneve: add IPv6 bits to geneve_fill_metadata_dst Signed-off-by: John W. Linville <linville@tuxdriver.com> Reviewed-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
3a56f86f |
|
26-Oct-2015 |
John W. Linville <linville@tuxdriver.com> |
geneve: handle ipv6 priority like ipv4 tos Other callers of udp_tunnel6_xmit_skb just pass 0 for the prio argument. Jesse Gross <jesse@nicira.com> suggested that prio is really the same as IPv4's tos and should be handled the same, so this is my interpretation of that suggestion. Signed-off-by: John W. Linville <linville@tuxdriver.com> Reported-by: Jesse Gross <jesse@nicira.com> Reviewed-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8ed66f0e |
|
26-Oct-2015 |
John W. Linville <linville@tuxdriver.com> |
geneve: implement support for IPv6-based tunnels NOTE: Link-local IPv6 addresses for remote endpoints are not supported, since the driver currently has no capacity for binding a geneve interface to a specific link. Signed-off-by: John W. Linville <linville@tuxdriver.com> Reviewed-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
fc4099f1 |
|
22-Oct-2015 |
Pravin B Shelar <pshelar@nicira.com> |
openvswitch: Fix egress tunnel info. While transitioning to netdev based vport we broke OVS feature which allows user to retrieve tunnel packet egress information for lwtunnel devices. Following patch fixes it by introducing ndo operation to get the tunnel egress info. Same ndo operation can be used for lwtunnel devices and compat ovs-tnl-vport devices. So after adding such device operation we can remove similar operation from ovs-vport. Fixes: 614732eaa12d ("openvswitch: Use regular VXLAN net_device device"). Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e277de5f |
|
16-Oct-2015 |
Jesse Gross <jesse@nicira.com> |
tunnels: Don't require remote endpoint or ID during creation. Before lightweight tunnels existed, it really didn't make sense to create a tunnel that was not fully specified, such as without a destination IP address - the resulting packets would go nowhere. However, with lightweight tunnels, the opposite is true - it doesn't make sense to require this information when it will be provided later on by the route. This loosens the requirements for this information. An alternative would be to allow the relaxed version only when COLLECT_METADATA is enabled. However, since there are several variations on this theme (such as NBMA tunnels in GRE), just dropping the restrictions seems the most consistent across tunnels and with the existing configuration. CC: John Linville <linville@tuxdriver.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
7bbe33ff |
|
22-Sep-2015 |
John W. Linville <linville@tuxdriver.com> |
geneve: use network byte order for destination port config parameter This is primarily for consistancy with vxlan and other tunnels which use network byte order for similar parameters. Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
08399efc |
|
21-Sep-2015 |
John W. Linville <linville@tuxdriver.com> |
geneve: ensure ECN info is handled properly in all tx/rx paths Partially due to a pre-exising "thinko", the new metadata-based tx/rx paths were handling ECN propagation differently than the traditional tx/rx paths. This patch removes the "thinko" (involving multiple ip_hdr assignments) on the rx path and corrects the ECN handling on both the rx and tx paths. Signed-off-by: John W. Linville <linville@tuxdriver.com> Reviewed-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
5eb8f289 |
|
18-Sep-2015 |
John W. Linville <linville@tuxdriver.com> |
geneve: remove vlan-related feature assignment The code handling vlan tag insertion was dropped in commit 371bd1061d29 ("geneve: Consolidate Geneve functionality in single module."). Now we need to drop the related vlan feature bits in the netdev structure. Signed-off-by: John W. Linville <linville@tuxdriver.com> Reviewed-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
4c222798 |
|
30-Aug-2015 |
Pravin B Shelar <pshelar@nicira.com> |
ip-tunnel: Use API to access tunnel metadata options. Currently tun-info options pointer is used in few cases to pass options around. But tunnel options can be accessed using ip_tunnel_info_opts() API without using the pointer. Following patch removes the redundant pointer and consistently make use of API. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Reviewed-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8e816df8 |
|
28-Aug-2015 |
Jesse Gross <jesse@nicira.com> |
geneve: Use GRO cells infrastructure. Geneve can benefit from GRO at the device level in a manner similar to other tunnels, especially as hardware offloads are still emerging. After this patch, aggregated frames are seen on the tunnel interface. Single stream throughput nearly doubles in ideal circumstances (on old hardware). Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
7f9562a1 |
|
28-Aug-2015 |
Jiri Benc <jbenc@redhat.com> |
ip_tunnels: record IP version in tunnel info There's currently nothing preventing directing packets with IPv6 encapsulation data to IPv4 tunnels (and vice versa). If this happens, IPv6 addresses are incorrectly interpreted as IPv4 ones. Track whether the given ip_tunnel_key contains IPv4 or IPv6 data. Store this in ip_tunnel_info. Reject packets at appropriate places if they are supposed to be encapsulated into an incompatible protocol. Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
46fa062a |
|
28-Aug-2015 |
Jiri Benc <jbenc@redhat.com> |
ip_tunnels: convert the mode field of ip_tunnel_info to flags The mode field holds a single bit of information only (whether the ip_tunnel_info struct is for rx or tx). Change the mode field to bit flags. This allows more mode flags to be added. Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
66d47003 |
|
27-Aug-2015 |
Pravin B Shelar <pshelar@nicira.com> |
geneve: Move device hash table to geneve socket. This change simplifies Geneve Tunnel hash table management. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Jesse Gross <jesse@nicira.com> Reviewed-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
371bd106 |
|
27-Aug-2015 |
Pravin B Shelar <pshelar@nicira.com> |
geneve: Consolidate Geneve functionality in single module. geneve_core module handles send and receive functionality. This way OVS could use the Geneve API. Now with use of tunnel meatadata mode OVS can directly use Geneve netdevice. So there is no need for separate module for Geneve. Following patch consolidates Geneve protocol processing in single module. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Jesse Gross <jesse@nicira.com> Acked-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e305ac6c |
|
27-Aug-2015 |
Pravin B Shelar <pshelar@nicira.com> |
geneve: Add support to collect tunnel metadata. Following patch create new tunnel flag which enable tunnel metadata collection on given device. These devices can be used by tunnel metadata based routing or by OVS. Geneve Consolidation patch get rid of collect_md_tun to simplify tunnel lookup further. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
cd7918b3 |
|
27-Aug-2015 |
Pravin B Shelar <pshelar@nicira.com> |
geneve: Make dst-port configurable. Add netlink interface to configure Geneve UDP port number. So that user can configure it for a Gevene device. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
980c394c |
|
27-Aug-2015 |
Pravin B Shelar <pshelar@nicira.com> |
geneve: Use skb mark and protocol to lookup route. On packet transmit path geneve need to lookup route. Following patch improves route lookup using more parameters. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
87cd3dca |
|
27-Aug-2015 |
Pravin B Shelar <pshelar@nicira.com> |
geneve: Initialize ethernet address in device setup. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ed961ac2 |
|
18-Aug-2015 |
Phil Sutter <phil@nwl.cc> |
net: geneve: convert to using IFF_NO_QUEUE Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
d8951125 |
|
01-Jun-2015 |
John W. Linville <linville@tuxdriver.com> |
geneve: allow user to specify TOS info for tunnel frames Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8760ce58 |
|
01-Jun-2015 |
John W. Linville <linville@tuxdriver.com> |
geneve: allow user to specify TTL for tunnel frames Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
2d07dc79 |
|
12-May-2015 |
John W. Linville <linville@tuxdriver.com> |
geneve: add initial netdev driver for GENEVE tunnels This is an initial implementation of a netdev driver for GENEVE tunnels. This implementation uses a fixed UDP port, and only supports point-to-point links with specific partner endpoints. Only IPv4 links are supported at this time. Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|