#
a3522a2e |
|
09-Feb-2024 |
Guillaume Nault <gnault@redhat.com> |
ipv4: Set the routing scope properly in ip_route_output_ports(). Set scope automatically in ip_route_output_ports() (using the socket SOCK_LOCALROUTE flag). This way, callers don't have to overload the tos with the RTO_ONLINK flag, like RT_CONN_FLAGS() does. For callers that don't pass a struct sock, this doesn't change anything as the scope is still set to RT_SCOPE_UNIVERSE when sk is NULL. Callers that passed a struct sock and used RT_CONN_FLAGS(sk) or RT_CONN_FLAGS_TOS(sk, tos) for the tos are modified to use ip_sock_tos(sk) and RT_TOS(tos) respectively, as overloading tos with the RTO_ONLINK flag now becomes unnecessary. In drivers/net/amt.c, all ip_route_output_ports() calls use a 0 tos parameter, ignoring the SOCK_LOCALROUTE flag of the socket. But the sk parameter is a kernel socket, which doesn't have any configuration path for setting SOCK_LOCALROUTE anyway. Therefore, ip_route_output_ports() will continue to initialise scope with RT_SCOPE_UNIVERSE and amt.c doesn't need to be modified. Also, remove RT_CONN_FLAGS() and RT_CONN_FLAGS_TOS() from route.h as these macros are now unused. The objective is to eventually remove RTO_ONLINK entirely to allow converting ->flowi4_tos to dscp_t. This will ensure proper isolation between the DSCP and ECN bits, thus minimising the risk of introducing bugs where TOS values interfere with ECN. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/dacfd2ab40685e20959ab7b53c427595ba229e7d.1707496938.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
c274af22 |
|
16-Aug-2023 |
Eric Dumazet <edumazet@google.com> |
inet: introduce inet->inet_flags Various inet fields are currently racy. do_ip_setsockopt() and do_ip_getsockopt() are mostly holding the socket lock, but some (fast) paths do not. Use a new inet->inet_flags to hold atomic bits in the series. Remove inet->cmsg_flags, and use instead 9 bits from inet_flags. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
dc97391e |
|
23-Jun-2023 |
David Howells <dhowells@redhat.com> |
sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES) Remove ->sendpage() and ->sendpage_locked(). sendmsg() with MSG_SPLICE_PAGES should be used instead. This allows multiple pages and multipage folios to be passed through. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for net/can cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> cc: linux-afs@lists.infradead.org cc: mptcp@lists.linux.dev cc: rds-devel@oss.oracle.com cc: tipc-discussion@lists.sourceforge.net cc: virtualization@lists.linux-foundation.org Link: https://lore.kernel.org/r/20230623225513.2732256-16-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
e1d001fa |
|
09-Jun-2023 |
Breno Leitao <leitao@debian.org> |
net: ioctl: Use kernel memory on protocol ioctl callbacks Most of the ioctls to net protocols operates directly on userspace argument (arg). Usually doing get_user()/put_user() directly in the ioctl callback. This is not flexible, because it is hard to reuse these functions without passing userspace buffers. Change the "struct proto" ioctls to avoid touching userspace memory and operate on kernel buffers, i.e., all protocol's ioctl callbacks is adapted to operate on a kernel memory other than on userspace (so, no more {put,get}_user() and friends being called in the ioctl callback). This changes the "struct proto" ioctl format in the following way: int (*ioctl)(struct sock *sk, int cmd, - unsigned long arg); + int *karg); (Important to say that this patch does not touch the "struct proto_ops" protocols) So, the "karg" argument, which is passed to the ioctl callback, is a pointer allocated to kernel space memory (inside a function wrapper). This buffer (karg) may contain input argument (copied from userspace in a prep function) and it might return a value/buffer, which is copied back to userspace if necessary. There is not one-size-fits-all format (that is I am using 'may' above), but basically, there are three type of ioctls: 1) Do not read from userspace, returns a result to userspace 2) Read an input parameter from userspace, and does not return anything to userspace 3) Read an input from userspace, and return a buffer to userspace. The default case (1) (where no input parameter is given, and an "int" is returned to userspace) encompasses more than 90% of the cases, but there are two other exceptions. Here is a list of exceptions: * Protocol RAW: * cmd = SIOCGETVIFCNT: * input and output = struct sioc_vif_req * cmd = SIOCGETSGCNT * input and output = struct sioc_sg_req * Explanation: for the SIOCGETVIFCNT case, userspace passes the input argument, which is struct sioc_vif_req. Then the callback populates the struct, which is copied back to userspace. * Protocol RAW6: * cmd = SIOCGETMIFCNT_IN6 * input and output = struct sioc_mif_req6 * cmd = SIOCGETSGCNT_IN6 * input and output = struct sioc_sg_req6 * Protocol PHONET: * cmd == SIOCPNADDRESOURCE | SIOCPNDELRESOURCE * input int (4 bytes) * Nothing is copied back to userspace. For the exception cases, functions sock_sk_ioctl_inout() will copy the userspace input, and copy it back to kernel space. The wrapper that prepare the buffer and put the buffer back to user is sk_ioctl(), so, instead of calling sk->sk_prot->ioctl(), the callee now calls sk_ioctl(), which will handle all cases. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230609152800.830401-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
154e07c1 |
|
30-Mar-2023 |
Andrea Righi <andrea.righi@canonical.com> |
l2tp: generate correct module alias strings Commit 65b32f801bfb ("uapi: move IPPROTO_L2TP to in.h") moved the definition of IPPROTO_L2TP from a define to an enum, but since __stringify doesn't work properly with enums, we ended up breaking the modalias strings for the l2tp modules: $ modinfo l2tp_ip l2tp_ip6 | grep alias alias: net-pf-2-proto-IPPROTO_L2TP alias: net-pf-2-proto-2-type-IPPROTO_L2TP alias: net-pf-10-proto-IPPROTO_L2TP alias: net-pf-10-proto-2-type-IPPROTO_L2TP Use the resolved number directly in MODULE_ALIAS_*() macros (as we already do with SOCK_DGRAM) to fix the alias strings: $ modinfo l2tp_ip l2tp_ip6 | grep alias alias: net-pf-2-proto-115 alias: net-pf-2-proto-115-type-2 alias: net-pf-10-proto-115 alias: net-pf-10-proto-115-type-2 Moreover, fix the ordering of the parameters passed to MODULE_ALIAS_NET_PF_PROTO_TYPE() by switching proto and type. Fixes: 65b32f801bfb ("uapi: move IPPROTO_L2TP to in.h") Link: https://lore.kernel.org/lkml/ZCQt7hmodtUaBlCP@righiandr-XPS-13-7390 Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ff009403 |
|
13-May-2022 |
Eric Dumazet <edumazet@google.com> |
l2tp: use add READ_ONCE() to fetch sk->sk_bound_dev_if Use READ_ONCE() in paths not holding the socket lock. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ec095263 |
|
11-Apr-2022 |
Oliver Hartkopp <socketcan@hartkopp.net> |
net: remove noblock parameter from recvmsg() entities The internal recvmsg() functions have two parameters 'flags' and 'noblock' that were merged inside skb_recv_datagram(). As a follow up patch to commit f4b41f062c42 ("net: remove noblock parameter from skb_recv_datagram()") this patch removes the separate 'noblock' parameter for recvmsg(). Analogue to the referenced patch for skb_recv_datagram() the 'flags' and 'noblock' parameters are unnecessarily split up with e.g. err = sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT, flags & ~MSG_DONTWAIT, &addr_len); or in err = INDIRECT_CALL_2(sk->sk_prot->recvmsg, tcp_recvmsg, udp_recvmsg, sk, msg, size, flags & MSG_DONTWAIT, flags & ~MSG_DONTWAIT, &addr_len); instead of simply using only flags all the time and check for MSG_DONTWAIT where needed (to preserve for the formerly separated no(n)block condition). Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/r/20220411124955.154876-1-socketcan@hartkopp.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
#
f4b41f06 |
|
04-Apr-2022 |
Oliver Hartkopp <socketcan@hartkopp.net> |
net: remove noblock parameter from skb_recv_datagram() skb_recv_datagram() has two parameters 'flags' and 'noblock' that are merged inside skb_recv_datagram() by 'flags | (noblock ? MSG_DONTWAIT : 0)' As 'flags' may contain MSG_DONTWAIT as value most callers split the 'flags' into 'flags' and 'noblock' with finally obsolete bit operations like this: skb_recv_datagram(sk, flags & ~MSG_DONTWAIT, flags & MSG_DONTWAIT, &rc); And this is not even done consistently with the 'flags' parameter. This patch removes the obsolete and costly splitting into two parameters and only performs bit operations when really needed on the caller side. One missing conversion thankfully reported by kernel test robot. I missed to enable kunit tests to build the mctp code. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
7f553ff2 |
|
07-Jun-2021 |
Zheng Yongjun <zhengyongjun3@huawei.com> |
l2tp: Fix spelling mistakes Fix some spelling mistakes in comments: negociated ==> negotiated dont ==> don't Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
5796254e |
|
17-May-2021 |
Yejune Deng <yejune.deng@gmail.com> |
net: Remove the member netns_ok Every protocol has the 'netns_ok' member and it is euqal to 1. The 'if (!prot->netns_ok)' always false in inet_add_protocol(). Signed-off-by: Yejune Deng <yejunedeng@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
45faeff1 |
|
03-Sep-2020 |
Tom Parkin <tparkin@katalix.com> |
l2tp: make magic feather checks more useful The l2tp tunnel and session structures contain a "magic feather" field which was originally intended to help trace lifetime bugs in the code. Since the introduction of the shared kernel refcount code in refcount.h, and l2tp's porting to those APIs, we are covered by the refcount code's checks and warnings. Duplicating those checks in the l2tp code isn't useful. However, magic feather checks are still useful to help to detect bugs stemming from misuse/trampling of the sk_user_data pointer in struct sock. The l2tp code makes extensive use of sk_user_data to stash pointers to the tunnel and session structures, and if another subsystem overwrites sk_user_data it's important to detect this. As such, rework l2tp's magic feather checks to focus on validating the tunnel and session data structures when they're extracted from sk_user_data. * Add a new accessor function l2tp_sk_to_tunnel which contains a magic feather check, and is used by l2tp_core and l2tp_ip[6] * Comment l2tp_udp_encap_recv which doesn't use this new accessor function because of the specific nature of the codepath it is called in * Drop l2tp_session_queue_purge's check on the session magic feather: it is called from code which is walking the tunnel session list, and hence doesn't need validation * Drop l2tp_session_free's check on the tunnel magic feather: the intention of this check is covered by refcount.h's reference count sanity checking * Add session magic validation in pppol2tp_ioctl. On failure return -EBADF, which mirrors the approach in pppol2tp_[sg]etsockopt. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
12923365 |
|
22-Aug-2020 |
Tom Parkin <tparkin@katalix.com> |
l2tp: don't log data frames l2tp had logging to trace data frame receipt and transmission, including code to dump packet contents. This was originally intended to aid debugging of core l2tp packet handling, but is of limited use now that code is stable. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ca7885db |
|
28-Jul-2020 |
Tom Parkin <tparkin@katalix.com> |
l2tp: tweak exports for l2tp_recv_common and l2tp_ioctl All of the l2tp subsystem's exported symbols are exported using EXPORT_SYMBOL_GPL, except for l2tp_recv_common and l2tp_ioctl. These functions alone are not useful without the rest of the l2tp infrastructure, so there's no practical benefit to these symbols using a different export policy. Change these exports to use EXPORT_SYMBOL_GPL for consistency with the rest of l2tp. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
95075150 |
|
24-Jul-2020 |
Tom Parkin <tparkin@katalix.com> |
l2tp: avoid multiple assignments checkpatch warns about multiple assignments. Update l2tp accordingly. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
0febc7b3 |
|
22-Jul-2020 |
Tom Parkin <tparkin@katalix.com> |
l2tp: cleanup comparisons to NULL checkpatch warns about comparisons to NULL, e.g. CHECK: Comparison to NULL could be written "!rt" #474: FILE: net/l2tp/l2tp_ip.c:474: + if (rt == NULL) { These sort of comparisons are generally clearer and more readable the way checkpatch suggests, so update l2tp accordingly. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
20dcb110 |
|
22-Jul-2020 |
Tom Parkin <tparkin@katalix.com> |
l2tp: cleanup comments Modify some l2tp comments to better adhere to kernel coding style, as reported by checkpatch.pl. Add descriptive comments for the l2tp per-net spinlocks to document their use. Fix an incorrect comment in l2tp_recv_common: RFC2661 section 5.4 states that: "The LNS controls enabling and disabling of sequence numbers by sending a data message with or without sequence numbers present at any time during the life of a session." l2tp handles this correctly in l2tp_recv_common, but the comment around the code was incorrect and confusing. Fix up the comment accordingly. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b71a61cc |
|
22-Jul-2020 |
Tom Parkin <tparkin@katalix.com> |
l2tp: cleanup whitespace use Fix up various whitespace issues as reported by checkpatch.pl: * remove spaces around operators where appropriate, * add missing blank lines following declarations, * remove multiple blank lines, or trailing blank lines at the end of functions. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b6238c04 |
|
17-Jul-2020 |
Christoph Hellwig <hch@lst.de> |
net/ipv4: remove compat_ip_{get,set}sockopt Handle the few cases that need special treatment in-line using in_compat_syscall(). Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8c918ffb |
|
17-Jul-2020 |
Christoph Hellwig <hch@lst.de> |
net: remove compat_sock_common_{get,set}sockopt Add the compat handling to sock_common_{get,set}sockopt instead, keyed of in_compat_syscall(). This allow to remove the now unused ->compat_{get,set}sockopt methods from struct proto_ops. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
02c71b14 |
|
29-May-2020 |
Eric Dumazet <edumazet@google.com> |
l2tp: do not use inet_hash()/inet_unhash() syzbot recently found a way to crash the kernel [1] Issue here is that inet_hash() & inet_unhash() are currently only meant to be used by TCP & DCCP, since only these protocols provide the needed hashinfo pointer. L2TP uses a single list (instead of a hash table) This old bug became an issue after commit 610236587600 ("bpf: Add new cgroup attach type to enable sock modifications") since after this commit, sk_common_release() can be called while the L2TP socket is still considered 'hashed'. general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 PID: 7063 Comm: syz-executor654 Not tainted 5.7.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:inet_unhash+0x11f/0x770 net/ipv4/inet_hashtables.c:600 Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e dd 04 00 00 48 8d 7d 08 44 8b 73 08 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 55 05 00 00 48 8d 7d 14 4c 8b 6d 08 48 b8 00 00 RSP: 0018:ffffc90001777d30 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff88809a6df940 RCX: ffffffff8697c242 RDX: 0000000000000001 RSI: ffffffff8697c251 RDI: 0000000000000008 RBP: 0000000000000000 R08: ffff88809f3ae1c0 R09: fffffbfff1514cc1 R10: ffffffff8a8a6607 R11: fffffbfff1514cc0 R12: ffff88809a6df9b0 R13: 0000000000000007 R14: 0000000000000000 R15: ffffffff873a4d00 FS: 0000000001d2b880(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000006cd090 CR3: 000000009403a000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: sk_common_release+0xba/0x370 net/core/sock.c:3210 inet_create net/ipv4/af_inet.c:390 [inline] inet_create+0x966/0xe00 net/ipv4/af_inet.c:248 __sock_create+0x3cb/0x730 net/socket.c:1428 sock_create net/socket.c:1479 [inline] __sys_socket+0xef/0x200 net/socket.c:1521 __do_sys_socket net/socket.c:1530 [inline] __se_sys_socket net/socket.c:1528 [inline] __x64_sys_socket+0x6f/0xb0 net/socket.c:1528 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295 entry_SYSCALL_64_after_hwframe+0x49/0xb3 RIP: 0033:0x441e29 Code: e8 fc b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffdce184148 EFLAGS: 00000246 ORIG_RAX: 0000000000000029 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000441e29 RDX: 0000000000000073 RSI: 0000000000000002 RDI: 0000000000000002 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000402c30 R14: 0000000000000000 R15: 0000000000000000 Modules linked in: ---[ end trace 23b6578228ce553e ]--- RIP: 0010:inet_unhash+0x11f/0x770 net/ipv4/inet_hashtables.c:600 Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e dd 04 00 00 48 8d 7d 08 44 8b 73 08 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 55 05 00 00 48 8d 7d 14 4c 8b 6d 08 48 b8 00 00 RSP: 0018:ffffc90001777d30 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff88809a6df940 RCX: ffffffff8697c242 RDX: 0000000000000001 RSI: ffffffff8697c251 RDI: 0000000000000008 RBP: 0000000000000000 R08: ffff88809f3ae1c0 R09: fffffbfff1514cc1 R10: ffffffff8a8a6607 R11: fffffbfff1514cc0 R12: ffff88809a6df9b0 R13: 0000000000000007 R14: 0000000000000000 R15: ffffffff873a4d00 FS: 0000000001d2b880(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000006cd090 CR3: 000000009403a000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Fixes: 0d76751fad77 ("l2tp: Add L2TPv3 IP encapsulation (no UDP) support") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: James Chapman <jchapman@katalix.com> Cc: Andrii Nakryiko <andriin@fb.com> Reported-by: syzbot+3610d489778b57cc8031@syzkaller.appspotmail.com
|
#
895b5c9f |
|
29-Sep-2019 |
Florian Westphal <fw@strlen.de> |
netfilter: drop bridge nf reset from nf_reset commit 174e23810cd31 ("sk_buff: drop all skb extensions on free and skb scrubbing") made napi recycle always drop skb extensions. The additional skb_ext_del() that is performed via nf_reset on napi skb recycle is not needed anymore. Most nf_reset() calls in the stack are there so queued skb won't block 'rmmod nf_conntrack' indefinitely. This removes the skb_ext_del from nf_reset, and renames it to a more fitting nf_reset_ct(). In a few selected places, add a call to skb_ext_reset to make sure that no active extensions remain. I am submitting this for "net", because we're still early in the release cycle. The patch applies to net-next too, but I think the rename causes needless divergence between those trees. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
2874c5fd |
|
27-May-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
c7cbdbf2 |
|
17-Apr-2019 |
Arnd Bergmann <arnd@arndb.de> |
net: rework SIOCGSTAMP ioctl handling The SIOCGSTAMP/SIOCGSTAMPNS ioctl commands are implemented by many socket protocol handlers, and all of those end up calling the same sock_get_timestamp()/sock_get_timestampns() helper functions, which results in a lot of duplicate code. With the introduction of 64-bit time_t on 32-bit architectures, this gets worse, as we then need four different ioctl commands in each socket protocol implementation. To simplify that, let's add a new .gettstamp() operation in struct proto_ops, and move ioctl implementation into the common sock_ioctl()/compat_sock_ioctl_trans() functions that these all go through. We can reuse the sock_get_timestamp() implementation, but generalize it so it can deal with both native and compat mode, as well as timeval and timespec structures. Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Link: https://lore.kernel.org/lkml/CAK8P3a038aDQQotzua_QtKGhq8O9n+rdiz2=WDCp82ys8eUT+A@mail.gmail.com/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
4522a70d |
|
29-Jan-2019 |
Jacob Wen <jian.w.wen@oracle.com> |
l2tp: fix reading optional fields of L2TPv3 Use pskb_may_pull() to make sure the optional fields are in skb linear parts, so we can safely read them later. It's easy to reproduce the issue with a net driver that supports paged skb data. Just create a L2TPv3 over IP tunnel and then generates some network traffic. Once reproduced, rx err in /sys/kernel/debug/l2tp/tunnels will increase. Changes in v4: 1. s/l2tp_v3_pull_opt/l2tp_v3_ensure_opt_in_linear/ 2. s/tunnel->version != L2TP_HDR_VER_2/tunnel->version == L2TP_HDR_VER_3/ 3. Add 'Fixes' in commit messages. Changes in v3: 1. To keep consistency, move the code out of l2tp_recv_common. 2. Use "net" instead of "net-next", since this is a bug fix. Changes in v2: 1. Only fix L2TPv3 to make code simple. To fix both L2TPv3 and L2TPv2, we'd better refactor l2tp_recv_common. It's complicated to do so. 2. Reloading pointers after pskb_may_pull Fixes: f7faffa3ff8e ("l2tp: Add L2TPv3 protocol support") Fixes: 0d76751fad77 ("l2tp: Add L2TPv3 IP encapsulation (no UDP) support") Fixes: a32e0eec7042 ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6") Signed-off-by: Jacob Wen <jian.w.wen@oracle.com> Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
01e28b92 |
|
10-Aug-2018 |
Guillaume Nault <g.nault@alphalink.fr> |
l2tp: split l2tp_session_get() l2tp_session_get() is used for two different purposes. If 'tunnel' is NULL, the session is searched globally in the supplied network namespace. Otherwise it is searched exclusively in the tunnel context. Callers always know the context in which they need to search the session. But some of them do provide both a namespace and a tunnel, making the semantic of the call unclear. This patch defines l2tp_tunnel_get_session() for lookups done in a tunnel and restricts l2tp_session_get() to namespace searches. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
2b139e6b |
|
25-Jul-2018 |
Guillaume Nault <g.nault@alphalink.fr> |
l2tp: remove ->recv_payload_hook The tunnel reception hook is only used by l2tp_ppp for skipping PPP framing bytes. This is a session specific operation, but once a PPP session sets ->recv_payload_hook on its tunnel, all frames received by the tunnel will enter pppol2tp_recv_payload_hook(), including those targeted at Ethernet sessions (an L2TPv3 tunnel can multiplex PPP and Ethernet sessions). So this mechanism is wrong, and uselessly complex. Let's just move this functionality to the pppol2tp rx handler and drop ->recv_payload_hook. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a11e1d43 |
|
28-Jun-2018 |
Linus Torvalds <torvalds@linux-foundation.org> |
Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL The poll() changes were not well thought out, and completely unexplained. They also caused a huge performance regression, because "->poll()" was no longer a trivial file operation that just called down to the underlying file operations, but instead did at least two indirect calls. Indirect calls are sadly slow now with the Spectre mitigation, but the performance problem could at least be largely mitigated by changing the "->get_poll_head()" operation to just have a per-file-descriptor pointer to the poll head instead. That gets rid of one of the new indirections. But that doesn't fix the new complexity that is completely unwarranted for the regular case. The (undocumented) reason for the poll() changes was some alleged AIO poll race fixing, but we don't make the common case slower and more complex for some uncommon special case, so this all really needs way more explanations and most likely a fundamental redesign. [ This revert is a revert of about 30 different commits, not reverted individually because that would just be unnecessarily messy - Linus ] Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
db5051ea |
|
09-Apr-2018 |
Christoph Hellwig <hch@lst.de> |
net: convert datagram_poll users tp ->poll_mask Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
d00fa9ad |
|
23-Feb-2018 |
James Chapman <jchapman@katalix.com> |
l2tp: fix races with tunnel socket close The tunnel socket tunnel->sock (struct sock) is accessed when preparing a new ppp session on a tunnel at pppol2tp_session_init. If the socket is closed by a thread while another is creating a new session, the threads race. In pppol2tp_connect, the tunnel object may be created if the pppol2tp socket is associated with the special session_id 0 and the tunnel socket is looked up using the provided fd. When handling this, pppol2tp_connect cannot sock_hold the tunnel socket to prevent it being destroyed during pppol2tp_connect since this may itself may race with the socket being destroyed. Doing sockfd_lookup in pppol2tp_connect isn't sufficient to prevent tunnel->sock going away either because a given tunnel socket fd may be reused between calls to pppol2tp_connect. Instead, have l2tp_tunnel_create sock_hold the tunnel socket before it does sockfd_put. This ensures that the tunnel's socket is always extant while the tunnel object exists. Hold a ref on the socket until the tunnel is destroyed and ensure that all tunnel destroy paths go through a common function (l2tp_tunnel_delete) since this will do the final sock_put to release the tunnel socket. Since the tunnel's socket is now guaranteed to exist if the tunnel exists, we no longer need to use sockfd_lookup via l2tp_sock_to_tunnel to derive the tunnel from the socket since this is always sk_user_data. Also, sessions no longer sock_hold the tunnel socket since sessions already hold a tunnel ref and the tunnel sock will not be freed until the tunnel is freed. Removing these sock_holds in l2tp_session_register avoids a possible sock leak in the pppol2tp_connect error path if l2tp_session_register succeeds but attaching a ppp channel fails. The pppol2tp_connect error path could have been fixed instead and have the sock ref dropped when the session is freed, but doing a sock_put of the tunnel socket when the session is freed would require a new session_free callback. It is simpler to just remove the sock_hold of the tunnel socket in l2tp_session_register, now that the tunnel socket lifetime is guaranteed. Finally, some init code in l2tp_tunnel_create is reordered to ensure that the new tunnel object's refcount is set and the tunnel socket ref is taken before the tunnel socket destructor callbacks are set. kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Modules linked in: CPU: 0 PID: 4360 Comm: syzbot_19c09769 Not tainted 4.16.0-rc2+ #34 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 RIP: 0010:pppol2tp_session_init+0x1d6/0x500 RSP: 0018:ffff88001377fb40 EFLAGS: 00010212 RAX: dffffc0000000000 RBX: ffff88001636a940 RCX: ffffffff84836c1d RDX: 0000000000000045 RSI: 0000000055976744 RDI: 0000000000000228 RBP: ffff88001377fb60 R08: ffffffff84836bc8 R09: 0000000000000002 R10: ffff88001377fab8 R11: 0000000000000001 R12: 0000000000000000 R13: ffff88001636aac8 R14: ffff8800160f81c0 R15: 1ffff100026eff76 FS: 00007ffb3ea66700(0000) GS:ffff88001a400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020e77000 CR3: 0000000016261000 CR4: 00000000000006f0 Call Trace: pppol2tp_connect+0xd18/0x13c0 ? pppol2tp_session_create+0x170/0x170 ? __might_fault+0x115/0x1d0 ? lock_downgrade+0x860/0x860 ? __might_fault+0xe5/0x1d0 ? security_socket_connect+0x8e/0xc0 SYSC_connect+0x1b6/0x310 ? SYSC_bind+0x280/0x280 ? __do_page_fault+0x5d1/0xca0 ? up_read+0x1f/0x40 ? __do_page_fault+0x3c8/0xca0 SyS_connect+0x29/0x30 ? SyS_accept+0x40/0x40 do_syscall_64+0x1e0/0x730 ? trace_hardirqs_off_thunk+0x1a/0x1c entry_SYSCALL_64_after_hwframe+0x42/0xb7 RIP: 0033:0x7ffb3e376259 RSP: 002b:00007ffeda4f6508 EFLAGS: 00000202 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000020e77012 RCX: 00007ffb3e376259 RDX: 000000000000002e RSI: 0000000020e77000 RDI: 0000000000000004 RBP: 00007ffeda4f6540 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000400b60 R13: 00007ffeda4f6660 R14: 0000000000000000 R15: 0000000000000000 Code: 80 3d b0 ff 06 02 00 0f 84 07 02 00 00 e8 13 d6 db fc 49 8d bc 24 28 02 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 f a 48 c1 ea 03 <80> 3c 02 00 0f 85 ed 02 00 00 4d 8b a4 24 28 02 00 00 e8 13 16 Fixes: 80d84ef3ff1dd ("l2tp: prevent l2tp_tunnel_delete racing with userspace close") Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
9b2c45d4 |
|
12-Feb-2018 |
Denys Vlasenko <dvlasenk@redhat.com> |
net: make getname() functions return length rather than use int* parameter Changes since v1: Added changes in these files: drivers/infiniband/hw/usnic/usnic_transport.c drivers/staging/lustre/lnet/lnet/lib-socket.c drivers/target/iscsi/iscsi_target_login.c drivers/vhost/net.c fs/dlm/lowcomms.c fs/ocfs2/cluster/tcp.c security/tomoyo/network.c Before: All these functions either return a negative error indicator, or store length of sockaddr into "int *socklen" parameter and return zero on success. "int *socklen" parameter is awkward. For example, if caller does not care, it still needs to provide on-stack storage for the value it does not need. None of the many FOO_getname() functions of various protocols ever used old value of *socklen. They always just overwrite it. This change drops this parameter, and makes all these functions, on success, return length of sockaddr. It's always >= 0 and can be differentiated from an error. Tests in callers are changed from "if (err)" to "if (err < 0)", where needed. rpc_sockname() lost "int buflen" parameter, since its only use was to be passed to kernel_getsockname() as &buflen and subsequently not used in any way. Userspace API is not changed. text data bss dec hex filename 30108430 2633624 873672 33615726 200ef6e vmlinux.before.o 30108109 2633612 873672 33615393 200ee21 vmlinux.o Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> CC: David S. Miller <davem@davemloft.net> CC: linux-kernel@vger.kernel.org CC: netdev@vger.kernel.org CC: linux-bluetooth@vger.kernel.org CC: linux-decnet-user@lists.sourceforge.net CC: linux-wireless@vger.kernel.org CC: linux-rdma@vger.kernel.org CC: linux-sctp@vger.kernel.org CC: linux-nfs@vger.kernel.org CC: linux-x25@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8f7dc9ae |
|
03-Nov-2017 |
Guillaume Nault <g.nault@alphalink.fr> |
l2tp: don't use l2tp_tunnel_find() in l2tp_ip and l2tp_ip6 Using l2tp_tunnel_find() in l2tp_ip_recv() is wrong for two reasons: * It doesn't take a reference on the returned tunnel, which makes the call racy wrt. concurrent tunnel deletion. * The lookup is only based on the tunnel identifier, so it can return a tunnel that doesn't match the packet's addresses or protocol. For example, a packet sent to an L2TPv3 over IPv6 tunnel can be delivered to an L2TPv2 over UDPv4 tunnel. This is worse than a simple cross-talk: when delivering the packet to an L2TP over UDP tunnel, the corresponding socket is UDP, where ->sk_backlog_rcv() is NULL. Calling sk_receive_skb() will then crash the kernel by trying to execute this callback. And l2tp_tunnel_find() isn't even needed here. __l2tp_ip_bind_lookup() properly checks the socket binding and connection settings. It was used as a fallback mechanism for finding tunnels that didn't have their data path registered yet. But it's not limited to this case and can be used to replace l2tp_tunnel_find() in the general case. Fix l2tp_ip6 in the same way. Fixes: 0d76751fad77 ("l2tp: Add L2TPv3 IP encapsulation (no UDP) support") Fixes: a32e0eec7042 ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a4346210 |
|
31-Oct-2017 |
Guillaume Nault <g.nault@alphalink.fr> |
l2tp: remove ->ref() and ->deref() The ->ref() and ->deref() callbacks are unused since PPP stopped using them in ee40fb2e1eb5 ("l2tp: protect sock pointer of struct pppol2tp_session with RCU"). We can thus remove them from struct l2tp_session and drop the do_ref parameter of l2tp_session_get*(). Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
61b9a047 |
|
31-Mar-2017 |
Guillaume Nault <g.nault@alphalink.fr> |
l2tp: fix race in l2tp_recv_common() Taking a reference on sessions in l2tp_recv_common() is racy; this has to be done by the callers. To this end, a new function is required (l2tp_session_get()) to atomically lookup a session and take a reference on it. Callers then have to manually drop this reference. Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
94d7ee0b |
|
29-Mar-2017 |
Guillaume Nault <g.nault@alphalink.fr> |
l2tp: hold tunnel socket when handling control frames in l2tp_ip and l2tp_ip6 The code following l2tp_tunnel_find() expects that a new reference is held on sk. Either sk_receive_skb() or the discard_put error path will drop a reference from the tunnel's socket. This issue exists in both l2tp_ip and l2tp_ip6. Fixes: a3c18422a4b4 ("l2tp: hold socket before dropping lock in l2tp_ip{, 6}_recv()") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
51fb60eb |
|
26-Feb-2017 |
Paul Hüber <phueber@kernsp.in> |
l2tp: avoid use-after-free caused by l2tp_ip_backlog_recv l2tp_ip_backlog_recv may not return -1 if the packet gets dropped. The return value is passed up to ip_local_deliver_finish, which treats negative values as an IP protocol number for resubmission. Signed-off-by: Paul Hüber <phueber@kernsp.in> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
72fb96e7 |
|
09-Feb-2017 |
Eric Dumazet <edumazet@google.com> |
l2tp: do not use udp_ioctl() udp_ioctl(), as its name suggests, is used by UDP protocols, but is also used by L2TP :( L2TP should use its own handler, because it really does not look the same. SIOCINQ for instance should not assume UDP checksum or headers. Thanks to Andrey and syzkaller team for providing the report and a nice reproducer. While crashes only happen on recent kernels (after commit 7c13f97ffde6 ("udp: do fwd memory scheduling on dequeue")), this probably needs to be backported to older kernels. Fixes: 7c13f97ffde6 ("udp: do fwd memory scheduling on dequeue") Fixes: 85584672012e ("udp: Fix udp_poll() and ioctl()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c5fdae04 |
|
06-Jan-2017 |
Guillaume Nault <g.nault@alphalink.fr> |
l2tp: rework socket comparison in __l2tp_ip*_bind_lookup() Split conditions, so that each test becomes clearer. Also, for l2tp_ip, check if "laddr" is 0. This prevents a socket from binding to the unspecified address when other sockets are already bound using the same device (if any), connection ID and namespace. Same thing for l2tp_ip6: add ipv6_addr_any(laddr) and ipv6_addr_any(raddr) tests to ensure that an IPv6 unspecified address passed as parameter is properly treated a wildcard. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
986f7cbc |
|
06-Jan-2017 |
Guillaume Nault <g.nault@alphalink.fr> |
l2tp: remove useless NULL check in __l2tp_ip*_bind_lookup() If "l2tp" was NULL, that'd mean "sk" is NULL too. This can't happen since "sk" is returned by sk_for_each_bound(). Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
bb39b0bd |
|
06-Jan-2017 |
Guillaume Nault <g.nault@alphalink.fr> |
l2tp: make __l2tp_ip*_bind_lookup() parameters 'const' Add const qualifier wherever possible for __l2tp_ip_bind_lookup() and __l2tp_ip6_bind_lookup(). Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8cf2f704 |
|
06-Jan-2017 |
Guillaume Nault <g.nault@alphalink.fr> |
l2tp: remove redundant addr_len check in l2tp_ip_bind() addr_len's value has already been verified at this point. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a9b2dff8 |
|
30-Dec-2016 |
Guillaume Nault <g.nault@alphalink.fr> |
l2tp: take remote address into account in l2tp_ip and l2tp_ip6 socket lookups For connected sockets, __l2tp_ip{,6}_bind_lookup() needs to check the remote IP when looking for a matching socket. Otherwise a connected socket can receive traffic not originating from its peer. Drop l2tp_ip_bind_lookup() and l2tp_ip6_bind_lookup() instead of updating their prototype, as these functions aren't used. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
df90e688 |
|
29-Nov-2016 |
Guillaume Nault <g.nault@alphalink.fr> |
l2tp: fix lookup for sockets not bound to a device in l2tp_ip When looking up an l2tp socket, we must consider a null netdevice id as wild card. There are currently two problems caused by __l2tp_ip_bind_lookup() not considering 'dif' as wild card when set to 0: * A socket bound to a device (i.e. with sk->sk_bound_dev_if != 0) never receives any packet. Since __l2tp_ip_bind_lookup() is called with dif == 0 in l2tp_ip_recv(), sk->sk_bound_dev_if is always different from 'dif' so the socket doesn't match. * Two sockets, one bound to a device but not the other, can be bound to the same address. If the first socket binding to the address is the one that is also bound to a device, the second socket can bind to the same address without __l2tp_ip_bind_lookup() noticing the overlap. To fix this issue, we need to consider that any null device index, be it 'sk->sk_bound_dev_if' or 'dif', matches with any other value. We also need to pass the input device index to __l2tp_ip_bind_lookup() on reception so that sockets bound to a device never receive packets from other devices. This patch fixes l2tp_ip6 in the same way. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
d5e3a190 |
|
29-Nov-2016 |
Guillaume Nault <g.nault@alphalink.fr> |
l2tp: fix racy socket lookup in l2tp_ip and l2tp_ip6 bind() It's not enough to check for sockets bound to same address at the beginning of l2tp_ip{,6}_bind(): even if no socket is found at that time, a socket with the same address could be bound before we take the l2tp lock again. This patch moves the lookup right before inserting the new socket, so that no change can ever happen to the list between address lookup and socket insertion. Care is taken to avoid side effects on the socket in case of failure. That is, modifications of the socket are done after the lookup, when binding is guaranteed to succeed, and before releasing the l2tp lock, so that concurrent lookups will always see fully initialised sockets. For l2tp_ip, 'ret' is set to -EINVAL before checking the SOCK_ZAPPED bit. Error code was mistakenly set to -EADDRINUSE on error by commit 32c231164b76 ("l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()"). Using -EINVAL restores original behaviour. For l2tp_ip6, the lookup is now always done with the correct bound device. Before this patch, when binding to a link-local address, the lookup was done with the original sk->sk_bound_dev_if, which was later overwritten with addr->l2tp_scope_id. Lookup is now performed with the final sk->sk_bound_dev_if value. Finally, the (addr_len >= sizeof(struct sockaddr_in6)) check has been dropped: addr is a sockaddr_l2tpip6 not sockaddr_in6 and addr_len has already been checked at this point (this part of the code seems to have been copy-pasted from net/ipv6/raw.c). Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a3c18422 |
|
29-Nov-2016 |
Guillaume Nault <g.nault@alphalink.fr> |
l2tp: hold socket before dropping lock in l2tp_ip{, 6}_recv() Socket must be held while under the protection of the l2tp lock; there is no guarantee that sk remains valid after the read_unlock_bh() call. Same issue for l2tp_ip and l2tp_ip6. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
0382a25a |
|
29-Nov-2016 |
Guillaume Nault <g.nault@alphalink.fr> |
l2tp: lock socket before checking flags in connect() Socket flags aren't updated atomically, so the socket must be locked while reading the SOCK_ZAPPED flag. This issue exists for both l2tp_ip and l2tp_ip6. For IPv6, this patch also brings error handling for __ip6_datagram_connect() failures. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
32c23116 |
|
18-Nov-2016 |
Guillaume Nault <g.nault@alphalink.fr> |
l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind() Lock socket before checking the SOCK_ZAPPED flag in l2tp_ip6_bind(). Without lock, a concurrent call could modify the socket flags between the sock_flag(sk, SOCK_ZAPPED) test and the lock_sock() call. This way, a socket could be inserted twice in l2tp_ip6_bind_table. Releasing it would then leave a stale pointer there, generating use-after-free errors when walking through the list or modifying adjacent entries. BUG: KASAN: use-after-free in l2tp_ip6_close+0x22e/0x290 at addr ffff8800081b0ed8 Write of size 8 by task syz-executor/10987 CPU: 0 PID: 10987 Comm: syz-executor Not tainted 4.8.0+ #39 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 ffff880031d97838 ffffffff829f835b ffff88001b5a1640 ffff8800081b0ec0 ffff8800081b15a0 ffff8800081b6d20 ffff880031d97860 ffffffff8174d3cc ffff880031d978f0 ffff8800081b0e80 ffff88001b5a1640 ffff880031d978e0 Call Trace: [<ffffffff829f835b>] dump_stack+0xb3/0x118 lib/dump_stack.c:15 [<ffffffff8174d3cc>] kasan_object_err+0x1c/0x70 mm/kasan/report.c:156 [< inline >] print_address_description mm/kasan/report.c:194 [<ffffffff8174d666>] kasan_report_error+0x1f6/0x4d0 mm/kasan/report.c:283 [< inline >] kasan_report mm/kasan/report.c:303 [<ffffffff8174db7e>] __asan_report_store8_noabort+0x3e/0x40 mm/kasan/report.c:329 [< inline >] __write_once_size ./include/linux/compiler.h:249 [< inline >] __hlist_del ./include/linux/list.h:622 [< inline >] hlist_del_init ./include/linux/list.h:637 [<ffffffff8579047e>] l2tp_ip6_close+0x22e/0x290 net/l2tp/l2tp_ip6.c:239 [<ffffffff850b2dfd>] inet_release+0xed/0x1c0 net/ipv4/af_inet.c:415 [<ffffffff851dc5a0>] inet6_release+0x50/0x70 net/ipv6/af_inet6.c:422 [<ffffffff84c4581d>] sock_release+0x8d/0x1d0 net/socket.c:570 [<ffffffff84c45976>] sock_close+0x16/0x20 net/socket.c:1017 [<ffffffff817a108c>] __fput+0x28c/0x780 fs/file_table.c:208 [<ffffffff817a1605>] ____fput+0x15/0x20 fs/file_table.c:244 [<ffffffff813774f9>] task_work_run+0xf9/0x170 [<ffffffff81324aae>] do_exit+0x85e/0x2a00 [<ffffffff81326dc8>] do_group_exit+0x108/0x330 [<ffffffff81348cf7>] get_signal+0x617/0x17a0 kernel/signal.c:2307 [<ffffffff811b49af>] do_signal+0x7f/0x18f0 [<ffffffff810039bf>] exit_to_usermode_loop+0xbf/0x150 arch/x86/entry/common.c:156 [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190 [<ffffffff81006060>] syscall_return_slowpath+0x1a0/0x1e0 arch/x86/entry/common.c:259 [<ffffffff85e4d726>] entry_SYSCALL_64_fastpath+0xc4/0xc6 Object at ffff8800081b0ec0, in cache L2TP/IPv6 size: 1448 Allocated: PID = 10987 [ 1116.897025] [<ffffffff811ddcb6>] save_stack_trace+0x16/0x20 [ 1116.897025] [<ffffffff8174c736>] save_stack+0x46/0xd0 [ 1116.897025] [<ffffffff8174c9ad>] kasan_kmalloc+0xad/0xe0 [ 1116.897025] [<ffffffff8174cee2>] kasan_slab_alloc+0x12/0x20 [ 1116.897025] [< inline >] slab_post_alloc_hook mm/slab.h:417 [ 1116.897025] [< inline >] slab_alloc_node mm/slub.c:2708 [ 1116.897025] [< inline >] slab_alloc mm/slub.c:2716 [ 1116.897025] [<ffffffff817476a8>] kmem_cache_alloc+0xc8/0x2b0 mm/slub.c:2721 [ 1116.897025] [<ffffffff84c4f6a9>] sk_prot_alloc+0x69/0x2b0 net/core/sock.c:1326 [ 1116.897025] [<ffffffff84c58ac8>] sk_alloc+0x38/0xae0 net/core/sock.c:1388 [ 1116.897025] [<ffffffff851ddf67>] inet6_create+0x2d7/0x1000 net/ipv6/af_inet6.c:182 [ 1116.897025] [<ffffffff84c4af7b>] __sock_create+0x37b/0x640 net/socket.c:1153 [ 1116.897025] [< inline >] sock_create net/socket.c:1193 [ 1116.897025] [< inline >] SYSC_socket net/socket.c:1223 [ 1116.897025] [<ffffffff84c4b46f>] SyS_socket+0xef/0x1b0 net/socket.c:1203 [ 1116.897025] [<ffffffff85e4d685>] entry_SYSCALL_64_fastpath+0x23/0xc6 Freed: PID = 10987 [ 1116.897025] [<ffffffff811ddcb6>] save_stack_trace+0x16/0x20 [ 1116.897025] [<ffffffff8174c736>] save_stack+0x46/0xd0 [ 1116.897025] [<ffffffff8174cf61>] kasan_slab_free+0x71/0xb0 [ 1116.897025] [< inline >] slab_free_hook mm/slub.c:1352 [ 1116.897025] [< inline >] slab_free_freelist_hook mm/slub.c:1374 [ 1116.897025] [< inline >] slab_free mm/slub.c:2951 [ 1116.897025] [<ffffffff81748b28>] kmem_cache_free+0xc8/0x330 mm/slub.c:2973 [ 1116.897025] [< inline >] sk_prot_free net/core/sock.c:1369 [ 1116.897025] [<ffffffff84c541eb>] __sk_destruct+0x32b/0x4f0 net/core/sock.c:1444 [ 1116.897025] [<ffffffff84c5aca4>] sk_destruct+0x44/0x80 net/core/sock.c:1452 [ 1116.897025] [<ffffffff84c5ad33>] __sk_free+0x53/0x220 net/core/sock.c:1460 [ 1116.897025] [<ffffffff84c5af23>] sk_free+0x23/0x30 net/core/sock.c:1471 [ 1116.897025] [<ffffffff84c5cb6c>] sk_common_release+0x28c/0x3e0 ./include/net/sock.h:1589 [ 1116.897025] [<ffffffff8579044e>] l2tp_ip6_close+0x1fe/0x290 net/l2tp/l2tp_ip6.c:243 [ 1116.897025] [<ffffffff850b2dfd>] inet_release+0xed/0x1c0 net/ipv4/af_inet.c:415 [ 1116.897025] [<ffffffff851dc5a0>] inet6_release+0x50/0x70 net/ipv6/af_inet6.c:422 [ 1116.897025] [<ffffffff84c4581d>] sock_release+0x8d/0x1d0 net/socket.c:570 [ 1116.897025] [<ffffffff84c45976>] sock_close+0x16/0x20 net/socket.c:1017 [ 1116.897025] [<ffffffff817a108c>] __fput+0x28c/0x780 fs/file_table.c:208 [ 1116.897025] [<ffffffff817a1605>] ____fput+0x15/0x20 fs/file_table.c:244 [ 1116.897025] [<ffffffff813774f9>] task_work_run+0xf9/0x170 [ 1116.897025] [<ffffffff81324aae>] do_exit+0x85e/0x2a00 [ 1116.897025] [<ffffffff81326dc8>] do_group_exit+0x108/0x330 [ 1116.897025] [<ffffffff81348cf7>] get_signal+0x617/0x17a0 kernel/signal.c:2307 [ 1116.897025] [<ffffffff811b49af>] do_signal+0x7f/0x18f0 [ 1116.897025] [<ffffffff810039bf>] exit_to_usermode_loop+0xbf/0x150 arch/x86/entry/common.c:156 [ 1116.897025] [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190 [ 1116.897025] [<ffffffff81006060>] syscall_return_slowpath+0x1a0/0x1e0 arch/x86/entry/common.c:259 [ 1116.897025] [<ffffffff85e4d726>] entry_SYSCALL_64_fastpath+0xc4/0xc6 Memory state around the buggy address: ffff8800081b0d80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8800081b0e00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff8800081b0e80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb ^ ffff8800081b0f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8800081b0f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== The same issue exists with l2tp_ip_bind() and l2tp_ip_bind_table. Fixes: c51ce49735c1 ("l2tp: fix oops in L2TP IP sockets for connect() AF_UNSPEC case") Reported-by: Baozeng Ding <sploving1@gmail.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Baozeng Ding <sploving1@gmail.com> Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
286c72de |
|
20-Oct-2016 |
Eric Dumazet <edumazet@google.com> |
udp: must lock the socket in udp_disconnect() Baozeng Ding reported KASAN traces showing uses after free in udp_lib_get_port() and other related UDP functions. A CONFIG_DEBUG_PAGEALLOC=y kernel would eventually crash. I could write a reproducer with two threads doing : static int sock_fd; static void *thr1(void *arg) { for (;;) { connect(sock_fd, (const struct sockaddr *)arg, sizeof(struct sockaddr_in)); } } static void *thr2(void *arg) { struct sockaddr_in unspec; for (;;) { memset(&unspec, 0, sizeof(unspec)); connect(sock_fd, (const struct sockaddr *)&unspec, sizeof(unspec)); } } Problem is that udp_disconnect() could run without holding socket lock, and this was causing list corruptions. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Baozeng Ding <sploving1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
5745b823 |
|
03-Apr-2016 |
Haishuang Yan <yanhaishuang@cmss.chinamobile.com> |
ipv4: l2tp: fix a potential issue in l2tp_ip_recv pskb_may_pull() can change skb->data, so we have to load ptr/optr at the right place. Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
163c2e25 |
|
23-Sep-2015 |
stephen hemminger <stephen@networkplumber.org> |
l2tp: auto load IP modules When creating a IP encapsulated tunnel the necessary l2tp module should be loaded. It already works for UDP encapsulation, it just doesn't work for direct IP encap. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1b784140 |
|
02-Mar-2015 |
Ying Xue <ying.xue@windriver.com> |
net: Remove iocb argument from sendmsg and recvmsg After TIPC doesn't depend on iocb argument in its internal implementations of sendmsg() and recvmsg() hooks defined in proto structure, no any user is using iocb argument in them at all now. Then we can drop the redundant iocb argument completely from kinds of implementations of both sendmsg() and recvmsg() in the entire networking stack. Cc: Christoph Hellwig <hch@lst.de> Suggested-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
6ce8e9ce |
|
06-Apr-2014 |
Al Viro <viro@zeniv.linux.org.uk> |
new helper: memcpy_from_msg() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
51f3d02b |
|
05-Nov-2014 |
David S. Miller <davem@davemloft.net> |
net: Add and use skb_copy_datagram_msg() helper. This encapsulates all of the skb_copy_datagram_iovec() callers with call argument signature "skb, offset, msghdr->msg_iov, length". When we move to iov_iters in the networking, the iov_iter object will sit in the msghdr. Having a helper like this means there will be less places to touch during that transformation. Based upon descriptions and patch from Al Viro. Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b26ba202 |
|
23-May-2014 |
Tom Herbert <therbert@google.com> |
net: Eliminate no_check from protosw It doesn't seem like an protocols are setting anything other than the default, and allowing to arbitrarily disable checksums for a whole protocol seems dangerous. This can be done on a per socket basis. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b0270e91 |
|
14-Apr-2014 |
Eric Dumazet <edumazet@google.com> |
ipv4: add a sock pointer to ip_queue_xmit() ip_queue_xmit() assumes the skb it has to transmit is attached to an inet socket. Commit 31c70d5956fc ("l2tp: keep original skb ownership") changed l2tp to not change skb ownership and thus broke this assumption. One fix is to add a new 'struct sock *sk' parameter to ip_queue_xmit(), so that we do not assume skb->sk points to the socket used by l2tp tunnel. Fixes: 31c70d5956fc ("l2tp: keep original skb ownership") Reported-by: Zhan Jianyu <nasa4836@gmail.com> Tested-by: Zhan Jianyu <nasa4836@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
342dfc30 |
|
17-Jan-2014 |
Steffen Hurrle <steffen@hurrle.net> |
net: add build-time checks for msg->msg_name size This is a follow-up patch to f3d3342602f8bc ("net: rework recvmsg handler msg_name and msg_namelen logic"). DECLARE_SOCKADDR validates that the structure we use for writing the name information to is not larger than the buffer which is reserved for msg->msg_name (which is 128 bytes). Also use DECLARE_SOCKADDR consistently in sendmsg code paths. Signed-off-by: Steffen Hurrle <steffen@hurrle.net> Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
bceaa902 |
|
17-Nov-2013 |
Hannes Frederic Sowa <hannes@stressinduktion.org> |
inet: prevent leakage of uninitialized memory to user in recv syscalls Only update *addr_len when we actually fill in sockaddr, otherwise we can return uninitialized memory from the stack to the caller in the recvfrom, recvmmsg and recvmsg syscalls. Drop the the (addr_len == NULL) checks because we only get called with a valid addr_len pointer either from sock_common_recvmsg or inet_recvmsg. If a blocking read waits on a socket which is concurrently shut down we now return zero and set msg_msgnamelen to 0. Reported-by: mpb <mpb.mail@gmail.com> Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
93606317 |
|
19-Mar-2013 |
Tom Parkin <tparkin@katalix.com> |
l2tp: close sessions in ip socket destroy callback l2tp_core hooks UDP's .destroy handler to gain advance warning of a tunnel socket being closed from userspace. We need to do the same thing for IP-encapsulation sockets. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b67bfe0d |
|
27-Feb-2013 |
Sasha Levin <sasha.levin@oracle.com> |
hlist: drop the node parameter from iterators I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S | hlist_for_each_entry_continue(a, - b, c) S | hlist_for_each_entry_from(a, - b, c) S | hlist_for_each_entry_rcu(a, - b, c, d) S | hlist_for_each_entry_rcu_bh(a, - b, c, d) S | hlist_for_each_entry_continue_rcu_bh(a, - b, c) S | for_each_busy_worker(a, c, - b, d) S | ax25_uid_for_each(a, - b, c) S | ax25_for_each(a, - b, c) S | inet_bind_bucket_for_each(a, - b, c) S | sctp_for_each_hentry(a, - b, c) S | sk_for_each(a, - b, c) S | sk_for_each_rcu(a, - b, c) S | sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S | sk_for_each_safe(a, - b, c, d) S | sk_for_each_bound(a, - b, c) S | hlist_for_each_entry_safe(a, - b, c, d, e) S | hlist_for_each_entry_continue_rcu(a, - b, c) S | nr_neigh_for_each(a, - b, c) S | nr_neigh_for_each_safe(a, - b, c, d) S | nr_node_for_each(a, - b, c) S | nr_node_for_each_safe(a, - b, c, d) S | - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S | - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S | for_each_host(a, - b, c) S | for_each_host_safe(a, - b, c, d) S | for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by: Peter Senna Tschudin <peter.senna@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
9d6ddb19 |
|
05-Feb-2013 |
David S. Miller <davem@davemloft.net> |
l2tp: Make ipv4 protocol handler namespace aware. The infrastructure is already pretty much entirely there to allow this conversion. The tunnel and session lookups have per-namespace tables, and the ipv4 bind lookup includes the namespace in the lookup key. Set netns_ok in l2tp_ip_protocol. Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
4399a4df |
|
08-Jun-2012 |
Eric Dumazet <edumazet@google.com> |
l2tp: fix a race in l2tp_ip_sendmsg() Commit 081b1b1bb27f (l2tp: fix l2tp_ip_sendmsg() route handling) added a race, in case IP route cache is disabled. In this case, we should not do the dst_release(&rt->dst), since it'll free the dst immediately, instead of waiting a RCU grace period. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: James Chapman <jchapman@katalix.com> Cc: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c51ce497 |
|
28-May-2012 |
James Chapman <jchapman@katalix.com> |
l2tp: fix oops in L2TP IP sockets for connect() AF_UNSPEC case An application may call connect() to disconnect a socket using an address with family AF_UNSPEC. The L2TP IP sockets were not handling this case when the socket is not bound and an attempt to connect() using AF_UNSPEC in such cases would result in an oops. This patch addresses the problem by protecting the sk_prot->disconnect() call against trying to unhash the socket before it is bound. The L2TP IPv4 and IPv6 sockets have the same problem. Both are fixed by this patch. The patch also adds more checks that the sockaddr supplied to bind() and connect() calls is valid. RIP: 0010:[<ffffffff82e133b0>] [<ffffffff82e133b0>] inet_unhash+0x50/0xd0 RSP: 0018:ffff88001989be28 EFLAGS: 00010293 Stack: ffff8800407a8000 0000000000000000 ffff88001989be78 ffffffff82e3a249 ffffffff82e3a050 ffff88001989bec8 ffff88001989be88 ffff8800407a8000 0000000000000010 ffff88001989bec8 ffff88001989bea8 ffffffff82e42639 Call Trace: [<ffffffff82e3a249>] udp_disconnect+0x1f9/0x290 [<ffffffff82e42639>] inet_dgram_connect+0x29/0x80 [<ffffffff82d012fc>] sys_connect+0x9c/0x100 Reported-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a4ca44fa |
|
16-May-2012 |
Joe Perches <joe@perches.com> |
net: l2tp: Standardize logging styles Use more current logging styles. Add pr_fmt to prefix output appropriately. Convert printks to pr_<level>. Convert PRINTK macros to new l2tp_<level> macros. Neaten some <foo>_refcount debugging macros. Use print_hex_dump_bytes instead of hand-coded loops. Coalesce formats and align arguments. Some KERN_DEBUG output is not now emitted unless dynamic_debugging is enabled. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
84768edb |
|
01-May-2012 |
Sasha Levin <levinsasha928@gmail.com> |
net: l2tp: unlock socket lock before returning from l2tp_ip_sendmsg l2tp_ip_sendmsg could return without releasing socket lock, making it all the way to userspace, and generating the following warning: [ 130.891594] ================================================ [ 130.894569] [ BUG: lock held when returning to user space! ] [ 130.897257] 3.4.0-rc5-next-20120501-sasha #104 Tainted: G W [ 130.900336] ------------------------------------------------ [ 130.902996] trinity/8384 is leaving the kernel with locks still held! [ 130.906106] 1 lock held by trinity/8384: [ 130.907924] #0: (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff82b9503f>] l2tp_ip_sendmsg+0x2f/0x550 Introduced by commit 2f16270 ("l2tp: Fix locking in l2tp_ip.c"). Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c8657fd5 |
|
29-Apr-2012 |
James Chapman <jchapman@katalix.com> |
l2tp: remove unused stats from l2tp_ip socket The l2tp_ip socket currently maintains packet/byte stats in its private socket structure. But these counters aren't exposed to userspace and so serve no purpose. The counters were also smp-unsafe. So this patch just gets rid of the stats. While here, change a couple of internal __u32 variables to u32. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
de3c7a18 |
|
29-Apr-2012 |
James Chapman <jchapman@katalix.com> |
l2tp: Use ip4_datagram_connect() in l2tp_ip_connect() Cleanup the l2tp_ip code to make use of an existing ipv4 support function. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c9be48dc |
|
09-Apr-2012 |
James Chapman <jchapman@katalix.com> |
l2tp: don't overwrite source address in l2tp_ip_bind() Applications using L2TP/IP sockets want to be able to bind() an L2TP/IP socket to set the local tunnel id while leaving the auto-assigned source address alone. So if no source address is supplied, don't overwrite the address already stored in the socket. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
d1f224ae |
|
09-Apr-2012 |
James Chapman <jchapman@katalix.com> |
l2tp: fix refcount leak in l2tp_ip sockets The l2tp_ip socket close handler does not update the module refcount correctly which prevents module unload after the first bind() call on an L2TPv3 IP encapulation socket. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
68315801 |
|
24-Jan-2012 |
James Chapman <jchapman@katalix.com> |
l2tp: l2tp_ip - fix possible oops on packet receive When a packet is received on an L2TP IP socket (L2TPv3 IP link encapsulation), the l2tpip socket's backlog_rcv function calls xfrm4_policy_check(). This is not necessary, since it was called before the skb was added to the backlog. With CONFIG_NET_NS enabled, xfrm4_policy_check() will oops if skb->dev is null, so this trivial patch removes the call. This bug has always been present, but only when CONFIG_NET_NS is enabled does it cause problems. Most users are probably using UDP encapsulation for L2TP, hence the problem has only recently surfaced. EIP: 0060:[<c12bb62b>] EFLAGS: 00210246 CPU: 0 EIP is at l2tp_ip_recvmsg+0xd4/0x2a7 EAX: 00000001 EBX: d77b5180 ECX: 00000000 EDX: 00200246 ESI: 00000000 EDI: d63cbd30 EBP: d63cbd18 ESP: d63cbcf4 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Call Trace: [<c1218568>] sock_common_recvmsg+0x31/0x46 [<c1215c92>] __sock_recvmsg_nosec+0x45/0x4d [<c12163a1>] __sock_recvmsg+0x31/0x3b [<c1216828>] sock_recvmsg+0x96/0xab [<c10b2693>] ? might_fault+0x47/0x81 [<c10b2693>] ? might_fault+0x47/0x81 [<c1167fd0>] ? _copy_from_user+0x31/0x115 [<c121e8c8>] ? copy_from_user+0x8/0xa [<c121ebd6>] ? verify_iovec+0x3e/0x78 [<c1216604>] __sys_recvmsg+0x10a/0x1aa [<c1216792>] ? sock_recvmsg+0x0/0xab [<c105a99b>] ? __lock_acquire+0xbdf/0xbee [<c12d5a99>] ? do_page_fault+0x193/0x375 [<c10d1200>] ? fcheck_files+0x9b/0xca [<c10d1259>] ? fget_light+0x2a/0x9c [<c1216bbb>] sys_recvmsg+0x2b/0x43 [<c1218145>] sys_socketcall+0x16d/0x1a5 [<c11679f0>] ? trace_hardirqs_on_thunk+0xc/0x10 [<c100305f>] sysenter_do_call+0x12/0x38 Code: c6 05 8c ea a8 c1 01 e8 0c d4 d9 ff 85 f6 74 07 3e ff 86 80 00 00 00 b9 17 b6 2b c1 ba 01 00 00 00 b8 78 ed 48 c1 e8 23 f6 d9 ff <ff> 76 0c 68 28 e3 30 c1 68 2d 44 41 c1 e8 89 57 01 00 83 c4 0c Signed-off-by: James Chapman <jchapman@katalix.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
081b1b1b |
|
11-Jun-2011 |
Eric Dumazet <eric.dumazet@gmail.com> |
l2tp: fix l2tp_ip_sendmsg() route handling l2tp_ip_sendmsg() in non connected mode incorrectly calls sk_setup_caps(). Subsequent send() calls send data to wrong destination. We can also avoid changing dst refcount in connected mode, using appropriate rcu locking. Once output route lookups can also be done under rcu, sendto() calls wont change dst refcounts too. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@conan.davemloft.net>
|
#
d9d8da80 |
|
06-May-2011 |
David S. Miller <davem@davemloft.net> |
inet: Pass flowi to ->queue_xmit(). This allows us to acquire the exact route keying information from the protocol, however that might be managed. It handles all of the possibilities, from the simplest case of storing the key in inet->cork.fl to the more complex setup SCTP has where individual transports determine the flow. Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
fdbb0f07 |
|
08-May-2011 |
David S. Miller <davem@davemloft.net> |
l2tp: Use cork flow in l2tp_ip_connect() and l2tp_ip_sendmsg() Now that the socket is consistently locked in these two routines, this transformation is legal. Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
2f16270f |
|
08-May-2011 |
David S. Miller <davem@davemloft.net> |
l2tp: Fix locking in l2tp_ip.c Both l2tp_ip_connect() and l2tp_ip_sendmsg() must take the socket lock. They both modify socket state non-atomically, and in particular l2tp_ip_sendmsg() increments socket private counters without using atomic operations. Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
31e4543d |
|
03-May-2011 |
David S. Miller <davem@davemloft.net> |
ipv4: Make caller provide on-stack flow key to ip_route_output_ports(). Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ad227425 |
|
29-Apr-2011 |
David S. Miller <davem@davemloft.net> |
ipv4: Get route daddr from flow key in l2tp_ip_connect(). Now that output route lookups update the flow with destination address selection, we can fetch it from fl4->daddr instead of rt->rt_dst Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
44901666 |
|
29-Apr-2011 |
David S. Miller <davem@davemloft.net> |
ipv4: Fetch route saddr from flow key in l2tp_ip_connect(). Now that output route lookups update the flow with source address selection, we can fetch it from fl4->saddr instead of rt->rt_src Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
778865a5 |
|
28-Apr-2011 |
David S. Miller <davem@davemloft.net> |
l2tp: Fix inet_opt conversion. We don't actually hold the socket lock at this point, so the rcu_dereference_protected() isn't' correct. Thanks to Eric Dumazet for pointing this out. Thankfully, we're only interested in fetching the faddr value if srr is enabled, so we can simply make this an RCU sequence and use plain rcu_dereference(). Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
f6d8bd05 |
|
21-Apr-2011 |
Eric Dumazet <eric.dumazet@gmail.com> |
inet: add RCU protection to inet->opt We lack proper synchronization to manipulate inet->opt ip_options Problem is ip_make_skb() calls ip_setup_cork() and ip_setup_cork() possibly makes a copy of ipc->opt (struct ip_options), without any protection against another thread manipulating inet->opt. Another thread can change inet->opt pointer and free old one under us. Use RCU to protect inet->opt (changed to inet->inet_opt). Instead of handling atomic refcounts, just copy ip_options when necessary, to avoid cache line dirtying. We cant insert an rcu_head in struct ip_options since its included in skb->cb[], so this patch is large because I had to introduce a new ip_options_rcu structure. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
2d7192d6 |
|
26-Apr-2011 |
David S. Miller <davem@davemloft.net> |
ipv4: Sanitize and simplify ip_route_{connect,newports}() These functions are used together as a unit for route resolution during connect(). They address the chicken-and-egg problem that exists when ports need to be allocated during connect() processing, yet such port allocations require addressing information from the routing code. It's currently more heavy handed than it needs to be, and in particular we allocate and initialize a flow object twice. Let the callers provide the on-stack flow object. That way we only need to initialize it once in the ip_route_connect() call. Later, if ip_route_newports() needs to do anything, it re-uses that flow object as-is except for the ports which it updates before the route re-lookup. Also, describe why this set of facilities are needed and how it works in a big comment. Signed-off-by: David S. Miller <davem@davemloft.net> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
|
#
e9c54999 |
|
27-Apr-2011 |
Lucas De Marchi <lucas.demarchi@profusion.mobi> |
Revert wrong fixes for common misspellings These changes were incorrectly fixed by codespell. They were now manually corrected. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
|
#
78fbfd8a |
|
11-Mar-2011 |
David S. Miller <davem@davemloft.net> |
ipv4: Create and use route lookup helpers. The idea here is this minimizes the number of places one has to edit in order to make changes to how flows are defined and used. Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b23dd4fe |
|
02-Mar-2011 |
David S. Miller <davem@davemloft.net> |
ipv4: Make output route lookup return rtable directly. Instead of on the stack. Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
273447b3 |
|
01-Mar-2011 |
David S. Miller <davem@davemloft.net> |
ipv4: Kill can_sleep arg to ip_route_output_flow() This boolean state is now available in the flow flags. Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
420d44da |
|
01-Mar-2011 |
David S. Miller <davem@davemloft.net> |
ipv4: Make final arg to ip_route_output_flow to be boolean "can_sleep" Since that is what the current vague "flags" argument means. Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
abdf7e72 |
|
01-Mar-2011 |
David S. Miller <davem@davemloft.net> |
ipv4: Can final ip_route_connect() arg to boolean "can_sleep". Since that's what the current vague "flags" thing means. Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e8d34a88 |
|
05-Dec-2010 |
Michal Marek <mmarek@suse.cz> |
l2tp: Fix modalias of l2tp_ip Using the SOCK_DGRAM enum results in "net-pf-2-proto-SOCK_DGRAM-type-115", so use the numeric value like it is done in net/dccp. Signed-off-by: Michal Marek <mmarek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
5811662b |
|
12-Nov-2010 |
Changli Gao <xiaosuo@gmail.com> |
net: use the macros defined for the members of flowi Use the macros defined for the members of flowi to clean the code up. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
fc130840 |
|
21-Oct-2010 |
stephen hemminger <shemminger@vyatta.com> |
l2tp: make local function static Also moved the refcound inlines from l2tp_core.h to l2tp_core.c since only used in that one file. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e83726b0 |
|
21-Oct-2010 |
Eric Dumazet <eric.dumazet@gmail.com> |
l2tp: small cleanup Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
d8d1f30b |
|
11-Jun-2010 |
Changli Gao <xiaosuo@gmail.com> |
net-next: remove useless union keyword remove useless union keyword in rtable, rt6_info and dn_route. Since there is only one member in a union, the union keyword isn't useful. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
4e15ed4d |
|
15-Apr-2010 |
Shan Wei <shanwei@cn.fujitsu.com> |
net: replace ipfragok with skb->local_df As Herbert Xu said: we should be able to simply replace ipfragok with skb->local_df. commit f88037(sctp: Drop ipfargok in sctp_xmit function) has droped ipfragok and set local_df value properly. The patch kills the ipfragok parameter of .queue_xmit(). Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
0d76751f |
|
02-Apr-2010 |
James Chapman <jchapman@katalix.com> |
l2tp: Add L2TPv3 IP encapsulation (no UDP) support This patch adds a new L2TPIP socket family and modifies the core to handle the case where there is no UDP header in the L2TP packet. L2TP/IP uses IP protocol 115. Since L2TP/UDP and L2TP/IP packets differ in layout, the datapath packet handling code needs changes too. Userspace uses an L2TPIP socket instead of a UDP socket when IP encapsulation is required. We can't use raw sockets for this because the semantics of raw sockets don't lend themselves to the socket-per-tunnel model - we need to Signed-off-by: David S. Miller <davem@davemloft.net>
|