Searched +hist:02 +hist:c30a84 (Results 1 - 25 of 41) sorted by path
/linux-master/include/linux/ | ||
H A D | fddidevice.h | diff 02c30a84 Thu May 05 17:16:16 MDT 2005 Jesper Juhl <juhl-lkml@dif.dk> [PATCH] update Ross Biro bouncing email address Ross moved. Remove the bad email address so people will find the correct one in ./CREDITS. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 02c30a84 Thu May 05 17:16:16 MDT 2005 Jesper Juhl <juhl-lkml@dif.dk> [PATCH] update Ross Biro bouncing email address Ross moved. Remove the bad email address so people will find the correct one in ./CREDITS. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
H A D | etherdevice.h | diff e80094a4 Mon Oct 18 15:10:02 MDT 2021 Jakub Kicinski <kuba@kernel.org> ethernet: add a helper for assigning port addresses We have 5 drivers which offset base MAC addr by port id. Create a helper for them. This helper takes care of overflows, which some drivers did not do, please complain if that's going to break anything! Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Shannon Nelson <snelson@pensando.io> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 48eab831 Thu Sep 02 12:10:37 MDT 2021 Jakub Kicinski <kuba@kernel.org> net: create netdev->dev_addr assignment helpers Recent work on converting address list to a tree made it obvious we need an abstraction around writing netdev->dev_addr. Without such abstraction updating the main device address is invisible to the core. Introduce a number of helpers which for now just wrap memcpy() but in the future can make necessary changes to the address tree. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff 05428729 Thu Nov 02 03:36:48 MDT 2017 Egil Hjelmeland <privat@egil-hjelmeland.no> net: Define eth_stp_addr in linux/etherdevice.h The lan9303 driver defines eth_stp_addr as a synonym to eth_reserved_addr_base to get the STP ethernet address 01:80:c2:00:00:00. eth_reserved_addr_base is also used to define the start of Bridge Reserved ethernet address range, which happen to be the STP address. br_dev_setup refer to eth_reserved_addr_base as a definition of STP address. Clean up by: - Move the eth_stp_addr definition to linux/etherdevice.h - Use eth_stp_addr instead of eth_reserved_addr_base in br_dev_setup. Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> diff 2bc80059 Thu Nov 01 03:12:02 MDT 2012 Ben Hutchings <bhutchings@solarflare.com> eth: Rename and properly align br_reserved_address array Since this array is no longer part of the bridge driver, it should have an 'eth' prefix not 'br'. We also assume that either it's 16-bit-aligned or the architecture has efficient unaligned access. Ensure the first of these is true by explicitly aligning it. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 1a0d6ae5 Thu Feb 09 02:48:54 MST 2012 Danny Kukawka <danny.kukawka@bisect.de> rename dev_hw_addr_random and remove redundant second Renamed dev_hw_addr_random to eth_hw_addr_random() to reflect that this function only assign a random ethernet address (MAC). Removed the second parameter (u8 *hwaddr), it's redundant since the also given net_device already contains net_device->dev_addr. Set it directly. Adapt igbvf and ixgbevf to the changed function. Small fix for ixgbevf_probe(): if ixgbevf_sw_init() fails (which means the device got no dev_addr) handle the error and jump to err_sw_init as already done by igbvf in similar case. Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de> Signed-off-by: David S. Miller <davem@davemloft.net> diff 3b04ddde Tue Oct 09 02:40:57 MDT 2007 Stephen Hemminger <shemminger@linux-foundation.org> [NET]: Move hardware header operations out of netdevice. Since hardware header operations are part of the protocol class not the device instance, make them into a separate object and save memory. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff 02c30a84 Thu May 05 17:16:16 MDT 2005 Jesper Juhl <juhl-lkml@dif.dk> [PATCH] update Ross Biro bouncing email address Ross moved. Remove the bad email address so people will find the correct one in ./CREDITS. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 02c30a84 Thu May 05 17:16:16 MDT 2005 Jesper Juhl <juhl-lkml@dif.dk> [PATCH] update Ross Biro bouncing email address Ross moved. Remove the bad email address so people will find the correct one in ./CREDITS. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
H A D | hippidevice.h | diff 02c30a84 Thu May 05 17:16:16 MDT 2005 Jesper Juhl <juhl-lkml@dif.dk> [PATCH] update Ross Biro bouncing email address Ross moved. Remove the bad email address so people will find the correct one in ./CREDITS. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 02c30a84 Thu May 05 17:16:16 MDT 2005 Jesper Juhl <juhl-lkml@dif.dk> [PATCH] update Ross Biro bouncing email address Ross moved. Remove the bad email address so people will find the correct one in ./CREDITS. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
H A D | if_arp.h | diff a4f39c9f Wed Aug 23 07:41:02 MDT 2023 Nicolas Dichtel <nicolas.dichtel@6wind.com> net: handle ARPHRD_PPP in dev_is_mac_header_xmit() The goal is to support a bpf_redirect() from an ethernet device (ingress) to a ppp device (egress). The l2 header is added automatically by the ppp driver, thus the ethernet header should be removed. CC: stable@vger.kernel.org Fixes: 27b29f63058d ("bpf: add bpf_redirect() helper") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Tested-by: Siwar Zitouni <siwar.zitouni@6wind.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 3b707c30 Thu Jan 24 04:07:02 MST 2019 Maciej Żenczykowski <maze@google.com> net: dev_is_mac_header_xmit() true for ARPHRD_RAWIP __bpf_redirect() and act_mirred checks this boolean to determine whether to prefix an ethernet header. Signed-off-by: Maciej Żenczykowski <maze@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> diff 6752c8db Mon Mar 25 02:26:16 MDT 2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> firewire net, ipv4 arp: Extend hardware address and remove driver-level packet inspection. Inspection of upper layer protocol is considered harmful, especially if it is about ARP or other stateful upper layer protocol; driver cannot (and should not) have full state of them. IPv4 over Firewire module used to inspect ARP (both in sending path and in receiving path), and record peer's GUID, max packet size, max speed and fifo address. This patch removes such inspection by extending our "hardware address" definition to include other information as well: max packet size, max speed and fifo. By doing this, The neighbour module in networking subsystem can cache them. Note: As we have started ignoring sspd and max_rec in ARP/NDP, those information will not be used in the driver when sending. When a packet is being sent, the IP layer fills our pseudo header with the extended "hardware address", including GUID and fifo. The driver can look-up node-id (the real but rather volatile low-level address) by GUID, and then the module can send the packet to the wire using parameters provided in the extendedn hardware address. This approach is realistic because IP over IEEE1394 (RFC2734) and IPv6 over IEEE1394 (RFC3146) share same "hardware address" format in their address resolution protocols. Here, extended "hardware address" is defined as follows: union fwnet_hwaddr { u8 u[16]; struct { __be64 uniq_id; /* EUI-64 */ u8 max_rec; /* max packet size */ u8 sspd; /* max speed */ __be16 fifo_hi; /* hi 16bits of FIFO addr */ __be32 fifo_lo; /* lo 32bits of FIFO addr */ } __packed uc; }; Note that Hardware address is declared as union, so that we can map full IP address into this, when implementing MCAP (Multicast Cannel Allocation Protocol) for IPv6, but IP and ARP subsystem do not need to know this format in detail. One difference between original ARP (RFC826) and 1394 ARP (RFC2734) is that 1394 ARP Request/Reply do not contain the target hardware address field (aka ar$tha). This difference is handled in the ARP subsystem. CC: Stephan Gatzka <stephan.gatzka@gmail.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff 02c30a84 Thu May 05 17:16:16 MDT 2005 Jesper Juhl <juhl-lkml@dif.dk> [PATCH] update Ross Biro bouncing email address Ross moved. Remove the bad email address so people will find the correct one in ./CREDITS. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 02c30a84 Thu May 05 17:16:16 MDT 2005 Jesper Juhl <juhl-lkml@dif.dk> [PATCH] update Ross Biro bouncing email address Ross moved. Remove the bad email address so people will find the correct one in ./CREDITS. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
H A D | net.h | diff c381b079 Fri Oct 02 02:27:28 MDT 2020 Coly Li <colyli@suse.de> net: introduce helper sendpage_ok() in include/linux/net.h The original problem was from nvme-over-tcp code, who mistakenly uses kernel_sendpage() to send pages allocated by __get_free_pages() without __GFP_COMP flag. Such pages don't have refcount (page_count is 0) on tail pages, sending them by kernel_sendpage() may trigger a kernel panic from a corrupted kernel heap, because these pages are incorrectly freed in network stack as page_count 0 pages. This patch introduces a helper sendpage_ok(), it returns true if the checking page, - is not slab page: PageSlab(page) is false. - has page refcount: page_count(page) is not zero All drivers who want to send page to remote end by kernel_sendpage() may use this helper to check whether the page is OK. If the helper does not return true, the driver should try other non sendpage method (e.g. sock_no_sendpage()) to handle the page. Signed-off-by: Coly Li <colyli@suse.de> Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Jan Kara <jack@suse.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Vlastimil Babka <vbabka@suse.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> diff c381b079 Fri Oct 02 02:27:28 MDT 2020 Coly Li <colyli@suse.de> net: introduce helper sendpage_ok() in include/linux/net.h The original problem was from nvme-over-tcp code, who mistakenly uses kernel_sendpage() to send pages allocated by __get_free_pages() without __GFP_COMP flag. Such pages don't have refcount (page_count is 0) on tail pages, sending them by kernel_sendpage() may trigger a kernel panic from a corrupted kernel heap, because these pages are incorrectly freed in network stack as page_count 0 pages. This patch introduces a helper sendpage_ok(), it returns true if the checking page, - is not slab page: PageSlab(page) is false. - has page refcount: page_count(page) is not zero All drivers who want to send page to remote end by kernel_sendpage() may use this helper to check whether the page is OK. If the helper does not return true, the driver should try other non sendpage method (e.g. sock_no_sendpage()) to handle the page. Signed-off-by: Coly Li <colyli@suse.de> Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Jan Kara <jack@suse.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Vlastimil Babka <vbabka@suse.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> diff a3f8683b Sun Jul 02 20:22:01 MDT 2017 Al Viro <viro@zeniv.linux.org.uk> ->poll() methods should return __poll_t The most common place to find POLL... bitmaps: return values of ->poll() and its subsystem counterparts. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff da9ba564 Wed Jun 07 18:05:02 MDT 2017 Jason A. Donenfeld <Jason@zx2c4.com> random: add get_random_{bytes,u32,u64,int,long,once}_wait family These functions are simple convenience wrappers that call wait_for_random_bytes before calling the respective get_random_* function. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> diff d8725c86 Wed Dec 10 22:02:50 MST 2014 Al Viro <viro@zeniv.linux.org.uk> get rid of the size argument of sock_sendmsg() it's equal to iov_iter_count(&msg->msg_iter) in all cases Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff 1b784140 Mon Mar 02 00:37:48 MST 2015 Ying Xue <ying.xue@windriver.com> net: Remove iocb argument from sendmsg and recvmsg After TIPC doesn't depend on iocb argument in its internal implementations of sendmsg() and recvmsg() hooks defined in proto structure, no any user is using iocb argument in them at all now. Then we can drop the redundant iocb argument completely from kinds of implementations of both sendmsg() and recvmsg() in the entire networking stack. Cc: Christoph Hellwig <hch@lst.de> Suggested-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff c68c7f5a Sat Oct 19 22:26:02 MDT 2013 Hannes Frederic Sowa <hannes@stressinduktion.org> net: fix build warnings because of net_get_random_once merge This patch fixes the following warning: In file included from include/linux/skbuff.h:27:0, from include/linux/netfilter.h:5, from include/net/netns/netfilter.h:5, from include/net/net_namespace.h:20, from include/linux/init_task.h:14, from init/init_task.c:1: include/linux/net.h:243:14: warning: 'struct static_key' declared inside parameter list [enabled by default] struct static_key *done_key); on x86_64 allnoconfig, um defconfig and ia64 allmodconfig and maybe others as well. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff 228e548e Mon May 02 14:21:35 MDT 2011 Anton Blanchard <anton@samba.org> net: Add sendmmsg socket system call This patch adds a multiple message send syscall and is the send version of the existing recvmmsg syscall. This is heavily based on the patch by Arnaldo that added recvmmsg. I wrote a microbenchmark to test the performance gains of using this new syscall: http://ozlabs.org/~anton/junkcode/sendmmsg_test.c The test was run on a ppc64 box with a 10 Gbit network card. The benchmark can send both UDP and RAW ethernet packets. 64B UDP batch pkts/sec 1 804570 2 872800 (+ 8 %) 4 916556 (+14 %) 8 939712 (+17 %) 16 952688 (+18 %) 32 956448 (+19 %) 64 964800 (+20 %) 64B raw socket batch pkts/sec 1 1201449 2 1350028 (+12 %) 4 1461416 (+22 %) 8 1513080 (+26 %) 16 1541216 (+28 %) 32 1553440 (+29 %) 64 1557888 (+30 %) We see a 20% improvement in throughput on UDP send and 30% on raw socket send. [ Add sparc syscall entries. -DaveM ] Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff 1621e094 Mon Feb 01 02:44:19 MST 2010 Alexey Dobriyan <adobriyan@gmail.com> net: CONFIG_COMPAT redux Ifdef out struct proto_ops::compat_ioctl struct proto_ops::compat_setsockopt struct proto_ops::compat_getsockopt to make structures smaller on COMPAT=n kernels. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 717115e1 Fri Jul 25 02:45:58 MDT 2008 Dave Young <hidave.darkstar@gmail.com> printk ratelimiting rewrite All ratelimit user use same jiffies and burst params, so some messages (callbacks) will be lost. For example: a call printk_ratelimit(5 * HZ, 1) b call printk_ratelimit(5 * HZ, 1) before the 5*HZ timeout of a, then b will will be supressed. - rewrite __ratelimit, and use a ratelimit_state as parameter. Thanks for hints from andrew. - Add WARN_ON_RATELIMIT, update rcupreempt.h - remove __printk_ratelimit - use __ratelimit in net_ratelimit Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
H A D | netdevice.h | diff 4d2bb0bf Mon Feb 12 02:50:55 MST 2024 Lorenzo Bianconi <lorenzo@kernel.org> xdp: rely on skb pointer reference in do_xdp_generic and netif_receive_generic_xdp Rely on skb pointer reference instead of the skb pointer in do_xdp_generic and netif_receive_generic_xdp routine signatures. This is a preliminary patch to add multi-buff support for xdp running in generic mode where we will need to reallocate the skb to avoid linearization and we will need to make it visible to do_xdp_generic() caller. Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Hoiland-Jorgensen <toke@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/c09415b1f48c8620ef4d76deed35050a7bddf7c2.1707729884.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff d3d344a1 Tue Jan 02 09:22:20 MST 2024 Eric Dumazet <edumazet@google.com> net-device: move xdp_prog to net_device_read_rx xdp_prog is used in receive path, both from XDP enabled drivers and from netif_elide_gro(). This patch also removes two 4-bytes holes. Fixes: 43a71cd66b9c ("net-device: reorganize net_device fast path variables") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Coco Li <lixiaoyan@google.com> Cc: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240102162220.750823-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff 0b068c71 Thu Sep 21 02:52:16 MDT 2023 Eric Dumazet <edumazet@google.com> net: add DEV_STATS_READ() helper Companion of DEV_STATS_INC() & DEV_STATS_ADD(). This is going to be used in the series. Use it in macsec_get_stats64(). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 680ee045 Wed Aug 02 19:02:30 MDT 2023 Jakub Kicinski <kuba@kernel.org> net: invert the netdevice.h vs xdp.h dependency xdp.h is far more specific and is included in only 67 other files vs netdevice.h's 1538 include sites. Make xdp.h include netdevice.h, instead of the other way around. This decreases the incremental allmodconfig builds size when xdp.h is touched from 5947 to 662 objects. Move bpf_prog_run_xdp() to xdp.h, seems appropriate and filter.h is a mega-header in its own right so it's nice to avoid xdp.h getting included there as well. The only unfortunate part is that the typedef for xdp_features_t has to move to netdevice.h, since its embedded in struct netdevice. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://lore.kernel.org/r/20230803010230.1755386-4-kuba@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> diff 680ee045 Wed Aug 02 19:02:30 MDT 2023 Jakub Kicinski <kuba@kernel.org> net: invert the netdevice.h vs xdp.h dependency xdp.h is far more specific and is included in only 67 other files vs netdevice.h's 1538 include sites. Make xdp.h include netdevice.h, instead of the other way around. This decreases the incremental allmodconfig builds size when xdp.h is touched from 5947 to 662 objects. Move bpf_prog_run_xdp() to xdp.h, seems appropriate and filter.h is a mega-header in its own right so it's nice to avoid xdp.h getting included there as well. The only unfortunate part is that the typedef for xdp_features_t has to move to netdevice.h, since its embedded in struct netdevice. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://lore.kernel.org/r/20230803010230.1755386-4-kuba@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> diff 49e47a5b Wed Aug 02 19:02:29 MDT 2023 Jakub Kicinski <kuba@kernel.org> net: move struct netdev_rx_queue out of netdevice.h struct netdev_rx_queue is touched in only a few places and having it defined in netdevice.h brings in the dependency on xdp.h, because struct xdp_rxq_info gets embedded in struct netdev_rx_queue. In prep for removal of xdp.h from netdevice.h move all the netdev_rx_queue stuff to a new header. We could technically break the new header up to avoid the sysfs.h include but it's so rarely included it doesn't seem to be worth it at this point. Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://lore.kernel.org/r/20230803010230.1755386-3-kuba@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> diff 49e47a5b Wed Aug 02 19:02:29 MDT 2023 Jakub Kicinski <kuba@kernel.org> net: move struct netdev_rx_queue out of netdevice.h struct netdev_rx_queue is touched in only a few places and having it defined in netdevice.h brings in the dependency on xdp.h, because struct xdp_rxq_info gets embedded in struct netdev_rx_queue. In prep for removal of xdp.h from netdevice.h move all the netdev_rx_queue stuff to a new header. We could technically break the new header up to avoid the sysfs.h include but it's so rarely included it doesn't seem to be worth it at this point. Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://lore.kernel.org/r/20230803010230.1755386-3-kuba@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> diff 88c0a6b5 Sun Apr 02 06:37:55 MDT 2023 Vladimir Oltean <vladimir.oltean@nxp.com> net: create a netdev notifier for DSA to reject PTP on DSA master The fact that PTP 2-step TX timestamping is broken on DSA switches if the master also timestamps the same packets is documented by commit f685e609a301 ("net: dsa: Deny PTP on master if switch supports it"). We attempt to help the users avoid shooting themselves in the foot by making DSA reject the timestamping ioctls on an interface that is a DSA master, and the switch tree beneath it contains switches which are aware of PTP. The only problem is that there isn't an established way of intercepting ndo_eth_ioctl calls, so DSA creates avoidable burden upon the network stack by creating a struct dsa_netdevice_ops with overlaid function pointers that are manually checked from the relevant call sites. There used to be 2 such dsa_netdevice_ops, but now, ndo_eth_ioctl is the only one left. There is an ongoing effort to migrate driver-visible hardware timestamping control from the ndo_eth_ioctl() based API to a new ndo_hwtstamp_set() model, but DSA actively prevents that migration, since dsa_master_ioctl() is currently coded to manually call the master's legacy ndo_eth_ioctl(), and so, whenever a network device driver would be converted to the new API, DSA's restrictions would be circumvented, because any device could be used as a DSA master. The established way for unrelated modules to react on a net device event is via netdevice notifiers. So we create a new notifier which gets called whenever there is an attempt to change hardware timestamping settings on a device. Finally, there is another reason why a netdev notifier will be a good idea, besides strictly DSA, and this has to do with PHY timestamping. With ndo_eth_ioctl(), all MAC drivers must manually call phy_has_hwtstamp() before deciding whether to act upon SIOCSHWTSTAMP, otherwise they must pass this ioctl to the PHY driver via phy_mii_ioctl(). With the new ndo_hwtstamp_set() API, it will be desirable to simply not make any calls into the MAC device driver when timestamping should be performed at the PHY level. But there exist drivers, such as the lan966x switch, which need to install packet traps for PTP regardless of whether they are the layer that provides the hardware timestamps, or the PHY is. That would be impossible to support with the new API. The proposal there, too, is to introduce a netdev notifier which acts as a better cue for switching drivers to add or remove PTP packet traps, than ndo_hwtstamp_set(). The one introduced here "almost" works there as well, except for the fact that packet traps should only be installed if the PHY driver succeeded to enable hardware timestamping, whereas here, we need to deny hardware timestamping on the DSA master before it actually gets enabled. This is why this notifier is called "PRE_", and the notifier that would get used for PHY timestamping and packet traps would be called NETDEV_CHANGE_HWTSTAMP. This isn't a new concept, for example NETDEV_CHANGEUPPER and NETDEV_PRECHANGEUPPER do the same thing. In expectation of future netlink UAPI, we also pass a non-NULL extack pointer to the netdev notifier, and we make DSA populate it with an informative reason for the rejection. To avoid making it go to waste, we make the ioctl-based dev_set_hwtstamp() create a fake extack and print the message to the kernel log. Link: https://lore.kernel.org/netdev/20230401191215.tvveoi3lkawgg6g4@skbuf/ Link: https://lore.kernel.org/netdev/20230310164451.ls7bbs6pdzs4m6pw@skbuf/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 4b397c06 Fri Mar 10 12:11:09 MST 2023 Eric Dumazet <edumazet@google.com> net: tunnels: annotate lockless accesses to dev->needed_headroom IP tunnels can apparently update dev->needed_headroom in their xmit path. This patch takes care of three tunnels xmit, and also the core LL_RESERVED_SPACE() and LL_RESERVED_SPACE_EXTRA() helpers. More changes might be needed for completeness. BUG: KCSAN: data-race in ip_tunnel_xmit / ip_tunnel_xmit read to 0xffff88815b9da0ec of 2 bytes by task 888 on cpu 1: ip_tunnel_xmit+0x1270/0x1730 net/ipv4/ip_tunnel.c:803 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:444 [inline] ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126 iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:444 [inline] ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126 iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:444 [inline] ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126 iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:444 [inline] ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126 iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:444 [inline] ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126 iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:444 [inline] ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126 iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 write to 0xffff88815b9da0ec of 2 bytes by task 2379 on cpu 0: ip_tunnel_xmit+0x1294/0x1730 net/ipv4/ip_tunnel.c:804 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip6_finish_output2+0x9bc/0xc50 net/ipv6/ip6_output.c:134 __ip6_finish_output net/ipv6/ip6_output.c:195 [inline] ip6_finish_output+0x39a/0x4e0 net/ipv6/ip6_output.c:206 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip6_output+0xeb/0x220 net/ipv6/ip6_output.c:227 dst_output include/net/dst.h:444 [inline] NF_HOOK include/linux/netfilter.h:302 [inline] mld_sendpack+0x438/0x6a0 net/ipv6/mcast.c:1820 mld_send_cr net/ipv6/mcast.c:2121 [inline] mld_ifc_work+0x519/0x7b0 net/ipv6/mcast.c:2653 process_one_work+0x3e6/0x750 kernel/workqueue.c:2390 worker_thread+0x5f2/0xa10 kernel/workqueue.c:2537 kthread+0x1ac/0x1e0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308 value changed: 0x0dd4 -> 0x0e14 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 2379 Comm: kworker/0:0 Not tainted 6.3.0-rc1-syzkaller-00002-g8ca09d5fa354-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023 Workqueue: mld mld_ifc_work Fixes: 8eb30be0352d ("ipv6: Create ip6_tnl_xmit") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230310191109.2384387-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff f3da86dc Fri Dec 02 11:41:33 MST 2022 Leon Romanovsky <leon@kernel.org> xfrm: add support to HW update soft and hard limits Both in RX and TX, the traffic that performs IPsec packet offload transformation is accounted by HW. It is needed to properly handle hard limits that require to drop the packet. It means that XFRM core needs to update internal counters with the one that accounted by the HW, so new callbacks are introduced in this patch. In case of soft or hard limit is occurred, the driver should call to xfrm_state_check_expire() that will perform key rekeying exactly as done by XFRM core. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> |
/linux-master/ | ||
H A D | CREDITS | diff 3cf5abf2 Tue Mar 26 02:51:30 MDT 2024 Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> MAINTAINERS: Drop Gustavo Pimentel as PCI DWC Maintainer Gustavo Pimentel seems to have left Synopsys, so his email is bouncing. And there is no indication from him expressing willingless to continue contributing to the driver. Drop him from the MAINTAINERS entry and add a CREDITS entry. Link: https://lore.kernel.org/r/20240326085130.12487-1-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> [bhelgaas: add CREDITS entry] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> diff 44042fb0 Wed Jan 31 02:34:34 MST 2024 Sekhar Nori <nsekhar@ti.com> MAINTAINERS: drop Sekhar Nori My TI e-mail address will become inactive soon. Drop it. Add an entry to CREDITS file for work done on TI DaVinci family SoCs. Signed-off-by: Sekhar Nori <nsekhar@ti.com> Acked-by: Nishanth Menon <nm@ti.com> Link: https://lore.kernel.org/r/20240131093434.55652-1-nsekhar@ti.com Signed-off-by: Stephen Boyd <sboyd@kernel.org> diff da14d1fe Tue Jan 09 09:45:11 MST 2024 Jakub Kicinski <kuba@kernel.org> MAINTAINERS: eth: mtk: move John to CREDITS John is still active in other bits of the kernel but not much on the MediaTek ethernet switch side. Our scripts report: Subsystem MEDIATEK ETHERNET DRIVER Changes 81 / 384 (21%) Last activity: 2023-12-21 Felix Fietkau <nbd@nbd.name>: Author c6d96df9fa2c 2023-05-02 00:00:00 42 Tags c6d96df9fa2c 2023-05-02 00:00:00 48 John Crispin <john@phrozen.org>: Sean Wang <sean.wang@mediatek.com>: Author 880c2d4b2fdf 2019-06-03 00:00:00 5 Tags a5d75538295b 2020-04-07 00:00:00 7 Mark Lee <Mark-MC.Lee@mediatek.com>: Author 8d66a8183d0c 2019-11-14 00:00:00 4 Tags 8d66a8183d0c 2019-11-14 00:00:00 4 Lorenzo Bianconi <lorenzo@kernel.org>: Author 7cb8cd4daacf 2023-12-21 00:00:00 98 Tags 7cb8cd4daacf 2023-12-21 00:00:00 112 Top reviewers: [18]: horms@kernel.org [15]: leonro@nvidia.com [8]: rmk+kernel@armlinux.org.uk INACTIVE MAINTAINER John Crispin <john@phrozen.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: John Crispin <john@phrozen.org> Link: https://lore.kernel.org/r/20240109164517.3063131-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff da14d1fe Tue Jan 09 09:45:11 MST 2024 Jakub Kicinski <kuba@kernel.org> MAINTAINERS: eth: mtk: move John to CREDITS John is still active in other bits of the kernel but not much on the MediaTek ethernet switch side. Our scripts report: Subsystem MEDIATEK ETHERNET DRIVER Changes 81 / 384 (21%) Last activity: 2023-12-21 Felix Fietkau <nbd@nbd.name>: Author c6d96df9fa2c 2023-05-02 00:00:00 42 Tags c6d96df9fa2c 2023-05-02 00:00:00 48 John Crispin <john@phrozen.org>: Sean Wang <sean.wang@mediatek.com>: Author 880c2d4b2fdf 2019-06-03 00:00:00 5 Tags a5d75538295b 2020-04-07 00:00:00 7 Mark Lee <Mark-MC.Lee@mediatek.com>: Author 8d66a8183d0c 2019-11-14 00:00:00 4 Tags 8d66a8183d0c 2019-11-14 00:00:00 4 Lorenzo Bianconi <lorenzo@kernel.org>: Author 7cb8cd4daacf 2023-12-21 00:00:00 98 Tags 7cb8cd4daacf 2023-12-21 00:00:00 112 Top reviewers: [18]: horms@kernel.org [15]: leonro@nvidia.com [8]: rmk+kernel@armlinux.org.uk INACTIVE MAINTAINER John Crispin <john@phrozen.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: John Crispin <john@phrozen.org> Link: https://lore.kernel.org/r/20240109164517.3063131-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff 16a1d968 Mon Oct 02 12:43:43 MDT 2023 Vlastimil Babka <vbabka@suse.cz> mm/slab: remove mm/slab.c and slab_def.h Remove the SLAB implementation. Update CREDITS. Also update and properly sort the SLOB entry there. RIP SLAB allocator (1996 - 2024) Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Tested-by: David Rientjes <rientjes@google.com> Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> diff bc220fe7 Thu Nov 30 01:38:48 MST 2023 Bagas Sanjaya <bagasdotme@gmail.com> MAINTAINERS: drop Antti Palosaari He is currently inactive (last message from him is two years ago [1]). His media tree [2] is also dormant (latest activity is 6 years ago), yet his site is still online [3]. Drop him from MAINTAINERS and add CREDITS entry for him. We thank him for maintaining various DVB drivers. [1]: https://lore.kernel.org/all/660772b3-0597-02db-ed94-c6a9be04e8e8@iki.fi/ [2]: https://git.linuxtv.org/anttip/media_tree.git/ [3]: https://palosaari.fi/linux/ Link: https://lkml.kernel.org/r/20231130083848.5396-1-bagasdotme@gmail.com Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Acked-by: Antti Palosaari <crope@iki.fi> Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> diff 33fcc0e3 Mon Feb 20 10:02:10 MST 2023 Lukas Bulwahn <lukas.bulwahn@gmail.com> qnx4: credit contributors in CREDITS Replace the content of the qnx4 README file with the canonical place for such information. Add the credits of the qnx4 contribution to CREDITS. As there is already a QNX4 FILESYSTEM section in MAINTAINERS, it is clear who to contact and send patches to. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-By: Anders Larsen <al@alarsen.net> Link: https://lore.kernel.org/r/20230220170210.15677-3-lukas.bulwahn@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net> diff 3fe899e4 Mon Feb 20 10:02:09 MST 2023 Lukas Bulwahn <lukas.bulwahn@gmail.com> qnx6: credit contributor and mark filesystem orphan Replace the content of the qnx6 README file with the canonical places for such information. Add the credits of the qnx6 contribution to CREDITS, and add an section in MAINTAINERS to mark this filesystem as Orphan, as the domain ontika.net and email address does not resolve to an IP address anymore. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Link: https://lore.kernel.org/r/20230220170210.15677-2-lukas.bulwahn@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net> diff 57b24f8c Wed Feb 01 11:20:11 MST 2023 Jakub Kicinski <kuba@kernel.org> MAINTAINERS: bonding: move Veaceslav Falico to CREDITS Veaceslav has stepped away from netdev: Subsystem BONDING DRIVER Changes 96 / 319 (30%) Last activity: 2022-12-01 Jay Vosburgh <j.vosburgh@gmail.com>: Author 4f5d33f4f798 2022-08-11 00:00:00 3 Tags e5214f363dab 2022-12-01 00:00:00 48 Veaceslav Falico <vfalico@gmail.com>: Andy Gospodarek <andy@greyhouse.net>: Tags 47f706262f1d 2019-02-24 00:00:00 4 Top reviewers: [42]: jay.vosburgh@canonical.com [18]: jiri@nvidia.com [10]: jtoppins@redhat.com INACTIVE MAINTAINER Veaceslav Falico <vfalico@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff 62c46d55 Thu Dec 02 10:11:25 MST 2021 Mathieu Poirier <mathieu.poirier@linaro.org> MAINTAINERS: Removing Ohad from remoteproc/rpmsg maintenance Ohad has not reviewed patches in the remoteproc and rpmsg subsystems for several years now: $ git log --no-merges --format=email drivers/remoteproc/ drivers/rpmsg/ | \ grep -Pi "^Subject:|^Date:|^[\w\-]+-by:.*ohad*" | grep -B2 ohad Date: Wed, 16 Sep 2015 07:32:54 -0500 Subject: [PATCH] remoteproc/wkup_m3: Use MODULE_DEVICE_TABLE to export alias Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com> Date: Fri, 28 Aug 2015 18:08:19 -0700 Subject: [PATCH] remoteproc: report error if resource table doesn't exist Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com> -- Date: Wed, 16 Sep 2015 19:29:18 -0500 Subject: [PATCH] remoteproc: fix memory leak of remoteproc ida cache layers Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com> Date: Fri, 20 Nov 2015 18:26:07 +0100 Subject: [PATCH] remoteproc: avoid stack overflow in debugfs file Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com> Date: Thu, 18 Jun 2015 11:44:41 +0300 Subject: [PATCH] remoteproc: fix !CONFIG_OF build breakage Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com> Date: Fri, 22 May 2015 15:45:30 -0500 Subject: [PATCH] remoteproc/wkup_m3: add a remoteproc driver for TI Wakeup M3 Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com> As such move his names to the CREDITS file. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Link: https://lore.kernel.org/r/20211202171125.903608-1-mathieu.poirier@linaro.org Acked-by: Ohad Ben Cohen <ohad@wizery.com> |
/linux-master/drivers/clocksource/ | ||
H A D | dw_apb_timer.c | diff cc2550b4 Thu Feb 27 03:59:02 MST 2020 afzal mohammed <afzal.mohd.ma@gmail.com> clocksource: Replace setup_irq() by request_irq() request_irq() is preferred over setup_irq(). The early boot setup_irq() invocations happen either via 'init_IRQ()' or 'time_init()', while memory allocators are ready by 'mm_init()'. Per tglx[1], setup_irq() existed in olden days when allocators were not ready by the time early interrupts were initialized. Hence replace setup_irq() by request_irq(). Seldom remove_irq() usage has been observed coupled with setup_irq(), wherever that has been found, it too has been replaced by free_irq(). A build error that was reported by kbuild test robot <lkp@intel.com> in the previous version of the patch also has been fixed. [1] https://lkml.kernel.org/r/alpine.DEB.2.20.1710191609480.1971@nanos Signed-off-by: afzal mohammed <afzal.mohd.ma@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/91961c77c1cf93d41523f5e1ac52043f32f97077.1582799709.git.afzal.mohd.ma@gmail.com diff d2912cb1 Tue Jun 04 02:11:33 MDT 2019 Thomas Gleixner <tglx@linutronix.de> treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff ac9ce6d1 Thu Mar 09 02:47:10 MST 2017 Rafał Miłecki <rafal@milecki.pl> clocksource: Add missing line break to error messages Printing with pr_* functions requires adding line break manually. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff 45735326 Fri Feb 26 02:45:57 MST 2016 Jisheng Zhang <jszhang@marvell.com> clockevents/drivers/dw_apb_timer: Implement ->set_state_oneshot_stopped() The dw_apb_timer only "supports PERIODIC mode and their drivers emulate ONESHOT over that" as described in commit 8fff52fd5093 ("clockevents: Introduce CLOCK_EVT_STATE_ONESHOT_STOPPED state"). Inspired by Viresh, I think the dw_apb_timer also needs to implement the set_state_oneshot_stopped() which is called by the clkevt core, when the next event is required at an expiry time of 'KTIME_MAX'. This normally happens with NO_HZ_{IDLE|FULL} in both LOWRES/HIGHRES modes. This patch makes the clockevent device to stop on such an event, to avoid spurious interrupts, as explained by the above commit. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org> diff 38c30a84 Mon Dec 09 02:12:10 MST 2013 Michael Opdenacker <michael.opdenacker@free-electrons.com> clocksource: misc drivers: Remove deprecated IRQF_DISABLED This patch removes the use of the IRQF_DISABLED flag It's a NOOP since 2.6.35 and it will be removed one day. [dlezcano] : slightly changed the changelog Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff 38c30a84 Mon Dec 09 02:12:10 MST 2013 Michael Opdenacker <michael.opdenacker@free-electrons.com> clocksource: misc drivers: Remove deprecated IRQF_DISABLED This patch removes the use of the IRQF_DISABLED flag It's a NOOP since 2.6.35 and it will be removed one day. [dlezcano] : slightly changed the changelog Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> |
H A D | nomadik-mtu.c | diff cc2550b4 Thu Feb 27 03:59:02 MST 2020 afzal mohammed <afzal.mohd.ma@gmail.com> clocksource: Replace setup_irq() by request_irq() request_irq() is preferred over setup_irq(). The early boot setup_irq() invocations happen either via 'init_IRQ()' or 'time_init()', while memory allocators are ready by 'mm_init()'. Per tglx[1], setup_irq() existed in olden days when allocators were not ready by the time early interrupts were initialized. Hence replace setup_irq() by request_irq(). Seldom remove_irq() usage has been observed coupled with setup_irq(), wherever that has been found, it too has been replaced by free_irq(). A build error that was reported by kbuild test robot <lkp@intel.com> in the previous version of the patch also has been fixed. [1] https://lkml.kernel.org/r/alpine.DEB.2.20.1710191609480.1971@nanos Signed-off-by: afzal mohammed <afzal.mohd.ma@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/91961c77c1cf93d41523f5e1ac52043f32f97077.1582799709.git.afzal.mohd.ma@gmail.com diff d2912cb1 Tue Jun 04 02:11:33 MDT 2019 Thomas Gleixner <tglx@linutronix.de> treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff ac9ce6d1 Thu Mar 09 02:47:10 MST 2017 Rafał Miłecki <rafal@milecki.pl> clocksource: Add missing line break to error messages Printing with pr_* functions requires adding line break manually. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff 38c30a84 Mon Dec 09 02:12:10 MST 2013 Michael Opdenacker <michael.opdenacker@free-electrons.com> clocksource: misc drivers: Remove deprecated IRQF_DISABLED This patch removes the use of the IRQF_DISABLED flag It's a NOOP since 2.6.35 and it will be removed one day. [dlezcano] : slightly changed the changelog Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff 38c30a84 Mon Dec 09 02:12:10 MST 2013 Michael Opdenacker <michael.opdenacker@free-electrons.com> clocksource: misc drivers: Remove deprecated IRQF_DISABLED This patch removes the use of the IRQF_DISABLED flag It's a NOOP since 2.6.35 and it will be removed one day. [dlezcano] : slightly changed the changelog Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff 74adcbff Sat Mar 02 03:10:12 MST 2013 Daniel Lezcano <daniel.lezcano@linaro.org> ARM: nomadik: add dynamic irq flag to the timer Add the dynamic irq affinity feature to the timer clock device. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Acked-by: Rickard Andersson <rickard.andersson@stericsson.com> diff 38ff87f7 Sun Jun 02 00:39:40 MDT 2013 Stephen Boyd <sboyd@codeaurora.org> sched_clock: Make ARM's sched_clock generic for all architectures Nothing about the sched_clock implementation in the ARM port is specific to the architecture. Generalize the code so that other architectures can use it by selecting GENERIC_SCHED_CLOCK. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> [jstultz: Merge minor collisions with other patches in my tree] Signed-off-by: John Stultz <john.stultz@linaro.org> |
H A D | samsung_pwm_timer.c | diff cc2550b4 Thu Feb 27 03:59:02 MST 2020 afzal mohammed <afzal.mohd.ma@gmail.com> clocksource: Replace setup_irq() by request_irq() request_irq() is preferred over setup_irq(). The early boot setup_irq() invocations happen either via 'init_IRQ()' or 'time_init()', while memory allocators are ready by 'mm_init()'. Per tglx[1], setup_irq() existed in olden days when allocators were not ready by the time early interrupts were initialized. Hence replace setup_irq() by request_irq(). Seldom remove_irq() usage has been observed coupled with setup_irq(), wherever that has been found, it too has been replaced by free_irq(). A build error that was reported by kbuild test robot <lkp@intel.com> in the previous version of the patch also has been fixed. [1] https://lkml.kernel.org/r/alpine.DEB.2.20.1710191609480.1971@nanos Signed-off-by: afzal mohammed <afzal.mohd.ma@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/91961c77c1cf93d41523f5e1ac52043f32f97077.1582799709.git.afzal.mohd.ma@gmail.com diff d2912cb1 Tue Jun 04 02:11:33 MDT 2019 Thomas Gleixner <tglx@linutronix.de> treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff ac9ce6d1 Thu Mar 09 02:47:10 MST 2017 Rafał Miłecki <rafal@milecki.pl> clocksource: Add missing line break to error messages Printing with pr_* functions requires adding line break manually. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff b8725dab Tue Oct 20 02:02:35 MDT 2015 Jisheng Zhang <jszhang@marvell.com> clocksource/drivers/samsung_pwm_timer: Prevent ftrace recursion Currently samsung_pwm_timer can be used as a scheduler clock. We properly marked samsung_read_sched_clock() as notrace but we then call another function samsung_clocksource_read() that _wasn't_ notrace. Having a traceable function in the sched_clock() path leads to a recursion within ftrace and a kernel crash. Fix this by adding notrace attribute to the samsung_clocksource_read() function. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff b8725dab Tue Oct 20 02:02:35 MDT 2015 Jisheng Zhang <jszhang@marvell.com> clocksource/drivers/samsung_pwm_timer: Prevent ftrace recursion Currently samsung_pwm_timer can be used as a scheduler clock. We properly marked samsung_read_sched_clock() as notrace but we then call another function samsung_clocksource_read() that _wasn't_ notrace. Having a traceable function in the sched_clock() path leads to a recursion within ftrace and a kernel crash. Fix this by adding notrace attribute to the samsung_clocksource_read() function. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff 38c30a84 Mon Dec 09 02:12:10 MST 2013 Michael Opdenacker <michael.opdenacker@free-electrons.com> clocksource: misc drivers: Remove deprecated IRQF_DISABLED This patch removes the use of the IRQF_DISABLED flag It's a NOOP since 2.6.35 and it will be removed one day. [dlezcano] : slightly changed the changelog Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff 38c30a84 Mon Dec 09 02:12:10 MST 2013 Michael Opdenacker <michael.opdenacker@free-electrons.com> clocksource: misc drivers: Remove deprecated IRQF_DISABLED This patch removes the use of the IRQF_DISABLED flag It's a NOOP since 2.6.35 and it will be removed one day. [dlezcano] : slightly changed the changelog Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff 38ff87f7 Sun Jun 02 00:39:40 MDT 2013 Stephen Boyd <sboyd@codeaurora.org> sched_clock: Make ARM's sched_clock generic for all architectures Nothing about the sched_clock implementation in the ARM port is specific to the architecture. Generalize the code so that other architectures can use it by selecting GENERIC_SCHED_CLOCK. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> [jstultz: Merge minor collisions with other patches in my tree] Signed-off-by: John Stultz <john.stultz@linaro.org> |
H A D | sh_cmt.c | diff c3daa475 Mon Jan 23 15:02:21 MST 2023 Uwe Kleine-König <u.kleine-koenig@pengutronix.de> clocksource/drivers/sh_cmt: Mark driver as non-removable The comment in the remove callback suggests that the driver is not supposed to be unbound. However returning an error code in the remove callback doesn't accomplish that. Instead set the suppress_bind_attrs property (which makes it impossible to unbind the driver via sysfs). The only remaining way to unbind a sh_cmt device would be module unloading, but that doesn't apply here, as the driver cannot be built as a module. Also drop the useless remove callback. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20230123220221.48164-1-u.kleine-koenig@pengutronix.de Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff 68c70aae Tue Mar 09 02:44:48 MST 2021 Wolfram Sang <wsa+renesas@sang-engineering.com> clocksource/drivers/sh_cmt: Don't use CMTOUT_IE with R-Car Gen2/3 CMTOUT_IE is only supported for older SoCs. Newer SoCs shall not set this bit. So, add a version check. Reported-by: Phong Hoang <phong.hoang.wz@renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210309094448.31823-1-wsa+renesas@sang-engineering.com diff ad7794d4 Thu Jun 18 02:02:12 MDT 2020 Geert Uytterhoeven <geert+renesas@glider.be> clocksource/drivers/sh_cmt: Use "kHz" for kilohertz "K" stands for "kelvin". Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200618080212.16560-1-geert+renesas@glider.be diff ad7794d4 Thu Jun 18 02:02:12 MDT 2020 Geert Uytterhoeven <geert+renesas@glider.be> clocksource/drivers/sh_cmt: Use "kHz" for kilohertz "K" stands for "kelvin". Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200618080212.16560-1-geert+renesas@glider.be diff 38c30a84 Mon Dec 09 02:12:10 MST 2013 Michael Opdenacker <michael.opdenacker@free-electrons.com> clocksource: misc drivers: Remove deprecated IRQF_DISABLED This patch removes the use of the IRQF_DISABLED flag It's a NOOP since 2.6.35 and it will be removed one day. [dlezcano] : slightly changed the changelog Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff 38c30a84 Mon Dec 09 02:12:10 MST 2013 Michael Opdenacker <michael.opdenacker@free-electrons.com> clocksource: misc drivers: Remove deprecated IRQF_DISABLED This patch removes the use of the IRQF_DISABLED flag It's a NOOP since 2.6.35 and it will be removed one day. [dlezcano] : slightly changed the changelog Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff 38409d72 Mon Aug 02 03:24:05 MDT 2010 Magnus Damm <damm@opensource.se> clocksource: sh_cmt: Rate calculation fix Fix the rate calculation in the CMT driver. Without this fix the clocksource runs way too fast and we get a divide-by-zero error. Signed-off-by: Magnus Damm <damm@opensource.se> Signed-off-by: Paul Mundt <lethal@linux-sh.org> diff f4d7c356 Wed Jun 02 02:10:44 MDT 2010 Paul Mundt <lethal@linux-sh.org> clocksource: sh_cmt: compute mult and shift before registration Based on the sh_tmu change in 66f49121ffa41a19c59965b31b046d8368fec3c7 ("clocksource: sh_tmu: compute mult and shift before registration"). The same issues impact the sh_cmt driver, so we take the same approach here. Cc: stable@kernel.org Signed-off-by: Paul Mundt <lethal@linux-sh.org> diff f4d7c356 Wed Jun 02 02:10:44 MDT 2010 Paul Mundt <lethal@linux-sh.org> clocksource: sh_cmt: compute mult and shift before registration Based on the sh_tmu change in 66f49121ffa41a19c59965b31b046d8368fec3c7 ("clocksource: sh_tmu: compute mult and shift before registration"). The same issues impact the sh_cmt driver, so we take the same approach here. Cc: stable@kernel.org Signed-off-by: Paul Mundt <lethal@linux-sh.org> diff 5a0e3ad6 Wed Mar 24 02:04:11 MDT 2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> |
H A D | sh_mtu2.c | diff 38c30a84 Mon Dec 09 02:12:10 MST 2013 Michael Opdenacker <michael.opdenacker@free-electrons.com> clocksource: misc drivers: Remove deprecated IRQF_DISABLED This patch removes the use of the IRQF_DISABLED flag It's a NOOP since 2.6.35 and it will be removed one day. [dlezcano] : slightly changed the changelog Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff 38c30a84 Mon Dec 09 02:12:10 MST 2013 Michael Opdenacker <michael.opdenacker@free-electrons.com> clocksource: misc drivers: Remove deprecated IRQF_DISABLED This patch removes the use of the IRQF_DISABLED flag It's a NOOP since 2.6.35 and it will be removed one day. [dlezcano] : slightly changed the changelog Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff 87d4bb9f Mon Dec 02 23:51:06 MST 2013 Jingoo Han <jg1.han@samsung.com> clocksource: sh_mtu2: Remove unnecessary platform_set_drvdata() The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Acked-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff 5a0e3ad6 Wed Mar 24 02:04:11 MDT 2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> diff 46a12f74 Sun May 03 02:57:17 MDT 2009 Paul Mundt <lethal@linux-sh.org> sh: Consolidate MTU2/CMT/TMU timer platform data. All of the SH timers use a roughly identical structure for platform data, which presently is broken out for each block. Consolidate all of these definitions, as there is no reason for them to be broken out in the first place. Signed-off-by: Paul Mundt <lethal@linux-sh.org> d5ed4c2e Thu Apr 30 01:02:49 MDT 2009 Magnus Damm <damm@igel.co.jp> clocksource: SuperH MTU2 Timer driver This patch adds a MTU2 driver for the SuperH architecture. The MTU2 driver is a platform driver with early platform support to allow using a MTU2 channel as only clockevent during system bootup. Clocksource on sh2a is currently unsupported due to code generation issues with 64-bit math, so at this point only periodic clockevent support is in place. Signed-off-by: Magnus Damm <damm@igel.co.jp> Signed-off-by: Paul Mundt <lethal@linux-sh.org> |
H A D | sh_tmu.c | diff f2a54738 Tue Dec 16 02:48:54 MST 2014 Magnus Damm <damm+renesas@opensource.se> clocksource: sh_tmu: Set cpu_possible_mask to fix SMP broadcast Update the TMU driver to use cpu_possible_mask as cpumask to make r8a7779 SMP work as expected with or without the ARM TWD timer. Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff 38c30a84 Mon Dec 09 02:12:10 MST 2013 Michael Opdenacker <michael.opdenacker@free-electrons.com> clocksource: misc drivers: Remove deprecated IRQF_DISABLED This patch removes the use of the IRQF_DISABLED flag It's a NOOP since 2.6.35 and it will be removed one day. [dlezcano] : slightly changed the changelog Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff 38c30a84 Mon Dec 09 02:12:10 MST 2013 Michael Opdenacker <michael.opdenacker@free-electrons.com> clocksource: misc drivers: Remove deprecated IRQF_DISABLED This patch removes the use of the IRQF_DISABLED flag It's a NOOP since 2.6.35 and it will be removed one day. [dlezcano] : slightly changed the changelog Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff 5707f18c Mon Dec 02 23:50:09 MST 2013 Jingoo Han <jg1.han@samsung.com> clocksource: sh_tmu: Remove unnecessary platform_set_drvdata() The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Acked-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> diff 3977407e Mon Jun 11 02:10:16 MDT 2012 Paul Mundt <lethal@linux-sh.org> clocksource: sh_tmu: Use clockevents_config_and_register(). This switches over to the now exported clockevents_config() and clockevents_config_and_register() helpers. This knocks off a long-standing TMU TODO item. Signed-off-by: Paul Mundt <lethal@linux-sh.org> diff 5a0e3ad6 Wed Mar 24 02:04:11 MDT 2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> diff 46a12f74 Sun May 03 02:57:17 MDT 2009 Paul Mundt <lethal@linux-sh.org> sh: Consolidate MTU2/CMT/TMU timer platform data. All of the SH timers use a roughly identical structure for platform data, which presently is broken out for each block. Consolidate all of these definitions, as there is no reason for them to be broken out in the first place. Signed-off-by: Paul Mundt <lethal@linux-sh.org> |
/linux-master/drivers/net/ | ||
H A D | Space.c | diff 60903f2c Tue Jan 02 01:35:48 MST 2007 Adrian Bunk <bunk@stusta.de> [NET] drivers/net/loopback.c: convert to module_init() This patch converts drivers/net/loopback.c to using module_init(). Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net> diff 90833aa4 Mon Nov 13 17:02:22 MST 2006 Adrian Bunk <bunk@stusta.de> [NET]: The scheduled removal of the frame diverter. This patch contains the scheduled removal of the frame diverter. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net> diff 7525d4bf Mon Oct 02 19:08:22 MDT 2006 Jeff Garzik <jeff@garzik.org> [PATCH] hp100: fix conditional compilation mess The previous hp100 changeset attempted to kill warnings, but was only tested on !CONFIG_ISA platforms. The correct conditional compilation setup involves tested CONFIG_ISA rather than just MODULE. Fixes link on CONFIG_ISA platforms (i386) in current -git. Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 504ff16c Wed Jul 27 02:14:50 MDT 2005 Jochen Friedrich <jochen@scram.de> [PATCH] tms380tr: move to DMA API This patch makes tms380tr use the new DMA API. Now that on Alpha, this API also supports bus master DMA for ISA (platform) devices, i changed the driver to use this new API. This also works around a bug in the firmware loader: The example provided in Documentation/firmware_class no longer works, as the firmware loader now calls get_kobj_path_length() and the kernel promptly oopses, as the home-grown device doesn't have a parent. Of course, this doesn't happen with a "real" device which has its bus (or pseudo bus in the case of platform) as parent. Converted tms380tr to use new DMA API: - proteon.c, skisa.c: use platform pseudo bus to create a struct device - Space.c: delete init hooks - abyss.c, tmspci.c: pass struct device to tms380tr.c - tms380tr.c, tms380tr.h: new DMA API, use real device fo firmware loader Signed-off-by: Jochen Friedrich <jochen@scram.de> Cc: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jeff Garzik <jgarzik@pobox.com> diff 02c30a84 Thu May 05 17:16:16 MDT 2005 Jesper Juhl <juhl-lkml@dif.dk> [PATCH] update Ross Biro bouncing email address Ross moved. Remove the bad email address so people will find the correct one in ./CREDITS. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 02c30a84 Thu May 05 17:16:16 MDT 2005 Jesper Juhl <juhl-lkml@dif.dk> [PATCH] update Ross Biro bouncing email address Ross moved. Remove the bad email address so people will find the correct one in ./CREDITS. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
H A D | loopback.c | diff de799101 Wed Mar 02 12:55:31 MST 2022 Martin KaFai Lau <kafai@fb.com> net: Add skb_clear_tstamp() to keep the mono delivery_time Right now, skb->tstamp is reset to 0 whenever the skb is forwarded. If skb->tstamp has the mono delivery_time, clearing it can hurt the performance when it finally transmits out to fq@phy-dev. The earlier patch added a skb->mono_delivery_time bit to flag the skb->tstamp carrying the mono delivery_time. This patch adds skb_clear_tstamp() helper which keeps the mono delivery_time and clears everything else. The delivery_time clearing will be postponed until the stack knows the skb will be delivered locally. It will be done in a latter patch. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 52bb6677 Fri Sep 14 02:00:51 MDT 2018 Li RongQing <lirongqing@baidu.com> net: move definition of pcpu_lstats to header file pcpu_lstats is defined in several files, so unify them as one and move to header file Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 2f635cee Tue Mar 27 09:02:13 MDT 2018 Kirill Tkhai <ktkhai@virtuozzo.com> net: Drop pernet_operations::async Synchronous pernet_operations are not allowed anymore. All are asynchronous. So, drop the structure member. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff f6c382fc Thu Jun 02 12:05:38 MDT 2016 Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> loopback: make use of NETIF_F_GSO_SOFTWARE NETIF_F_GSO_SOFTWARE was defined to list all GSO software types, so lets make use of it in loopback code. Note that veth/vxlan/others already uses it. Within this patch series, this patch causes lo to pick up SCTP GSO feature automatically (as it's added to NETIF_F_GSO_SOFTWARE) and thus avoiding segmentation if possible. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Tested-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff e65db2b7 Tue Aug 18 02:30:32 MDT 2015 Phil Sutter <phil@nwl.cc> net: loopback: convert to using IFF_NO_QUEUE Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net> diff ae33bc40 Wed Nov 05 17:00:02 MST 2008 Eric W. Biederman <ebiederm@xmission.com> net: Guaranetee the proper ordering of the loopback device. I was recently hunting a bug that occurred in network namespace cleanup. In looking at the code it became apparrent that we have and will continue to have cases where if we have anything going on in a network namespace there will be assumptions that the loopback device is present. Things like sending igmp unsubscribe messages when we bring down network devices invokes the routing code which assumes that at least the loopback driver is present. Therefore to avoid magic initcall ordering hackery that is hard to follow and hard to get right insert a call to register the loopback device directly from net_dev_init(). This guarantes that the loopback device is the first device registered and the last network device to go away. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 3b04ddde Tue Oct 09 02:40:57 MDT 2007 Stephen Hemminger <shemminger@linux-foundation.org> [NET]: Move hardware header operations out of netdevice. Since hardware header operations are part of the protocol class not the device instance, make them into a separate object and save memory. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff 60903f2c Tue Jan 02 01:35:48 MST 2007 Adrian Bunk <bunk@stusta.de> [NET] drivers/net/loopback.c: convert to module_init() This patch converts drivers/net/loopback.c to using module_init(). Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net> diff 0fed4846 Tue Mar 28 02:56:37 MST 2006 KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [PATCH] for_each_possible_cpu: loopback device. This patch replaces for_each_cpu with for_each_possible_cpu. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 0e920bfb Sat Jul 02 19:28:23 MDT 2005 Chuck Ebbert <76306.1226@compuserve.com> [PATCH] loopback: whitespace cleanup Whitespace cleanup for loopback driver. Hopefully it fixes the last few annoyances. Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Signed-off-by: Jeff Garzik <jgarzik@pobox.com> |
/linux-master/include/net/ | ||
H A D | icmp.h | diff e4422b2d Thu Mar 29 02:49:01 MDT 2012 Rami Rosen <ramirose@gmail.com> net: remove unused icmp_ioctl() definition. The patch removes unused icmp_ioctl() method definition in include/net/icmp.h. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff b60538a0 Fri Jul 18 05:04:02 MDT 2008 Pavel Emelyanov <xemul@openvz.org> mib: put icmp statistics on struct net Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff 75c939bb Tue Jul 15 00:02:35 MDT 2008 Pavel Emelyanov <xemul@openvz.org> mib: add struct net to ICMP_INC_STATS Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff 43589aa9 Tue Jul 15 00:02:09 MDT 2008 Pavel Emelyanov <xemul@openvz.org> icmp: drop unused MIB accounting wrappers There are ICMP_XXX_STATS that are not used in the kernel, so I remove them, not to "just patch" them later. But if there's some sense in keeping them, kick me - I will remake this set keeping them. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff a24022e1 Wed Mar 26 02:55:37 MDT 2008 Pavel Emelyanov <xemul@openvz.org> [NETNS][ICMP]: Move ICMP sysctls on struct net. Initialization is moved to icmp_sk_init, all the places, that refer to them use init_net for now. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff 20380731 Mon Aug 15 23:18:02 MDT 2005 Arnaldo Carvalho de Melo <acme@mandriva.com> [NET]: Fix sparse warnings Of this type, mostly: CHECK net/ipv6/netfilter.c net/ipv6/netfilter.c:96:12: warning: symbol 'ipv6_netfilter_init' was not declared. Should it be static? net/ipv6/netfilter.c:101:6: warning: symbol 'ipv6_netfilter_fini' was not declared. Should it be static? Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 02c30a84 Thu May 05 17:16:16 MDT 2005 Jesper Juhl <juhl-lkml@dif.dk> [PATCH] update Ross Biro bouncing email address Ross moved. Remove the bad email address so people will find the correct one in ./CREDITS. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 02c30a84 Thu May 05 17:16:16 MDT 2005 Jesper Juhl <juhl-lkml@dif.dk> [PATCH] update Ross Biro bouncing email address Ross moved. Remove the bad email address so people will find the correct one in ./CREDITS. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
H A D | ip.h | diff 878d951c Wed Oct 18 03:00:13 MDT 2023 Eric Dumazet <edumazet@google.com> inet: lock the socket in ip_sock_set_tos() Christoph Paasch reported a panic in TCP stack [1] Indeed, we should not call sk_dst_reset() without holding the socket lock, as __sk_dst_get() callers do not all rely on bare RCU. [1] BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 12bad6067 P4D 12bad6067 PUD 12bad5067 PMD 0 Oops: 0000 [#1] PREEMPT SMP CPU: 1 PID: 2750 Comm: syz-executor.5 Not tainted 6.6.0-rc4-g7a5720a344e7 #49 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:tcp_get_metrics+0x118/0x8f0 net/ipv4/tcp_metrics.c:321 Code: c7 44 24 70 02 00 8b 03 89 44 24 48 c7 44 24 4c 00 00 00 00 66 c7 44 24 58 02 00 66 ba 02 00 b1 01 89 4c 24 04 4c 89 7c 24 10 <49> 8b 0f 48 8b 89 50 05 00 00 48 89 4c 24 30 33 81 00 02 00 00 69 RSP: 0018:ffffc90000af79b8 EFLAGS: 00010293 RAX: 000000000100007f RBX: ffff88812ae8f500 RCX: ffff88812b5f8f01 RDX: 0000000000000002 RSI: ffffffff8300f080 RDI: 0000000000000002 RBP: 0000000000000002 R08: 0000000000000003 R09: ffffffff8205eca0 R10: 0000000000000002 R11: ffff88812b5f8f00 R12: ffff88812a9e0580 R13: 0000000000000000 R14: ffff88812ae8fbd2 R15: 0000000000000000 FS: 00007f70a006b640(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000012bad7003 CR4: 0000000000170ee0 Call Trace: <TASK> tcp_fastopen_cache_get+0x32/0x140 net/ipv4/tcp_metrics.c:567 tcp_fastopen_cookie_check+0x28/0x180 net/ipv4/tcp_fastopen.c:419 tcp_connect+0x9c8/0x12a0 net/ipv4/tcp_output.c:3839 tcp_v4_connect+0x645/0x6e0 net/ipv4/tcp_ipv4.c:323 __inet_stream_connect+0x120/0x590 net/ipv4/af_inet.c:676 tcp_sendmsg_fastopen+0x2d6/0x3a0 net/ipv4/tcp.c:1021 tcp_sendmsg_locked+0x1957/0x1b00 net/ipv4/tcp.c:1073 tcp_sendmsg+0x30/0x50 net/ipv4/tcp.c:1336 __sock_sendmsg+0x83/0xd0 net/socket.c:730 __sys_sendto+0x20a/0x2a0 net/socket.c:2194 __do_sys_sendto net/socket.c:2206 [inline] Fixes: e08d0b3d1723 ("inet: implement lockless IP_TOS") Reported-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231018090014.345158-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> diff 878d951c Wed Oct 18 03:00:13 MDT 2023 Eric Dumazet <edumazet@google.com> inet: lock the socket in ip_sock_set_tos() Christoph Paasch reported a panic in TCP stack [1] Indeed, we should not call sk_dst_reset() without holding the socket lock, as __sk_dst_get() callers do not all rely on bare RCU. [1] BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 12bad6067 P4D 12bad6067 PUD 12bad5067 PMD 0 Oops: 0000 [#1] PREEMPT SMP CPU: 1 PID: 2750 Comm: syz-executor.5 Not tainted 6.6.0-rc4-g7a5720a344e7 #49 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:tcp_get_metrics+0x118/0x8f0 net/ipv4/tcp_metrics.c:321 Code: c7 44 24 70 02 00 8b 03 89 44 24 48 c7 44 24 4c 00 00 00 00 66 c7 44 24 58 02 00 66 ba 02 00 b1 01 89 4c 24 04 4c 89 7c 24 10 <49> 8b 0f 48 8b 89 50 05 00 00 48 89 4c 24 30 33 81 00 02 00 00 69 RSP: 0018:ffffc90000af79b8 EFLAGS: 00010293 RAX: 000000000100007f RBX: ffff88812ae8f500 RCX: ffff88812b5f8f01 RDX: 0000000000000002 RSI: ffffffff8300f080 RDI: 0000000000000002 RBP: 0000000000000002 R08: 0000000000000003 R09: ffffffff8205eca0 R10: 0000000000000002 R11: ffff88812b5f8f00 R12: ffff88812a9e0580 R13: 0000000000000000 R14: ffff88812ae8fbd2 R15: 0000000000000000 FS: 00007f70a006b640(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000012bad7003 CR4: 0000000000170ee0 Call Trace: <TASK> tcp_fastopen_cache_get+0x32/0x140 net/ipv4/tcp_metrics.c:567 tcp_fastopen_cookie_check+0x28/0x180 net/ipv4/tcp_fastopen.c:419 tcp_connect+0x9c8/0x12a0 net/ipv4/tcp_output.c:3839 tcp_v4_connect+0x645/0x6e0 net/ipv4/tcp_ipv4.c:323 __inet_stream_connect+0x120/0x590 net/ipv4/af_inet.c:676 tcp_sendmsg_fastopen+0x2d6/0x3a0 net/ipv4/tcp.c:1021 tcp_sendmsg_locked+0x1957/0x1b00 net/ipv4/tcp.c:1073 tcp_sendmsg+0x30/0x50 net/ipv4/tcp.c:1336 __sock_sendmsg+0x83/0xd0 net/socket.c:730 __sys_sendto+0x20a/0x2a0 net/socket.c:2194 __do_sys_sendto net/socket.c:2206 [inline] Fixes: e08d0b3d1723 ("inet: implement lockless IP_TOS") Reported-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231018090014.345158-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> diff 878d951c Wed Oct 18 03:00:13 MDT 2023 Eric Dumazet <edumazet@google.com> inet: lock the socket in ip_sock_set_tos() Christoph Paasch reported a panic in TCP stack [1] Indeed, we should not call sk_dst_reset() without holding the socket lock, as __sk_dst_get() callers do not all rely on bare RCU. [1] BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 12bad6067 P4D 12bad6067 PUD 12bad5067 PMD 0 Oops: 0000 [#1] PREEMPT SMP CPU: 1 PID: 2750 Comm: syz-executor.5 Not tainted 6.6.0-rc4-g7a5720a344e7 #49 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:tcp_get_metrics+0x118/0x8f0 net/ipv4/tcp_metrics.c:321 Code: c7 44 24 70 02 00 8b 03 89 44 24 48 c7 44 24 4c 00 00 00 00 66 c7 44 24 58 02 00 66 ba 02 00 b1 01 89 4c 24 04 4c 89 7c 24 10 <49> 8b 0f 48 8b 89 50 05 00 00 48 89 4c 24 30 33 81 00 02 00 00 69 RSP: 0018:ffffc90000af79b8 EFLAGS: 00010293 RAX: 000000000100007f RBX: ffff88812ae8f500 RCX: ffff88812b5f8f01 RDX: 0000000000000002 RSI: ffffffff8300f080 RDI: 0000000000000002 RBP: 0000000000000002 R08: 0000000000000003 R09: ffffffff8205eca0 R10: 0000000000000002 R11: ffff88812b5f8f00 R12: ffff88812a9e0580 R13: 0000000000000000 R14: ffff88812ae8fbd2 R15: 0000000000000000 FS: 00007f70a006b640(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000012bad7003 CR4: 0000000000170ee0 Call Trace: <TASK> tcp_fastopen_cache_get+0x32/0x140 net/ipv4/tcp_metrics.c:567 tcp_fastopen_cookie_check+0x28/0x180 net/ipv4/tcp_fastopen.c:419 tcp_connect+0x9c8/0x12a0 net/ipv4/tcp_output.c:3839 tcp_v4_connect+0x645/0x6e0 net/ipv4/tcp_ipv4.c:323 __inet_stream_connect+0x120/0x590 net/ipv4/af_inet.c:676 tcp_sendmsg_fastopen+0x2d6/0x3a0 net/ipv4/tcp.c:1021 tcp_sendmsg_locked+0x1957/0x1b00 net/ipv4/tcp.c:1073 tcp_sendmsg+0x30/0x50 net/ipv4/tcp.c:1336 __sock_sendmsg+0x83/0xd0 net/socket.c:730 __sys_sendto+0x20a/0x2a0 net/socket.c:2194 __do_sys_sendto net/socket.c:2206 [inline] Fixes: e08d0b3d1723 ("inet: implement lockless IP_TOS") Reported-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231018090014.345158-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> diff 878d951c Wed Oct 18 03:00:13 MDT 2023 Eric Dumazet <edumazet@google.com> inet: lock the socket in ip_sock_set_tos() Christoph Paasch reported a panic in TCP stack [1] Indeed, we should not call sk_dst_reset() without holding the socket lock, as __sk_dst_get() callers do not all rely on bare RCU. [1] BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 12bad6067 P4D 12bad6067 PUD 12bad5067 PMD 0 Oops: 0000 [#1] PREEMPT SMP CPU: 1 PID: 2750 Comm: syz-executor.5 Not tainted 6.6.0-rc4-g7a5720a344e7 #49 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:tcp_get_metrics+0x118/0x8f0 net/ipv4/tcp_metrics.c:321 Code: c7 44 24 70 02 00 8b 03 89 44 24 48 c7 44 24 4c 00 00 00 00 66 c7 44 24 58 02 00 66 ba 02 00 b1 01 89 4c 24 04 4c 89 7c 24 10 <49> 8b 0f 48 8b 89 50 05 00 00 48 89 4c 24 30 33 81 00 02 00 00 69 RSP: 0018:ffffc90000af79b8 EFLAGS: 00010293 RAX: 000000000100007f RBX: ffff88812ae8f500 RCX: ffff88812b5f8f01 RDX: 0000000000000002 RSI: ffffffff8300f080 RDI: 0000000000000002 RBP: 0000000000000002 R08: 0000000000000003 R09: ffffffff8205eca0 R10: 0000000000000002 R11: ffff88812b5f8f00 R12: ffff88812a9e0580 R13: 0000000000000000 R14: ffff88812ae8fbd2 R15: 0000000000000000 FS: 00007f70a006b640(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000012bad7003 CR4: 0000000000170ee0 Call Trace: <TASK> tcp_fastopen_cache_get+0x32/0x140 net/ipv4/tcp_metrics.c:567 tcp_fastopen_cookie_check+0x28/0x180 net/ipv4/tcp_fastopen.c:419 tcp_connect+0x9c8/0x12a0 net/ipv4/tcp_output.c:3839 tcp_v4_connect+0x645/0x6e0 net/ipv4/tcp_ipv4.c:323 __inet_stream_connect+0x120/0x590 net/ipv4/af_inet.c:676 tcp_sendmsg_fastopen+0x2d6/0x3a0 net/ipv4/tcp.c:1021 tcp_sendmsg_locked+0x1957/0x1b00 net/ipv4/tcp.c:1073 tcp_sendmsg+0x30/0x50 net/ipv4/tcp.c:1336 __sock_sendmsg+0x83/0xd0 net/socket.c:730 __sys_sendto+0x20a/0x2a0 net/socket.c:2194 __do_sys_sendto net/socket.c:2206 [inline] Fixes: e08d0b3d1723 ("inet: implement lockless IP_TOS") Reported-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231018090014.345158-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> diff 6ac66cb0 Thu Aug 31 02:03:30 MDT 2023 Sriram Yagnaraman <sriram.yagnaraman@est.tech> ipv4: ignore dst hint for multipath routes Route hints when the nexthop is part of a multipath group causes packets in the same receive batch to be sent to the same nexthop irrespective of the multipath hash of the packet. So, do not extract route hint for packets whose destination is part of a multipath group. A new SKB flag IPSKB_MULTIPATH is introduced for this purpose, set the flag when route is looked up in ip_mkroute_input() and use it in ip_extract_route_hint() to check for the existence of the flag. Fixes: 02b24941619f ("ipv4: use dst hint for ipv4 list receive") Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff 6ac66cb0 Thu Aug 31 02:03:30 MDT 2023 Sriram Yagnaraman <sriram.yagnaraman@est.tech> ipv4: ignore dst hint for multipath routes Route hints when the nexthop is part of a multipath group causes packets in the same receive batch to be sent to the same nexthop irrespective of the multipath hash of the packet. So, do not extract route hint for packets whose destination is part of a multipath group. A new SKB flag IPSKB_MULTIPATH is introduced for this purpose, set the flag when route is looked up in ip_mkroute_input() and use it in ip_extract_route_hint() to check for the existence of the flag. Fixes: 02b24941619f ("ipv4: use dst hint for ipv4 list receive") Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff c85be08f Mon May 22 08:38:02 MDT 2023 Guillaume Nault <gnault@redhat.com> raw: Stop using RTO_ONLINK. Use ip_sendmsg_scope() to properly initialise the scope in flowi4_init_output(), instead of overriding tos with the RTO_ONLINK flag. The objective is to eventually remove RTO_ONLINK, which will allow converting .flowi4_tos to dscp_t. The MSG_DONTROUTE and SOCK_LOCALROUTE cases were already handled by raw_sendmsg() (SOCK_LOCALROUTE was handled by the RT_CONN_FLAGS*() macros called by get_rtconn_flags()). However, opt.is_strictroute wasn't taken into account. Therefore, a side effect of this patch is to now honour opt.is_strictroute, and thus align raw_sendmsg() with ping_v4_sendmsg() and udp_sendmsg(). Since raw_sendmsg() was the only user of get_rtconn_flags(), we can now remove this function. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff 91d0b78c Tue Jan 24 06:36:43 MST 2023 Jakub Sitnicki <jakub@cloudflare.com> inet: Add IP_LOCAL_PORT_RANGE socket option Users who want to share a single public IP address for outgoing connections between several hosts traditionally reach for SNAT. However, SNAT requires state keeping on the node(s) performing the NAT. A stateless alternative exists, where a single IP address used for egress can be shared between several hosts by partitioning the available ephemeral port range. In such a setup: 1. Each host gets assigned a disjoint range of ephemeral ports. 2. Applications open connections from the host-assigned port range. 3. Return traffic gets routed to the host based on both, the destination IP and the destination port. An application which wants to open an outgoing connection (connect) from a given port range today can choose between two solutions: 1. Manually pick the source port by bind()'ing to it before connect()'ing the socket. This approach has a couple of downsides: a) Search for a free port has to be implemented in the user-space. If the chosen 4-tuple happens to be busy, the application needs to retry from a different local port number. Detecting if 4-tuple is busy can be either easy (TCP) or hard (UDP). In TCP case, the application simply has to check if connect() returned an error (EADDRNOTAVAIL). That is assuming that the local port sharing was enabled (REUSEADDR) by all the sockets. # Assume desired local port range is 60_000-60_511 s = socket(AF_INET, SOCK_STREAM) s.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1) s.bind(("192.0.2.1", 60_000)) s.connect(("1.1.1.1", 53)) # Fails only if 192.0.2.1:60000 -> 1.1.1.1:53 is busy # Application must retry with another local port In case of UDP, the network stack allows binding more than one socket to the same 4-tuple, when local port sharing is enabled (REUSEADDR). Hence detecting the conflict is much harder and involves querying sock_diag and toggling the REUSEADDR flag [1]. b) For TCP, bind()-ing to a port within the ephemeral port range means that no connecting sockets, that is those which leave it to the network stack to find a free local port at connect() time, can use the this port. IOW, the bind hash bucket tb->fastreuse will be 0 or 1, and the port will be skipped during the free port search at connect() time. 2. Isolate the app in a dedicated netns and use the use the per-netns ip_local_port_range sysctl to adjust the ephemeral port range bounds. The per-netns setting affects all sockets, so this approach can be used only if: - there is just one egress IP address, or - the desired egress port range is the same for all egress IP addresses used by the application. For TCP, this approach avoids the downsides of (1). Free port search and 4-tuple conflict detection is done by the network stack: system("sysctl -w net.ipv4.ip_local_port_range='60000 60511'") s = socket(AF_INET, SOCK_STREAM) s.setsockopt(SOL_IP, IP_BIND_ADDRESS_NO_PORT, 1) s.bind(("192.0.2.1", 0)) s.connect(("1.1.1.1", 53)) # Fails if all 4-tuples 192.0.2.1:60000-60511 -> 1.1.1.1:53 are busy For UDP this approach has limited applicability. Setting the IP_BIND_ADDRESS_NO_PORT socket option does not result in local source port being shared with other connected UDP sockets. Hence relying on the network stack to find a free source port, limits the number of outgoing UDP flows from a single IP address down to the number of available ephemeral ports. To put it another way, partitioning the ephemeral port range between hosts using the existing Linux networking API is cumbersome. To address this use case, add a new socket option at the SOL_IP level, named IP_LOCAL_PORT_RANGE. The new option can be used to clamp down the ephemeral port range for each socket individually. The option can be used only to narrow down the per-netns local port range. If the per-socket range lies outside of the per-netns range, the latter takes precedence. UAPI-wise, the low and high range bounds are passed to the kernel as a pair of u16 values in host byte order packed into a u32. This avoids pointer passing. PORT_LO = 40_000 PORT_HI = 40_511 s = socket(AF_INET, SOCK_STREAM) v = struct.pack("I", PORT_HI << 16 | PORT_LO) s.setsockopt(SOL_IP, IP_LOCAL_PORT_RANGE, v) s.bind(("127.0.0.1", 0)) s.getsockname() # Local address between ("127.0.0.1", 40_000) and ("127.0.0.1", 40_511), # if there is a free port. EADDRINUSE otherwise. [1] https://github.com/cloudflare/cloudflare-blog/blob/232b432c1d57/2022-02-connectx/connectx.py#L116 Reviewed-by: Marek Majkowski <marek@cloudflare.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff e6175a2e Fri May 13 14:34:02 MDT 2022 Eyal Birger <eyal.birger@gmail.com> xfrm: fix "disable_policy" flag use when arriving from different devices In IPv4 setting the "disable_policy" flag on a device means no policy should be enforced for traffic originating from the device. This was implemented by seting the DST_NOPOLICY flag in the dst based on the originating device. However, dsts are cached in nexthops regardless of the originating devices, in which case, the DST_NOPOLICY flag value may be incorrect. Consider the following setup: +------------------------------+ | ROUTER | +-------------+ | +-----------------+ | | ipsec src |----|-|ipsec0 | | +-------------+ | |disable_policy=0 | +----+ | | +-----------------+ |eth1|-|----- +-------------+ | +-----------------+ +----+ | | noipsec src |----|-|eth0 | | +-------------+ | |disable_policy=1 | | | +-----------------+ | +------------------------------+ Where ROUTER has a default route towards eth1. dst entries for traffic arriving from eth0 would have DST_NOPOLICY and would be cached and therefore can be reused by traffic originating from ipsec0, skipping policy check. Fix by setting a IPSKB_NOPOLICY flag in IPCB and observing it instead of the DST in IN/FWD IPv4 policy checks. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> diff 02a1b175 Wed Sep 23 14:18:15 MDT 2020 Maciej Żenczykowski <maze@google.com> net/ipv4: always honour route mtu during forwarding Documentation/networking/ip-sysctl.txt:46 says: ip_forward_use_pmtu - BOOLEAN By default we don't trust protocol path MTUs while forwarding because they could be easily forged and can lead to unwanted fragmentation by the router. You only need to enable this if you have user-space software which tries to discover path mtus by itself and depends on the kernel honoring this information. This is normally not the case. Default: 0 (disabled) Possible values: 0 - disabled 1 - enabled Which makes it pretty clear that setting it to 1 is a potential security/safety/DoS issue, and yet it is entirely reasonable to want forwarded traffic to honour explicitly administrator configured route mtus (instead of defaulting to device mtu). Indeed, I can't think of a single reason why you wouldn't want to. Since you configured a route mtu you probably know better... It is pretty common to have a higher device mtu to allow receiving large (jumbo) frames, while having some routes via that interface (potentially including the default route to the internet) specify a lower mtu. Note that ipv6 forwarding uses device mtu unless the route is locked (in which case it will use the route mtu). This approach is not usable for IPv4 where an 'mtu lock' on a route also has the side effect of disabling TCP path mtu discovery via disabling the IPv4 DF (don't frag) bit on all outgoing frames. I'm not aware of a way to lock a route from an IPv6 RA, so that also potentially seems wrong. Signed-off-by: Maciej Żenczykowski <maze@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: Sunmeet Gill (Sunny) <sgill@quicinc.com> Cc: Vinay Paradkar <vparadka@qti.qualcomm.com> Cc: Tyler Wear <twear@quicinc.com> Cc: David Ahern <dsahern@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
H A D | route.h | diff 4bd0623f Wed Aug 16 02:15:41 MDT 2023 Eric Dumazet <edumazet@google.com> inet: move inet->transparent to inet->inet_flags IP_TRANSPARENT socket option can now be set/read without locking the socket. v2: removed unused issk variable in mptcp_setsockopt_sol_ip_set_transparent() v4: rebased after commit 3f326a821b99 ("mptcp: change the mpc check helper to return a sk") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net> diff 02b24941 Wed Nov 20 05:47:37 MST 2019 Paolo Abeni <pabeni@redhat.com> ipv4: use dst hint for ipv4 list receive This is alike the previous change, with some additional ipv4 specific quirk. Even when using the route hint we still have to do perform additional per packet checks about source address validity: a new helper is added to wrap them. Hints are explicitly disabled if the destination is a local broadcast, that keeps the code simple and local broadcast are a slower path anyway. UDP flood performances vs recvmmsg() receiver: vanilla patched delta Kpps Kpps % 1683 1871 +11 In the worst case scenario - each packet has a different destination address - the performance delta is within noise range. v3 -> v4: - re-enable hints for forward v2 -> v3: - really fix build (sic) and hint usage check - use fib4_has_custom_rules() helpers (David A.) - add ip_extract_route_hint() helper (Edward C.) - use prev skb as hint instead of copying data (Willem) v1 -> v2: - fix build issue with !CONFIG_IP_MULTIPLE_TABLES Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 510c321b Wed Feb 14 04:06:02 MST 2018 Xin Long <lucien.xin@gmail.com> xfrm: reuse uncached_list to track xdsts In early time, when freeing a xdst, it would be inserted into dst_garbage.list first. Then if it's refcnt was still held somewhere, later it would be put into dst_busy_list in dst_gc_task(). When one dev was being unregistered, the dev of these dsts in dst_busy_list would be set with loopback_dev and put this dev. So that this dev's removal wouldn't get blocked, and avoid the kmsg warning: kernel:unregister_netdevice: waiting for veth0 to become \ free. Usage count = 2 However after Commit 52df157f17e5 ("xfrm: take refcnt of dst when creating struct xfrm_dst bundle"), the xdst will not be freed with dst gc, and this warning happens. To fix it, we need to find these xdsts that are still held by others when removing the dev, and free xdst's dev and set it with loopback_dev. But unfortunately after flow_cache for xfrm was deleted, no list tracks them anymore. So we need to save these xdsts somewhere to release the xdst's dev later. To make this easier, this patch is to reuse uncached_list to track xdsts, so that the dev refcnt can be released in the event NETDEV_UNREGISTER process of fib_netdev_notifier. Thanks to Florian, we could move forward this fix quickly. Fixes: 52df157f17e5 ("xfrm: take refcnt of dst when creating struct xfrm_dst bundle") Reported-by: Jianlin Shi <jishi@redhat.com> Reported-by: Hangbin Liu <liuhangbin@gmail.com> Tested-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> diff 79a13159 Wed Sep 30 02:12:22 MDT 2015 Peter Nørlund <pch@ordbogen.com> ipv4: ICMP packet inspection for multipath ICMP packets are inspected to let them route together with the flow they belong to, minimizing the chance that a problematic path will affect flows on other paths, and so that anycast environments can work with ECMP. Signed-off-by: Peter Nørlund <pch@ordbogen.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff b7503e0c Wed Sep 02 14:58:35 MDT 2015 David Ahern <dsa@cumulusnetworks.com> net: Add FIB table id to rtable Add the FIB table id to rtable to make the information available for IPv4 as it is for IPv6. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 613d09b3 Thu Aug 13 14:59:02 MDT 2015 David Ahern <dsa@cumulusnetworks.com> net: Use VRF device index for lookups on TX As with ingress use the index of VRF master device for route lookups on egress. However, the oif should only be used to direct the lookups to a specific table. Routes in the table are not based on the VRF device but rather interfaces that are part of the VRF so do not consider the oif for lookups within the table. The FLOWI_FLAG_VRFSRC is used to control this latter part. Signed-off-by: Shrijeet Mukherjee <shm@cumulusnetworks.com> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 571e7226 Tue Jul 21 02:43:47 MDT 2015 Roopa Prabhu <roopa@cumulusnetworks.com> ipv4: support for fib route lwtunnel encap attributes This patch adds support in ipv4 fib functions to parse user provided encap attributes and attach encap state data to fib_nh and rtable. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff f87c10a8 Thu Jan 09 02:01:15 MST 2014 Hannes Frederic Sowa <hannes@stressinduktion.org> ipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu spoofing While forwarding we should not use the protocol path mtu to calculate the mtu for a forwarded packet but instead use the interface mtu. We mark forwarded skbs in ip_forward with IPSKB_FORWARDED, which was introduced for multicast forwarding. But as it does not conflict with our usage in unicast code path it is perfect for reuse. I moved the functions ip_sk_accept_pmtu, ip_sk_use_pmtu and ip_skb_dst_mtu along with the new ip_dst_mtu_maybe_forward to net/ip.h to fix circular dependencies because of IPSKB_FORWARDED. Because someone might have written a software which does probe destinations manually and expects the kernel to honour those path mtus I introduced a new per-namespace "ip_forward_use_pmtu" knob so someone can disable this new behaviour. We also still use mtus which are locked on a route for forwarding. The reason for this change is, that path mtus information can be injected into the kernel via e.g. icmp_err protocol handler without verification of local sockets. As such, this could cause the IPv4 forwarding path to wrongfully emit fragmentation needed notifications or start to fragment packets along a path. Tunnel and ipsec output paths clear IPCB again, thus IPSKB_FORWARDED won't be set and further fragmentation logic will use the path mtu to determine the fragmentation size. They also recheck packet size with help of path mtu discovery and report appropriate errors. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: John Heffner <johnwheffner@gmail.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff d6c0a4f6 Sat Jun 30 20:02:59 MDT 2012 David Miller <davem@davemloft.net> ipv4: Kill 'rt_src' from 'struct rtable' Signed-off-by: David S. Miller <davem@davemloft.net> diff 1a00fee4 Sat Jun 30 20:02:56 MDT 2012 David Miller <davem@davemloft.net> ipv4: Remove rt_key_{src,dst,tos} from struct rtable. They are always used in contexts where they can be reconstituted, or where the finally resolved rt->rt_{src,dst} is semantically equivalent. Signed-off-by: David S. Miller <davem@davemloft.net> |
H A D | sock.h | diff d986f521 Tue Sep 12 10:02:00 MDT 2023 Eric Dumazet <edumazet@google.com> ipv6: lockless IPV6_MULTICAST_LOOP implementation Add inet6_{test|set|clear|assign}_bit() helpers. Note that I am using bits from inet->inet_flags, this might change in the future if we need more flags. While solving data-races accessing np->mc_loop, this patch also allows to implement lockless accesses to np->mcast_hops in the following patch. Also constify sk_mc_loop() argument. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff 4faeee0c Fri May 26 10:34:58 MDT 2023 Eric Dumazet <edumazet@google.com> tcp: deny tcp_disconnect() when threads are waiting Historically connect(AF_UNSPEC) has been abused by syzkaller and other fuzzers to trigger various bugs. A recent one triggers a divide-by-zero [1], and Paolo Abeni was able to diagnose the issue. tcp_recvmsg_locked() has tests about sk_state being not TCP_LISTEN and TCP REPAIR mode being not used. Then later if socket lock is released in sk_wait_data(), another thread can call connect(AF_UNSPEC), then make this socket a TCP listener. When recvmsg() is resumed, it can eventually call tcp_cleanup_rbuf() and attempt a divide by 0 in tcp_rcv_space_adjust() [1] This patch adds a new socket field, counting number of threads blocked in sk_wait_event() and inet_wait_for_connect(). If this counter is not zero, tcp_disconnect() returns an error. This patch adds code in blocking socket system calls, thus should not hurt performance of non blocking ones. Note that we probably could revert commit 499350a5a6e7 ("tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0") to restore original tcpi_rcv_mss meaning (was 0 if no payload was ever received on a socket) [1] divide error: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 13832 Comm: syz-executor.5 Not tainted 6.3.0-rc4-syzkaller-00224-g00c7b5f4ddc5 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023 RIP: 0010:tcp_rcv_space_adjust+0x36e/0x9d0 net/ipv4/tcp_input.c:740 Code: 00 00 00 00 fc ff df 4c 89 64 24 48 8b 44 24 04 44 89 f9 41 81 c7 80 03 00 00 c1 e1 04 44 29 f0 48 63 c9 48 01 e9 48 0f af c1 <49> f7 f6 48 8d 04 41 48 89 44 24 40 48 8b 44 24 30 48 c1 e8 03 48 RSP: 0018:ffffc900033af660 EFLAGS: 00010206 RAX: 4a66b76cbade2c48 RBX: ffff888076640cc0 RCX: 00000000c334e4ac RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000001 RBP: 00000000c324e86c R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8880766417f8 R13: ffff888028fbb980 R14: 0000000000000000 R15: 0000000000010344 FS: 00007f5bffbfe700(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b32f25000 CR3: 000000007ced0000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcp_recvmsg_locked+0x100e/0x22e0 net/ipv4/tcp.c:2616 tcp_recvmsg+0x117/0x620 net/ipv4/tcp.c:2681 inet6_recvmsg+0x114/0x640 net/ipv6/af_inet6.c:670 sock_recvmsg_nosec net/socket.c:1017 [inline] sock_recvmsg+0xe2/0x160 net/socket.c:1038 ____sys_recvmsg+0x210/0x5a0 net/socket.c:2720 ___sys_recvmsg+0xf2/0x180 net/socket.c:2762 do_recvmmsg+0x25e/0x6e0 net/socket.c:2856 __sys_recvmmsg net/socket.c:2935 [inline] __do_sys_recvmmsg net/socket.c:2958 [inline] __se_sys_recvmmsg net/socket.c:2951 [inline] __x64_sys_recvmmsg+0x20f/0x260 net/socket.c:2951 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f5c0108c0f9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f5bffbfe168 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 00007f5c011ac050 RCX: 00007f5c0108c0f9 RDX: 0000000000000001 RSI: 0000000020000bc0 RDI: 0000000000000003 RBP: 00007f5c010e7b39 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000122 R11: 0000000000000246 R12: 0000000000000000 R13: 00007f5c012cfb1f R14: 00007f5bffbfe300 R15: 0000000000022000 </TASK> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot <syzkaller@googlegroups.com> Reported-by: Paolo Abeni <pabeni@redhat.com> Diagnosed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20230526163458.2880232-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff d6755f37 Thu Sep 22 02:38:56 MDT 2022 Gaosheng Cui <cuigaosheng1@huawei.com> net: Remove unused inline function sk_nulls_node_init() All uses of sk_nulls_node_init() have been removed since commit dbca1596bbb0 ("ping: convert to RCU lookups, get rid of rwlock"), so remove it. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff 6fd1d51c Wed Apr 27 14:02:37 MDT 2022 Erin MacNeil <lnx.erin@gmail.com> net: SO_RCVMARK socket option for SO_MARK with recvmsg() Adding a new socket option, SO_RCVMARK, to indicate that SO_MARK should be included in the ancillary data returned by recvmsg(). Renamed the sock_recv_ts_and_drops() function to sock_recv_cmsgs(). Signed-off-by: Erin MacNeil <lnx.erin@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Link: https://lore.kernel.org/r/20220427200259.2564-1-lnx.erin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff a1cdec57 Thu Feb 17 10:05:02 MST 2022 Eric Dumazet <edumazet@google.com> net-timestamp: convert sk->sk_tskey to atomic_t UDP sendmsg() can be lockless, this is causing all kinds of data races. This patch converts sk->sk_tskey to remove one of these races. BUG: KCSAN: data-race in __ip_append_data / __ip_append_data read to 0xffff8881035d4b6c of 4 bytes by task 8877 on cpu 1: __ip_append_data+0x1c1/0x1de0 net/ipv4/ip_output.c:994 ip_make_skb+0x13f/0x2d0 net/ipv4/ip_output.c:1636 udp_sendmsg+0x12bd/0x14c0 net/ipv4/udp.c:1249 inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819 sock_sendmsg_nosec net/socket.c:705 [inline] sock_sendmsg net/socket.c:725 [inline] ____sys_sendmsg+0x39a/0x510 net/socket.c:2413 ___sys_sendmsg net/socket.c:2467 [inline] __sys_sendmmsg+0x267/0x4c0 net/socket.c:2553 __do_sys_sendmmsg net/socket.c:2582 [inline] __se_sys_sendmmsg net/socket.c:2579 [inline] __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae write to 0xffff8881035d4b6c of 4 bytes by task 8880 on cpu 0: __ip_append_data+0x1d8/0x1de0 net/ipv4/ip_output.c:994 ip_make_skb+0x13f/0x2d0 net/ipv4/ip_output.c:1636 udp_sendmsg+0x12bd/0x14c0 net/ipv4/udp.c:1249 inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819 sock_sendmsg_nosec net/socket.c:705 [inline] sock_sendmsg net/socket.c:725 [inline] ____sys_sendmsg+0x39a/0x510 net/socket.c:2413 ___sys_sendmsg net/socket.c:2467 [inline] __sys_sendmmsg+0x267/0x4c0 net/socket.c:2553 __do_sys_sendmmsg net/socket.c:2582 [inline] __se_sys_sendmmsg net/socket.c:2579 [inline] __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x0000054d -> 0x0000054e Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 8880 Comm: syz-executor.5 Not tainted 5.17.0-rc2-syzkaller-00167-gdcb85f85fa6f-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 09c2d251b707 ("net-timestamp: add key to disambiguate concurrent datagrams") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 43f51df4 Mon Nov 15 12:02:49 MST 2021 Eric Dumazet <edumazet@google.com> net: move early demux fields close to sk_refcnt sk_rx_dst/sk_rx_dst_ifindex/sk_rx_dst_cookie are read in early demux, and currently spans two cache lines. Moving them close to sk_refcnt makes more sense, as only one cache line is needed. New layout for this hot cache line is : struct sock { struct sock_common __sk_common; /* 0 0x88 */ /* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */ struct dst_entry * sk_rx_dst; /* 0x88 0x8 */ int sk_rx_dst_ifindex; /* 0x90 0x4 */ u32 sk_rx_dst_cookie; /* 0x94 0x4 */ socket_lock_t sk_lock; /* 0x98 0x20 */ atomic_t sk_drops; /* 0xb8 0x4 */ int sk_rcvlowat; /* 0xbc 0x4 */ /* --- cacheline 3 boundary (192 bytes) --- */ Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff f35f8219 Mon Nov 15 12:02:46 MST 2021 Eric Dumazet <edumazet@google.com> tcp: defer skb freeing after socket lock is released tcp recvmsg() (or rx zerocopy) spends a fair amount of time freeing skbs after their payload has been consumed. A typical ~64KB GRO packet has to release ~45 page references, eventually going to page allocator for each of them. Currently, this freeing is performed while socket lock is held, meaning that there is a high chance that BH handler has to queue incoming packets to tcp socket backlog. This can cause additional latencies, because the user thread has to process the backlog at release_sock() time, and while doing so, additional frames can be added by BH handler. This patch adds logic to defer these frees after socket lock is released, or directly from BH handler if possible. Being able to free these skbs from BH handler helps a lot, because this avoids the usual alloc/free assymetry, when BH handler and user thread do not run on same cpu or NUMA node. One cpu can now be fully utilized for the kernel->user copy, and another cpu is handling BH processing and skb/page allocs/frees (assuming RFS is not forcing use of a single CPU) Tested: 100Gbit NIC Max throughput for one TCP_STREAM flow, over 10 runs MTU : 1500 Before: 55 Gbit After: 66 Gbit MTU : 4096+(headers) Before: 82 Gbit After: 95 Gbit Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff d2489c7b Mon Nov 15 12:02:41 MST 2021 Eric Dumazet <edumazet@google.com> tcp: add RETPOLINE mitigation to sk_backlog_rcv Use INDIRECT_CALL_INET() to avoid an indirect call when/if CONFIG_RETPOLINE=y Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 6c302e79 Mon Nov 15 12:02:38 MST 2021 Eric Dumazet <edumazet@google.com> net: forward_alloc_get depends on CONFIG_MPTCP (struct proto)->sk_forward_alloc is currently only used by MPTCP. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 1ace2b4d Mon Nov 15 12:02:37 MST 2021 Eric Dumazet <edumazet@google.com> net: shrink struct sock by 8 bytes Move sk_bind_phc next to sk_peer_lock to fill a hole. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
H A D | tcp.h | diff 06b22ef2 Mon Oct 23 13:22:02 MDT 2023 Dmitry Safonov <0x7f454c46@gmail.com> net/tcp: Wire TCP-AO to request sockets Now when the new request socket is created from the listening socket, it's recorded what MKT was used by the peer. tcp_rsk_used_ao() is a new helper for checking if TCP-AO option was used to create the request socket. tcp_ao_copy_all_matching() will copy all keys that match the peer on the request socket, as well as preparing them for the usage (creating traffic keys). Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff 350f6bbc Mon Jul 24 12:54:02 MDT 2023 Matthew Wilcox (Oracle) <willy@infradead.org> mm: allow per-VMA locks on file-backed VMAs Remove the TCP layering violation by allowing per-VMA locks on all VMAs. The fault path will immediately fail in handle_mm_fault(). There may be a small performance reduction from this patch as a little unnecessary work will be done on each page fault. See later patches for the improvement. Link: https://lkml.kernel.org/r/20230724185410.1124082-3-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Arjun Roy <arjunroy@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Punit Agrawal <punit.agrawal@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> diff 4bd0623f Wed Aug 16 02:15:41 MDT 2023 Eric Dumazet <edumazet@google.com> inet: move inet->transparent to inet->inet_flags IP_TRANSPARENT socket option can now be set/read without locking the socket. v2: removed unused issk variable in mptcp_setsockopt_sol_ip_set_transparent() v4: rebased after commit 3f326a821b99 ("mptcp: change the mpc check helper to return a sk") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net> diff 30c6f0bf Wed May 31 02:01:50 MDT 2023 fuyuanli <fuyuanli@didiglobal.com> tcp: fix mishandling when the sack compression is deferred. In this patch, we mainly try to handle sending a compressed ack correctly if it's deferred. Here are more details in the old logic: When sack compression is triggered in the tcp_compressed_ack_kick(), if the sock is owned by user, it will set TCP_DELACK_TIMER_DEFERRED and then defer to the release cb phrase. Later once user releases the sock, tcp_delack_timer_handler() should send a ack as expected, which, however, cannot happen due to lack of ICSK_ACK_TIMER flag. Therefore, the receiver would not sent an ack until the sender's retransmission timeout. It definitely increases unnecessary latency. Fixes: 5d9f4262b7ea ("tcp: add SACK compression") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: fuyuanli <fuyuanli@didiglobal.com> Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://lore.kernel.org/netdev/20230529113804.GA20300@didi-ThinkCentre-M920t-N000/ Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230531080150.GA20424@didi-ThinkCentre-M920t-N000 Signed-off-by: Paolo Abeni <pabeni@redhat.com> diff ba5a4fdd Sun Apr 24 14:35:09 MDT 2022 Eric Dumazet <edumazet@google.com> tcp: make sure treq->af_specific is initialized syzbot complained about a recent change in TCP stack, hitting a NULL pointer [1] tcp request sockets have an af_specific pointer, which was used before the blamed change only for SYNACK generation in non SYNCOOKIE mode. tcp requests sockets momentarily created when third packet coming from client in SYNCOOKIE mode were not using treq->af_specific. Make sure this field is populated, in the same way normal TCP requests sockets do in tcp_conn_request(). [1] TCP: request_sock_TCPv6: Possible SYN flooding on port 20002. Sending cookies. Check SNMP counters. general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 1 PID: 3695 Comm: syz-executor864 Not tainted 5.18.0-rc3-syzkaller-00224-g5fd1fe4807f9 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:tcp_create_openreq_child+0xe16/0x16b0 net/ipv4/tcp_minisocks.c:534 Code: 48 c1 ea 03 80 3c 02 00 0f 85 e5 07 00 00 4c 8b b3 28 01 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7e 08 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 c9 07 00 00 48 8b 3c 24 48 89 de 41 ff 56 08 48 RSP: 0018:ffffc90000de0588 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff888076490330 RCX: 0000000000000100 RDX: 0000000000000001 RSI: ffffffff87d67ff0 RDI: 0000000000000008 RBP: ffff88806ee1c7f8 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff87d67f00 R11: 0000000000000000 R12: ffff88806ee1bfc0 R13: ffff88801b0e0368 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f517fe58700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffcead76960 CR3: 000000006f97b000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> tcp_v6_syn_recv_sock+0x199/0x23b0 net/ipv6/tcp_ipv6.c:1267 tcp_get_cookie_sock+0xc9/0x850 net/ipv4/syncookies.c:207 cookie_v6_check+0x15c3/0x2340 net/ipv6/syncookies.c:258 tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:1131 [inline] tcp_v6_do_rcv+0x1148/0x13b0 net/ipv6/tcp_ipv6.c:1486 tcp_v6_rcv+0x3305/0x3840 net/ipv6/tcp_ipv6.c:1725 ip6_protocol_deliver_rcu+0x2e9/0x1900 net/ipv6/ip6_input.c:422 ip6_input_finish+0x14c/0x2c0 net/ipv6/ip6_input.c:464 NF_HOOK include/linux/netfilter.h:307 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] ip6_input+0x9c/0xd0 net/ipv6/ip6_input.c:473 dst_input include/net/dst.h:461 [inline] ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] ipv6_rcv+0x27f/0x3b0 net/ipv6/ip6_input.c:297 __netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5405 __netif_receive_skb+0x24/0x1b0 net/core/dev.c:5519 process_backlog+0x3a0/0x7c0 net/core/dev.c:5847 __napi_poll+0xb3/0x6e0 net/core/dev.c:6413 napi_poll net/core/dev.c:6480 [inline] net_rx_action+0x8ec/0xc60 net/core/dev.c:6567 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558 invoke_softirq kernel/softirq.c:432 [inline] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637 irq_exit_rcu+0x5/0x20 kernel/softirq.c:649 sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1097 Fixes: 5b0b9e4c2c89 ("tcp: md5: incorrect tcp_header_len for incoming connections") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff ba5a4fdd Sun Apr 24 14:35:09 MDT 2022 Eric Dumazet <edumazet@google.com> tcp: make sure treq->af_specific is initialized syzbot complained about a recent change in TCP stack, hitting a NULL pointer [1] tcp request sockets have an af_specific pointer, which was used before the blamed change only for SYNACK generation in non SYNCOOKIE mode. tcp requests sockets momentarily created when third packet coming from client in SYNCOOKIE mode were not using treq->af_specific. Make sure this field is populated, in the same way normal TCP requests sockets do in tcp_conn_request(). [1] TCP: request_sock_TCPv6: Possible SYN flooding on port 20002. Sending cookies. Check SNMP counters. general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 1 PID: 3695 Comm: syz-executor864 Not tainted 5.18.0-rc3-syzkaller-00224-g5fd1fe4807f9 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:tcp_create_openreq_child+0xe16/0x16b0 net/ipv4/tcp_minisocks.c:534 Code: 48 c1 ea 03 80 3c 02 00 0f 85 e5 07 00 00 4c 8b b3 28 01 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7e 08 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 c9 07 00 00 48 8b 3c 24 48 89 de 41 ff 56 08 48 RSP: 0018:ffffc90000de0588 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff888076490330 RCX: 0000000000000100 RDX: 0000000000000001 RSI: ffffffff87d67ff0 RDI: 0000000000000008 RBP: ffff88806ee1c7f8 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff87d67f00 R11: 0000000000000000 R12: ffff88806ee1bfc0 R13: ffff88801b0e0368 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f517fe58700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffcead76960 CR3: 000000006f97b000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> tcp_v6_syn_recv_sock+0x199/0x23b0 net/ipv6/tcp_ipv6.c:1267 tcp_get_cookie_sock+0xc9/0x850 net/ipv4/syncookies.c:207 cookie_v6_check+0x15c3/0x2340 net/ipv6/syncookies.c:258 tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:1131 [inline] tcp_v6_do_rcv+0x1148/0x13b0 net/ipv6/tcp_ipv6.c:1486 tcp_v6_rcv+0x3305/0x3840 net/ipv6/tcp_ipv6.c:1725 ip6_protocol_deliver_rcu+0x2e9/0x1900 net/ipv6/ip6_input.c:422 ip6_input_finish+0x14c/0x2c0 net/ipv6/ip6_input.c:464 NF_HOOK include/linux/netfilter.h:307 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] ip6_input+0x9c/0xd0 net/ipv6/ip6_input.c:473 dst_input include/net/dst.h:461 [inline] ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] ipv6_rcv+0x27f/0x3b0 net/ipv6/ip6_input.c:297 __netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5405 __netif_receive_skb+0x24/0x1b0 net/core/dev.c:5519 process_backlog+0x3a0/0x7c0 net/core/dev.c:5847 __napi_poll+0xb3/0x6e0 net/core/dev.c:6413 napi_poll net/core/dev.c:6480 [inline] net_rx_action+0x8ec/0xc60 net/core/dev.c:6567 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558 invoke_softirq kernel/softirq.c:432 [inline] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637 irq_exit_rcu+0x5/0x20 kernel/softirq.c:649 sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1097 Fixes: 5b0b9e4c2c89 ("tcp: md5: incorrect tcp_header_len for incoming connections") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff f35f8219 Mon Nov 15 12:02:46 MST 2021 Eric Dumazet <edumazet@google.com> tcp: defer skb freeing after socket lock is released tcp recvmsg() (or rx zerocopy) spends a fair amount of time freeing skbs after their payload has been consumed. A typical ~64KB GRO packet has to release ~45 page references, eventually going to page allocator for each of them. Currently, this freeing is performed while socket lock is held, meaning that there is a high chance that BH handler has to queue incoming packets to tcp socket backlog. This can cause additional latencies, because the user thread has to process the backlog at release_sock() time, and while doing so, additional frames can be added by BH handler. This patch adds logic to defer these frees after socket lock is released, or directly from BH handler if possible. Being able to free these skbs from BH handler helps a lot, because this avoids the usual alloc/free assymetry, when BH handler and user thread do not run on same cpu or NUMA node. One cpu can now be fully utilized for the kernel->user copy, and another cpu is handling BH processing and skb/page allocs/frees (assuming RFS is not forcing use of a single CPU) Tested: 100Gbit NIC Max throughput for one TCP_STREAM flow, over 10 runs MTU : 1500 Before: 55 Gbit After: 66 Gbit MTU : 4096+(headers) Before: 82 Gbit After: 95 Gbit Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 0307a0b7 Mon Nov 15 12:02:42 MST 2021 Eric Dumazet <edumazet@google.com> tcp: annotate data-races on tp->segs_in and tp->data_segs_in tcp_segs_in() can be called from BH, while socket spinlock is held but socket owned by user, eventually reading these fields from tcp_get_info() Found by code inspection, no need to backport this patch to older kernels. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 9b65b17d Tue Nov 02 20:58:44 MDT 2021 Talal Ahmad <talalahmad@google.com> net: avoid double accounting for pure zerocopy skbs Track skbs containing only zerocopy data and avoid charging them to kernel memory to correctly account the memory utilization for msg_zerocopy. All of the data in such skbs is held in user pages which are already accounted to user. Before this change, they are charged again in kernel in __zerocopy_sg_from_iter. The charging in kernel is excessive because data is not being copied into skb frags. This excessive charging can lead to kernel going into memory pressure state which impacts all sockets in the system adversely. Mark pure zerocopy skbs with a SKBFL_PURE_ZEROCOPY flag and remove charge/uncharge for data in such skbs. Initially, an skb is marked pure zerocopy when it is empty and in zerocopy path. skb can then change from a pure zerocopy skb to mixed data skb (zerocopy and copy data) if it is at tail of write queue and there is room available in it and non-zerocopy data is being sent in the next sendmsg call. At this time sk_mem_charge is done for the pure zerocopied data and the pure zerocopy flag is unmarked. We found that this happens very rarely on workloads that pass MSG_ZEROCOPY. A pure zerocopy skb can later be coalesced into normal skb if they are next to each other in queue but this patch prevents coalescing from happening. This avoids complexity of charging when skb downgrades from pure zerocopy to mixed. This is also rare. In sk_wmem_free_skb, if it is a pure zerocopy skb, an sk_mem_uncharge for SKB_TRUESIZE(skb_end_offset(skb)) is done for sk_mem_charge in tcp_skb_entail for an skb without data. Testing with the msg_zerocopy.c benchmark between two hosts(100G nics) with zerocopy showed that before this patch the 'sock' variable in memory.stat for cgroup2 that tracks sum of sk_forward_alloc, sk_rmem_alloc and sk_wmem_queued is around 1822720 and with this change it is 0. This is due to no charge to sk_forward_alloc for zerocopy data and shows memory utilization for kernel is lowered. With this commit we don't see the warning we saw in previous commit which resulted in commit 84882cf72cd774cf16fd338bdbf00f69ac9f9194. Signed-off-by: Talal Ahmad <talalahmad@google.com> Acked-by: Arjun Roy <arjunroy@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 82506665 Fri Apr 02 12:10:37 MDT 2021 Eric Dumazet <edumazet@google.com> tcp: reorder tcp_congestion_ops for better cache locality Group all the often used fields in the first cache line, to reduce cache line misses. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net> |
H A D | udp.h | diff 9e63a99c Tue Aug 01 07:39:02 MDT 2023 Yue Haibing <yuehaibing@huawei.com> udp: Remove unused function declaration udp_bpf_get_proto() commit 8a59f9d1e3d4 ("sock: Introduce sk->sk_prot->psock_update_sk_prot()") left behind this. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230801133902.3660-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff fd69c399 Mon Apr 08 02:15:59 MDT 2019 Paolo Abeni <pabeni@redhat.com> datagram: remove rendundant 'peeked' argument After commit a297569fe00a ("net/udp: do not touch skb->peeked unless really needed") the 'peeked' argument of __skb_try_recv_datagram() and friends is always equal to !!'flags & MSG_PEEK'. Since such argument is really a boolean info, and the callers have already 'flags & MSG_PEEK' handy, we can remove it and clean-up the code a bit. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff ade994f4 Sun Jul 02 22:01:49 MDT 2017 Al Viro <viro@zeniv.linux.org.uk> net: annotate ->poll() instances Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff 02c22347 Wed Apr 27 17:44:30 MDT 2016 Eric Dumazet <edumazet@google.com> net: udp: rename UDP_INC_STATS_BH() Rename UDP_INC_STATS_BH() to __UDP_INC_STATS(), and UDP6_INC_STATS_BH() to __UDP6_INC_STATS() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 1b784140 Mon Mar 02 00:37:48 MST 2015 Ying Xue <ying.xue@windriver.com> net: Remove iocb argument from sendmsg and recvmsg After TIPC doesn't depend on iocb argument in its internal implementations of sendmsg() and recvmsg() hooks defined in proto structure, no any user is using iocb argument in them at all now. Then we can drop the redundant iocb argument completely from kinds of implementations of both sendmsg() and recvmsg() in the entire networking stack. Cc: Christoph Hellwig <hch@lst.de> Suggested-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff d7f3f621 Fri Apr 27 02:24:08 MDT 2012 Benjamin LaHaise <bcrl@kvack.org> net/ipv6/udp: UDP encapsulation: introduce encap_rcv hook into IPv6 Now that the sematics of udpv6_queue_rcv_skb() match IPv4's udp_queue_rcv_skb(), introduce the UDP encap_rcv() hook for IPv6. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff dfd56b8b Sat Dec 10 02:48:31 MST 2011 Eric Dumazet <eric.dumazet@gmail.com> net: use IS_ENABLED(CONFIG_IPV6) Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff d7ca4cc0 Thu Jul 09 02:09:47 MDT 2009 Sridhar Samudrala <sri@us.ibm.com> udpv4: Handle large incoming UDP/IPv4 packets and support software UFO. - validate and forward GSO UDP/IPv4 packets from untrusted sources. - do software UFO if the outgoing device doesn't support UFO. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> diff 645ca708 Wed Oct 29 02:41:45 MDT 2008 Eric Dumazet <dada1@cosmosbay.com> udp: introduce struct udp_table and multiple spinlocks UDP sockets are hashed in a 128 slots hash table. This hash table is protected by *one* rwlock. This rwlock is readlocked each time an incoming UDP message is handled. This rwlock is writelocked each time a socket must be inserted in hash table (bind time), or deleted from this table (close time) This is not scalable on SMP machines : 1) Even in read mode, lock() and unlock() are atomic operations and must dirty a contended cache line, shared by all cpus. 2) A writer might be starved if many readers are 'in flight'. This can happen on a machine with some NIC receiving many UDP messages. User process can be delayed a long time at socket creation/dismantle time. This patch prepares RCU migration, by introducing 'struct udp_table and struct udp_hslot', and using one spinlock per chain, to reduce contention on central rwlock. Introducing one spinlock per chain reduces latencies, for port randomization on heavily loaded UDP servers. This also speedup bindings to specific ports. udp_lib_unhash() was uninlined, becoming to big. Some cleanups were done to ease review of following patch (RCUification of UDP Unicast lookups) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 20380731 Mon Aug 15 23:18:02 MDT 2005 Arnaldo Carvalho de Melo <acme@mandriva.com> [NET]: Fix sparse warnings Of this type, mostly: CHECK net/ipv6/netfilter.c net/ipv6/netfilter.c:96:12: warning: symbol 'ipv6_netfilter_init' was not declared. Should it be static? net/ipv6/netfilter.c:101:6: warning: symbol 'ipv6_netfilter_fini' was not declared. Should it be static? Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
/linux-master/net/802/ | ||
H A D | fddi.c | diff e3804cbe Mon May 25 02:53:53 MDT 2009 Alexander Beregalov <a.beregalov@gmail.com> net: remove COMPAT_NET_DEV_OPS All drivers are already converted to new net_device_ops API and nobody uses old API anymore. Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 3b04ddde Tue Oct 09 02:40:57 MDT 2007 Stephen Hemminger <shemminger@linux-foundation.org> [NET]: Move hardware header operations out of netdevice. Since hardware header operations are part of the protocol class not the device instance, make them into a separate object and save memory. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff 02c30a84 Thu May 05 17:16:16 MDT 2005 Jesper Juhl <juhl-lkml@dif.dk> [PATCH] update Ross Biro bouncing email address Ross moved. Remove the bad email address so people will find the correct one in ./CREDITS. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 02c30a84 Thu May 05 17:16:16 MDT 2005 Jesper Juhl <juhl-lkml@dif.dk> [PATCH] update Ross Biro bouncing email address Ross moved. Remove the bad email address so people will find the correct one in ./CREDITS. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
H A D | hippi.c | diff e3804cbe Mon May 25 02:53:53 MDT 2009 Alexander Beregalov <a.beregalov@gmail.com> net: remove COMPAT_NET_DEV_OPS All drivers are already converted to new net_device_ops API and nobody uses old API anymore. Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 3b04ddde Tue Oct 09 02:40:57 MDT 2007 Stephen Hemminger <shemminger@linux-foundation.org> [NET]: Move hardware header operations out of netdevice. Since hardware header operations are part of the protocol class not the device instance, make them into a separate object and save memory. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff 02c30a84 Thu May 05 17:16:16 MDT 2005 Jesper Juhl <juhl-lkml@dif.dk> [PATCH] update Ross Biro bouncing email address Ross moved. Remove the bad email address so people will find the correct one in ./CREDITS. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 02c30a84 Thu May 05 17:16:16 MDT 2005 Jesper Juhl <juhl-lkml@dif.dk> [PATCH] update Ross Biro bouncing email address Ross moved. Remove the bad email address so people will find the correct one in ./CREDITS. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
/linux-master/net/core/ | ||
H A D | dev.c | diff 6ebfad33 Thu Mar 14 08:18:16 MDT 2024 Eric Dumazet <edumazet@google.com> packet: annotate data-races around ignore_outgoing ignore_outgoing is read locklessly from dev_queue_xmit_nit() and packet_getsockopt() Add appropriate READ_ONCE()/WRITE_ONCE() annotations. syzbot reported: BUG: KCSAN: data-race in dev_queue_xmit_nit / packet_setsockopt write to 0xffff888107804542 of 1 bytes by task 22618 on cpu 0: packet_setsockopt+0xd83/0xfd0 net/packet/af_packet.c:4003 do_sock_setsockopt net/socket.c:2311 [inline] __sys_setsockopt+0x1d8/0x250 net/socket.c:2334 __do_sys_setsockopt net/socket.c:2343 [inline] __se_sys_setsockopt net/socket.c:2340 [inline] __x64_sys_setsockopt+0x66/0x80 net/socket.c:2340 do_syscall_64+0xd3/0x1d0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 read to 0xffff888107804542 of 1 bytes by task 27 on cpu 1: dev_queue_xmit_nit+0x82/0x620 net/core/dev.c:2248 xmit_one net/core/dev.c:3527 [inline] dev_hard_start_xmit+0xcc/0x3f0 net/core/dev.c:3547 __dev_queue_xmit+0xf24/0x1dd0 net/core/dev.c:4335 dev_queue_xmit include/linux/netdevice.h:3091 [inline] batadv_send_skb_packet+0x264/0x300 net/batman-adv/send.c:108 batadv_send_broadcast_skb+0x24/0x30 net/batman-adv/send.c:127 batadv_iv_ogm_send_to_if net/batman-adv/bat_iv_ogm.c:392 [inline] batadv_iv_ogm_emit net/batman-adv/bat_iv_ogm.c:420 [inline] batadv_iv_send_outstanding_bat_ogm_packet+0x3f0/0x4b0 net/batman-adv/bat_iv_ogm.c:1700 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0x465/0x990 kernel/workqueue.c:3335 worker_thread+0x526/0x730 kernel/workqueue.c:3416 kthread+0x1d1/0x210 kernel/kthread.c:388 ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243 value changed: 0x00 -> 0x01 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 27 Comm: kworker/u8:1 Tainted: G W 6.8.0-syzkaller-08073-g480e035fc4c7 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024 Workqueue: bat_events batadv_iv_send_outstanding_bat_ogm_packet Fixes: fa788d986a3a ("packet: add sockopt to ignore outgoing packets") Reported-by: syzbot+c669c1136495a2e7c31f@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/CANn89i+Z7MfbkBLOv=p7KZ7=K1rKHO4P1OL5LYDCtBiyqsa9oQ@mail.gmail.com/T/#t Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff f853fa5c Fri Feb 16 02:25:43 MST 2024 Lorenzo Bianconi <lorenzo@kernel.org> net: page_pool: fix recycle stats for system page_pool allocator Use global percpu page_pool_recycle_stats counter for system page_pool allocator instead of allocating a separate percpu variable for each (also percpu) page pool instance. Reviewed-by: Toke Hoiland-Jorgensen <toke@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/87f572425e98faea3da45f76c3c68815c01a20ee.1708075412.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff e6d5dbdd Mon Feb 12 02:50:56 MST 2024 Lorenzo Bianconi <lorenzo@kernel.org> xdp: add multi-buff support for xdp running in generic mode Similar to native xdp, do not always linearize the skb in netif_receive_generic_xdp routine but create a non-linear xdp_buff to be processed by the eBPF program. This allow to add multi-buffer support for xdp running in generic mode. Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Hoiland-Jorgensen <toke@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/1044d6412b1c3e95b40d34993fd5f37cd2f319fd.1707729884.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff 4d2bb0bf Mon Feb 12 02:50:55 MST 2024 Lorenzo Bianconi <lorenzo@kernel.org> xdp: rely on skb pointer reference in do_xdp_generic and netif_receive_generic_xdp Rely on skb pointer reference instead of the skb pointer in do_xdp_generic and netif_receive_generic_xdp routine signatures. This is a preliminary patch to add multi-buff support for xdp running in generic mode where we will need to reallocate the skb to avoid linearization and we will need to make it visible to do_xdp_generic() caller. Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Hoiland-Jorgensen <toke@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/c09415b1f48c8620ef4d76deed35050a7bddf7c2.1707729884.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff 2b0cfa6e Mon Feb 12 02:50:54 MST 2024 Lorenzo Bianconi <lorenzo@kernel.org> net: add generic percpu page_pool allocator Introduce generic percpu page_pools allocator. Moreover add page_pool_create_percpu() and cpuid filed in page_pool struct in order to recycle the page in the page_pool "hot" cache if napi_pp_put_page() is running on the same cpu. This is a preliminary patch to add xdp multi-buff support for xdp running in generic mode. Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Hoiland-Jorgensen <toke@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/80bc4285228b6f4220cd03de1999d86e46e3fcbd.1707729884.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff ffabe98c Fri Feb 02 03:11:06 MST 2024 Eric Dumazet <edumazet@google.com> net: make dev_unreg_count global We can use a global dev_unreg_count counter instead of a per netns one. As a bonus we can factorize the changes done on it for bulk device removals. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff d3d344a1 Tue Jan 02 09:22:20 MST 2024 Eric Dumazet <edumazet@google.com> net-device: move xdp_prog to net_device_read_rx xdp_prog is used in receive path, both from XDP enabled drivers and from netif_elide_gro(). This patch also removes two 4-bytes holes. Fixes: 43a71cd66b9c ("net-device: reorganize net_device fast path variables") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Coco Li <lixiaoyan@google.com> Cc: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240102162220.750823-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff 8e15aee6 Tue Oct 17 19:38:16 MDT 2023 Jakub Kicinski <kuba@kernel.org> net: move altnames together with the netdevice The altname nodes are currently not moved to the new netns when netdevice itself moves: [ ~]# ip netns add test [ ~]# ip -netns test link add name eth0 type dummy [ ~]# ip -netns test link property add dev eth0 altname some-name [ ~]# ip -netns test link show dev some-name 2: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 1e:67:ed:19:3d:24 brd ff:ff:ff:ff:ff:ff altname some-name [ ~]# ip -netns test link set dev eth0 netns 1 [ ~]# ip link ... 3: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 02:40:88:62:ec:b8 brd ff:ff:ff:ff:ff:ff altname some-name [ ~]# ip li show dev some-name Device "some-name" does not exist. Remove them from the hash table when device is unlisted and add back when listed again. Fixes: 36fbf1e52bd3 ("net: rtnetlink: add linkprop commands to add and delete alternative ifnames") Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> diff 7663d522 Tue Oct 17 19:38:14 MDT 2023 Jakub Kicinski <kuba@kernel.org> net: check for altname conflicts when changing netdev's netns It's currently possible to create an altname conflicting with an altname or real name of another device by creating it in another netns and moving it over: [ ~]$ ip link add dev eth0 type dummy [ ~]$ ip netns add test [ ~]$ ip -netns test link add dev ethX netns test type dummy [ ~]$ ip -netns test link property add dev ethX altname eth0 [ ~]$ ip -netns test link set dev ethX netns 1 [ ~]$ ip link ... 3: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 02:40:88:62:ec:b8 brd ff:ff:ff:ff:ff:ff ... 5: ethX: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 26:b7:28:78:38:0f brd ff:ff:ff:ff:ff:ff altname eth0 Create a macro for walking the altnames, this hopefully makes it clearer that the list we walk contains only altnames. Which is otherwise not entirely intuitive. Fixes: 36fbf1e52bd3 ("net: rtnetlink: add linkprop commands to add and delete alternative ifnames") Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> diff 49e47a5b Wed Aug 02 19:02:29 MDT 2023 Jakub Kicinski <kuba@kernel.org> net: move struct netdev_rx_queue out of netdevice.h struct netdev_rx_queue is touched in only a few places and having it defined in netdevice.h brings in the dependency on xdp.h, because struct xdp_rxq_info gets embedded in struct netdev_rx_queue. In prep for removal of xdp.h from netdevice.h move all the netdev_rx_queue stuff to a new header. We could technically break the new header up to avoid the sysfs.h include but it's so rarely included it doesn't seem to be worth it at this point. Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://lore.kernel.org/r/20230803010230.1755386-3-kuba@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> diff 49e47a5b Wed Aug 02 19:02:29 MDT 2023 Jakub Kicinski <kuba@kernel.org> net: move struct netdev_rx_queue out of netdevice.h struct netdev_rx_queue is touched in only a few places and having it defined in netdevice.h brings in the dependency on xdp.h, because struct xdp_rxq_info gets embedded in struct netdev_rx_queue. In prep for removal of xdp.h from netdevice.h move all the netdev_rx_queue stuff to a new header. We could technically break the new header up to avoid the sysfs.h include but it's so rarely included it doesn't seem to be worth it at this point. Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://lore.kernel.org/r/20230803010230.1755386-3-kuba@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> |
H A D | sock.c | diff c2deb2e9 Thu Mar 21 02:44:10 MDT 2024 linke li <lilinke99@qq.com> net: mark racy access on sk->sk_rcvbuf sk->sk_rcvbuf in __sock_queue_rcv_skb() and __sk_receive_skb() can be changed by other threads. Mark this as benign using READ_ONCE(). This patch is aimed at reducing the number of benign races reported by KCSAN in order to focus future debugging effort on harmful races. Signed-off-by: linke li <lilinke99@qq.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff d986f521 Tue Sep 12 10:02:00 MDT 2023 Eric Dumazet <edumazet@google.com> ipv6: lockless IPV6_MULTICAST_LOOP implementation Add inet6_{test|set|clear|assign}_bit() helpers. Note that I am using bits from inet->inet_flags, this might change in the future if we need more flags. While solving data-races accessing np->mc_loop, this patch also allows to implement lockless accesses to np->mcast_hops in the following patch. Also constify sk_mc_loop() argument. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff b09bde5c Wed Aug 16 02:15:39 MDT 2023 Eric Dumazet <edumazet@google.com> inet: move inet->mc_loop to inet->inet_frags IP_MULTICAST_LOOP socket option can now be set/read without locking the socket. v3: fix build bot error reported in ipvs set_mcast_loop() Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> diff b6f79e82 Mon Aug 07 02:12:25 MDT 2023 David Rheinsberg <david@readahead.eu> net/unix: use consistent error code in SO_PEERPIDFD Change the new (unreleased) SO_PEERPIDFD sockopt to return ENODATA rather than ESRCH if a socket type does not support remote peer-PID queries. Currently, SO_PEERPIDFD returns ESRCH when the socket in question is not an AF_UNIX socket. This is quite unexpected, given that one would assume ESRCH means the peer process already exited and thus cannot be found. However, in that case the sockopt actually returns EINVAL (via pidfd_prepare()). This is rather inconsistent with other syscalls, which usually return ESRCH if a given PID refers to a non-existant process. This changes SO_PEERPIDFD to return ENODATA instead. This is also what SO_PEERGROUPS returns, and thus keeps a consistent behavior across sockopts. Note that this code is returned in 2 cases: First, if the socket type is not AF_UNIX, and secondly if the socket was not yet connected. In both cases ENODATA seems suitable. Signed-off-by: David Rheinsberg <david@readahead.eu> Reviewed-by: Christian Brauner <brauner@kernel.org> Acked-by: Luca Boccassi <bluca@debian.org> Fixes: 7b26952a91cf ("net: core: add getsockopt SO_PEERPIDFD") Link: https://lore.kernel.org/r/20230807081225.816199-1-david@readahead.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff 4ff09db1 Thu Sep 01 18:28:02 MDT 2022 Martin KaFai Lau <martin.lau@kernel.org> bpf: net: Change sk_getsockopt() to take the sockptr_t argument This patch changes sk_getsockopt() to take the sockptr_t argument such that it can be used by bpf_getsockopt(SOL_SOCKET) in a latter patch. security_socket_getpeersec_stream() is not changed. It stays with the __user ptr (optval.user and optlen.user) to avoid changes to other security hooks. bpf_getsockopt(SOL_SOCKET) also does not support SO_PEERSEC. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002802.2888419-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> diff 6fd1d51c Wed Apr 27 14:02:37 MDT 2022 Erin MacNeil <lnx.erin@gmail.com> net: SO_RCVMARK socket option for SO_MARK with recvmsg() Adding a new socket option, SO_RCVMARK, to indicate that SO_MARK should be included in the ancillary data returned by recvmsg(). Renamed the sock_recv_ts_and_drops() function to sock_recv_cmsgs(). Signed-off-by: Erin MacNeil <lnx.erin@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Link: https://lore.kernel.org/r/20220427200259.2564-1-lnx.erin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff a1cdec57 Thu Feb 17 10:05:02 MST 2022 Eric Dumazet <edumazet@google.com> net-timestamp: convert sk->sk_tskey to atomic_t UDP sendmsg() can be lockless, this is causing all kinds of data races. This patch converts sk->sk_tskey to remove one of these races. BUG: KCSAN: data-race in __ip_append_data / __ip_append_data read to 0xffff8881035d4b6c of 4 bytes by task 8877 on cpu 1: __ip_append_data+0x1c1/0x1de0 net/ipv4/ip_output.c:994 ip_make_skb+0x13f/0x2d0 net/ipv4/ip_output.c:1636 udp_sendmsg+0x12bd/0x14c0 net/ipv4/udp.c:1249 inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819 sock_sendmsg_nosec net/socket.c:705 [inline] sock_sendmsg net/socket.c:725 [inline] ____sys_sendmsg+0x39a/0x510 net/socket.c:2413 ___sys_sendmsg net/socket.c:2467 [inline] __sys_sendmmsg+0x267/0x4c0 net/socket.c:2553 __do_sys_sendmmsg net/socket.c:2582 [inline] __se_sys_sendmmsg net/socket.c:2579 [inline] __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae write to 0xffff8881035d4b6c of 4 bytes by task 8880 on cpu 0: __ip_append_data+0x1d8/0x1de0 net/ipv4/ip_output.c:994 ip_make_skb+0x13f/0x2d0 net/ipv4/ip_output.c:1636 udp_sendmsg+0x12bd/0x14c0 net/ipv4/udp.c:1249 inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819 sock_sendmsg_nosec net/socket.c:705 [inline] sock_sendmsg net/socket.c:725 [inline] ____sys_sendmsg+0x39a/0x510 net/socket.c:2413 ___sys_sendmsg net/socket.c:2467 [inline] __sys_sendmmsg+0x267/0x4c0 net/socket.c:2553 __do_sys_sendmmsg net/socket.c:2582 [inline] __se_sys_sendmmsg net/socket.c:2579 [inline] __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x0000054d -> 0x0000054e Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 8880 Comm: syz-executor.5 Not tainted 5.17.0-rc2-syzkaller-00167-gdcb85f85fa6f-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 09c2d251b707 ("net-timestamp: add key to disambiguate concurrent datagrams") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff 79074a72 Mon Jan 17 02:27:33 MST 2022 Gal Pressman <gal@nvidia.com> net: Flush deferred skb free on socket destroy The cited Fixes patch moved to a deferred skb approach where the skbs are not freed immediately under the socket lock. Add a WARN_ON_ONCE() to verify the deferred list is empty on socket destroy, and empty it to prevent potential memory leaks. Fixes: f35f821935d8 ("tcp: defer skb freeing after socket lock is released") Signed-off-by: Gal Pressman <gal@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff a1b519b7 Tue Nov 23 01:37:02 MST 2021 Maciej Żenczykowski <maze@google.com> net: allow CAP_NET_RAW to setsockopt SO_PRIORITY CAP_NET_ADMIN is and should continue to be about configuring the system as a whole, not about configuring per-socket or per-packet parameters. Sending and receiving raw packets is what CAP_NET_RAW is all about. It can already send packets with any VLAN tag, and any IPv4 TOS mark, and any IPv6 TCLASS mark, simply by virtue of building such a raw packet. Not to mention using any protocol and source/ /destination ip address/port tuple. These are the fields that networking gear uses to prioritize packets. Hence, a CAP_NET_RAW process is already capable of affecting traffic prioritization after it hits the wire. This change makes it capable of affecting traffic prioritization even in the host at the nic and before that in the queueing disciplines (provided skb->priority is actually being used for prioritization, and not the TOS/TCLASS field) Hence it makes sense to allow a CAP_NET_RAW process to set the priority of sockets and thus packets it sends. Signed-off-by: Maciej Żenczykowski <maze@google.com> Link: https://lore.kernel.org/r/20211123203702.193221-1-zenczykowski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> diff d2489c7b Mon Nov 15 12:02:41 MST 2021 Eric Dumazet <edumazet@google.com> tcp: add RETPOLINE mitigation to sk_backlog_rcv Use INDIRECT_CALL_INET() to avoid an indirect call when/if CONFIG_RETPOLINE=y Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
Completed in 2320 milliseconds