#
65acf6e0 |
|
08-Apr-2024 |
Eric Dumazet <edumazet@google.com> |
netfilter: complete validation of user input In my recent commit, I missed that do_replace() handlers use copy_from_sockptr() (which I fixed), followed by unsafe copy_from_sockptr_offset() calls. In all functions, we can perform the @optlen validation before even calling xt_alloc_table_info() with the following check: if ((u64)optlen < (u64)tmp.size + sizeof(tmp)) return -EINVAL; Fixes: 0c83842df40f ("netfilter: validate user input for expected length") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org> Link: https://lore.kernel.org/r/20240409120741.3538135-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
0c83842d |
|
03-Apr-2024 |
Eric Dumazet <edumazet@google.com> |
netfilter: validate user input for expected length I got multiple syzbot reports showing old bugs exposed by BPF after commit 20f2505fb436 ("bpf: Try to avoid kzalloc in cgroup/{s,g}etsockopt") setsockopt() @optlen argument should be taken into account before copying data. BUG: KASAN: slab-out-of-bounds in copy_from_sockptr_offset include/linux/sockptr.h:49 [inline] BUG: KASAN: slab-out-of-bounds in copy_from_sockptr include/linux/sockptr.h:55 [inline] BUG: KASAN: slab-out-of-bounds in do_replace net/ipv4/netfilter/ip_tables.c:1111 [inline] BUG: KASAN: slab-out-of-bounds in do_ipt_set_ctl+0x902/0x3dd0 net/ipv4/netfilter/ip_tables.c:1627 Read of size 96 at addr ffff88802cd73da0 by task syz-executor.4/7238 CPU: 1 PID: 7238 Comm: syz-executor.4 Not tainted 6.9.0-rc2-next-20240403-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 kasan_check_range+0x282/0x290 mm/kasan/generic.c:189 __asan_memcpy+0x29/0x70 mm/kasan/shadow.c:105 copy_from_sockptr_offset include/linux/sockptr.h:49 [inline] copy_from_sockptr include/linux/sockptr.h:55 [inline] do_replace net/ipv4/netfilter/ip_tables.c:1111 [inline] do_ipt_set_ctl+0x902/0x3dd0 net/ipv4/netfilter/ip_tables.c:1627 nf_setsockopt+0x295/0x2c0 net/netfilter/nf_sockopt.c:101 do_sock_setsockopt+0x3af/0x720 net/socket.c:2311 __sys_setsockopt+0x1ae/0x250 net/socket.c:2334 __do_sys_setsockopt net/socket.c:2343 [inline] __se_sys_setsockopt net/socket.c:2340 [inline] __x64_sys_setsockopt+0xb5/0xd0 net/socket.c:2340 do_syscall_64+0xfb/0x240 entry_SYSCALL_64_after_hwframe+0x72/0x7a RIP: 0033:0x7fd22067dde9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fd21f9ff0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036 RAX: ffffffffffffffda RBX: 00007fd2207abf80 RCX: 00007fd22067dde9 RDX: 0000000000000040 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 00007fd2206ca47a R08: 0000000000000001 R09: 0000000000000000 R10: 0000000020000880 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000000000b R14: 00007fd2207abf80 R15: 00007ffd2d0170d8 </TASK> Allocated by task 7238: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 poison_kmalloc_redzone mm/kasan/common.c:370 [inline] __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387 kasan_kmalloc include/linux/kasan.h:211 [inline] __do_kmalloc_node mm/slub.c:4069 [inline] __kmalloc_noprof+0x200/0x410 mm/slub.c:4082 kmalloc_noprof include/linux/slab.h:664 [inline] __cgroup_bpf_run_filter_setsockopt+0xd47/0x1050 kernel/bpf/cgroup.c:1869 do_sock_setsockopt+0x6b4/0x720 net/socket.c:2293 __sys_setsockopt+0x1ae/0x250 net/socket.c:2334 __do_sys_setsockopt net/socket.c:2343 [inline] __se_sys_setsockopt net/socket.c:2340 [inline] __x64_sys_setsockopt+0xb5/0xd0 net/socket.c:2340 do_syscall_64+0xfb/0x240 entry_SYSCALL_64_after_hwframe+0x72/0x7a The buggy address belongs to the object at ffff88802cd73da0 which belongs to the cache kmalloc-8 of size 8 The buggy address is located 0 bytes inside of allocated 1-byte region [ffff88802cd73da0, ffff88802cd73da1) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88802cd73020 pfn:0x2cd73 flags: 0xfff80000000000(node=0|zone=1|lastcpupid=0xfff) page_type: 0xffffefff(slab) raw: 00fff80000000000 ffff888015041280 dead000000000100 dead000000000122 raw: ffff88802cd73020 000000008080007f 00000001ffffefff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY), pid 5103, tgid 2119833701 (syz-executor.4), ts 5103, free_ts 70804600828 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1490 prep_new_page mm/page_alloc.c:1498 [inline] get_page_from_freelist+0x2e7e/0x2f40 mm/page_alloc.c:3454 __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4712 __alloc_pages_node_noprof include/linux/gfp.h:244 [inline] alloc_pages_node_noprof include/linux/gfp.h:271 [inline] alloc_slab_page+0x5f/0x120 mm/slub.c:2249 allocate_slab+0x5a/0x2e0 mm/slub.c:2412 new_slab mm/slub.c:2465 [inline] ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3615 __slab_alloc+0x58/0xa0 mm/slub.c:3705 __slab_alloc_node mm/slub.c:3758 [inline] slab_alloc_node mm/slub.c:3936 [inline] __do_kmalloc_node mm/slub.c:4068 [inline] kmalloc_node_track_caller_noprof+0x286/0x450 mm/slub.c:4089 kstrdup+0x3a/0x80 mm/util.c:62 device_rename+0xb5/0x1b0 drivers/base/core.c:4558 dev_change_name+0x275/0x860 net/core/dev.c:1232 do_setlink+0xa4b/0x41f0 net/core/rtnetlink.c:2864 __rtnl_newlink net/core/rtnetlink.c:3680 [inline] rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3727 rtnetlink_rcv_msg+0x89b/0x10d0 net/core/rtnetlink.c:6594 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559 netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline] netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361 page last free pid 5146 tgid 5146 stack trace: reset_page_owner include/linux/page_owner.h:25 [inline] free_pages_prepare mm/page_alloc.c:1110 [inline] free_unref_page+0xd3c/0xec0 mm/page_alloc.c:2617 discard_slab mm/slub.c:2511 [inline] __put_partials+0xeb/0x130 mm/slub.c:2980 put_cpu_partial+0x17c/0x250 mm/slub.c:3055 __slab_free+0x2ea/0x3d0 mm/slub.c:4254 qlink_free mm/kasan/quarantine.c:163 [inline] qlist_free_all+0x9e/0x140 mm/kasan/quarantine.c:179 kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286 __kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:322 kasan_slab_alloc include/linux/kasan.h:201 [inline] slab_post_alloc_hook mm/slub.c:3888 [inline] slab_alloc_node mm/slub.c:3948 [inline] __do_kmalloc_node mm/slub.c:4068 [inline] __kmalloc_node_noprof+0x1d7/0x450 mm/slub.c:4076 kmalloc_node_noprof include/linux/slab.h:681 [inline] kvmalloc_node_noprof+0x72/0x190 mm/util.c:634 bucket_table_alloc lib/rhashtable.c:186 [inline] rhashtable_rehash_alloc+0x9e/0x290 lib/rhashtable.c:367 rht_deferred_worker+0x4e1/0x2440 lib/rhashtable.c:427 process_one_work kernel/workqueue.c:3218 [inline] process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3299 worker_thread+0x86d/0xd70 kernel/workqueue.c:3380 kthread+0x2f0/0x390 kernel/kthread.c:388 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243 Memory state around the buggy address: ffff88802cd73c80: 07 fc fc fc 05 fc fc fc 05 fc fc fc fa fc fc fc ffff88802cd73d00: fa fc fc fc fa fc fc fc fa fc fc fc fa fc fc fc >ffff88802cd73d80: fa fc fc fc 01 fc fc fc fa fc fc fc fa fc fc fc ^ ffff88802cd73e00: fa fc fc fc fa fc fc fc 05 fc fc fc 07 fc fc fc ffff88802cd73e80: 07 fc fc fc 07 fc fc fc 07 fc fc fc 07 fc fc fc Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org> Link: https://lore.kernel.org/r/20240404122051.2303764-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
36ce9982 |
|
16-Mar-2023 |
Florian Westphal <fw@strlen.de> |
xtables: move icmp/icmpv6 logic to xt_tcpudp icmp/icmp6 matches are baked into ip(6)_tables.ko. This means that even if iptables-nft is used, a rule like "-p icmp --icmp-type 1" will load the ip(6)tables modules. Move them to xt_tcpdudp.ko instead to avoid this. This will also allow to eventually add kconfig knobs to build kernels that support iptables-nft but not iptables-legacy (old set/getsockopt interface). Signed-off-by: Florian Westphal <fw@strlen.de>
|
#
0af8c09c |
|
12-Feb-2023 |
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> |
netfilter: x_tables: fix percpu counter block leak on error path when creating new netns Here is the stack where we allocate percpu counter block: +-< __alloc_percpu +-< xt_percpu_counter_alloc +-< find_check_entry # {arp,ip,ip6}_tables.c +-< translate_table And it can be leaked on this code path: +-> ip6t_register_table +-> translate_table # allocates percpu counter block +-> xt_register_table # fails there is no freeing of the counter block on xt_register_table fail. Note: xt_percpu_counter_free should be called to free it like we do in do_replace through cleanup_entry helper (or in __ip6t_unregister_table). Probability of hitting this error path is low AFAICS (xt_register_table can only return ENOMEM here, as it is not replacing anything, as we are creating new netns, and it is hard to imagine that all previous allocations succeeded and after that one in xt_register_table failed). But it's worth fixing even the rare leak. Fixes: 71ae0dff02d7 ("netfilter: xtables: use percpu rule counters") Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
e58a171d |
|
17-Feb-2023 |
Florian Westphal <fw@strlen.de> |
netfilter: ebtables: fix table blob use-after-free We are not allowed to return an error at this point. Looking at the code it looks like ret is always 0 at this point, but its not. t = find_table_lock(net, repl->name, &ret, &ebt_mutex); ... this can return a valid table, with ret != 0. This bug causes update of table->private with the new blob, but then frees the blob right away in the caller. Syzbot report: BUG: KASAN: vmalloc-out-of-bounds in __ebt_unregister_table+0xc00/0xcd0 net/bridge/netfilter/ebtables.c:1168 Read of size 4 at addr ffffc90005425000 by task kworker/u4:4/74 Workqueue: netns cleanup_net Call Trace: kasan_report+0xbf/0x1f0 mm/kasan/report.c:517 __ebt_unregister_table+0xc00/0xcd0 net/bridge/netfilter/ebtables.c:1168 ebt_unregister_table+0x35/0x40 net/bridge/netfilter/ebtables.c:1372 ops_exit_list+0xb0/0x170 net/core/net_namespace.c:169 cleanup_net+0x4ee/0xb10 net/core/net_namespace.c:613 ... ip(6)tables appears to be ok (ret should be 0 at this point) but make this more obvious. Fixes: c58dd2dd443c ("netfilter: Can't fail and free after table replacement") Reported-by: syzbot+f61594de72d6705aea03@syzkaller.appspotmail.com Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
8844e010 |
|
11-Oct-2021 |
Florian Westphal <fw@strlen.de> |
netfilter: iptables: allow use of ipt_do_table as hookfn This is possible now that the xt_table structure is passed in via *priv. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
47a6959f |
|
25-Apr-2021 |
Florian Westphal <fw@strlen.de> |
netfilter: allow to turn off xtables compat layer The compat layer needs to parse untrusted input (the ruleset) to translate it to a 64bit compatible format. We had a number of bugs in this department in the past, so allow users to turn this feature off. Add CONFIG_NETFILTER_XTABLES_COMPAT kconfig knob and make it default to y to keep existing behaviour. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
ae689334 |
|
21-Apr-2021 |
Florian Westphal <fw@strlen.de> |
netfilter: ip_tables: pass table pointer via nf_hook_ops iptable_x modules rely on 'struct net' to contain a pointer to the table that should be evaluated. In order to remove these pointers from struct net, pass them via the 'priv' pointer in a similar fashion as nf_tables passes the rule data. To do that, duplicate the nf_hook_info array passed in from the iptable_x modules, update the ops->priv pointers of the copy to refer to the table and then change the hookfn implementations to just pass the 'priv' argument to the traverser. After this patch, the xt_table pointers can already be removed from struct net. However, changes to struct net result in re-compile of the entire network stack, so do the removal after arptables and ip6tables have been converted as well. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
20a9df33 |
|
21-Apr-2021 |
Florian Westphal <fw@strlen.de> |
netfilter: iptables: unregister the tables by name xtables stores the xt_table structs in the struct net. This isn't needed anymore, the structures could be passed via the netfilter hook 'private' pointer to the hook functions, which would allow us to remove those pointers from struct net. As a first step, reduce the number of accesses to the net->ipv4.ip6table_{raw,filter,...} pointers. This allows the tables to get unregistered by name instead of having to pass the raw address. The xt_table structure cane looked up by name+address family instead. This patch is useless as-is (the backends still have the raw pointer address), but it lowers the bar to remove those. It also allows to put the 'was table registered in the first place' check into ip_tables.c rather than have it in each table sub module. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
7716bf09 |
|
21-Apr-2021 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: remove ipt_unregister_table Its the same function as ipt_unregister_table_exit. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
b29c457a |
|
07-Apr-2021 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: fix compat match/target pad out-of-bound write xt_compat_match/target_from_user doesn't check that zeroing the area to start of next rule won't write past end of allocated ruleset blob. Remove this code and zero the entire blob beforehand. Reported-by: syzbot+cfc0247ac173f597aaaa@syzkaller.appspotmail.com Reported-by: Andy Nguyen <theflow@google.com> Fixes: 9fa492cdc160c ("[NETFILTER]: x_tables: simplify compat API") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
d3d40f23 |
|
07-Mar-2021 |
Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> |
Revert "netfilter: x_tables: Switch synchronization to RCU" This reverts commit cc00bcaa589914096edef7fb87ca5cee4a166b5c. This (and the preceding) patch basically re-implemented the RCU mechanisms of patch 784544739a25. That patch was replaced because of the performance problems that it created when replacing tables. Now, we have the same issue: the call to synchronize_rcu() makes replacing tables slower by as much as an order of magnitude. Prior to using RCU a script calling "iptables" approx. 200 times was taking 1.16s. With RCU this increased to 11.59s. Revert these patches and fix the issue in a different way. Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
abe7034b |
|
07-Mar-2021 |
Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> |
Revert "netfilter: x_tables: Update remaining dereference to RCU" This reverts commit 443d6e86f821a165fae3fc3fc13086d27ac140b1. This (and the following) patch basically re-implemented the RCU mechanisms of patch 784544739a25. That patch was replaced because of the performance problems that it created when replacing tables. Now, we have the same issue: the call to synchronize_rcu() makes replacing tables slower by as much as an order of magnitude. Revert these patches and fix the issue in a different way. Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
443d6e86 |
|
16-Dec-2020 |
Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> |
netfilter: x_tables: Update remaining dereference to RCU This fixes the dereference to fetch the RCU pointer when holding the appropriate xtables lock. Reported-by: kernel test robot <lkp@intel.com> Fixes: cc00bcaa5899 ("netfilter: x_tables: Switch synchronization to RCU") Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
cc00bcaa |
|
25-Nov-2020 |
Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> |
netfilter: x_tables: Switch synchronization to RCU When running concurrent iptables rules replacement with data, the per CPU sequence count is checked after the assignment of the new information. The sequence count is used to synchronize with the packet path without the use of any explicit locking. If there are any packets in the packet path using the table information, the sequence count is incremented to an odd value and is incremented to an even after the packet process completion. The new table value assignment is followed by a write memory barrier so every CPU should see the latest value. If the packet path has started with the old table information, the sequence counter will be odd and the iptables replacement will wait till the sequence count is even prior to freeing the old table info. However, this assumes that the new table information assignment and the memory barrier is actually executed prior to the counter check in the replacement thread. If CPU decides to execute the assignment later as there is no user of the table information prior to the sequence check, the packet path in another CPU may use the old table information. The replacement thread would then free the table information under it leading to a use after free in the packet processing context- Unable to handle kernel NULL pointer dereference at virtual address 000000000000008e pc : ip6t_do_table+0x5d0/0x89c lr : ip6t_do_table+0x5b8/0x89c ip6t_do_table+0x5d0/0x89c ip6table_filter_hook+0x24/0x30 nf_hook_slow+0x84/0x120 ip6_input+0x74/0xe0 ip6_rcv_finish+0x7c/0x128 ipv6_rcv+0xac/0xe4 __netif_receive_skb+0x84/0x17c process_backlog+0x15c/0x1b8 napi_poll+0x88/0x284 net_rx_action+0xbc/0x23c __do_softirq+0x20c/0x48c This could be fixed by forcing instruction order after the new table information assignment or by switching to RCU for the synchronization. Fixes: 80055dab5de0 ("netfilter: x_tables: make xt_replace_table wait until old rules are not used anymore") Reported-by: Sean Tranchetti <stranche@codeaurora.org> Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
d3c48151 |
|
28-Jul-2020 |
Christoph Hellwig <hch@lst.de> |
net: remove sockptr_advance sockptr_advance never properly worked. Replace it with _offset variants of copy_from_sockptr and copy_to_sockptr. Fixes: ba423fdaa589 ("net: add a new sockptr_t type") Reported-by: Jason A. Donenfeld <Jason@zx2c4.com> Reported-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jason A. Donenfeld <Jason@zx2c4.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c2f12630 |
|
23-Jul-2020 |
Christoph Hellwig <hch@lst.de> |
netfilter: switch nf_setsockopt to sockptr_t Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ab214d1b |
|
23-Jul-2020 |
Christoph Hellwig <hch@lst.de> |
netfilter: switch xt_copy_counters to sockptr_t Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c34bc10d |
|
17-Jul-2020 |
Christoph Hellwig <hch@lst.de> |
netfilter: remove the compat argument to xt_copy_counters_from_user Lift the in_compat_syscall() from the callers instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
89c53c14 |
|
17-Jul-2020 |
Christoph Hellwig <hch@lst.de> |
netfilter/ip_tables: clean up compat {get,set}sockopt handling Merge the native and compat {get,set}sockopt handlers using in_compat_syscall(). Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1cbf9098 |
|
22-Jun-2020 |
David Wilder <dwilder@us.ibm.com> |
netfilter: iptables: Split ipt_unregister_table() into pre_exit and exit helpers. The pre_exit will un-register the underlying hook and .exit will do the table freeing. The netns core does an unconditional synchronize_rcu after the pre_exit hooks insuring no packets are in flight that have picked up the pointer before completing the un-register. Fixes: b9e69e127397 ("netfilter: xtables: don't hook tables by default") Signed-off-by: David Wilder <dwilder@us.ibm.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
6daf1414 |
|
20-Feb-2020 |
Gustavo A. R. Silva <gustavo@embeddedor.com> |
netfilter: Replace zero-length array with flexible-array member The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] Lastly, fix checkpatch.pl warning WARNING: __aligned(size) is preferred over __attribute__((aligned(size))) in net/bridge/netfilter/ebtables.c This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
d2912cb1 |
|
04-Jun-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
d376bef9 |
|
04-Jul-2018 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: set module owner for icmp(6) matches nft_compat relies on xt_request_find_match to increment refcount of the module that provides the match/target. The (builtin) icmp matches did't set the module owner so it was possible to rmmod ip(6)tables while icmp extensions were still in use. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
c568503e |
|
07-Jun-2018 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: initialise match/target check parameter struct syzbot reports following splat: BUG: KMSAN: uninit-value in ebt_stp_mt_check+0x24b/0x450 net/bridge/netfilter/ebt_stp.c:162 ebt_stp_mt_check+0x24b/0x450 net/bridge/netfilter/ebt_stp.c:162 xt_check_match+0x1438/0x1650 net/netfilter/x_tables.c:506 ebt_check_match net/bridge/netfilter/ebtables.c:372 [inline] ebt_check_entry net/bridge/netfilter/ebtables.c:702 [inline] The uninitialised access is xt_mtchk_param->nft_compat ... which should be set to 0. Fix it by zeroing the struct beforehand, same for tgchk. ip(6)tables targetinfo uses c99-style initialiser, so no change needed there. Reported-by: syzbot+da4494182233c23a5fcf@syzkaller.appspotmail.com Fixes: 55917a21d0cc0 ("netfilter: x_tables: add context to know if extension runs from nft_compat") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
ba7d284a |
|
14-May-2018 |
Florian Westphal <fw@strlen.de> |
netfilter: xtables: allow table definitions not backed by hook_ops The ip(6)tables nat table is currently receiving skbs from the netfilter core, after a followup patch skbs will be coming from the netfilter nat core instead, so the table is no longer backed by normal hook_ops. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
a44f6d82 |
|
05-May-2018 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: add module alias for icmp matches The icmp matches are implemented in ip_tables and ip6_tables, respectively, so for normal iptables they are always available: those modules are loaded once iptables calls getsockopt() to fetch available module revisions. In iptables-over-nftables case probing occurs via nfnetlink, so these modules might not be loaded. Add aliases so modprobe can load these when icmp/icmp6 is requested. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
dc3c09d3 |
|
13-Apr-2018 |
Taehee Yoo <ap420073@gmail.com> |
netfilter: xtables: use ipt_get_target_c instead of ipt_get_target ipt_get_target is used to get struct xt_entry_target and ipt_get_target_c is used to get const struct xt_entry_target. However in the ipt_do_table, ipt_get_target is used to get const struct xt_entry_target. it should be replaced by ipt_get_target_c. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
e3b5e1ec |
|
30-Mar-2018 |
Florian Westphal <fw@strlen.de> |
Revert "netfilter: x_tables: ensure last rule in base chain matches underflow/policy" This reverts commit 0d7df906a0e78079a02108b06d32c3ef2238ad25. Valdis Kletnieks reported that xtables is broken in linux-next since 0d7df906a0e78 ("netfilter: x_tables: ensure last rule in base chain matches underflow/policy"), as kernel rejects the (well-formed) ruleset: [ 64.402790] ip6_tables: last base chain position 1136 doesn't match underflow 1344 (hook 1) mark_source_chains is not the correct place for such a check, as it terminates evaluation of a chain once it sees an unconditional verdict (following rules are known to be unreachable). It seems preferrable to fix libiptc instead, so remove this check again. Fixes: 0d7df906a0e78 ("netfilter: x_tables: ensure last rule in base chain matches underflow/policy") Reported-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
2f635cee |
|
27-Mar-2018 |
Kirill Tkhai <ktkhai@virtuozzo.com> |
net: Drop pernet_operations::async Synchronous pernet_operations are not allowed anymore. All are asynchronous. So, drop the structure member. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
0d7df906 |
|
27-Feb-2018 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: ensure last rule in base chain matches underflow/policy Harmless from kernel point of view, but again iptables assumes that this is true when decoding ruleset coming from kernel. If a (syzkaller generated) ruleset doesn't have the underflow/policy stored as the last rule in the base chain, then iptables will abort() because it doesn't find the chain policy. libiptc assumes that the policy is the last rule in the basechain, which is only true for iptables-generated rulesets. Unfortunately this needs code duplication -- the functions need the struct layout of the rule head, but that is different for ip/ip6/arptables. NB: pr_warn could be pr_debug but in case this break rulesets somehow its useful to know why blob was rejected. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
9782a11e |
|
27-Feb-2018 |
Florian Westphal <fw@strlen.de> |
netfilter: compat: prepare xt_compat_init_offsets to return errors should have no impact, function still always returns 0. This patch is only to ease review. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
c84ca954 |
|
27-Feb-2018 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: add counters allocation wrapper allows to have size checks in a single spot. This is supposed to reduce oom situations when fuzz-testing xtables. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
1b293e30 |
|
27-Feb-2018 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: move hook entry checks into core Allow followup patch to change on location instead of three. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
07a9da51 |
|
27-Feb-2018 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: check standard verdicts in core Userspace must provide a valid verdict to the standard target. The verdict can be either a jump (signed int > 0), or a return code. Allowed return codes are either RETURN (pop from stack), NF_ACCEPT, DROP and QUEUE (latter is allowed for legacy reasons). Jump offsets (verdict > 0) are checked in more detail later on when loop-detection is performed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
f31e5f1a |
|
16-Feb-2018 |
Xin Long <lucien.xin@gmail.com> |
netfilter: unlock xt_table earlier in __do_replace Now it's doing cleanup_entry for oldinfo under the xt_table lock, but it's not really necessary. After the replacement job is done in xt_replace_table, oldinfo is not used elsewhere any more, and it can be freed without xt_table lock safely. The important thing is that rtnl_lock is called in some xt_target destroy, which means rtnl_lock, a big lock is used in xt_table lock, a smaller one. It usually could be the reason why a dead lock may happen. Besides, all xt_target/match checkentry is called out of xt_table lock. It's better also to move all cleanup_entry calling out of xt_table lock, just as do_replace_finish does for ebtables. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
4d6b8076 |
|
19-Feb-2018 |
Kirill Tkhai <ktkhai@virtuozzo.com> |
net: Convert ip_tables_net_ops, udplite6_net_ops and xt_net_ops ip_tables_net_ops and udplite6_net_ops create and destroy /proc entries. xt_net_ops does nothing. So, we are able to mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
57ebd808 |
|
07-Feb-2018 |
Florian Westphal <fw@strlen.de> |
netfilter: add back stackpointer size checks The rationale for removing the check is only correct for rulesets generated by ip(6)tables. In iptables, a jump can only occur to a user-defined chain, i.e. because we size the stack based on number of user-defined chains we cannot exceed stack size. However, the underlying binary format has no such restriction, and the validation step only ensures that the jump target is a valid rule start point. IOW, its possible to build a rule blob that has no user-defined chains but does contain a jump. If this happens, no jump stack gets allocated and crash occurs because no jumpstack was allocated. Fixes: 7814b6ec6d0d6 ("netfilter: xtables: don't save/restore jumpstack offset") Reported-by: syzbot+e783f671527912cd9403@syzkaller.appspotmail.com Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
e5531166 |
|
19-Jan-2018 |
Pablo Neira Ayuso <pablo@netfilter.org> |
netfilter: remove messages print and boot/module load time Several reasons for this: * Several modules maintain internal version numbers, that they print at boot/module load time, that are not exposed to userspace, as a primitive mechanism to make revision number control from the earlier days of Netfilter. * IPset shows the protocol version at boot/module load time, instead display this via module description, as Jozsef suggested. * Remove copyright notice at boot/module load time in two spots, the Netfilter codebase is a collective development effort, if we would have to display copyrights for each contributor at boot/module load time for each extensions we have, we would probably fill up logs with lots of useless information - from a technical standpoint. So let's be consistent and remove them all. Acked-by: Florian Westphal <fw@strlen.de> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
03d13b68 |
|
08-Dec-2017 |
Florian Westphal <fw@strlen.de> |
netfilter: xtables: add and use xt_request_find_table_lock currently we always return -ENOENT to userspace if we can't find a particular table, or if the table initialization fails. Followup patch will make nat table init fail in case nftables already registered a nat hook so this change makes xt_find_table_lock return an ERR_PTR to return the errno value reported from the table init function. Add xt_request_find_table_lock as try_then_request_module replacement and use it where needed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
4be2b04e |
|
09-Oct-2017 |
Paul E. McKenney <paulmck@kernel.org> |
netfilter: Remove now-redundant smp_read_barrier_depends() READ_ONCE() now implies smp_read_barrier_depends(), which means that the instances in arpt_do_table(), ipt_do_table(), and ip6t_do_table() are now redundant. This commit removes them and adjusts the comments. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Cc: Florian Westphal <fw@strlen.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: <netfilter-devel@vger.kernel.org> Cc: <coreteam@netfilter.org> Cc: <netdev@vger.kernel.org>
|
#
07dc8bc9 |
|
07-Nov-2017 |
Colin Ian King <colin.king@canonical.com> |
netfilter: remove redundant assignment to e The assignment to variable e is redundant since the same assignment occurs just a few lines later, hence it can be removed. Cleans up clang warning for arp_tables, ip_tables and ip6_tables: warning: Value stored to 'e' is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
d13e7b2e |
|
11-Oct-2017 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: don't use seqlock when fetching old counters after previous commit xt_replace_table will wait until all cpus had even seqcount (i.e., no cpu is accessing old ruleset). Add a 'old' counter retrival version that doesn't synchronize counters. Its not needed, the old counters are not in use anymore at this point. This speeds up table replacement on busy systems with large tables (and many cores). Cc: Dan Williams <dcbw@redhat.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
a5d7a714 |
|
01-Sep-2017 |
Florian Westphal <fw@strlen.de> |
netfilter: xtables: add scheduling opportunity in get_counters There are reports about spurious softlockups during iptables-restore, a backtrace i saw points at get_counters -- it uses a sequence lock and also has unbounded restart loop. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
9efdb14f |
|
30-Aug-2017 |
Varsha Rao <rvarsha016@gmail.com> |
net: Remove CONFIG_NETFILTER_DEBUG and _ASSERT() macros. This patch removes CONFIG_NETFILTER_DEBUG and _ASSERT() macros as they are no longer required. Replace _ASSERT() macros with WARN_ON(). Signed-off-by: Varsha Rao <rvarsha016@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
549d2d41 |
|
31-Jul-2017 |
Julia Lawall <julia.lawall@lip6.fr> |
netfilter: constify nf_loginfo structures The nf_loginfo structures are only passed as the seventh argument to nf_log_trace, which is declared as const or stored in a local const variable. Thus the nf_loginfo structures themselves can be const. Done with the help of Coccinelle. // <smpl> @r disable optional_qualifier@ identifier i; position p; @@ static struct nf_loginfo i@p = { ... }; @ok1@ identifier r.i; expression list[6] es; position p; @@ nf_log_trace(es,&i@p,...) @ok2@ identifier r.i; const struct nf_loginfo *e; position p; @@ e = &i@p @bad@ position p != {r.p,ok1.p,ok2.p}; identifier r.i; struct nf_loginfo e; @@ e@i@p @depends on !bad disable optional_qualifier@ identifier r.i; @@ static +const struct nf_loginfo i = { ... }; // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
5b9ccdcb |
|
29-Jul-2017 |
Taehee Yoo <ap420073@gmail.com> |
netfilter: xtables: Remove unused variable in compat_copy_entry_from_user() The target variable is not used in the compat_copy_entry_from_user(). So It can be removed. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
9beceb54 |
|
26-Jul-2017 |
Taehee Yoo <ap420073@gmail.com> |
netfilter: x_tables: Fix use-after-free in ipt_do_table. If verdict is NF_STOLEN in the SYNPROXY target, the skb is consumed. However, ipt_do_table() always tries to get ip header from the skb. So that, KASAN triggers the use-after-free message. We can reproduce this message using below command. # iptables -I INPUT -p tcp -j SYNPROXY --mss 1460 [ 193.542265] BUG: KASAN: use-after-free in ipt_do_table+0x1405/0x1c10 [ ... ] [ 193.578603] Call Trace: [ 193.581590] <IRQ> [ 193.584107] dump_stack+0x68/0xa0 [ 193.588168] print_address_description+0x78/0x290 [ 193.593828] ? ipt_do_table+0x1405/0x1c10 [ 193.598690] kasan_report+0x230/0x340 [ 193.603194] __asan_report_load2_noabort+0x19/0x20 [ 193.608950] ipt_do_table+0x1405/0x1c10 [ 193.613591] ? rcu_read_lock_held+0xae/0xd0 [ 193.618631] ? ip_route_input_rcu+0x27d7/0x4270 [ 193.624348] ? ipt_do_table+0xb68/0x1c10 [ 193.629124] ? do_add_counters+0x620/0x620 [ 193.634234] ? iptable_filter_net_init+0x60/0x60 [ ... ] After this patch, only when verdict is XT_CONTINUE, ipt_do_table() tries to get ip header. Also arpt_do_table() is modified because it has same bug. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
68ad546a |
|
28-Mar-2017 |
simran singhal <singhalsimran0@gmail.com> |
netfilter: Remove unnecessary cast on void pointer The following Coccinelle script was used to detect this: @r@ expression x; void* e; type T; identifier f; @@ ( *((T *)e) | ((T *)x)[...] | ((T*)x)->f | - (T*) e ) Unnecessary parantheses are also remove. Signed-off-by: simran singhal <singhalsimran0@gmail.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
f77bc5b2 |
|
02-Jan-2017 |
Willem de Bruijn <willemb@google.com> |
iptables: use match, target and data copy_to_user helpers Convert iptables to copying entries, matches and targets one by one, using the xt_match_to_user and xt_target_to_user helper functions. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
7c0f6ba6 |
|
24-Dec-2016 |
Linus Torvalds <torvalds@linux-foundation.org> |
Replace <asm/uaccess.h> with <linux/uaccess.h> globally This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
ae0ac0ed |
|
22-Nov-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: pack percpu counter allocations instead of allocating each xt_counter individually, allocate 4k chunks and then use these for counter allocation requests. This should speed up rule evaluation by increasing data locality, also speeds up ruleset loading because we reduce calls to the percpu allocator. As Eric points out we can't use PAGE_SIZE, page_allocator would fail on arches with 64k page size. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
f28e15ba |
|
22-Nov-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: pass xt_counters struct to counter allocator Keeps some noise away from a followup patch. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
4d31eef5 |
|
22-Nov-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: pass xt_counters struct instead of packet counter On SMP we overload the packet counter (unsigned long) to contain percpu offset. Hide this from callers and pass xt_counters address instead. Preparation patch to allocate the percpu counters in page-sized batch chunks. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
eb1a6bdc |
|
11-Nov-2016 |
Julia Lawall <julia.lawall@lip6.fr> |
netfilter: x_tables: simplify IS_ERR_OR_NULL to NULL test Since commit 7926dbfa4bc1 ("netfilter: don't use mutex_lock_interruptible()"), the function xt_find_table_lock can only return NULL on an error. Simplify the call sites and update the comment before the function. The semantic patch that change the code is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression t,e; @@ t = \(xt_find_table_lock(...)\| try_then_request_module(xt_find_table_lock(...),...)\) ... when != t=e - ! IS_ERR_OR_NULL(t) + t @@ expression t,e; @@ t = \(xt_find_table_lock(...)\| try_then_request_module(xt_find_table_lock(...),...)\) ... when != t=e - IS_ERR_OR_NULL(t) + !t @@ expression t,e,e1; @@ t = \(xt_find_table_lock(...)\| try_then_request_module(xt_find_table_lock(...),...)\) ... when != t=e ?- t ? PTR_ERR(t) : e1 + e1 ... when any // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
613dbd95 |
|
03-Nov-2016 |
Pablo Neira Ayuso <pablo@netfilter.org> |
netfilter: x_tables: move hook state into xt_action_param structure Place pointer to hook state in xt_action_param structure instead of copying the fields that we need. After this change xt_action_param fits into one cacheline. This patch also adds a set of new wrapper functions to fetch relevant hook state structure fields. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
ff107d27 |
|
25-Sep-2016 |
Liping Zhang <liping.zhang@spreadtrum.com> |
netfilter: nft_log: complete NFTA_LOG_FLAGS attr support NFTA_LOG_FLAGS attribute is already supported, but the related NF_LOG_XXX flags are not exposed to the userspace. So we cannot explicitly enable log flags to log uid, tcp sequence, ip options and so on, i.e. such rule "nft add rule filter output log uid" is not supported yet. So move NF_LOG_XXX macro definitions to the uapi/../nf_log.h. In order to keep consistent with other modules, change NF_LOG_MASK to refer to all supported log flags. On the other hand, add a new NF_LOG_DEFAULT_MASK to refer to the original default log flags. Finally, if user specify the unsupported log flags or NFTA_LOG_GROUP and NFTA_LOG_FLAGS are set at the same time, report EINVAL to the userspace. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
f4dc7771 |
|
14-Jul-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: speed up jump target validation The dummy ruleset I used to test the original validation change was broken, most rules were unreachable and were not tested by mark_source_chains(). In some cases rulesets that used to load in a few seconds now require several minutes. sample ruleset that shows the behaviour: echo "*filter" for i in $(seq 0 100000);do printf ":chain_%06x - [0:0]\n" $i done for i in $(seq 0 100000);do printf -- "-A INPUT -j chain_%06x\n" $i printf -- "-A INPUT -j chain_%06x\n" $i printf -- "-A INPUT -j chain_%06x\n" $i done echo COMMIT [ pipe result into iptables-restore ] This ruleset will be about 74mbyte in size, with ~500k searches though all 500k[1] rule entries. iptables-restore will take forever (gave up after 10 minutes) Instead of always searching the entire blob for a match, fill an array with the start offsets of every single ipt_entry struct, then do a binary search to check if the jump target is present or not. After this change ruleset restore times get again close to what one gets when reverting 36472341017529e (~3 seconds on my workstation). [1] every user-defined rule gets an implicit RETURN, so we get 300k jumps + 100k userchains + 100k returns -> 500k rule entries Fixes: 36472341017529e ("netfilter: x_tables: validate targets of jumps") Reported-by: Jeff Wu <wujiafu@gmail.com> Tested-by: Jeff Wu <wujiafu@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
c37a2dfa |
|
24-Jun-2016 |
Joe Perches <joe@perches.com> |
netfilter: Convert FWINV<[foo]> macros and uses to NF_INVF netfilter uses multiple FWINV #defines with identical form that hide a specific structure variable and dereference it with a invflags member. $ git grep "#define FWINV" include/linux/netfilter_bridge/ebtables.h:#define FWINV(bool,invflg) ((bool) ^ !!(info->invflags & invflg)) net/bridge/netfilter/ebtables.c:#define FWINV2(bool, invflg) ((bool) ^ !!(e->invflags & invflg)) net/ipv4/netfilter/arp_tables.c:#define FWINV(bool, invflg) ((bool) ^ !!(arpinfo->invflags & (invflg))) net/ipv4/netfilter/ip_tables.c:#define FWINV(bool, invflg) ((bool) ^ !!(ipinfo->invflags & (invflg))) net/ipv6/netfilter/ip6_tables.c:#define FWINV(bool, invflg) ((bool) ^ !!(ip6info->invflags & (invflg))) net/netfilter/xt_tcpudp.c:#define FWINVTCP(bool, invflg) ((bool) ^ !!(tcpinfo->invflags & (invflg))) Consolidate these macros into a single NF_INVF macro. Miscellanea: o Neaten the alignment around these uses o A few lines are > 80 columns for intelligibility Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
d7cdf816 |
|
03-May-2016 |
Pablo Neira Ayuso <pablo@netfilter.org> |
netfilter: x_tables: get rid of old and inconsistent debugging The dprintf() and duprintf() functions are enabled at compile time, these days we have better runtime debugging through pr_debug() and static keys. On top of this, this debugging is so old that I don't expect anyone using this anymore, so let's get rid of this. IP_NF_ASSERT() is still left in place, although this needs that NETFILTER_DEBUG is enabled, I think these assertions provide useful context information when reading the code. Note that ARP_NF_ASSERT() has been removed as there is no user of this. Kill also DEBUG_ALLOW_ALL and a couple of pr_error() and pr_debug() spots that are inconsistently placed in the code. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
92b4423e |
|
29-Apr-2016 |
Pablo Neira Ayuso <pablo@netfilter.org> |
netfilter: fix IS_ERR_VALUE usage This is a forward-port of the original patch from Andrzej Hajda, he said: "IS_ERR_VALUE should be used only with unsigned long type. Otherwise it can work incorrectly. To achieve this function xt_percpu_counter_alloc is modified to return unsigned long, and its result is assigned to temporary variable to perform error checking, before assigning to .pcnt field. The patch follows conclusion from discussion on LKML [1][2]. [1]: http://permalink.gmane.org/gmane.linux.kernel/2120927 [2]: http://permalink.gmane.org/gmane.linux.kernel/2150581" Original patch from Andrzej is here: http://patchwork.ozlabs.org/patch/582970/ This patch has clashed with input validation fixes for x_tables. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
d7591f0c |
|
01-Apr-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: introduce and use xt_copy_counters_from_user The three variants use same copy&pasted code, condense this into a helper and use that. Make sure info.name is 0-terminated. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
aded9f3e |
|
01-Apr-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: remove obsolete check Since 'netfilter: x_tables: validate targets of jumps' change we validate that the target aligns exactly with beginning of a rule, so offset test is now redundant. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
95609155 |
|
01-Apr-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: remove obsolete overflow check for compat case too commit 9e67d5a739327c44885adebb4f3a538050be73e4 ("[NETFILTER]: x_tables: remove obsolete overflow check") left the compat parts alone, but we can kill it there as well. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
09d96860 |
|
01-Apr-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: do compat validation via translate_table This looks like refactoring, but its also a bug fix. Problem is that the compat path (32bit iptables, 64bit kernel) lacks a few sanity tests that are done in the normal path. For example, we do not check for underflows and the base chain policies. While its possible to also add such checks to the compat path, its more copy&pastry, for instance we cannot reuse check_underflow() helper as e->target_offset differs in the compat case. Other problem is that it makes auditing for validation errors harder; two places need to be checked and kept in sync. At a high level 32 bit compat works like this: 1- initial pass over blob: validate match/entry offsets, bounds checking lookup all matches and targets do bookkeeping wrt. size delta of 32/64bit structures assign match/target.u.kernel pointer (points at kernel implementation, needed to access ->compatsize etc.) 2- allocate memory according to the total bookkeeping size to contain the translated ruleset 3- second pass over original blob: for each entry, copy the 32bit representation to the newly allocated memory. This also does any special match translations (e.g. adjust 32bit to 64bit longs, etc). 4- check if ruleset is free of loops (chase all jumps) 5-first pass over translated blob: call the checkentry function of all matches and targets. The alternative implemented by this patch is to drop steps 3&4 from the compat process, the translation is changed into an intermediate step rather than a full 1:1 translate_table replacement. In the 2nd pass (step #3), change the 64bit ruleset back to a kernel representation, i.e. put() the kernel pointer and restore ->u.user.name . This gets us a 64bit ruleset that is in the format generated by a 64bit iptables userspace -- we can then use translate_table() to get the 'native' sanity checks. This has two drawbacks: 1. we re-validate all the match and target entry structure sizes even though compat translation is supposed to never generate bogus offsets. 2. we put and then re-lookup each match and target. THe upside is that we get all sanity tests and ruleset validations provided by the normal path and can remove some duplicated compat code. iptables-restore time of autogenerated ruleset with 300k chains of form -A CHAIN0001 -m limit --limit 1/s -j CHAIN0002 -A CHAIN0002 -m limit --limit 1/s -j CHAIN0003 shows no noticeable differences in restore times: old: 0m30.796s new: 0m31.521s 64bit: 0m25.674s Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
0188346f |
|
01-Apr-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: xt_compat_match_from_user doesn't need a retval Always returned 0. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
7d3f843e |
|
01-Apr-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: ip_tables: simplify translate_compat_table args Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
ce683e5f |
|
01-Apr-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: check for bogus target offset We're currently asserting that targetoff + targetsize <= nextoff. Extend it to also check that targetoff is >= sizeof(xt_entry). Since this is generic code, add an argument pointing to the start of the match/target, we can then derive the base structure size from the delta. We also need the e->elems pointer in a followup change to validate matches. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
fc1221b3 |
|
01-Apr-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: add compat version of xt_check_entry_offsets 32bit rulesets have different layout and alignment requirements, so once more integrity checks get added to xt_check_entry_offsets it will reject well-formed 32bit rulesets. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
aa412ba2 |
|
01-Apr-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: kill check_entry helper Once we add more sanity testing to xt_check_entry_offsets it becomes relvant if we're expecting a 32bit 'config_compat' blob or a normal one. Since we already have a lot of similar-named functions (check_entry, compat_check_entry, find_and_check_entry, etc.) and the current incarnation is short just fold its contents into the callers. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
7d35812c |
|
01-Apr-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: add and use xt_check_entry_offsets Currently arp/ip and ip6tables each implement a short helper to check that the target offset is large enough to hold one xt_entry_target struct and that t->u.target_size fits within the current rule. Unfortunately these checks are not sufficient. To avoid adding new tests to all of ip/ip6/arptables move the current checks into a helper, then extend this helper in followup patches. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
36472341 |
|
01-Apr-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: validate targets of jumps When we see a jump also check that the offset gets us to beginning of a rule (an ipt_entry). The extra overhead is negible, even with absurd cases. 300k custom rules, 300k jumps to 'next' user chain: [ plus one jump from INPUT to first userchain ]: Before: real 0m24.874s user 0m7.532s sys 0m16.076s After: real 0m27.464s user 0m7.436s sys 0m18.840s Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
f24e230d |
|
01-Apr-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: don't move to non-existent next rule Ben Hawkes says: In the mark_source_chains function (net/ipv4/netfilter/ip_tables.c) it is possible for a user-supplied ipt_entry structure to have a large next_offset field. This field is not bounds checked prior to writing a counter value at the supplied offset. Base chains enforce absolute verdict. User defined chains are supposed to end with an unconditional return, xtables userspace adds them automatically. But if such return is missing we will move to non-existent next rule. Reported-by: Ben Hawkes <hawkes@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
b301f253 |
|
24-Mar-2016 |
Pablo Neira Ayuso <pablo@netfilter.org> |
netfilter: x_tables: enforce nul-terminated table name from getsockopt GET_ENTRIES Make sure the table names via getsockopt GET_ENTRIES is nul-terminated in ebtables and all the x_tables variants and their respective compat code. Uncovered by KASAN. Reported-by: Baozeng Ding <sploving1@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
54d83fc7 |
|
22-Mar-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: fix unconditional helper Ben Hawkes says: In the mark_source_chains function (net/ipv4/netfilter/ip_tables.c) it is possible for a user-supplied ipt_entry structure to have a large next_offset field. This field is not bounds checked prior to writing a counter value at the supplied offset. Problem is that mark_source_chains should not have been called -- the rule doesn't have a next entry, so its supposed to return an absolute verdict of either ACCEPT or DROP. However, the function conditional() doesn't work as the name implies. It only checks that the rule is using wildcard address matching. However, an unconditional rule must also not be using any matches (no -m args). The underflow validator only checked the addresses, therefore passing the 'unconditional absolute verdict' test, while mark_source_chains also tested for presence of matches, and thus proceeeded to the next (not-existent) rule. Unify this so that all the callers have same idea of 'unconditional rule'. Reported-by: Ben Hawkes <hawkes@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
6e94e0cf |
|
22-Mar-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: make sure e->next_offset covers remaining blob size Otherwise this function may read data beyond the ruleset blob. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
bdf533de |
|
22-Mar-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: x_tables: validate e->target_offset early We should check that e->target_offset is sane before mark_source_chains gets called since it will fetch the target entry for loop detection. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
b9e69e12 |
|
25-Feb-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: xtables: don't hook tables by default delay hook registration until the table is being requested inside a namespace. Historically, a particular table (iptables mangle, ip6tables filter, etc) was registered on module load. When netns support was added to iptables only the ip/ip6tables ruleset was made namespace aware, not the actual hook points. This means f.e. that when ipt_filter table/module is loaded on a system, then each namespace on that system has an (empty) iptables filter ruleset. In other words, if a namespace sends a packet, such skb is 'caught' by netfilter machinery and fed to hooking points for that table (i.e. INPUT, FORWARD, etc). Thanks to Eric Biederman, hooks are no longer global, but per namespace. This means that we can avoid allocation of empty ruleset in a namespace and defer hook registration until we need the functionality. We register a tables hook entry points ONLY in the initial namespace. When an iptables get/setockopt is issued inside a given namespace, we check if the table is found in the per-namespace list. If not, we attempt to find it in the initial namespace, and, if found, create an empty default table in the requesting namespace and register the needed hooks. Hook points are destroyed only once namespace is deleted, there is no 'usage count' (it makes no sense since there is no 'remove table' operation in xtables api). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
a67dd266 |
|
25-Feb-2016 |
Florian Westphal <fw@strlen.de> |
netfilter: xtables: prepare for on-demand hook register This change prepares for upcoming on-demand xtables hook registration. We change the protoypes of the register/unregister functions. A followup patch will then add nf_hook_register/unregister calls to the iptables one. Once a hook is registered packets will be picked up, so all assignments of the form net->ipv4.iptable_$table = new_table have to be moved to ip(6)t_register_table, else we can see NULL net->ipv4.iptable_$table later. This patch doesn't change functionality; without this the actual change simply gets too big. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
24cebe3f |
|
14-Oct-2015 |
Ian Morris <ipm@chirality.org.uk> |
netfilter: ipv4: code indentation Use tabs instead of spaces to indent code. No changes detected by objdiff. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
6c28255b |
|
14-Oct-2015 |
Ian Morris <ipm@chirality.org.uk> |
netfilter: ipv4: function definition layout Use tabs instead of spaces to indent second line of parameters in function definitions. No changes detected by objdiff. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
27951a01 |
|
14-Oct-2015 |
Ian Morris <ipm@chirality.org.uk> |
netfilter: ipv4: ternary operator layout Correct whitespace layout of ternary operators in the netfilter-ipv4 code. No changes detected by objdiff. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
19f0a602 |
|
14-Oct-2015 |
Ian Morris <ipm@chirality.org.uk> |
netfilter: ipv4: label placement Whitespace cleansing: Labels should not be indented. No changes detected by objdiff. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
156c196f |
|
18-Sep-2015 |
Eric W. Biederman <ebiederm@xmission.com> |
netfilter: x_tables: Pass struct net in xt_action_param As xt_action_param lives on the stack this does not bloat any persistent data structures. This is a first step in making netfilter code that needs to know which network namespace it is executing in simpler. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
6cb8ff3f1 |
|
18-Sep-2015 |
Eric W. Biederman <ebiederm@xmission.com> |
inet netfilter: Remove hook from ip6t_do_table, arp_do_table, ipt_do_table The values of ops->hooknum and state->hook are guaraneted to be equal making the hook argument to ip6t_do_table, arp_do_table, and ipt_do_table is unnecessary. Remove the unnecessary hook argument. In the callers use state->hook instead of ops->hooknum for clarity and to reduce the number of cachelines the callers touch. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
9dff2c96 |
|
15-Sep-2015 |
Eric W. Biederman <ebiederm@xmission.com> |
netfilter: Use nf_hook_state.net Instead of saying "net = dev_net(state->in?state->in:state->out)" just say "state->net". As that information is now availabe, much less confusing and much less error prone. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
98dbbfc3 |
|
26-Aug-2015 |
Florian Westphal <fw@strlen.de> |
Revert "netfilter: xtables: compute exact size needed for jumpstack" This reverts commit 98d1bd802cdbc8f56868fae51edec13e86b59515. mark_source_chains will not re-visit chains, so *filter :INPUT ACCEPT [365:25776] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [217:45832] :t1 - [0:0] :t2 - [0:0] :t3 - [0:0] :t4 - [0:0] -A t1 -i lo -j t2 -A t2 -i lo -j t3 -A t3 -i lo -j t4 # -A INPUT -j t4 # -A INPUT -j t3 # -A INPUT -j t2 -A INPUT -j t1 COMMIT Will compute a chain depth of 2 if the comments are removed. Revert back to counting the number of chains for the time being. Reported-by: Cong Wang <cwang@twopensource.com> Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
6c7941de |
|
14-Jul-2015 |
Florian Westphal <fw@strlen.de> |
netfilter: xtables: remove __pure annotation sparse complains: ip_tables.c:361:27: warning: incorrect type in assignment (different modifiers) ip_tables.c:361:27: expected struct ipt_entry *[assigned] e ip_tables.c:361:27: got struct ipt_entry [pure] * doesn't change generated code. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
dcebd315 |
|
14-Jul-2015 |
Florian Westphal <fw@strlen.de> |
netfilter: add and use jump label for xt_tee Don't bother testing if we need to switch to alternate stack unless TEE target is used. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
7814b6ec |
|
14-Jul-2015 |
Florian Westphal <fw@strlen.de> |
netfilter: xtables: don't save/restore jumpstack offset In most cases there is no reentrancy into ip/ip6tables. For skbs sent by REJECT or SYNPROXY targets, there is one level of reentrancy, but its not relevant as those targets issue an absolute verdict, i.e. the jumpstack can be clobbered since its not used after the target issues absolute verdict (ACCEPT, DROP, STOLEN, etc). So the only special case where it is relevant is the TEE target, which returns XT_CONTINUE. This patch changes ip(6)_do_table to always use the jump stack starting from 0. When we detect we're operating on an skb sent via TEE (percpu nf_skb_duplicated is 1) we switch to an alternate stack to leave the original one alone. Since there is no TEE support for arptables, it doesn't need to test if tee is active. The jump stack overflow tests are no longer needed as well -- since ->stacksize is the largest call depth we cannot exceed it. A much better alternative to the external jumpstack would be to just declare a jumps[32] stack on the local stack frame, but that would mean we'd have to reject iptables rulesets that used to work before. Another alternative would be to start rejecting rulesets with a larger call depth, e.g. 1000 -- in this case it would be feasible to allocate the entire stack in the percpu area which would avoid one dereference. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
98d1bd80 |
|
14-Jul-2015 |
Florian Westphal <fw@strlen.de> |
netfilter: xtables: compute exact size needed for jumpstack The {arp,ip,ip6tables} jump stack is currently sized based on the number of user chains. However, its rather unlikely that every user defined chain jumps to the next, so lets use the existing loop detection logic to also track the chain depths. The stacksize is then set to the largest chain depth seen. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
711bdde6 |
|
15-Jun-2015 |
Eric Dumazet <edumazet@google.com> |
netfilter: x_tables: remove XT_TABLE_INFO_SZ and a dereference. After Florian patches, there is no need for XT_TABLE_INFO_SZ anymore : Only one copy of table is kept, instead of one copy per cpu. We also can avoid a dereference if we put table data right after xt_table_info. It reduces register pressure and helps compiler. Then, we attempt a kmalloc() if total size is under order-3 allocation, to reduce TLB pressure, as in many cases, rules fit in 32 KB. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
482cfc31 |
|
10-Jun-2015 |
Florian Westphal <fw@strlen.de> |
netfilter: xtables: avoid percpu ruleset duplication We store the rule blob per (possible) cpu. Unfortunately this means we can waste lot of memory on big smp machines. ipt_entry structure ('rule head') is 112 byte, so e.g. with maxcpu=64 one single rule eats close to 8k RAM. Since previous patch made counters percpu it appears there is nothing left in the rule blob that needs to be percpu. On my test system (144 possible cpus, 400k dummy rules) this change saves close to 9 Gigabyte of RAM. Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
71ae0dff |
|
10-Jun-2015 |
Florian Westphal <fw@strlen.de> |
netfilter: xtables: use percpu rule counters The binary arp/ip/ip6tables ruleset is stored per cpu. The only reason left as to why we need percpu duplication are the rule counters embedded into ipt_entry et al -- since each cpu has its own copy of the rules, all counters can be lockless. The downside is that the more cpus are supported, the more memory is required. Rules are not just duplicated per online cpu but for each possible cpu, i.e. if maxcpu is 144, then rule is duplicated 144 times, not for the e.g. 64 cores present. To save some memory and also improve utilization of shared caches it would be preferable to only store the rule blob once. So we first need to separate counters and the rule blob. Instead of using entry->counters, allocate this percpu and store the percpu address in entry->counters.pcnt on CONFIG_SMP. This change makes no sense as-is; it is merely an intermediate step to remove the percpu duplication of the rule set in a followup patch. Suggested-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
2f06550b |
|
23-May-2015 |
Florian Westphal <fw@strlen.de> |
netfilter: remove unused comefrom hookmask argument Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
1086bbe9 |
|
19-May-2015 |
Dave Jones <davej@codemonkey.org.uk> |
netfilter: ensure number of counters is >0 in do_replace() After improving setsockopt() coverage in trinity, I started triggering vmalloc failures pretty reliably from this code path: warn_alloc_failed+0xe9/0x140 __vmalloc_node_range+0x1be/0x270 vzalloc+0x4b/0x50 __do_replace+0x52/0x260 [ip_tables] do_ipt_set_ctl+0x15d/0x1d0 [ip_tables] nf_setsockopt+0x65/0x90 ip_setsockopt+0x61/0xa0 raw_setsockopt+0x16/0x60 sock_common_setsockopt+0x14/0x20 SyS_setsockopt+0x71/0xd0 It turns out we don't validate that the num_counters field in the struct we pass in from userspace is initialized. The same problem also exists in ebtables, arptables, ipv6, and the compat variants. Signed-off-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
1c491ba2 |
|
03-Apr-2015 |
David S. Miller <davem@davemloft.net> |
netfilter: Pass nf_hook_state through ipt_do_table(). Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
4017a7ee |
|
01-Mar-2015 |
Pablo Neira Ayuso <pablo@netfilter.org> |
netfilter: restore rule tracing via nfnetlink_log Since fab4085 ("netfilter: log: nf_log_packet() as real unified interface"), the loginfo structure that is passed to nf_log_packet() is used to explicitly indicate the logger type you want to use. This is a problem for people tracing rules through nfnetlink_log since packets are always routed to the NF_LOG_TYPE logger after the aforementioned patch. We can fix this by removing the trace loginfo structures, but that still changes the log level from 4 to 5 for tracing messages and there may be someone relying on this outthere. So let's just introduce a new nf_log_trace() function that restores the former behaviour. Reported-by: Markus Kötter <koetter@rrzn.uni-hannover.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
c58dd2dd |
|
04-Apr-2014 |
Thomas Graf <tgraf@suug.ch> |
netfilter: Can't fail and free after table replacement All xtables variants suffer from the defect that the copy_to_user() to copy the counters to user memory may fail after the table has already been exchanged and thus exposed. Return an error at this point will result in freeing the already exposed table. Any subsequent packet processing will result in a kernel panic. We can't copy the counters before exposing the new tables as we want provide the counter state after the old table has been unhooked. Therefore convert this into a silent error. Cc: Florian Westphal <fw@strlen.de> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
b416c144 |
|
21-Oct-2013 |
Will Deacon <will@kernel.org> |
netfilter: x_tables: fix ordering of jumpstack allocation and table update During kernel stability testing on an SMP ARMv7 system, Yalin Wang reported the following panic from the netfilter code: 1fe0: 0000001c 5e2d3b10 4007e779 4009e110 60000010 00000032 ff565656 ff545454 [<c06c48dc>] (ipt_do_table+0x448/0x584) from [<c0655ef0>] (nf_iterate+0x48/0x7c) [<c0655ef0>] (nf_iterate+0x48/0x7c) from [<c0655f7c>] (nf_hook_slow+0x58/0x104) [<c0655f7c>] (nf_hook_slow+0x58/0x104) from [<c0683bbc>] (ip_local_deliver+0x88/0xa8) [<c0683bbc>] (ip_local_deliver+0x88/0xa8) from [<c0683718>] (ip_rcv_finish+0x418/0x43c) [<c0683718>] (ip_rcv_finish+0x418/0x43c) from [<c062b1c4>] (__netif_receive_skb+0x4cc/0x598) [<c062b1c4>] (__netif_receive_skb+0x4cc/0x598) from [<c062b314>] (process_backlog+0x84/0x158) [<c062b314>] (process_backlog+0x84/0x158) from [<c062de84>] (net_rx_action+0x70/0x1dc) [<c062de84>] (net_rx_action+0x70/0x1dc) from [<c0088230>] (__do_softirq+0x11c/0x27c) [<c0088230>] (__do_softirq+0x11c/0x27c) from [<c008857c>] (do_softirq+0x44/0x50) [<c008857c>] (do_softirq+0x44/0x50) from [<c0088614>] (local_bh_enable_ip+0x8c/0xd0) [<c0088614>] (local_bh_enable_ip+0x8c/0xd0) from [<c06b0330>] (inet_stream_connect+0x164/0x298) [<c06b0330>] (inet_stream_connect+0x164/0x298) from [<c061d68c>] (sys_connect+0x88/0xc8) [<c061d68c>] (sys_connect+0x88/0xc8) from [<c000e340>] (ret_fast_syscall+0x0/0x30) Code: 2a000021 e59d2028 e59de01c e59f011c (e7824103) ---[ end trace da227214a82491bd ]--- Kernel panic - not syncing: Fatal exception in interrupt This comes about because CPU1 is executing xt_replace_table in response to a setsockopt syscall, resulting in: ret = xt_jumpstack_alloc(newinfo); --> newinfo->jumpstack = kzalloc(size, GFP_KERNEL); [...] table->private = newinfo; newinfo->initial_entries = private->initial_entries; Meanwhile, CPU0 is handling the network receive path and ends up in ipt_do_table, resulting in: private = table->private; [...] jumpstack = (struct ipt_entry **)private->jumpstack[cpu]; On weakly ordered memory architectures, the writes to table->private and newinfo->jumpstack from CPU1 can be observed out of order by CPU0. Furthermore, on architectures which don't respect ordering of address dependencies (i.e. Alpha), the reads from CPU0 can also be re-ordered. This patch adds an smp_wmb() before the assignment to table->private (which is essentially publishing newinfo) to ensure that all writes to newinfo will be observed before plugging it into the table structure. A dependent-read barrier is also added on the consumer sides, to ensure the same ordering requirements are also respected there. Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reported-by: Wang, Yalin <Yalin.Wang@sonymobile.com> Tested-by: Wang, Yalin <Yalin.Wang@sonymobile.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
f229f6ce |
|
06-Apr-2013 |
Patrick McHardy <kaber@trash.net> |
netfilter: add my copyright statements Add copyright statements to all netfilter files which have had significant changes done by myself in the past. Some notes: - nf_conntrack_ecache.c was incorrectly attributed to Rusty and Netfilter Core Team when it got split out of nf_conntrack_core.c. The copyrights even state a date which lies six years before it was written. It was written in 2005 by Harald and myself. - net/ipv{4,6}/netfilter.c, net/netfitler/nf_queue.c were missing copyright statements. I've added the copyright statement from net/netfilter/core.c, where this code originated - for nf_conntrack_proto_tcp.c I've also added Jozsef, since I didn't want it to give the wrong impression Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
30e0c6a6 |
|
24-Mar-2013 |
Gao feng <gaofeng@cn.fujitsu.com> |
netfilter: nf_log: prepare net namespace support for loggers This patch adds netns support to nf_log and it prepares netns support for existing loggers. It is composed of four major changes. 1) nf_log_register has been split to two functions: nf_log_register and nf_log_set. The new nf_log_register is used to globally register the nf_logger and nf_log_set is used for enabling pernet support from nf_loggers. Per netns is not yet complete after this patch, it comes in separate follow up patches. 2) Add net as a parameter of nf_log_bind_pf. Per netns is not yet complete after this patch, it only allows to bind the nf_logger to the protocol family from init_net and it skips other cases. 3) Adapt all nf_log_packet callers to pass netns as parameter. After this patch, this function only works for init_net. 4) Make the sysctl net/netfilter/nf_log pernet. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
f0165888 |
|
21-Mar-2013 |
Gao feng <gaofeng@cn.fujitsu.com> |
netfilter: use IS_ENABLE to replace if defined in TRACE target Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
0cc8d8df |
|
21-Jan-2013 |
YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> |
netfilter: Use IS_ERR_OR_NULL(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
52e804c6 |
|
15-Nov-2012 |
Eric W. Biederman <ebiederm@xmission.com> |
net: Allow userns root to control ipv4 Allow an unpriviled user who has created a user namespace, and then created a network namespace to effectively use the new network namespace, by reducing capable(CAP_NET_ADMIN) and capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns, CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls. Settings that merely control a single network device are allowed. Either the network device is a logical network device where restrictions make no difference or the network device is hardware NIC that has been explicity moved from the initial network namespace. In general policy and network stack state changes are allowed while resource control is left unchanged. Allow creating raw sockets. Allow the SIOCSARP ioctl to control the arp cache. Allow the SIOCSIFFLAG ioctl to allow setting network device flags. Allow the SIOCSIFADDR ioctl to allow setting a netdevice ipv4 address. Allow the SIOCSIFBRDADDR ioctl to allow setting a netdevice ipv4 broadcast address. Allow the SIOCSIFDSTADDR ioctl to allow setting a netdevice ipv4 destination address. Allow the SIOCSIFNETMASK ioctl to allow setting a netdevice ipv4 netmask. Allow the SIOCADDRT and SIOCDELRT ioctls to allow adding and deleting ipv4 routes. Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for adding, changing and deleting gre tunnels. Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for adding, changing and deleting ipip tunnels. Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for adding, changing and deleting ipsec virtual tunnel interfaces. Allow setting the MRT_INIT, MRT_DONE, MRT_ADD_VIF, MRT_DEL_VIF, MRT_ADD_MFC, MRT_DEL_MFC, MRT_ASSERT, MRT_PIM, MRT_TABLE socket options on multicast routing sockets. Allow setting and receiving IPOPT_CIPSO, IP_OPT_SEC, IP_OPT_SID and arbitrary ip options. Allow setting IP_SEC_POLICY/IP_XFRM_POLICY ipv4 socket option. Allow setting the IP_TRANSPARENT ipv4 socket option. Allow setting the TCP_REPAIR socket option. Allow setting the TCP_CONGESTION socket option. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e87cc472 |
|
13-May-2012 |
Joe Perches <joe@perches.com> |
net: Convert net_ratelimit uses to net_<level>_ratelimited Standardize the net core ratelimited logging functions. Coalesce formats, align arguments. Change a printk then vprintk sequence to use printf extension %pV. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
95c96174 |
|
14-Apr-2012 |
Eric Dumazet <eric.dumazet@gmail.com> |
net: cleanup unsigned to unsigned int Use of "unsigned int" is preferred to bare "unsigned" in net tree. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
63f6fe92 |
|
16-Jun-2011 |
Sebastian Andrzej Siewior <sebastian@breakpoint.cc> |
netfilter: ip_tables: fix compile with debug Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
7f5c6d4f |
|
04-Apr-2011 |
Eric Dumazet <eric.dumazet@gmail.com> |
netfilter: get rid of atomic ops in fast path We currently use a percpu spinlock to 'protect' rule bytes/packets counters, after various attempts to use RCU instead. Lately we added a seqlock so that get_counters() can run without blocking BH or 'writers'. But we really only need the seqcount in it. Spinlock itself is only locked by the current/owner cpu, so we can remove it completely. This cleanups api, using correct 'writer' vs 'reader' semantic. At replace time, the get_counters() call makes sure all cpus are done using the old table. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
25985edc |
|
30-Mar-2011 |
Lucas De Marchi <lucas.demarchi@profusion.mobi> |
Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
|
#
db856674 |
|
20-Mar-2011 |
Eric Dumazet <eric.dumazet@gmail.com> |
netfilter: xtables: fix reentrancy commit f3c5c1bfd4308 (make ip_tables reentrant) introduced a race in handling the stackptr restore, at the end of ipt_do_table() We should do it before the call to xt_info_rdunlock_bh(), or we allow cpu preemption and another cpu overwrites stackptr of original one. A second fix is to change the underflow test to check the origptr value instead of 0 to detect underflow, or else we allow a jump from different hooks. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
78b79876 |
|
15-Mar-2011 |
Vasiliy Kulikov <segoon@openwall.com> |
netfilter: ip_tables: fix infoleak to userspace Structures ipt_replace, compat_ipt_replace, and xt_get_revision are copied from userspace. Fields of these structs that are zero-terminated strings are not checked. When they are used as argument to a format string containing "%s" in request_module(), some sensitive information is leaked to userspace via argument of spawned modprobe process. The first and the third bugs were introduced before the git epoch; the second was introduced in 2722971c (v2.6.17-rc1). To trigger the bug one should have CAP_NET_ADMIN. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
255d0dc3 |
|
18-Dec-2010 |
Eric Dumazet <eric.dumazet@gmail.com> |
netfilter: x_table: speedup compat operations One iptables invocation with 135000 rules takes 35 seconds of cpu time on a recent server, using a 32bit distro and a 64bit kernel. We eventually trigger NMI/RCU watchdog. INFO: rcu_sched_state detected stall on CPU 3 (t=6000 jiffies) COMPAT mode has quadratic behavior and consume 16 bytes of memory per rule. Switch the xt_compat algos to use an array instead of list, and use a binary search to locate an offset in the sorted array. This halves memory need (8 bytes per rule), and removes quadratic behavior [ O(N*N) -> O(N*log2(N)) ] Time of iptables goes from 35 s to 150 ms. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
83723d60 |
|
10-Jan-2011 |
Eric Dumazet <eric.dumazet@gmail.com> |
netfilter: x_tables: dont block BH while reading counters Using "iptables -L" with a lot of rules have a too big BH latency. Jesper mentioned ~6 ms and worried of frame drops. Switch to a per_cpu seqlock scheme, so that taking a snapshot of counters doesnt need to block BH (for this cpu, but also other cpus). This adds two increments on seqlock sequence per ipt_do_table() call, its a reasonable cost for allowing "iptables -L" not block BH processing. Reported-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Patrick McHardy <kaber@trash.net> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
#
b5f15ac4 |
|
03-Nov-2010 |
Vasiliy Kulikov <segooon@gmail.com> |
ipv4: netfilter: ip_tables: fix information leak to userland Structure ipt_getinfo is copied to userland with the field "name" that has the last elements unitialized. It leads to leaking of contents of kernel stack memory. Signed-off-by: Vasiliy Kulikov <segooon@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
243bf6e2 |
|
13-Oct-2010 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: resolve indirect macros 3/3
|
#
87a2e70d |
|
13-Oct-2010 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: resolve indirect macros 2/3 Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
12b00c2c |
|
13-Oct-2010 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: resolve indirect macros 1/3 Many of the used macros are just there for userspace compatibility. Substitute the in-kernel code to directly use the terminal macro and stuff the defines into #ifndef __KERNEL__ sections. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
cca77b7c |
|
23-Aug-2010 |
Florian Westphal <fw@strlen.de> |
netfilter: fix CONFIG_COMPAT support commit f3c5c1bfd430858d3a05436f82c51e53104feb6b (netfilter: xtables: make ip_tables reentrant) forgot to also compute the jumpstack size in the compat handlers. Result is that "iptables -I INPUT -j userchain" turns into -j DROP. Reported by Sebastian Roesner on #netfilter, closes http://bugzilla.netfilter.org/show_bug.cgi?id=669. Note: arptables change is compile-tested only. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
001389b9 |
|
16-Aug-2010 |
Eric Dumazet <eric.dumazet@gmail.com> |
netfilter: {ip,ip6,arp}_tables: avoid lockdep false positive After commit 24b36f019 (netfilter: {ip,ip6,arp}_tables: dont block bottom half more than necessary), lockdep can raise a warning because we attempt to lock a spinlock with BH enabled, while the same lock is usually locked by another cpu in a softirq context. Disable again BH to avoid these lockdep warnings. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Diagnosed-by: David S. Miller <davem@davemloft.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
24b36f01 |
|
02-Aug-2010 |
Eric Dumazet <eric.dumazet@gmail.com> |
netfilter: {ip,ip6,arp}_tables: dont block bottom half more than necessary We currently disable BH for the whole duration of get_counters() On machines with a lot of cpus and large tables, this might be too long. We can disable preemption during the whole function, and disable BH only while fetching counters for the current cpu. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
7df0884c |
|
23-Jul-2010 |
Changli Gao <xiaosuo@gmail.com> |
netfilter: iptables: use skb->len for accounting Use skb->len for accounting as xt_quota does. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
e12f8e29 |
|
04-Jun-2010 |
Eric Dumazet <eric.dumazet@gmail.com> |
netfilter: vmalloc_node cleanup Using vmalloc_node(size, numa_node_id()) for temporary storage is not needed. vmalloc(size) is more respectful of user NUMA policy. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
7489aec8 |
|
31-May-2010 |
Eric Dumazet <eric.dumazet@gmail.com> |
netfilter: xtables: stackptr should be percpu commit f3c5c1bfd4 (netfilter: xtables: make ip_tables reentrant) introduced a performance regression, because stackptr array is shared by all cpus, adding cache line ping pongs. (16 cpus share a 64 bytes cache line) Fix this using alloc_percpu() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-By: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
654d0fbd |
|
13-May-2010 |
Stephen Hemminger <shemminger@vyatta.com> |
netfilter: cleanup printk messages Make sure all printk messages have a severity level. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
af567603 |
|
13-May-2010 |
Stephen Hemminger <shemminger@vyatta.com> |
netfilter: change NF_ASSERT to WARN_ON Change netfilter asserts to standard WARN_ON. This has the benefit of backtrace info and also causes netfilter errors to show up on kerneloops.org. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
4538506b |
|
03-Jul-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: combine built-in extension structs Prepare the arrays for use with the multiregister function. The future layer-3 xt matches can then be easily added to it without needing more (un)register code. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
b4ba2611 |
|
07-Jul-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: change hotdrop pointer to direct modification Since xt_action_param is writable, let's use it. The pointer to 'bool hotdrop' always worried (8 bytes (64-bit) to write 1 byte!). Surprisingly results in a reduction in size: text data bss filename 5457066 692730 357892 vmlinux.o-prev 5456554 692730 357892 vmlinux.o Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
62fc8051 |
|
07-Jul-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: deconstify struct xt_action_param for matches In future, layer-3 matches will be an xt module of their own, and need to set the fragoff and thoff fields. Adding more pointers would needlessy increase memory requirements (esp. so for 64-bit, where pointers are wider). Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
4b560b44 |
|
05-Jul-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: substitute temporary defines by final name Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
de74c169 |
|
05-Jul-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: combine struct xt_match_param and xt_target_param The structures carried - besides match/target - almost the same data. It is possible to combine them, as extensions are evaluated serially, and so, the callers end up a little smaller. text data bss filename -15318 740 104 net/ipv4/netfilter/ip_tables.o +15286 740 104 net/ipv4/netfilter/ip_tables.o -15333 540 152 net/ipv6/netfilter/ip6_tables.o +15269 540 152 net/ipv6/netfilter/ip6_tables.o Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
ef53d702 |
|
09-Jul-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: dissolve do_match function Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
b5cad0df |
|
02-May-2010 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: ip_tables: fix compilation when debug is enabled Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
cecc74de |
|
22-Apr-2010 |
Patrick McHardy <kaber@trash.net> |
netfilter: ip_tables: convert pr_devel() to pr_debug() We want to be able to use CONFIG_DYNAMIC_DEBUG in netfilter code, switch the few existing pr_devel() calls to pr_debug(). Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
5b775eb1 |
|
19-Apr-2010 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: remove old comments about reentrancy Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
f3c5c1bf |
|
19-Apr-2010 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: make ip_tables reentrant Currently, the table traverser stores return addresses in the ruleset itself (struct ip6t_entry->comefrom). This has a well-known drawback: the jumpstack is overwritten on reentry, making it necessary for targets to return absolute verdicts. Also, the ruleset (which might be heavy memory-wise) needs to be replicated for each CPU that can possibly invoke ip6t_do_table. This patch decouples the jumpstack from struct ip6t_entry and instead puts it into xt_table_info. Not being restricted by 'comefrom' anymore, we can set up a stack as needed. By default, there is room allocated for two entries into the traverser. arp_tables is not touched though, because there is just one/two modules and further patches seek to collapse the table traverser anyhow. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
bd414ee6 |
|
23-Mar-2010 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: change matches to return error code The following semantic patch does part of the transformation: // <smpl> @ rule1 @ struct xt_match ops; identifier check; @@ ops.checkentry = check; @@ identifier rule1.check; @@ check(...) { <... -return true; +return 0; ...> } @@ identifier rule1.check; @@ check(...) { <... -return false; +return -EINVAL; ...> } // </smpl> Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
b0f38452 |
|
19-Mar-2010 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: change xt_match.checkentry return type Restore function signatures from bool to int so that we can report memory allocation failures or similar using -ENOMEM rather than always having to pass -EINVAL back. This semantic patch may not be too precise (checking for functions that use xt_mtchk_param rather than functions referenced by xt_match.checkentry), but reviewed, it produced the intended result. // <smpl> @@ type bool; identifier check, par; @@ -bool check +int check (struct xt_mtchk_param *par) { ... } // </smpl> Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
fd0ec0e6 |
|
10-Jul-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: consolidate code into xt_request_find_match Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
d2a7b6ba |
|
10-Jul-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: make use of xt_request_find_target Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
ff67e4e4 |
|
19-Mar-2010 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xt extensions: use pr_<level> (2) Supplement to 1159683ef48469de71dc26f0ee1a9c30d131cf89. Downgrade the log level to INFO for most checkentry messages as they are, IMO, just an extra information to the -EINVAL code that is returned as part of a parameter "constraint violation". Leave errors to real errors, such as being unable to create a LED trigger. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
6b4ff2d7 |
|
26-Feb-2010 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: restore indentation Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
0f234214 |
|
24-Feb-2010 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: reduce arguments to translate_table Just pass in the entire repl struct. In case of a new table (e.g. ip6t_register_table), the repldata has been previously filled with table->name and table->size already (in ip6t_alloc_initial_table). Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
6bdb331b |
|
24-Feb-2010 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: optimize call flow around xt_ematch_foreach Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
dcea992a |
|
24-Feb-2010 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: replace XT_MATCH_ITERATE macro The macro is replaced by a list.h-like foreach loop. This makes the code more inspectable. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
0559518b |
|
24-Feb-2010 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: optimize call flow around xt_entry_foreach Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
72b2b1dd |
|
24-Feb-2010 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: replace XT_ENTRY_ITERATE macro The macro is replaced by a list.h-like foreach loop. This makes the code much more inspectable. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
d5d1baa1 |
|
25-Jun-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: add const qualifiers This should make it easier to remove redundant arguments later. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
739674fb |
|
26-Jun-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: constify args in compat copying functions Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
e3eaa991 |
|
17-Jun-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: generate initial table on-demand The static initial tables are pretty large, and after the net namespace has been instantiated, they just hang around for nothing. This commit removes them and creates tables on-demand at runtime when needed. Size shrinks by 7735 bytes (x86_64). Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
14c7dbe0 |
|
08-Feb-2010 |
Alexey Dobriyan <adobriyan@gmail.com> |
netfilter: xtables: compat out of scope fix As per C99 6.2.4(2) when temporary table data goes out of scope, the behaviour is undefined: if (compat) { struct foo tmp; ... private = &tmp; } [dereference private] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: stable@kernel.org Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
dab1531a |
|
08-Feb-2010 |
Alexey Dobriyan <adobriyan@gmail.com> |
netfilter: xtables: compat out of scope fix As per C99 6.2.4(2) when temporary table data goes out of scope, the behaviour is undefined: if (compat) { struct foo tmp; ... private = &tmp; } [dereference private] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: stable@kernel.org Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
add67461 |
|
03-Feb-2010 |
Patrick McHardy <kaber@trash.net> |
netfilter: add struct net * to target parameters Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
f54e9367 |
|
18-Jan-2010 |
Alexey Dobriyan <adobriyan@gmail.com> |
netfilter: xtables: add struct xt_mtdtor_param::net Add ->net to match destructor list like ->net in constructor list. Make sure it's set in ebtables/iptables/ip6tables, this requires to propagate netns up to *_unregister_table(). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
a83d8e8d |
|
18-Jan-2010 |
Alexey Dobriyan <adobriyan@gmail.com> |
netfilter: xtables: add struct xt_mtchk_param::net Some complex match modules (like xt_hashlimit/xt_recent) want netns information at constructor and destructor time. We propably can play games at match destruction time, because netns can be passed in object, but I think it's cleaner to explicitly pass netns. Add ->net, make sure it's set from ebtables/iptables/ip6tables code. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
3666ed1c |
|
23-Nov-2009 |
Joe Perches <joe@perches.com> |
netfilter: net/ipv[46]/netfilter: Move && and || to end of previous line Compile tested only. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
35aad0ff |
|
24-Aug-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: mark initial tables constant The inputted table is never modified, so should be considered const. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
e2fe35c1 |
|
18-Jul-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: check for standard verdicts in policies This adds the second check that Rusty wanted to have a long time ago. :-) Base chain policies must have absolute verdicts that cease processing in the table, otherwise rule execution may continue in an unexpected spurious fashion (e.g. next chain that follows in memory). Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
90e7d4ab |
|
09-Jul-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: check for unconditionality of policies This adds a check that iptables's original author Rusty set forth in a FIXME comment. Underflows in iptables are better known as chain policies, and are required to be unconditional or there would be a stochastical chance for the policy rule to be skipped if it does not match. If that were to happen, rule execution would continue in an unexpected spurious fashion. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
a7d51738 |
|
18-Jul-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: ignore unassigned hooks in check_entry_size_and_hooks The "hook_entry" and "underflow" array contains values even for hooks not provided, such as PREROUTING in conjunction with the "filter" table. Usually, the values point to whatever the next rule is. For the upcoming unconditionality and underflow checking patches however, we must not inspect that arbitrary rule. Skipping unassigned hooks seems like a good idea, also because newinfo->hook_entry and newinfo->underflow will then continue to have the poison value for detecting abnormalities. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
47901dc2 |
|
09-Jul-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: use memcmp in unconditional check Instead of inspecting each u32/char open-coded, clean up and make use of memcmp. On some arches, memcmp is implemented as assembly or GCC's __builtin_memcmp which can possibly take advantages of known alignment. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
e5afbba1 |
|
07-Jul-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: iptables: remove unused datalen variable Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
24992eac |
|
11-Jun-2009 |
Patrick McHardy <kaber@trash.net> |
netfilter: ip_tables: fix build error Fix build error introduced by commit bb70dfa5 (netfilter: xtables: consolidate comefrom debug cast access): net/ipv4/netfilter/ip_tables.c: In function 'ipt_do_table': net/ipv4/netfilter/ip_tables.c:421: error: 'comefrom' undeclared (first use in this function) net/ipv4/netfilter/ip_tables.c:421: error: (Each undeclared identifier is reported only once net/ipv4/netfilter/ip_tables.c:421: error: for each function it appears in.) Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
a5e78820 |
|
04-Jun-2009 |
Evgeniy Polyakov <zbr@ioremap.net> |
netfilter: x_tables: added hook number into match extension parameter structure. Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
bb70dfa5 |
|
15-Apr-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: consolidate comefrom debug cast access Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
7a6b1c46 |
|
15-Apr-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: remove another level of indent Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
9452258d |
|
15-Apr-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: remove some goto Combining two ifs, and goto is easily gone. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
a1ff4ac8 |
|
15-Apr-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: reduce indent level by one Cosmetic only. Transformation applied: -if (foo) { long block; } else { short block; } +if (!foo) { short block; continue; } long block; Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
98e86403 |
|
15-Apr-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: consolidate open-coded logic Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
4f2f6f23 |
|
15-Apr-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: fix const inconsistency Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
ccf5bd8c |
|
15-Apr-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: remove redundant casts Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
4ba351cf |
|
14-Apr-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: use NFPROTO_ in standard targets Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
383ca5b8 |
|
14-Apr-2009 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: use NFPROTO_ for xt_proto_init callsites Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
#
942e4a2b |
|
28-Apr-2009 |
Stephen Hemminger <shemminger@vyatta.com> |
netfilter: revised locking for x_tables The x_tables are organized with a table structure and a per-cpu copies of the counters and rules. On older kernels there was a reader/writer lock per table which was a performance bottleneck. In 2.6.30-rc, this was converted to use RCU and the counters/rules which solved the performance problems for do_table but made replacing rules much slower because of the necessary RCU grace period. This version uses a per-cpu set of spinlocks and counters to allow to table processing to proceed without the cache thrashing of a global reader lock and keeps the same performance for table updates. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
fa9a86dd |
|
02-Apr-2009 |
Eric Dumazet <dada1@cosmosbay.com> |
netfilter: use rcu_read_bh() in ipt_do_table() Commit 784544739a25c30637397ace5489eeb6e15d7d49 (netfilter: iptables: lock free counters) forgot to disable BH in arpt_do_table(), ipt_do_table() and ip6t_do_table() Use rcu_read_lock_bh() instead of rcu_read_lock() cures the problem. Reported-and-bisected-by: Roman Mindalev <r000n@r000n.net> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Patrick McHardy <kaber@trash.net> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1f9352ae |
|
25-Mar-2009 |
Patrick McHardy <kaber@trash.net> |
netfilter: {ip,ip6,arp}_tables: fix incorrect loop detection Commit e1b4b9f ([NETFILTER]: {ip,ip6,arp}_tables: fix exponential worst-case search for loops) introduced a regression in the loop detection algorithm, causing sporadic incorrectly detected loops. When a chain has already been visited during the check, it is treated as having a standard target containing a RETURN verdict directly at the beginning in order to not check it again. The real target of the first rule is then incorrectly treated as STANDARD target and checked not to contain invalid verdicts. Fix by making sure the rule does actually contain a standard target. Based on patch by Francis Dupont <Francis_Dupont@isc.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
b8dfe498 |
|
25-Mar-2009 |
Eric Dumazet <dada1@cosmosbay.com> |
netfilter: factorize ifname_compare() We use same not trivial helper function in four places. We can factorize it. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
08361aa8 |
|
20-Feb-2009 |
Eric Dumazet <dada1@cosmosbay.com> |
netfilter: ip_tables: unfold two critical loops in ip_packet_match() While doing oprofile tests I noticed two loops are not properly unrolled by gcc Using a hand coded unrolled loop provides nice speedup : ipt_do_table credited of 2.52 % of cpu instead of 3.29 % in tbench. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
78454473 |
|
20-Feb-2009 |
Stephen Hemminger <shemminger@vyatta.com> |
netfilter: iptables: lock free counters The reader/writer lock in ip_tables is acquired in the critical path of processing packets and is one of the reasons just loading iptables can cause a 20% performance loss. The rwlock serves two functions: 1) it prevents changes to table state (xt_replace) while table is in use. This is now handled by doing rcu on the xt_table. When table is replaced, the new table(s) are put in and the old one table(s) are freed after RCU period. 2) it provides synchronization when accesing the counter values. This is now handled by swapping in new table_info entries for each cpu then summing the old values, and putting the result back onto one cpu. On a busy system it may cause sampling to occur at different times on each cpu, but no packet/byte counts are lost in the process. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Sucessfully tested on my dual quad core machine too, but iptables only (no ipv6 here) BTW, my new "tbench 8" result is 2450 MB/s, (it was 2150 MB/s not so long ago) Acked-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
cffee385 |
|
31-Oct-2008 |
Harvey Harrison <harvey.harrison@gmail.com> |
net: replace NIPQUAD() in net/ipv4/netfilter/ Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u can be replaced with %pI4 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
916a917d |
|
08-Oct-2008 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: provide invoked family value to extensions By passing in the family through which extensions were invoked, a bit of data space can be reclaimed. The "family" member will be added to the parameter structures and the check functions be adjusted. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
a2df1648 |
|
08-Oct-2008 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: move extension arguments into compound structure (6/6) This patch does this for target extensions' destroy functions. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
af5d6dc2 |
|
08-Oct-2008 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: move extension arguments into compound structure (5/6) This patch does this for target extensions' checkentry functions. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
7eb35586 |
|
08-Oct-2008 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: move extension arguments into compound structure (4/6) This patch does this for target extensions' target functions. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
6be3d859 |
|
08-Oct-2008 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: move extension arguments into compound structure (3/6) This patch does this for match extensions' destroy functions. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
9b4fce7a |
|
08-Oct-2008 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: move extension arguments into compound structure (2/6) This patch does this for match extensions' checkentry functions. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
f7108a20 |
|
08-Oct-2008 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: move extension arguments into compound structure (1/6) The function signatures for Xtables extensions have grown over time. It involves a lot of typing/replication, and also a bit of stack space even if they are not used. Realize an NFWS2008 idea and pack them into structs. The skb remains outside of the struct so gcc can continue to apply its optimizations. This patch does this for match extensions' match functions. A few ambiguities have also been addressed. The "offset" parameter for example has been renamed to "fragoff" (there are so many different offsets already) and "protoff" to "thoff" (there is more than just one protocol here, so clarify). Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
367c6790 |
|
08-Oct-2008 |
Jan Engelhardt <jengelh@medozas.de> |
netfilter: xtables: do centralized checkentry call (1/2) It used to be that {ip,ip6,etc}_tables called extension->checkentry themselves, but this can be moved into the xtables core. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
544473c1 |
|
14-Apr-2008 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: {ip,ip6,arp}_tables: return EAGAIN for invalid SO_GET_ENTRIES size Rule dumping is performed in two steps: first userspace gets the ruleset size using getsockopt(SO_GET_INFO) and allocates memory, then it calls getsockopt(SO_GET_ENTRIES) to actually dump the ruleset. When another process changes the ruleset in between the sizes from the first getsockopt call doesn't match anymore and the kernel aborts. Unfortunately it returns EAGAIN, as for multiple other possible errors, so userspace can't distinguish this case from real errors. Return EAGAIN so userspace can retry the operation. Fixes (with current iptables SVN version) netfilter bugzilla #104. Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
5452e425 |
|
14-Apr-2008 |
Jan Engelhardt <jengelh@computergmbh.de> |
[NETFILTER]: annotate {arp,ip,ip6,x}tables with const Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
|
#
3b1e0a65 |
|
25-Mar-2008 |
YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> |
[NET] NETNS: Omit sock->sk_net without CONFIG_NET_NS. Introduce per-sock inlines: sock_net(), sock_net_set() and per-inet_timewait_sock inlines: twsk_net(), twsk_net_set(). Without CONFIG_NET_NS, no namespace other than &init_net exists. Let's explicitly define them to help compiler optimizations. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
|
#
0dc47877 |
|
05-Mar-2008 |
Harvey Harrison <harvey.harrison@gmail.com> |
net: replace remaining __FUNCTION__ occurrences __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
3cb609d5 |
|
31-Jan-2008 |
Alexey Dobriyan <adobriyan@sw.ru> |
[NETFILTER]: x_tables: create per-netns /proc/net/*_tables_* Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b0a6363c |
|
31-Jan-2008 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: {ip,arp,ip6}_tables: fix sparse warnings in compat code CHECK net/ipv4/netfilter/ip_tables.c net/ipv4/netfilter/ip_tables.c:1453:8: warning: incorrect type in argument 3 (different signedness) net/ipv4/netfilter/ip_tables.c:1453:8: expected int *size net/ipv4/netfilter/ip_tables.c:1453:8: got unsigned int [usertype] *size net/ipv4/netfilter/ip_tables.c:1458:44: warning: incorrect type in argument 3 (different signedness) net/ipv4/netfilter/ip_tables.c:1458:44: expected int *size net/ipv4/netfilter/ip_tables.c:1458:44: got unsigned int [usertype] *size net/ipv4/netfilter/ip_tables.c:1603:2: warning: incorrect type in argument 2 (different signedness) net/ipv4/netfilter/ip_tables.c:1603:2: expected unsigned int *i net/ipv4/netfilter/ip_tables.c:1603:2: got int *<noident> net/ipv4/netfilter/ip_tables.c:1627:8: warning: incorrect type in argument 3 (different signedness) net/ipv4/netfilter/ip_tables.c:1627:8: expected int *size net/ipv4/netfilter/ip_tables.c:1627:8: got unsigned int *size net/ipv4/netfilter/ip_tables.c:1634:40: warning: incorrect type in argument 3 (different signedness) net/ipv4/netfilter/ip_tables.c:1634:40: expected int *size net/ipv4/netfilter/ip_tables.c:1634:40: got unsigned int *size net/ipv4/netfilter/ip_tables.c:1653:8: warning: incorrect type in argument 5 (different signedness) net/ipv4/netfilter/ip_tables.c:1653:8: expected unsigned int *i net/ipv4/netfilter/ip_tables.c:1653:8: got int *<noident> net/ipv4/netfilter/ip_tables.c:1666:2: warning: incorrect type in argument 2 (different signedness) net/ipv4/netfilter/ip_tables.c:1666:2: expected unsigned int *i net/ipv4/netfilter/ip_tables.c:1666:2: got int *<noident> CHECK net/ipv4/netfilter/arp_tables.c net/ipv4/netfilter/arp_tables.c:1285:40: warning: incorrect type in argument 3 (different signedness) net/ipv4/netfilter/arp_tables.c:1285:40: expected int *size net/ipv4/netfilter/arp_tables.c:1285:40: got unsigned int *size net/ipv4/netfilter/arp_tables.c:1543:44: warning: incorrect type in argument 3 (different signedness) net/ipv4/netfilter/arp_tables.c:1543:44: expected int *size net/ipv4/netfilter/arp_tables.c:1543:44: got unsigned int [usertype] *size CHECK net/ipv6/netfilter/ip6_tables.c net/ipv6/netfilter/ip6_tables.c:1481:8: warning: incorrect type in argument 3 (different signedness) net/ipv6/netfilter/ip6_tables.c:1481:8: expected int *size net/ipv6/netfilter/ip6_tables.c:1481:8: got unsigned int [usertype] *size net/ipv6/netfilter/ip6_tables.c:1486:44: warning: incorrect type in argument 3 (different signedness) net/ipv6/netfilter/ip6_tables.c:1486:44: expected int *size net/ipv6/netfilter/ip6_tables.c:1486:44: got unsigned int [usertype] *size net/ipv6/netfilter/ip6_tables.c:1631:2: warning: incorrect type in argument 2 (different signedness) net/ipv6/netfilter/ip6_tables.c:1631:2: expected unsigned int *i net/ipv6/netfilter/ip6_tables.c:1631:2: got int *<noident> net/ipv6/netfilter/ip6_tables.c:1655:8: warning: incorrect type in argument 3 (different signedness) net/ipv6/netfilter/ip6_tables.c:1655:8: expected int *size net/ipv6/netfilter/ip6_tables.c:1655:8: got unsigned int *size net/ipv6/netfilter/ip6_tables.c:1662:40: warning: incorrect type in argument 3 (different signedness) net/ipv6/netfilter/ip6_tables.c:1662:40: expected int *size net/ipv6/netfilter/ip6_tables.c:1662:40: got unsigned int *size net/ipv6/netfilter/ip6_tables.c:1680:8: warning: incorrect type in argument 5 (different signedness) net/ipv6/netfilter/ip6_tables.c:1680:8: expected unsigned int *i net/ipv6/netfilter/ip6_tables.c:1680:8: got int *<noident> net/ipv6/netfilter/ip6_tables.c:1693:2: warning: incorrect type in argument 2 (different signedness) net/ipv6/netfilter/ip6_tables.c:1693:2: expected unsigned int *i net/ipv6/netfilter/ip6_tables.c:1693:2: got int *<noident> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
df200969 |
|
31-Jan-2008 |
Alexey Dobriyan <adobriyan@sw.ru> |
[NETFILTER]: netns: put table module on netns stop When number of entries exceeds number of initial entries, foo-tables code will pin table module. But during table unregister on netns stop, that additional pin was forgotten. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
34bd137b |
|
31-Jan-2008 |
Alexey Dobriyan <adobriyan@sw.ru> |
[NETFILTER]: ip_tables: propagate netns from userspace .. all the way down to table searching functions. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
44d34e72 |
|
31-Jan-2008 |
Alexey Dobriyan <adobriyan@sw.ru> |
[NETFILTER]: x_tables: return new table from {arp,ip,ip6}t_register_table() Typical table module registers xt_table structure (i.e. packet_filter) and link it to list during it. We can't use one template for it because corresponding list_head will become corrupted. We also can't unregister with template because it wasn't changed at all and thus doesn't know in which list it is. So, we duplicate template at the very first step of table registration. Table modules will save it for use during unregistration time and actual filtering. Do it at once to not screw bisection. P.S.: renaming i.e. packet_filter => __packet_filter is temporary until full netnsization of table modules is done. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8d870052 |
|
31-Jan-2008 |
Alexey Dobriyan <adobriyan@sw.ru> |
[NETFILTER]: x_tables: per-netns xt_tables In fact all we want is per-netns set of rules, however doing that will unnecessary complicate routines such as ipt_hook()/ipt_do_table, so make full xt_table array per-netns. Every user stubbed with init_net for a while. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a98da11d |
|
31-Jan-2008 |
Alexey Dobriyan <adobriyan@sw.ru> |
[NETFILTER]: x_tables: change xt_table_register() return value convention Switch from 0/-E to ptr/PTR_ERR convention. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ecb6f85e |
|
31-Jan-2008 |
Jan Engelhardt <jengelh@computergmbh.de> |
[NETFILTER]: Use const in struct xt_match, xt_target, xt_table Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
022748a9 |
|
15-Jan-2008 |
Denys Vlasenko <vda.linux@googlemail.com> |
[NETFILTER]: {ip,ip6}_tables: remove some inlines This patch removes inlines except those which are used by packet matching code and thus are performance-critical. Before: $ size */*/*/ip*tables*.o text data bss dec hex filename 6402 500 16 6918 1b06 net/ipv4/netfilter/ip_tables.o 7130 500 16 7646 1dde net/ipv6/netfilter/ip6_tables.o After: $ size */*/*/ip*tables*.o text data bss dec hex filename 6307 500 16 6823 1aa7 net/ipv4/netfilter/ip_tables.o 7010 500 16 7526 1d66 net/ipv6/netfilter/ip6_tables.o Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e79ec50b |
|
17-Dec-2007 |
Jan Engelhardt <jengelh@computergmbh.de> |
[NETFILTER]: Parenthesize macro parameters Parenthesize macro parameters. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
f01ffbd6 |
|
17-Dec-2007 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: nf_log: move logging stuff to seperate header Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
6d6a55f4 |
|
17-Dec-2007 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: ip_tables: remove ipchains compatibility hack ipchains support has been removed years ago. kill last remains. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c9d8fe13 |
|
17-Dec-2007 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: {ip,ip6}_tables: fix format strings Use %zu for sizeof() and remove casts. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
9c547959 |
|
17-Dec-2007 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: {ip,ip6}_tables: reformat to eliminate differences Reformat ip_tables.c and ip6_tables.c in order to eliminate non-functional differences and minimize diff output. This allows to get a view of the real differences using: sed -e 's/IP6T/IPT/g' \ -e 's/IP6/IP/g' \ -e 's/INET6/INET/g' \ -e 's/ip6t/ipt/g' \ -e 's/ip6/ip/g' \ -e 's/ipv6/ip/g' \ -e 's/icmp6/icmp/g' \ net/ipv6/netfilter/ip6_tables.c | \ diff -wup /dev/stdin net/ipv4/netfilter/ip_tables.c Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b386d9f5 |
|
17-Dec-2007 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: ip_tables: move compat offset calculation to x_tables Its needed by ip6_tables and arp_tables as well. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
73cd598d |
|
17-Dec-2007 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: ip_tables: fix compat types Use compat types and compat iterators when dealing with compat entries for clarity. This doesn't actually make a difference for ip_tables, but is needed for ip6_tables and arp_tables. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
30c08c41 |
|
17-Dec-2007 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: ip_tables: account for struct ipt_entry/struct compat_ipt_entry size diff Account for size differences when dumping entries or calculating the entry positions. This doesn't actually make any difference for IPv4 since the structures have the same size, but its logically correct and needed for IPv6. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
89566951 |
|
17-Dec-2007 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: x_tables: make xt_compat_match_from_user usable in iterator macros Make xt_compat_match_from_user return an int to make it usable in the *tables iterator macros and kill a now unnecessary wrapper function. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
4b478248 |
|
17-Dec-2007 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: ip_tables: reformat compat code The compat code has some very odd formating, clean it up before porting it to ip6_tables. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ac8e27fd |
|
17-Dec-2007 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: ip_tables: kill useless wrapper Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
9e67d5a7 |
|
05-Dec-2007 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: x_tables: remove obsolete overflow check We're not multiplying the size with the number of CPUs anymore, so the check is obsolete. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
259d4e41 |
|
05-Dec-2007 |
Eric Dumazet <dada1@cosmosbay.com> |
[NETFILTER]: x_tables: struct xt_table_info diet Instead of using a big array of NR_CPUS entries, we can compute the size needed at runtime, using nr_cpu_ids This should save some ram (especially on David's machines where NR_CPUS=4096 : 32 KB can be saved per table, and 64KB for dynamically allocated ones (because of slab/slub alignements) ) In particular, the 'bootstrap' tables are not any more static (in data section) but on stack as their size is now very small. This also should reduce the size used on stack in compat functions (get_info() declares an automatic variable, that could be bigger than kernel stack size for big NR_CPUS) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
6e23ae2a |
|
19-Nov-2007 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: Introduce NF_INET_ hook values The IPv4 and IPv6 hook values are identical, yet some code tries to figure out the "correct" value by looking at the address family. Introduce NF_INET_* values for both IPv4 and IPv6. The old values are kept in a #ifndef __KERNEL__ section for userspace compatibility. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a18aa31b |
|
12-Dec-2007 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: ip_tables: fix compat copy race When copying entries to user, the kernel makes two passes through the data, first copying all the entries, then fixing up names and counters. On the second pass it copies the kernel and match data from userspace to the kernel again to find the corresponding structures, expecting that kernel pointers contained in the data are still valid. This is obviously broken, fix by avoiding the second pass completely and fixing names and counters while dumping the ruleset, using the kernel-internal data structures. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
3db05fea |
|
15-Oct-2007 |
Herbert Xu <herbert@gondor.apana.org.au> |
[NETFILTER]: Replace sk_buff ** with sk_buff * With all the users of the double pointers removed, this patch mops up by finally replacing all occurances of sk_buff ** in the netfilter API by sk_buff *. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
16fcec35 |
|
11-Sep-2007 |
Neil Horman <nhorman@tuxdriver.com> |
[NETFILTER]: Fix/improve deadlock condition on module removal netfilter So I've had a deadlock reported to me. I've found that the sequence of events goes like this: 1) process A (modprobe) runs to remove ip_tables.ko 2) process B (iptables-restore) runs and calls setsockopt on a netfilter socket, increasing the ip_tables socket_ops use count 3) process A acquires a file lock on the file ip_tables.ko, calls remove_module in the kernel, which in turn executes the ip_tables module cleanup routine, which calls nf_unregister_sockopt 4) nf_unregister_sockopt, seeing that the use count is non-zero, puts the calling process into uninterruptible sleep, expecting the process using the socket option code to wake it up when it exits the kernel 4) the user of the socket option code (process B) in do_ipt_get_ctl, calls ipt_find_table_lock, which in this case calls request_module to load ip_tables_nat.ko 5) request_module forks a copy of modprobe (process C) to load the module and blocks until modprobe exits. 6) Process C. forked by request_module process the dependencies of ip_tables_nat.ko, of which ip_tables.ko is one. 7) Process C attempts to lock the request module and all its dependencies, it blocks when it attempts to lock ip_tables.ko (which was previously locked in step 3) Theres not really any great permanent solution to this that I can see, but I've developed a two part solution that corrects the problem Part 1) Modifies the nf_sockopt registration code so that, instead of using a use counter internal to the nf_sockopt_ops structure, we instead use a pointer to the registering modules owner to do module reference counting when nf_sockopt calls a modules set/get routine. This prevents the deadlock by preventing set 4 from happening. Part 2) Enhances the modprobe utilty so that by default it preforms non-blocking remove operations (the same way rmmod does), and add an option to explicity request blocking operation. So if you select blocking operation in modprobe you can still cause the above deadlock, but only if you explicity try (and since root can do any old stupid thing it would like.... :) ). Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
0236e667 |
|
09-Jul-2007 |
Dan Aloni <da-x@monatomic.org> |
[NETFILTER] net/ipv4/netfilter/ip_tables.c: lower printk severity Signed-off-by: Dan Aloni <da-x@monatomic.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
9f15c530 |
|
07-Jul-2007 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: x_tables: mark matches and targets __read_mostly Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ba9dda3a |
|
07-Jul-2007 |
Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> |
[NETFILTER]: x_tables: add TRACE target The TRACE target can be used to follow IP and IPv6 packets through the ruleset. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick NcHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ccb79bdc |
|
07-Jul-2007 |
Jan Engelhardt <jengelh@gmx.de> |
[NETFILTER]: x_tables: switch xt_match->checkentry to bool Switch the return type of match functions to boolean Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1d93a9cb |
|
07-Jul-2007 |
Jan Engelhardt <jengelh@gmx.de> |
[NETFILTER]: x_tables: switch xt_match->match to bool Switch the return type of match functions to boolean Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
cff533ac |
|
07-Jul-2007 |
Jan Engelhardt <jengelh@gmx.de> |
[NETFILTER]: x_tables: switch hotdrop to bool Switch the "hotdrop" variables to boolean Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
4c1b52bc |
|
05-Jun-2007 |
Dmitry Mishin <dim@openvz.org> |
[NETFILTER]: ip_tables: fix compat related crash check_compat_entry_size_and_hooks iterates over the matches and calls compat_check_calc_match, which loads the match and calculates the compat offsets, but unlike the non-compat version, doesn't call ->checkentry yet. On error however it calls cleanup_matches, which in turn calls ->destroy, which can result in crashes if the destroy function (validly) expects to only get called after the checkentry function. Add a compat_release_match function that only drops the module reference on error and rename compat_check_calc_match to compat_find_calc_match to reflect the fact that it doesn't call the checkentry function. Reported by Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1b53d904 |
|
23-Mar-2007 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: Remove changelogs and CVS IDs Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
eddc9ec5 |
|
20-Apr-2007 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
[SK_BUFF]: Introduce ip_hdr(), remove skb->nh.iph Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c9bdd4b5 |
|
12-Mar-2007 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
[IP]: Introduce ip_hdrlen() For the common sequence "skb->nh.iph->ihl * 4", removing a good number of open coded skb->nh.iph uses, now to go after the rest... Just out of curiosity, here are the idioms found to get the same result: skb->nh.iph->ihl << 2 skb->nh.iph->ihl<<2 skb->nh.iph->ihl * 4 skb->nh.iph->ihl*4 (skb->nh.iph)->ihl * sizeof(u32) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e905a9ed |
|
09-Feb-2007 |
YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> |
[NET] IPV4: Fix whitespace errors. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e60a13e0 |
|
07-Feb-2007 |
Jan Engelhardt <jengelh@gmx.de> |
[NETFILTER]: {ip,ip6}_tables: use struct xt_table instead of redefined structure names Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
6709dbbb |
|
07-Feb-2007 |
Jan Engelhardt <jengelh@gmx.de> |
[NETFILTER]: {ip,ip6}_tables: remove x_tables wrapper functions Use the x_tables functions directly to make it better visible which parts are shared between ip_tables and ip6_tables. Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e5b5ef7d |
|
04-Jan-2007 |
Dmitry Mishin <dim@openvz.org> |
[NETFILTER]: compat offsets size change Used by compat code offsets of entries should be 'unsigned int' as entries array size has this dimension. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e1b4b9f3 |
|
12-Dec-2006 |
Al Viro <viro@zeniv.linux.org.uk> |
[NETFILTER]: {ip,ip6,arp}_tables: fix exponential worst-case search for loops If we come to node we'd already marked as seen and it's not a part of path (i.e. we don't have a loop right there), we already know that it isn't a part of any loop, so we don't need to revisit it. That speeds the things up if some chain is refered to from several places and kills O(exp(table size)) worst-case behaviour (without sleeping, at that, so if you manage to self-LART that way, you are SOL for a long time)... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a96be246 |
|
12-Dec-2006 |
Dmitry Mishin <dim@openvz.org> |
[NETFILTER]: ip_tables: ipt and ipt_compat checks unification Matches and targets verification is duplicated in normal and compat processing ways. This patch refactors code in order to remove this. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
f6677f43 |
|
05-Dec-2006 |
Dmitry Mishin <dim@openvz.org> |
[NETFILTER]: Fix iptables compat hook validation In compat mode, matches and targets valid hooks checks always successful due to not initialized e->comefrom field yet. This patch separates this checks from translation code and moves them after mark_source_chains() call, where these marks are initialized. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by; Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
74c9c0c1 |
|
05-Dec-2006 |
Dmitry Mishin <dim@openvz.org> |
[NETFILTER]: Fix {ip,ip6,arp}_tables hook validation Commit 590bdf7fd2292b47c428111cb1360e312eff207e introduced a regression in match/target hook validation. mark_source_chains builds a bitmask for each rule representing the hooks it can be reached from, which is then used by the matches and targets to make sure they are only called from valid hooks. The patch moved the match/target specific validation before the mark_source_chains call, at which point the mask is always zero. This patch returns back to the old order and moves the standard checks to mark_source_chains. This allows to get rid of a special case for standard targets as a nice side-effect. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
920b868a |
|
30-Oct-2006 |
Dmitry Mishin <dim@openvz.org> |
[NETFILTER]: ip_tables: compat code module refcounting fix This patch fixes bug in iptables modules refcounting on compat error way. As we are getting modules in check_compat_entry_size_and_hooks(), in case of later error, we should put them all in translate_compat_table(), not in the compat_copy_entry_from_user() or compat_copy_match_from_user(), as it is now. Signed-off-by: Dmitry Mishin <dim@openvz.org> Acked-by: Vasily Averin <vvs@openvz.org> Acked-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ef4512e7 |
|
30-Oct-2006 |
Vasily Averin <vvs@openvz.org> |
[NETFILTER]: ip_tables: compat error way cleanup This patch adds forgotten compat_flush_offset() call to error way of translate_compat_table(). May lead to table corruption on the next compat_do_replace(). Signed-off-by: Vasily Averin <vvs@openvz.org> Acked-by: Dmitry Mishin <dim@openvz.org> Acked-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
590bdf7f |
|
30-Oct-2006 |
Dmitry Mishin <dim@openvz.org> |
[NETFILTER]: Missed and reordered checks in {arp,ip,ip6}_tables There is a number of issues in parsing user-provided table in translate_table(). Malicious user with CAP_NET_ADMIN may crash system by passing special-crafted table to the *_tables. The first issue is that mark_source_chains() function is called before entry content checks. In case of standard target, mark_source_chains() function uses t->verdict field in order to determine new position. But the check, that this field leads no further, than the table end, is in check_entry(), which is called later, than mark_source_chains(). The second issue, that there is no check that target_offset points inside entry. If so, *_ITERATE_MATCH macro will follow further, than the entry ends. As a result, we'll have oops or memory disclosure. And the third issue, that there is no check that the target is completely inside entry. Results are the same, as in previous issue. Signed-off-by: Dmitry Mishin <dim@openvz.org> Acked-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
82fac054 |
|
20-Oct-2006 |
Björn Steinbrink <B.Steinbrink@gmx.de> |
[NETFILTER]: Missing check for CAP_NET_ADMIN in iptables compat layer The 32bit compatibility layer has no CAP_NET_ADMIN check in compat_do_ipt_get_ctl, which for example allows to list the current iptables rules even without having that capability (the non-compat version requires it). Other capabilities might be required to exploit the bug (eg. CAP_NET_RAW to get the nfnetlink socket?), so a plain user can't exploit it, but a setup actually using the posix capability system might very well hit such a constellation of granted capabilities. Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
3e597c60 |
|
24-Sep-2006 |
Al Viro <viro@ftp.linux.org.uk> |
[PATCH] fix iptables __user misannotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
9fa492cd |
|
20-Sep-2006 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: x_tables: simplify compat API Split the xt_compat_match/xt_compat_target into smaller type-safe functions performing just one operation. Handle all alignment and size-related conversions centrally in these function instead of requiring each module to implement a full-blown conversion function. Replace ->compat callback by ->compat_from_user and ->compat_to_user callbacks, responsible for converting just a single private structure. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
79030ed0 |
|
20-Sep-2006 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: ip_tables: revision support for compat code Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
bec71b16 |
|
20-Sep-2006 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: ip_tables: fix module refcount leaks in compat error paths Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
90d47db4 |
|
20-Sep-2006 |
Dmitry Mishin <dim@openvz.org> |
[NETFILTER]: x_tables: small check_entry & module_refcount cleanup While standard_target has target->me == NULL, module_put() should be called for it as for others, because there were try_module_get() before. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
efa74165 |
|
22-Aug-2006 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: x_tables: remove unused size argument to check/destroy functions The size is verified by x_tables and isn't needed by the modules anymore. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
fe1cb108 |
|
22-Aug-2006 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: x_tables: remove unused argument to target functions Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8311731a |
|
17-Aug-2006 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: ip_tables: fix table locking in ipt_do_table table->private might change because of ruleset changes, don't use it without holding the lock. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
0eff66e6 |
|
13-Aug-2006 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: {arp,ip,ip6}_tables: proper error recovery in init path Neither of {arp,ip,ip6}_tables cleans up behind itself when something goes wrong during initialization. Noticed by Rennie deGraaf <degraaf@cpsc.ucalgary.ca> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
6ab3d562 |
|
30-Jun-2006 |
Jörn Engel <joern@wohnheim.fh-wedel.de> |
Remove obsolete #include <linux/config.h> Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
#
da298d3a |
|
27-Jun-2006 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: x_tables: fix xt_register_table error propagation When xt_register_table fails the error is not properly propagated back. Based on patch by Lepton Wu <ytht.net@gmail.com>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
fb1bb34d |
|
25-Jun-2006 |
Andrew Morton <akpm@osdl.org> |
[PATCH] remove for_each_cpu() Convert a few stragglers over to for_each_possible_cpu(), remove for_each_cpu(). Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
7800007c |
|
04-May-2006 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: x_tables: don't use __copy_{from,to}_user on unchecked memory in compat layer Noticed by Linus Torvalds <torvalds@osdl.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
46c5ea3c |
|
01-May-2006 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER] x_tables: fix compat related crash on non-x86 When iptables userspace adds an ipt_standard_target, it calculates the size of the entire entry as: sizeof(struct ipt_entry) + XT_ALIGN(sizeof(struct ipt_standard_target)) ipt_standard_target looks like this: struct xt_standard_target { struct xt_entry_target target; int verdict; }; xt_entry_target contains a pointer, so when compiled for 64 bit the structure gets an extra 4 byte of padding at the end. On 32 bit architectures where iptables aligns to 8 byte it will also have 4 byte padding at the end because it is only 36 bytes large. The compat_ipt_standard_fn in the kernel adjusts the offsets by sizeof(struct ipt_standard_target) - sizeof(struct compat_ipt_standard_target), which will always result in 4, even if the structure from userspace was already padded to a multiple of 8. On x86 this works out by accident because userspace only aligns to 4, on all other architectures this is broken and causes incorrect adjustments to the size and following offsets. Thanks to Linus for lots of debugging help and testing. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
6f912042 |
|
10-Apr-2006 |
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> |
[PATCH] for_each_possible_cpu: network codes for_each_cpu() actually iterates across all possible CPUs. We've had mistakes in the past where people were using for_each_cpu() where they should have been iterating across only online or present CPUs. This is inefficient and possibly buggy. We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the future. This patch replaces for_each_cpu with for_each_possible_cpu under /net Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
2722971c |
|
01-Apr-2006 |
Dmitry Mishin <dim@openvz.org> |
[NETFILTER]: iptables 32bit compat layer This patch extends current iptables compatibility layer in order to get 32bit iptables to work on 64bit kernel. Current layer is insufficient due to alignment checks both in kernel and user space tools. Patch is for current net-2.6.17 with addition of move of ipt_entry_{match| target} definitions to xt_entry_{match|target}. Signed-off-by: Dmitry Mishin <dim@openvz.org> Acked-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
65b4b4e8 |
|
28-Mar-2006 |
Andrew Morton <akpm@osdl.org> |
[NETFILTER]: Rename init functions. Every netfilter module uses `init' for its module_init() function and `fini' or `cleanup' for its module_exit() function. Problem is, this creates uninformative initcall_debug output and makes ctags rather useless. So go through and rename them all to $(filename)_init and $(filename)_fini. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a45049c5 |
|
22-Mar-2006 |
Pablo Neira Ayuso <pablo@netfilter.org> |
[NETFILTER]: x_tables: set the protocol family in x_tables targets/matches Set the family field in xt_[matches|targets] registered. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
57b47a53 |
|
20-Mar-2006 |
Ingo Molnar <mingo@elte.hu> |
[NET]: sem2mutex part 2 Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c4986734 |
|
20-Mar-2006 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: x_tables: add xt_{match,target} arguments to match/target functions Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1c524830 |
|
20-Mar-2006 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: x_tables: pass registered match/target data to match/target functions This allows to make decisions based on the revision (and address family with a follow-up patch) at runtime. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1d5cd909 |
|
20-Mar-2006 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: Convert ip_tables matches/targets to centralized error checking Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
3cdc7c95 |
|
20-Mar-2006 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: Change {ip,ip6,arp}_tables to use centralized error checking Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
ee4bb818 |
|
04-Feb-2006 |
Kirill Korotaev <dev@openvz.org> |
[NETFILTER]: Fix possible overflow in netfilters do_replace() netfilter's do_replace() can overflow on addition within SMP_ALIGN() and/or on multiplication by NR_CPUS, resulting in a buffer overflow on the copy_from_user(). In practice, the overflow on addition is triggerable on all systems, whereas the multiplication one might require much physical memory to be present due to the check above. Either is sufficient to overwrite arbitrary amounts of kernel memory. I really hate adding the same check to all 4 versions of do_replace(), but the code is duplicate... Found by Solar Designer during security audit of OpenVZ.org Signed-Off-By: Kirill Korotaev <dev@openvz.org> Signed-Off-By: Solar Designer <solar@openwall.com> Signed-off-by: Patrck McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
2e4e6a17 |
|
12-Jan-2006 |
Harald Welte <laforge@netfilter.org> |
[NETFILTER] x_tables: Abstraction layer for {ip,ip6,arp}_tables This monster-patch tries to do the best job for unifying the data structures and backend interfaces for the three evil clones ip_tables, ip6_tables and arp_tables. In an ideal world we would never have allowed this kind of copy+paste programming... but well, our world isn't (yet?) ideal. o introduce a new x_tables module o {ip,arp,ip6}_tables depend on this x_tables module o registration functions for tables, matches and targets are only wrappers around x_tables provided functions o all matches/targets that are used from ip_tables and ip6_tables are now implemented as xt_FOOBAR.c files and provide module aliases to ipt_FOOBAR and ip6t_FOOBAR o header files for xt_matches are in include/linux/netfilter/, include/linux/netfilter_{ipv4,ipv6} contains compatibility wrappers around the xt_FOOBAR.h headers Based on this patchset we're going to further unify the code, gradually getting rid of all the layer 3 specific assumptions. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
4fc268d2 |
|
11-Jan-2006 |
Randy Dunlap <rdunlap@infradead.org> |
[PATCH] capable/capability.h (net/) net: Use <linux/capability.h> where capable() is used. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
31836064 |
|
14-Dec-2005 |
Eric Dumazet <dada1@cosmosbay.com> |
[NETFILTER] ip_tables: NUMA-aware allocation Part of a performance problem with ip_tables is that memory allocation is not NUMA aware, but 'only' SMP aware (ie each CPU normally touch separate cache lines) Even with small iptables rules, the cost of this misplacement can be high on common workloads. Instead of using one vmalloc() area (located in the node of the iptables process), we now allocate an area for each possible CPU, using vmalloc_node() so that memory should be allocated in the CPU's node if possible. Port to arp_tables and ip6_tables by Harald Welte. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
9b5b5cff |
|
29-Nov-2005 |
Arjan van de Ven <arjan@infradead.org> |
[NET]: Add const markers to various variables. the patch below marks various variables const in net/; the goal is to move them to the .rodata section so that they can't false-share cachelines with things that get written to, as well as potentially helping gcc a bit with optimisations. (these were found using a gcc patch to warn about such variables) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c8923c6b |
|
13-Oct-2005 |
David S. Miller <davem@davemloft.net> |
[NETFILTER]: Fix OOPSes on machines with discontiguous cpu numbering. Original patch by Harald Welte, with feedback from Herbert Xu and testing by S�bastien Bernard. EBTABLES, ARP tables, and IP/IP6 tables all assume that cpus are numbered linearly. That is not necessarily true. This patch fixes that up by calculating the largest possible cpu number, and allocating enough per-cpu structure space given that. Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
05465343 |
|
22-Aug-2005 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: Add goto target Originally written by Henrik Nordstrom <hno@marasystems.com>, taken from netfilter patch-o-matic and added ip6_tables support. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
6869c4d8 |
|
09-Aug-2005 |
Harald Welte <laforge@netfilter.org> |
[NETFILTER]: reduce netfilter sk_buff enlargement As discussed at netconf'05, we're trying to save every bit in sk_buff. The patch below makes sk_buff 8 bytes smaller. I did some basic testing on my notebook and it seems to work. The only real in-tree user of nfcache was IPVS, who only needs a single bit. Unfortunately I couldn't find some other free bit in sk_buff to stuff that bit into, so I introduced a separate field for them. Maybe the IPVS guys can resolve that to further save space. Initially I wanted to shrink pkt_type to three bits (PACKET_HOST and alike are only 6 values defined), but unfortunately the bluetooth code overloads pkt_type :( The conntrack-event-api (out-of-tree) uses nfcache, but Rusty just came up with a way how to do it without any skb fields, so it's safe to remove it. - remove all never-implemented 'nfcache' code - don't have ipvs code abuse 'nfcache' field. currently get's their own compile-conditional skb->ipvs_property field. IPVS maintainers can decide to move this bit elswhere, but nfcache needs to die. - remove skb->nfcache field to save 4 bytes - move skb->nfctinfo into three unused bits to save further 4 bytes Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
e45b1be8 |
|
21-Jun-2005 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: Kill lockhelp.h Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1da177e4 |
|
16-Apr-2005 |
Linus Torvalds <torvalds@ppc970.osdl.org> |
Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
|