History log of /linux-master/include/linux/netfilter/ipset/ip_set.h
Revision Date Author Comments
# 97f7cf1c 29-Jan-2024 Jozsef Kadlecsik <kadlec@netfilter.org>

netfilter: ipset: fix performance regression in swap operation

The patch "netfilter: ipset: fix race condition between swap/destroy
and kernel side add/del/test", commit 28628fa9 fixes a race condition.
But the synchronize_rcu() added to the swap function unnecessarily slows
it down: it can safely be moved to destroy and use call_rcu() instead.

Eric Dumazet pointed out that simply calling the destroy functions as
rcu callback does not work: sets with timeout use garbage collectors
which need cancelling at destroy which can wait. Therefore the destroy
functions are split into two: cancelling garbage collectors safely at
executing the command received by netlink and moving the remaining
part only into the rcu callback.

Link: https://lore.kernel.org/lkml/C0829B10-EAA6-4809-874E-E1E9C05A8D84@automattic.com/
Fixes: 28628fa952fe ("netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test")
Reported-by: Ale Crismani <ale.crismani@automattic.com>
Reported-by: David Wang <00107082@163.com>
Tested-by: David Wang <00107082@163.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 5e29dc36 30-Dec-2022 Jozsef Kadlecsik <kadlec@netfilter.org>

netfilter: ipset: Rework long task execution when adding/deleting entries

When adding/deleting large number of elements in one step in ipset, it can
take a reasonable amount of time and can result in soft lockup errors. The
patch 5f7b51bf09ba ("netfilter: ipset: Limit the maximal range of
consecutive elements to add/delete") tried to fix it by limiting the max
elements to process at all. However it was not enough, it is still possible
that we get hung tasks. Lowering the limit is not reasonable, so the
approach in this patch is as follows: rely on the method used at resizing
sets and save the state when we reach a smaller internal batch limit,
unlock/lock and proceed from the saved state. Thus we can avoid long
continuous tasks and at the same time removed the limit to add/delete large
number of elements in one step.

The nfnl mutex is held during the whole operation which prevents one to
issue other ipset commands in parallel.

Fixes: 5f7b51bf09ba ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete")
Reported-by: syzbot+9204e7399656300bf271@syzkaller.appspotmail.com
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# e9374524 22-Nov-2022 Vishwanath Pai <vpai@akamai.com>

netfilter: ipset: Add support for new bitmask parameter

Add a new parameter to complement the existing 'netmask' option. The
main difference between netmask and bitmask is that bitmask takes any
arbitrary ip address as input, it does not have to be a valid netmask.

The name of the new parameter is 'bitmask'. This lets us mask out
arbitrary bits in the ip address, for example:
ipset create set1 hash:ip bitmask 255.128.255.0
ipset create set2 hash:ip,port family inet6 bitmask ffff::ff80

Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Signed-off-by: Joshua Hunt <johunt@akamai.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 5f7b51bf 28-Jul-2021 Jozsef Kadlecsik <kadlec@netfilter.org>

netfilter: ipset: Limit the maximal range of consecutive elements to add/delete

The range size of consecutive elements were not limited. Thus one could
define a huge range which may result soft lockup errors due to the long
execution time. Now the range size is limited to 2^20 entries.

Reported-by: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 5c701e71 26-Mar-2021 Wan Jiabing <wanjiabing@vivo.com>

netfilter: ipset: Remove duplicate declaration

struct ip_set is declared twice. One is declared at 79th line,
so remove the duplicate.

Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# ccf0a4b7 29-Oct-2020 Jozsef Kadlecsik <kadlec@netfilter.org>

netfilter: ipset: Add bucketsize parameter to all hash types

The parameter defines the upper limit in any hash bucket at adding new entries
from userspace - if the limit would be exceeded, ipset doubles the hash size
and rehashes. It means the set may consume more memory but gives faster
evaluation at matching in the set.

Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 6daf1414 20-Feb-2020 Gustavo A. R. Silva <gustavo@embeddedor.com>

netfilter: Replace zero-length array with flexible-array member

The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
int stuff;
struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

Lastly, fix checkpatch.pl warning
WARNING: __aligned(size) is preferred over __attribute__((aligned(size)))
in net/bridge/netfilter/ebtables.c

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# f66ee041 11-Feb-2020 Jozsef Kadlecsik <kadlec@netfilter.org>

netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports

In the case of huge hash:* types of sets, due to the single spinlock of
a set the processing of the whole set under spinlock protection could take
too long.

There were four places where the whole hash table of the set was processed
from bucket to bucket under holding the spinlock:

- During resizing a set, the original set was locked to exclude kernel side
add/del element operations (userspace add/del is excluded by the
nfnetlink mutex). The original set is actually just read during the
resize, so the spinlocking is replaced with rcu locking of regions.
However, thus there can be parallel kernel side add/del of entries.
In order not to loose those operations a backlog is added and replayed
after the successful resize.
- Garbage collection of timed out entries was also protected by the spinlock.
In order not to lock too long, region locking is introduced and a single
region is processed in one gc go. Also, the simple timer based gc running
is replaced with a workqueue based solution. The internal book-keeping
(number of elements, size of extensions) is moved to region level due to
the region locking.
- Adding elements: when the max number of the elements is reached, the gc
was called to evict the timed out entries. The new approach is that the gc
is called just for the matching region, assuming that if the region
(proportionally) seems to be full, then the whole set does. We could scan
the other regions to check every entry under rcu locking, but for huge
sets it'd mean a slowdown at adding elements.
- Listing the set header data: when the set was defined with timeout
support, the garbage collector was called to clean up timed out entries
to get the correct element numbers and set size values. Now the set is
scanned to check non-timed out entries, without actually calling the gc
for the whole set.

Thanks to Florian Westphal for helping me to solve the SOFTIRQ-safe ->
SOFTIRQ-unsafe lock order issues during working on the patch.

Reported-by: syzbot+4b0e9d4ff3cf117837e5@syzkaller.appspotmail.com
Reported-by: syzbot+c27b8d5010f45c666ed1@syzkaller.appspotmail.com
Reported-by: syzbot+68a806795ac89df3aa1c@syzkaller.appspotmail.com
Fixes: 23c42a403a9c ("netfilter: ipset: Introduction of new commands and protocol version 7")
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>


# 32c72165 19-Jan-2020 Kadlecsik József <kadlec@blackhole.kfki.hu>

netfilter: ipset: use bitmap infrastructure completely

The bitmap allocation did not use full unsigned long sizes
when calculating the required size and that was triggered by KASAN
as slab-out-of-bounds read in several places. The patch fixes all
of them.

Reported-by: syzbot+fabca5cbf5e54f3fe2de@syzkaller.appspotmail.com
Reported-by: syzbot+827ced406c9a1d9570ed@syzkaller.appspotmail.com
Reported-by: syzbot+190d63957b22ef673ea5@syzkaller.appspotmail.com
Reported-by: syzbot+dfccdb2bdb4a12ad425e@syzkaller.appspotmail.com
Reported-by: syzbot+df0d0f5895ef1f41a65b@syzkaller.appspotmail.com
Reported-by: syzbot+b08bd19bb37513357fd4@syzkaller.appspotmail.com
Reported-by: syzbot+53cdd0ec0bbabd53370a@syzkaller.appspotmail.com
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 85639185 03-Oct-2019 Jeremy Sowden <jeremy@azazel.net>

netfilter: ipset: make ip_set_put_flags extern.

ip_set_put_flags is rather large for a static inline function in a
header-file. Move it to ip_set_core.c and export it.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 2398a976 03-Oct-2019 Jeremy Sowden <jeremy@azazel.net>

netfilter: ipset: move functions to ip_set_core.c.

Several inline functions in ip_set.h are only called in ip_set_core.c:
move them and remove inline function specifier.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 94177f6e 03-Oct-2019 Jeremy Sowden <jeremy@azazel.net>

netfilter: ipset: move ip_set_comment functions from ip_set.h to ip_set_core.c.

Most of the functions are only called from within ip_set_core.c.

The exception is ip_set_init_comment. However, this is too complex to
be a good candidate for a static inline function. Move it to
ip_set_core.c, change its linkage to extern and export it, leaving a
declaration in ip_set.h.

ip_set_comment_free is only used as an extension destructor, so change
its prototype to match and drop cast.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 017f77c0 03-Oct-2019 Jeremy Sowden <jeremy@azazel.net>

netfilter: ipset: add a coding-style fix to ip_set_ext_destroy.

Use a local variable to hold comment in order to align the arguments of
ip_set_comment_free properly.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# bd96b4c7 07-Aug-2019 Jeremy Sowden <jeremy@azazel.net>

netfilter: inline four headers files into another one.

linux/netfilter/ipset/ip_set.h included four other header files:

include/linux/netfilter/ipset/ip_set_comment.h
include/linux/netfilter/ipset/ip_set_counter.h
include/linux/netfilter/ipset/ip_set_skbinfo.h
include/linux/netfilter/ipset/ip_set_timeout.h

Of these the first three were not included anywhere else. The last,
ip_set_timeout.h, was included in a couple of other places, but defined
inline functions which call other inline functions defined in ip_set.h,
so ip_set.h had to be included before it.

Inlined all four into ip_set.h, and updated the other files that
included ip_set_timeout.h.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# d2912cb1 04-Jun-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500

Based on 2 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# fe03d474 10-Jun-2019 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

Update my email address

It's better to use my kadlec@netfilter.org email address in
the source code. I might not be able to use
kadlec@blackhole.kfki.hu in the future.

Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 12ad5f65 26-Apr-2019 Michal Kubecek <mkubecek@suse.cz>

ipset: drop ipset_nest_start() and ipset_nest_end()

After the previous commit, both ipset_nest_start() and ipset_nest_end() are
just aliases for nla_nest_start() and nla_nest_end() so that there is no
need to keep them.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>


# ae0be8de 26-Apr-2019 Michal Kubecek <mkubecek@suse.cz>

netlink: make nla_nest_start() add NLA_F_NESTED flag

Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
netlink based interfaces (including recently added ones) are still not
setting it in kernel generated messages. Without the flag, message parsers
not aware of attribute semantics (e.g. wireshark dissector or libmnl's
mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
the structure of their contents.

Unfortunately we cannot just add the flag everywhere as there may be
userspace applications which check nlattr::nla_type directly rather than
through a helper masking out the flags. Therefore the patch renames
nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
are rewritten to use nla_nest_start().

Except for changes in include/net/netlink.h, the patch was generated using
this semantic patch:

@@ expression E1, E2; @@
-nla_nest_start(E1, E2)
+nla_nest_start_noflag(E1, E2)

@@ expression E1, E2; @@
-nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
+nla_nest_start(E1, E2)

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 439cd39e 14-Jul-2018 Stefano Brivio <sbrivio@redhat.com>

netfilter: ipset: list:set: Decrease refcount synchronously on deletion and replace

Commit 45040978c899 ("netfilter: ipset: Fix set:list type crash
when flush/dump set in parallel") postponed decreasing set
reference counters to the RCU callback.

An 'ipset del' command can terminate before the RCU grace period
is elapsed, and if sets are listed before then, the reference
counter shown in userspace will be wrong:

# ipset create h hash:ip; ipset create l list:set; ipset add l
# ipset del l h; ipset list h
Name: h
Type: hash:ip
Revision: 4
Header: family inet hashsize 1024 maxelem 65536
Size in memory: 88
References: 1
Number of entries: 0
Members:
# sleep 1; ipset list h
Name: h
Type: hash:ip
Revision: 4
Header: family inet hashsize 1024 maxelem 65536
Size in memory: 88
References: 0
Number of entries: 0
Members:

Fix this by making the reference count update synchronous again.

As a result, when sets are listed, ip_set_name_byindex() might
now fetch a set whose reference count is already zero. Instead
of relying on the reference count to protect against concurrent
set renaming, grab ip_set_ref_lock as reader and copy the name,
while holding the same lock in ip_set_rename() as writer
instead.

Reported-by: Li Shuang <shuali@redhat.com>
Fixes: 45040978c899 ("netfilter: ipset: Fix set:list type crash when flush/dump set in parallel")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 23c42a40 27-Oct-2018 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Introduction of new commands and protocol version 7

Two new commands (IPSET_CMD_GET_BYNAME, IPSET_CMD_GET_BYINDEX) are
introduced. The new commands makes possible to eliminate the getsockopt
operation (in iptables set/SET match/target) and thus use only netlink
communication between userspace and kernel for ipset. With the new
protocol version, userspace can exactly know which functionality is
supported by the running kernel.

Both the kernel and userspace is fully backward compatible.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 4750005a 06-Jan-2018 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Fix "don't update counters" mode when counters used at the matching

The matching of the counters was not taken into account, fixed.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 9e41f26a 09-Nov-2016 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Count non-static extension memory for userspace

Non-static (i.e. comment) extension was not counted into the memory
size. A new internal counter is introduced for this. In the case of
the hash types the sizes of the arrays are counted there as well so
that we can avoid to scan the whole set when just the header data
is requested.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 702b71e7 10-Oct-2016 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Add element count to all set types header

It is better to list the set elements for all set types, thus the
header information is uniform. Element counts are therefore added
to the bitmap and list types.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 837a90ea 10-Oct-2016 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Regroup ip_set_put_extensions and add extern

Cleanup: group ip_set_put_extensions and ip_set_get_extensions
together and add missing extern.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 57982edc 10-Oct-2016 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Split extensions into separate files

Cleanup to separate all extensions into individual files.

Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>.

Suggested-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# bec810d9 05-May-2015 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Improve skbinfo get/init helpers

Use struct ip_set_skbinfo in struct ip_set_ext instead of open
coded fields and assign structure members in get/init helpers
instead of copying members one by one. Explicitly note that
struct ip_set_skbinfo must be padded to prevent non-aligned
access in the extension blob.

Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>.

Suggested-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 7ffea379 10-Nov-2016 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Headers file cleanup

Group counter helper functions together.

Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>.

Suggested-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# da9fbfa7 10-Nov-2016 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Mark some helper args as const.

Mark some of the helpers arguments as const.

Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>.

Suggested-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 2da16a69 10-Nov-2016 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Remove extra whitespaces in ip_set.h

Remove unnecessary whitespaces.

Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>.

Suggested-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# e9bbe898 22-Apr-2016 Nicolas Dichtel <nicolas.dichtel@6wind.com>

libnl: nla_put_net64(): align on a 64-bit area

nla_data() is now aligned on a 64-bit area.

The temporary function nla_put_be64_32bit() is removed in this patch.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 596cf3fe 16-Mar-2016 Vishwanath Pai <vpai@akamai.com>

netfilter: ipset: fix race condition in ipset save, swap and delete

This fix adds a new reference counter (ref_netlink) for the struct ip_set.
The other reference counter (ref) can be swapped out by ip_set_swap and we
need a separate counter to keep track of references for netlink events
like dump. Using the same ref counter for dump causes a race condition
which can be demonstrated by the following script:

ipset create hash_ip1 hash:ip family inet hashsize 1024 maxelem 500000 \
counters
ipset create hash_ip2 hash:ip family inet hashsize 300000 maxelem 500000 \
counters
ipset create hash_ip3 hash:ip family inet hashsize 1024 maxelem 500000 \
counters

ipset save &

ipset swap hash_ip3 hash_ip2
ipset destroy hash_ip3 /* will crash the machine */

Swap will exchange the values of ref so destroy will see ref = 0 instead of
ref = 1. With this fix in place swap will not succeed because ipset save
still has ref_netlink on the set (ip_set_swap doesn't swap ref_netlink).

Both delete and swap will error out if ref_netlink != 0 on the set.

Note: The changes to *_head functions is because previously we would
increment ref whenever we called these functions, we don't do that
anymore.

Reviewed-by: Joshua Hunt <johunt@akamai.com>
Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 95ad1f4a 07-Nov-2015 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Fix extension alignment

The data extensions in ipset lacked the proper memory alignment and
thus could lead to kernel crash on several architectures. Therefore
the structures have been reorganized and alignment attributes added
where needed. The patch was tested on armv7h by Gerhard Wiesinger and
on x86_64, sparc64 by Jozsef Kadlecsik.

Reported-by: Gerhard Wiesinger <lists@wiesinger.com>
Tested-by: Gerhard Wiesinger <lists@wiesinger.com>
Tested-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# ca0f6a5c 13-Jun-2015 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Fix coding styles reported by checkpatch.pl

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# b57b2d1f 13-Jun-2015 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Prepare the ipset core to use RCU at set level

Replace rwlock_t with spinlock_t in "struct ip_set" and change the locking
accordingly. Convert the comment extension into an rcu-avare object. Also,
simplify the timeout routines.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# c4c99783 13-Jun-2015 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Fix parallel resizing and listing of the same set

When elements added to a hash:* type of set and resizing triggered,
parallel listing could start to list the original set (before resizing)
and "continue" with listing the new set. Fix it by references and
using the original hash table for listing. Therefore the destroying of
the original hash table may happen from the resizing or listing functions.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# f690cbae 12-Jun-2015 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Fix cidr handling for hash:*net* types

Commit "Simplify cidr handling for hash:*net* types" broke the cidr
handling for the hash:*net* types when the sets were used by the SET
target: entries with invalid cidr values were added to the sets.
Reported by Jonathan Johnson.

Testsuite entry is added to verify the fix.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# a3b1c1eb 06-May-2015 Denys Vlasenko <dvlasenk@redhat.com>

netfilter: ipset: deinline ip_set_put_extensions()

On x86 allyesconfig build:
The function compiles to 489 bytes of machine code.
It has 25 callsites.

text data bss dec hex filename
82441375 22255384 20627456 125324215 7784bb7 vmlinux.before
82434909 22255384 20627456 125317749 7783275 vmlinux

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
CC: Eric W. Biederman <ebiederm@xmission.com>
CC: David S. Miller <davem@davemloft.net>
CC: Jan Engelhardt <jengelh@medozas.de>
CC: Jiri Pirko <jpirko@redhat.com>
CC: linux-kernel@vger.kernel.org
CC: netdev@vger.kernel.org
CC: netfilter-devel@vger.kernel.org
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 275e2bc0 02-May-2015 Sergey Popovich <popovich_sergei@mail.ua>

netfilter: ipset: Fix ext_*() macros

So pointers returned by these macros could be
referenced with -> directly.

Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 930345ea 29-Mar-2015 Jiri Benc <jbenc@redhat.com>

netlink: implement nla_put_in_addr and nla_put_in6_addr

IP addresses are often stored in netlink attributes. Add generic functions
to do that.

For nla_put_in_addr, it would be nicer to pass struct in_addr but this is
not used universally throughout the kernel, in way too many places __be32 is
used to store IPv4 address.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# aef96193 15-Sep-2014 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: send nonzero skbinfo extensions only

Do not send zero valued skbinfo extensions to userspace at listing.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 0e9871e3 28-Aug-2014 Anton Danilov <littlesmilingcloud@gmail.com>

netfilter: ipset: Add skbinfo extension kernel support in the ipset core.

Skbinfo extension provides mapping of metainformation with lookup in the ipset tables.
This patch defines the flags, the constants, the functions and the structures
for the data type independent support of the extension.
Note the firewall mark stores in the kernel structures as two 32bit values,
but transfered through netlink as one 64bit value.

Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 07cf8f5a 28-Feb-2014 Josh Hunt <johunt@akamai.com>

netfilter: ipset: add forceadd kernel support for hash set types

Adds a new property for hash set types, where if a set is created
with the 'forceadd' option and the set becomes full the next addition
to the set may succeed and evict a random entry from the set.

To keep overhead low eviction is done very simply. It checks to see
which bucket the new entry would be added. If the bucket's pos value
is non-zero (meaning there's at least one entry in the bucket) it
replaces the first entry in the bucket. If pos is zero, then it continues
down the normal add process.

This property is useful if you have a set for 'ban' lists where it may
not matter if you release some entries from the set early.

Signed-off-by: Josh Hunt <johunt@akamai.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# af284ece 12-Feb-2014 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Prepare the kernel for create option flags when no extension is needed

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 3b02b56c 17-Dec-2013 Vytas Dauksa <vytas.dauksa@smoothwall.net>

netfilter: ipset: add hash:ip,mark data type to ipset

Introduce packet mark support with new ip,mark hash set. This includes
userspace and kernelspace code, hash:ip,mark set tests and man page
updates.

The intended use of ip,mark set is similar to the ip:port type, but for
protocols which don't use a predictable port number. Instead of port
number it matches a firewall mark determined by a layer 7 filtering
program like opendpi.

As well as allowing or blocking traffic it will also be used for
accounting packets and bytes sent for each protocol.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 02eca9d2 30-Dec-2013 stephen hemminger <stephen@networkplumber.org>

netfilter: ipset: remove unused code

Function never used in current upstream code.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 93302880 18-Oct-2013 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Use netlink callback dump args only

Instead of cb->data, use callback dump args only and introduce symbolic
names instead of plain numbers at accessing the argument members.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 1785e8f4 30-Sep-2013 Vitaly Lavrov <lve@guap.ru>

netfiler: ipset: Add net namespace for ipset

This patch adds netns support for ipset.

Major changes were made in ip_set_core.c and ip_set.h.
Global variables are moved to per net namespace.
Added initialization code and the destruction of the network namespace ipset subsystem.
In the prototypes of public functions ip_set_* added parameter "struct net*".

The remaining corrections related to the change prototypes of public functions ip_set_*.

The patch for git://git.netfilter.org/ipset.git commit 6a4ec96c0b8caac5c35474e40e319704d92ca347

Signed-off-by: Vitaly Lavrov <lve@guap.ru>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 3fd986b3 25-Sep-2013 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Use a common function at listing the extensions

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 68b63f08 22-Sep-2013 Oliver Smith <oliver@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa>

netfilter: ipset: Support comments for ipset entries in the core.

This adds the core support for having comments on ipset entries.

The comments are stored as standard null-terminated strings in
dynamically allocated memory after being passed to the kernel. As a
result of this, code has been added to the generic destroy function to
iterate all extensions and call that extension's destroy task if the set
has that extension activated, and if such a task is defined.

Signed-off-by: Oliver Smith <oliver@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 40cd63bf 09-Sep-2013 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Support extensions which need a per data destroy function

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 03c8b234 06-Sep-2013 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Generalize extensions support

Get rid of the structure based extensions and introduce a blob for
the extensions. Thus we can support more extension types easily.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# ca134ce8 06-Sep-2013 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Move extension data to set structure

Default timeout and extension offsets are moved to struct set, because
all set types supports all extensions and it makes possible to generalize
extension support.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# f925f705 06-Sep-2013 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Rename extension offset ids to extension ids

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# a04d8b6b 30-Sep-2013 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Prepare ipset to support multiple networks for hash types

In order to support hash:net,net, hash:net,port,net etc. types,
arrays are introduced for the book-keeping of existing cidr sizes
and network numbers in a set.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# b8cd9786 02-May-2013 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Use fix sized type for timeout in the extension part

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 35b8dcf8 30-Apr-2013 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Rename simple macro names to avoid namespace issues.

Reported-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 0f1799ba 16-Sep-2013 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Consistent userspace testing with nomatch flag

The "nomatch" commandline flag should invert the matching at testing,
similarly to the --return-nomatch flag of the "set" match of iptables.
Until now it worked with the elements with "nomatch" flag only. From
now on it works with elements without the flag too, i.e:

# ipset n test hash:net
# ipset a test 10.0.0.0/24 nomatch
# ipset t test 10.0.0.1
10.0.0.1 is NOT in set test.
# ipset t test 10.0.0.1 nomatch
10.0.0.1 is in set test.

# ipset a test 192.168.0.0/24
# ipset t test 192.168.0.1
192.168.0.1 is in set test.
# ipset t test 192.168.0.1 nomatch
192.168.0.1 is NOT in set test.

Before the patch the results were

...
# ipset t test 192.168.0.1
192.168.0.1 is in set test.
# ipset t test 192.168.0.1 nomatch
192.168.0.1 is in set test.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 6e01781d 27-Apr-2013 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: set match: add support to match the counters

The new revision of the set match supports to match the counters
and to suppress updating the counters at matching too.

At the set:list types, the updating of the subcounters can be
suppressed as well.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 34d666d4 27-Apr-2013 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Introduce the counter extension in the core

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 075e64c0 27-Apr-2013 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Introduce extensions to elements in the core

Introduce extensions to elements in the core and prepare timeout as
the first one.

This patch also modifies the em_ipset classifier to use the new
extension struct layout.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 43c56e59 08-Apr-2013 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Make possible to test elements marked with nomatch

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# a8201414 09-Oct-2012 David Howells <dhowells@redhat.com>

UAPI: (Scripted) Disintegrate include/linux/netfilter/ipset

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>


# 3e0304a5 21-Sep-2012 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Support to match elements marked with "nomatch"

Exceptions can now be matched and we can branch according to the
possible cases:

a. match in the set if the element is not flagged as "nomatch"
b. match in the set if the element is flagged with "nomatch"
c. no match

i.e.

iptables ... -m set --match-set ... -j ...
iptables ... -m set --match-set ... --nomatch-entries -j ...
...

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 3ace95c0 21-Sep-2012 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Coding style fixes

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 10111a6e 21-Sep-2012 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Include supported revisions in module description

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>


# 95c96174 14-Apr-2012 Eric Dumazet <eric.dumazet@gmail.com>

net: cleanup unsigned to unsigned int

Use of "unsigned int" is preferred to bare "unsigned" in net tree.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 7cf7899d 01-Apr-2012 David S. Miller <davem@davemloft.net>

ipset: Stop using NLA_PUT*().

These macros contain a hidden goto, and are thus extremely error
prone and make code hard to audit.

Signed-off-by: David S. Miller <davem@davemloft.net>


# 2a7cef2a 14-Jan-2012 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Exceptions support added to hash:*net* types

The "nomatch" keyword and option is added to the hash:*net* types,
by which one can add exception entries to sets. Example:

ipset create test hash:net
ipset add test 192.168.0/24
ipset add test 192.168.0/30 nomatch

In this case the IP addresses from 192.168.0/24 except 192.168.0/30
match the elements of the set.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# ae8ded1c 14-Jan-2012 Jan Engelhardt <jengelh@medozas.de>

netfilter: ipset: expose userspace-relevant parts in ip_set.h

iptables's libxt_SET.c depends on these.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# c15f1c83 13-Feb-2012 Jan Engelhardt <jengelh@medozas.de>

netfilter: ipset: use NFPROTO_ constants

ipset is actually using NFPROTO values rather than AF (xt_set passes
that along).

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


# 15b4d93f 16-Jun-2011 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: whitespace and coding fixes detected by checkpatch.pl

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>


# e385357a 16-Jun-2011 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: hash:net,iface type introduced

The hash:net,iface type makes possible to store network address and
interface name pairs in a set. It's mostly suitable for egress
and ingress filtering. Examples:

# ipset create test hash:net,iface
# ipset add test 192.168.0.0/16,eth0
# ipset add test 192.168.0.0/24,eth1

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>


# b66554cf 16-Jun-2011 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: add xt_action_param to the variant level kadt functions, ipset API change

With the change the sets can use any parameter available for the match
and target extensions, like input/output interface. It's required for
the hash:net,iface set type.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>


# d0d9e0a5 16-Jun-2011 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: support range for IPv4 at adding/deleting elements for hash:*net* types

The range internally is converted to the network(s) equal to the range.
Example:

# ipset new test hash:net
# ipset add test 10.2.0.0-10.2.1.12
# ipset list test
Name: test
Type: hash:net
Header: family inet hashsize 1024 maxelem 65536
Size in memory: 16888
References: 0
Members:
10.2.1.12
10.2.1.0/29
10.2.0.0/24
10.2.1.8/30

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>


# f1e00b39 16-Jun-2011 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: set type support with multiple revisions added

A set type may have multiple revisions, for example when syntax is
extended. Support continuous revision ranges in set types.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>


# 3d14b171 16-Jun-2011 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: fix adding ranges to hash types

When ranges are added to hash types, the elements may trigger rehashing
the set. However, the last successfully added element was not kept track
so the adding started again with the first element after the rehashing.

Bug reported by Mr Dash Four.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>


# c1e2e043 16-Jun-2011 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: support listing setnames and headers too

Current listing makes possible to list sets with full content only.
The patch adds support partial listings, i.e. listing just
the existing setnames or listing set headers, without set members.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>


# ac8cc925 16-Jun-2011 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: options and flags support added to the kernel API

The support makes possible to specify the timeout value for
the SET target and a flag to reset the timeout for already existing
entries.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>


# 5416219e 16-Jun-2011 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: timeout can be modified for already added elements

When an element to a set with timeout added, one can change the timeout
by "readding" the element with the "-exist" flag. That means the timeout
value is reset to the specified one (or to the default from the set
specification if the "timeout n" option is not used). Example

ipset add foo 1.2.3.4 timeout 10
ipset add foo 1.2.3.4 timeout 600 -exist

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>


# 2f9f28b2 04-Apr-2011 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: references are protected by rwlock instead of mutex

The timeout variant of the list:set type must reference the member sets.
However, its garbage collector runs at timer interrupt so the mutex
protection of the references is a no go. Therefore the reference protection
is converted to rwlock.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>


# 6604271c 29-Mar-2011 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: References are protected by rwlock instead of mutex

The timeout variant of the list:set type must reference the member sets.
However, its garbage collector runs at timer interrupt so the mutex protection
of the references is a no go. Therefore the reference protection
is converted to rwlock.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>


# a7b4f989 01-Feb-2011 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: IP set core support

The patch adds the IP set core support to the kernel.

The IP set core implements a netlink (nfnetlink) based protocol by which
one can create, destroy, flush, rename, swap, list, save, restore sets,
and add, delete, test elements from userspace. For simplicity (and backward
compatibilty and for not to force ip(6)tables to be linked with a netlink
library) reasons a small getsockopt-based protocol is also kept in order
to communicate with the ip(6)tables match and target.

The netlink protocol passes all u16, etc values in network order with
NLA_F_NET_BYTEORDER flag. The protocol enforces the proper use of the
NLA_F_NESTED and NLA_F_NET_BYTEORDER flags.

For other kernel subsystems (netfilter match and target) the API contains
the functions to add, delete and test elements in sets and the required calls
to get/put refereces to the sets before those operations can be performed.

The set types (which are implemented in independent modules) are stored
in a simple RCU protected list. A set type may have variants: for example
without timeout or with timeout support, for IPv4 or for IPv6. The sets
(i.e. the pointers to the sets) are stored in an array. The sets are
identified by their index in the array, which makes possible easy and
fast swapping of sets. The array is protected indirectly by the nfnl
mutex from nfnetlink. The content of the sets are protected by the rwlock
of the set.

There are functional differences between the add/del/test functions
for the kernel and userspace:

- kernel add/del/test: works on the current packet (i.e. one element)
- kernel test: may trigger an "add" operation in order to fill
out unspecified parts of the element from the packet (like MAC address)
- userspace add/del: works on the netlink message and thus possibly
on multiple elements from the IPSET_ATTR_ADT container attribute.
- userspace add: may trigger resizing of a set

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>