History log of /freebsd-current/sys/netpfil/pf/pf_ioctl.c
Revision Date Author Comments
# 30bad751 04-Jun-2024 Kristof Provost <kp@FreeBSD.org>

pf: convert DIOCGETTIMEOUT/DIOCSETTIMEOUT to netlink


# 4779b16f 04-Jun-2024 Kristof Provost <kp@FreeBSD.org>

pf: fix overly large copy in pf_rule_to_krule()

The timeout array in struct pf_rule has PFTM_OLD_MAX entries, the one in
struct pf_krule has PFTM_MAX entries (and PFTM_MAX > PFTM_OLD_MAX).
Use the smaller of the sizes when copying.

Reported by: CheriBSD
MFC after: 1 week
Event: Kitchener-Waterloo Hackathon 202406


# 9dbbe68b 30-May-2024 Kristof Provost <kp@FreeBSD.org>

pf: convert DIOCCLRSTATUS to netlink

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 706d465d 26-Feb-2024 Kristof Provost <kp@FreeBSD.org>

pf: convert kill/clear state to use netlink

Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D44090


# 777a4702 12-Jan-2024 Kristof Provost <kp@FreeBSD.org>

pf: implement addrule via netlink

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 9d784da3 30-Jan-2024 Igor Ostapenko <pm@igoro.pro>

pf: uncomment counter asserts after mem leak fix

Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D43657


# 04932601 07-Dec-2023 Kristof Provost <kp@FreeBSD.org>

pf: store state creation/expiration timestamps with milisecond precision

The primary beneficiary is pflow(4), which expects milisecond precision
in timestamps.

Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D43112


# baf9b6d0 01-Dec-2023 Kristof Provost <kp@FreeBSD.org>

pf: allow pflow to be activated per rule

Only generate ipfix/netflow reports (through pflow) for the rules where
this is enabled. Reports can also be enabled globally through 'set
state-default pflow'.

Obtained from: OpenBSD
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D43108


# f92d9b1a 28-Nov-2023 Kristof Provost <kp@FreeBSD.org>

pflow: import from OpenBSD

pflow is a pseudo device to export flow accounting data over UDP.
It's compatible with netflow version 5 and IPFIX (10).

The data is extracted from the pf state table. States are exported once
they are removed.

Reviewed by: melifaro
Obtained from: OpenBSD
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D43106


# 0626d30e 29-Nov-2023 Igor Ostapenko <pm@igoro.pro>

pf: fix mem leaks upon vnet destroy

Add missing cleanup actions:
- remove user defined anchor rulesets
- remove user defined ether anchor rulesets
- remove tables linked to user defined anchors
- deal with wildcard anchor peculiarities to get them removed correctly

PR: 274310
Reviewed by: kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42747


# 44f323ec 24-Nov-2023 Kristof Provost <kp@FreeBSD.org>

pf: implement DIOCGETRULES via netlink

Sponsored by: Rubicon Communications, LLC ("Netgate")


# a6173e94 06-Nov-2023 Kristof Provost <kp@FreeBSD.org>

pf: expose more syncookie state information to userspace

Allow userspace to retrieve low and high water marks, as well as the
current number of half open states.

MFC after: 1 week
Sponsored by: Modirum MDPay


# ca9dbde8 27-Oct-2023 Kristof Provost <kp@FreeBSD.org>

pf: support SCTP-specific timeouts

Allow SCTP state timeouts to be configured independently from TCP state
timeouts.

Reviewed by: tuexen
MFC after: 1 week
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D42393


# 4f337550 19-Oct-2023 Kristof Provost <kp@FreeBSD.org>

pf: allow states to be killed by their pre-NAT address

If a connection is NAT-ed we could previously only terminate it by its
ID or the post-NAT IP address. Allow users to specify they want look for
the state by its pre-NAT address. Usage: `pfctl -k nat -k <address>`.

See also: https://redmine.pfsense.org/issues/11556
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D42312


# ffbf2595 14-Oct-2023 Kristof Provost <kp@FreeBSD.org>

pf: convert rule addition to netlink

The nvlist-based version will be removed in FreeBSD 16.

Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D42279


# 81647eb6 10-Oct-2023 Kristof Provost <kp@FreeBSD.org>

pf: implement start/stop calls via netlink

Implement equivalents to DIOCSTART and DIOCSTOP in netlink. Provide a
libpfctl implementation and add a basic test case, mostly to verify that
we still return the same errors as before the conversion

Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D42145


# ebfd3b22 06-Oct-2023 Kristof Provost <kp@FreeBSD.org>

pf: move DIOCGETSTATES(V2) to COMPAT_FREEBSD14

We now have an improved version (via netlink). The old-style ioctl will
be removed in FreeBSD 16.

Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D42101


# 2cef6288 14-Sep-2023 Alexander V. Chernikov <melifaro@FreeBSD.org>

pf: convert state retrieval to netlink

Use netlink to export pf's state table.

The primary motivation is to improve how we deal with very large state
stables. With the previous implementation we had to build the entire
list (both in the kernel and in userspace) before we could start
processing. With netlink we start to get data in userspace while the
kernel is still generating more. This reduces peak memory consumption
(which can get to the GB range once we hit millions of states).

Netlink also makes future extension easier, in that we can easily add
fields to the state export without breaking userspace. In that regard
it's similar to an nvlist-based approach, except that it also deals
with transport to userspace and that it performs significantly better
than nvlists. Testing has failed to measure a performance difference
between the previous struct-copy based ioctl and the netlink approach.

Differential Revision: https://reviews.freebsd.org/D38888


# c531c1d1 22-Sep-2023 Zhenlei Huang <zlei@FreeBSD.org>

pf: Convert PF_DEFAULT_TO_DROP into a vnet loader tunable 'net.pf.default_to_drop'

7f7ef494f11d introduced a compile time option PF_DEFAULT_TO_DROP to make
the pf(4) default rule to drop. While this change exposes a vnet loader
tunable 'net.pf.default_to_drop' so that users can change the default
rule without re-compiling the pf(4) module.

This change is similiar to that for IPFW [1].

1. 5f17ebf94db5 Convert IPFW_DEFAULT_TO_ACCEPT into a loader tunable 'net.inet.ip.fw.default_to_accept'

Reviewed by: #network, kp
MFC after: 2 weeks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D39866


# b2a48c3c 01-Sep-2023 Mateusz Guzik <mjg@FreeBSD.org>

pf: retire pf_krule_to_rule and pf_kpool_to_pool

Discussed with: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 8d49fd73 29-Aug-2023 Kristof Provost <kp@FreeBSD.org>

pf: remove DIOCGETRULE and DIOCGETSTATUS

These calls have nvlist variants that completely supersede them.
Remove the old code.

Reviewed by: mjg
MFC after: never
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D41651


# 2e8edbc2 28-Aug-2023 Kristof Provost <kp@FreeBSD.org>

pf: Remove DIOCCLRSTATES and DIOCKILLSTATES

These now have nvlist based alternatives, so remove them.

Reviewed by: mjg, Pau Amma <pauamma@gundo.com> (man page)
MFC after: never
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30056


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 6b4ed16d 12-Jul-2023 Kajetan Staszkiewicz <vegeta@tuxpowered.net>

pf: Simplify rule actions logic

Actions applied to a processed packet come in case of stateless
firewalling from a rule or in case of statefull firewalling from a
state. The state obtains the actions from a rule when it is created by a
rule or by pfsync. The logic for deciding if actions come from a rule or
a state is spread across many places in pf.

There already is struct pf_rule_actions in struct pf_pdesc and thus it
can be used as a central place for storing actions and their parameters.
OpenBSD does something similar: they also store the actions in struct
pf_pdesc and have no variables in pf_test() but they use separate
variables instead of a structure. By using struct pf_rule_actions we can
simplify the code even further. Applying of actions is done *only* in
pf_rule_to_actions() no matter if for the legacy scrub rules or for the
normal match / pass rules. The logic of choosing if rule or state
actions are used is applied only once in pf_test() by copying the whole
struct.

Reviewed by: kp
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D41009


# 3a1f834b 20-Jun-2023 Doug Rabson <dfr@FreeBSD.org>

pf: Add code to enable filtering for locally delivered packets

This is disabled by default since it potentially changes the behavior of
existing filter rule sets. To enable this extra filter for packets being
delivered locally, use:

sysctl net.pf.filter_local=1
service pf restart

PR: 268717
Reviewed-by: kp
MFC-after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D40373


# ba94bf28 15-Jun-2023 Kristof Provost <kp@FreeBSD.org>

pf: extend use of skip steps for Ethernet rules

Use the already populated PFE_SKIP_DST_ADDR and extend the skip
infrastructure to also skip on IP source/destination addresses.

This should make evaluating the rules slightly faster.

Reported by: R. Christian McDonald <rcm@rcm.sh>
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D40567


# 9925aee0 30-May-2023 Kristof Provost <kp@FreeBSD.org>

pf: carry over rule actions from route-to rules

If we route-to (or dup-to/reply-to) we re-run pf_test(), which will also
create states for the connection.
This means that we may end up matching a different (i.e. not the state
that was created by the route-to rule) state, without the attributes
(such as dummynet pipes/queues) set by the route-to rule.

Address this by inheriting the pf_rule_actions from the route-to rule
while evaluating the connection again in pf_test(). That is, we set
default pf_rule_actions based on the route-to rule for the new
evaluation. The new rule may still overrule these, but if it does not
have such actions the route-to actions are applied.

Do the same for IPv6 rules in pf_test6()/pf_route6().

See also: https://redmine.pfsense.org/issues/14039
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D40340


# c45d6b0e 29-May-2023 Kajetan Staszkiewicz <vegeta@tuxpowered.net>

pfctl: Add missing state parameters in DIOCGETSTATESV2

Reviewed by: kp
Sponsored by: InnoGames GmbH
Different Revision: https://reviews.freebsd.org/D40259


# 4bf98559 29-May-2023 Kajetan Staszkiewicz <vegeta@tuxpowered.net>

pf: make contents of struct pfsync_state configurable

Make struct pfsync_state contents configurable by sending out new
versions of the structure in separate subheader actions. Both old and
new version of struct pfsync_state can be understood, so replication of
states from a system running an older kernel is possible. The version
being sent out is configured using ifconfig pfsync0 … version XXXX. The
version is an user-friendly string - 1301 stands for FreeBSD 13.1 (I
have checked synchronization against a host running 13.1), 1400 stands
for 14.0.

A host running an older kernel will just ignore the messages and count
them as "packets discarded for bad action".

Reviewed by: kp
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D39392


# c4a32455 19-May-2023 Kristof Provost <kp@FreeBSD.org>

pf: remove the use of caddr_t

Replace caddr_t with void *, or more accurate types.

Suggested by: glebius
Reviewed by: zlei
Differential Revision: https://reviews.freebsd.org/D40186


# 7626863e 29-Mar-2023 Tom Hukins <tom@FreeBSD.org>

pf: Fix a spelling mistake in a comment

Pull Request: https://github.com/freebsd/freebsd-src/pull/704


# 2e6cdfe2 18-Apr-2023 Kristof Provost <kp@FreeBSD.org>

pf: change pf_rules_lock and pf_ioctl_lock to per-vnet locks

Both pf_rules_lock and pf_ioctl_lock only ever affect one vnet, so
there's no point in having these locks affect other vnets.
(In fact, the only lock in pf that can affect multiple vnets is
pf_end_lock.)

That's especially important for the rules lock, because taking the write
lock suspends all network traffic until it's released. This will reduce
the impact a vnet running pf can have on other vnets, and improve
concurrency on machines running multiple pf-enabled vnets.

Reviewed by: zlei
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D39658


# 39282ef3 13-Apr-2023 Kajetan Staszkiewicz <vegeta@tuxpowered.net>

pf: backport OpenBSD syntax of "scrub" option for "match" and "pass" rules

Introduce the OpenBSD syntax of "scrub" option for "match" and "pass"
rules and the "set reassemble" flag. The patch is backward-compatible,
pf.conf can be still written in FreeBSD-style.

Obtained from: OpenBSD
MFC after: never
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D38025


# caf32b26 14-Feb-2023 Gleb Smirnoff <glebius@FreeBSD.org>

pfil: add pfil_mem_{in,out}() and retire pfil_run_hooks()

The 0b70e3e78b0 changed the original design of a single entry point
into pfil(9) chains providing separate functions for the filtering
points that always provide mbufs and know the direction of a flow.
The motivation was to reduce branching. The logical continuation
would be to do the same for the filtering points that always provide
a memory pointer and retire the single entry point.

o Hooks now provide two functions: one for mbufs and optional for
memory pointers.
o pfil_hook_args() has a new member and pfil_add_hook() has a
requirement to zero out uninitialized data. Bump PFIL_VERSION.
o As it was before, a hook function for a memory pointer may realloc
into an mbuf. Such mbuf would be returned via a pointer that must
be provided in argument.
o The only hook that supports memory pointers is ipfw:default-link.
It is rewritten to provide two functions.
o All remaining uses of pfil_run_hooks() are converted to
pfil_mem_in().
o Transparent union of pfil_packet_t and tricks to fix pointer
alignment are retired. Internal pfil_realloc() reduces down to
m_devget() and thus is retired, too.

Reviewed by: mjg, ocochard
Differential revision: https://reviews.freebsd.org/D37977


# 3d0d5b21 23-Jan-2023 Justin Hibbits <jhibbits@FreeBSD.org>

IfAPI: Explicitly include <net/if_private.h> in netstack

Summary:
In preparation of making if_t completely opaque outside of the netstack,
explicitly include the header. <net/if_var.h> will stop including the
header in the future.

Sponsored by: Juniper Networks, Inc.
Reviewed by: glebius, melifaro
Differential Revision: https://reviews.freebsd.org/D38200


# 933be8d7 31-Dec-2022 Kristof Provost <kp@FreeBSD.org>

pf: default syncookies to adaptive mode

The cost of enabling syncookies in adaptive mode is very low (basically
a single atomic add when we create a new half-open state), and the
payoff when under SYN flood is huge.

So, enable adaptive mode by default.

Suggested by: Eirik Øverby


# 57cc96f4 14-Dec-2022 Mark Johnston <markj@FreeBSD.org>

pf: Fix definitions of pf_pfil_*_hooked

This use of "volatile" in the vnet definitions doesn't have any effect.
VNET_DEFINE_STATE(volatile int, ...) should work, but let's avoid using
"volatile" altogether and convert to atomic_load/atomic_store. Also
convert to bool while here.

Reviewed by: kp, mjg
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37684


# 8a8af942 22-Sep-2022 Kristof Provost <kp@FreeBSD.org>

pf: bridge-to

Allow pf (l2) to be used to redirect ethernet packets to a different
interface.

The intended use case is to send 802.1x challenges out to a side
interface, to enable AT&T links to function with pfSense as a gateway,
rather than the AT&T provided hardware.

Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D37193


# 444a77ca 24-Sep-2022 Kristof Provost <kp@FreeBSD.org>

pf: expose syncookie active/inactive status

When syncookies are in adaptive mode they may be active or inactive.
Expose this status to users.

Suggested by: Guido van Rooij
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 558ec54d 24-Oct-2022 Gordon Bergling <gbe@FreeBSD.org>

netpfil: Fix two typos in source code comments

- s/missmatch/mismatch/

MFC after: 3 days


# 133935d2 07-Oct-2022 Kristof Provost <kp@FreeBSD.org>

pf: atomically increment state ids

Rather than using a per-cpu state counter, and adding in the CPU id we
can atomically increment the number.
This has the advantage of removing the assumption that the CPU ID fits
in 8 bits.

Event: Aberdeen Hackathon 2022
Reviewed by: mjg
Differential Revision: https://reviews.freebsd.org/D36915


# 1d090028 29-Sep-2022 Kristof Provost <kp@FreeBSD.org>

pf: use time_to for timestamps

Use time_t rather than uint32_t to represent the timestamps. That means
we have 64 bits rather than 32 on all platforms except i386, avoiding
the Y2K38 issues on most platforms.

Reviewed by: Zhenlei Huang
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D36837


# 6ab80e72 18-Aug-2022 Kristof Provost <kp@FreeBSD.org>

pf: do not block new Ethernet rules for in-progress transactions

Make Ethernet rule addition behave just like L3 rules, in that we now
allow ongoing transaction to be interrupted, rather than rejecting a new
one.

The result of that is that we can no longer end up in a state where a
transaction failed, but was not rolled back, blocking us from setting
new rules.

It's safe to assume there's no pending epoch callback for cleanup here,
because we've explicitly called it before hitting pf_begin_eth().

Sponsored by: Rubicon Communications, LLC ("Netgate")


# c780d3ad 18-Aug-2022 Kristof Provost <kp@FreeBSD.org>

pf: clear ethernet rules prior to shutdown

Ethernet rule cleanup is postponed to an epoch callback. Ensure it's
been called before we remove the entire vnet, or we risk the rules still
getting hit after we've freed the uma zone, i.e. a use-after-free.

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 150486f6 29-Jul-2022 Zhenlei Huang <zlei.huang@gmail.com>

Introduce and use the NET_EPOCH_DRAIN_CALLBACKS() macro

Reviewed by: melifao, kp
Differential Revision: https://reviews.freebsd.org/D35968


# bc83b359 30-Jun-2022 Mark Johnston <markj@FreeBSD.org>

pf: Ensure that pfiio_name is always nul terminated

Reported by: syzkaller
Reviewed by: kp
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35660


# 6f16d78c 28-Jun-2022 Kristof Provost <kp@FreeBSD.org>

pf: add missing maximum length check for DIOCADDETHRULE

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 1f61367f 31-May-2022 Kristof Provost <kp@FreeBSD.org>

pf: support matching on tags for Ethernet rules

Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D35362


# a37e0e6d 02-Jun-2022 Franco Fichtner <franco@opnsense.org>

pf: fix more syncookie memory leaks

Allocate memory for packed nvlists in M_NVLIST, as nvlist_pack() does
this as well, and we use the same variable interchangable with the
memory we allocate. When we free it we can end up freeing from the wrong
zone, leaking memory.

Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D35385


# a3d97408 27-May-2022 Mateusz Guzik <mjg@FreeBSD.org>

pf: make sure the rule tree is allocated in DIOCCHANGERULE

Original patch by: peter
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 386b1a03 17-May-2022 Kristof Provost <kp@FreeBSD.org>

pf: allocate krule->timestamp in pf_krule_alloc()

There are three calls which can allocate a new rule. DIOCADDRULE,
DIOCADDRULENV and DIOCCHANGERULE. The first two call pf_ioctl_addrule(),
but DIOCCHANGERULE does not. As a result rules created through
DIOCCHANGERULE do not have the timestamp per-cpu memory allocated, and
we panic when the rule is exported with pf_krule_to_nvrule().

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 826c58d6 10-May-2022 Mateusz Guzik <mjg@FreeBSD.org>

pf: add missing unlock on error in DIOCCHANGERULE

Fixes: ff80dd034a8ca732
Sponsored by: Rubicon Communications, LLC ("Netgate")


# ff80dd03 04-May-2022 Mateusz Guzik <mjg@FreeBSD.org>

pf: fix DIOCCHANGERULE after pf config and rb tree of rules

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 0abcc1d2 22-Apr-2022 Reid Linnemann <rlinnemann@netgate.com>

pf: Add per-rule timestamps for rule and eth_rule

Similar to ipfw rule timestamps, these timestamps internally are
uint32_t snaps of the system time in seconds. The timestamp is CPU local
and updated each time a rule or a state associated with a rule or state
is matched.

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D34970


# 812839e5 12-Apr-2022 Kristof Provost <kp@FreeBSD.org>

pf: allow the use of tables in ethernet rules

Allow tables to be used for the l3 source/destination matching.
This requires taking the PF_RULES read lock.

Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D34917


# ba035a31 13-Apr-2022 John Baldwin <jhb@FreeBSD.org>

pf: Use __diagused for variables only used in KASSERT().


# 4496aecb 06-Apr-2022 Kristof Provost <kp@FreeBSD.org>

pf: drain Ethernet rules cleanup before starting a new transaction

Inactive Ethernet rules get cleaned by a net_epoch callback. This
callback may still be pending when we try to start a new (pf rules)
transaction, causing it to fail.
This is especially likely to occur in scripted scenarios, such as the
regression tests.

Drain the epoch callbacks before starting a new transaction, ensuring
we've had the opportunity to clean up the inactive rules.

Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D34846


# 0bd468ea 07-Apr-2022 Kristof Provost <kp@FreeBSD.org>

pf: fix memory leak

The nvlist is allocated in pf_keth_rule_to_nveth_rule(). There's no need
to allocate one in the calling function. Especially not as we overwrite
the pointer to the new nvlist with the one allocated by
pf_keth_rule_to_nveth_rule(), leaking memory.

Reported by: Coverity
CID: 1476128
Sponsored by: Rubicon Communications, LLC ("Netgate")


# bef71045 06-Apr-2022 Kristof Provost <kp@FreeBSD.org>

pf: use ERROUT_IOCTL()

Use ERROUT_IOCTL() rather than hand-rolling the macro. This adds DTrace
SDTs in the error path, making debugging ioctl errors easier.

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")


# c4a08ef2 01-Apr-2022 Mateusz Guzik <mjg@FreeBSD.org>

pf: handle duplicate rules gracefully

Reviewed by: kp
Reported by: dch
PR: 262971
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 81cac0d2 29-Mar-2022 Kristof Provost <kp@FreeBSD.org>

pf: add missing input/error validation for DIOCGETETHRULE

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 9bb06778 29-Mar-2022 Kristof Provost <kp@FreeBSD.org>

pf: support listing ethernet anchors

Sponsored by: Rubicon Communications, LLC ("Netgate")


# e123e229 29-Mar-2022 Mateusz Guzik <mjg@FreeBSD.org>

pf: guard against DIOCADDRULE without DIOCXBEGIN

Possibility to do it was always a bug, but it runs into crashes
since recent introduction of a per-ruleset RB tree.

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")
Reported by: syzbot+665b700afc6f69f1766a@syzkaller.appspotmail.com


# bd7762c8 28-Feb-2022 Mateusz Guzik <mjg@FreeBSD.org>

pf: add a rule rb tree

with md5 sum used as key.

This gets rid of the quadratic rule traversal when "keep_counters" is
set.

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 1a3e98a5 25-Feb-2022 Mateusz Guzik <mjg@FreeBSD.org>

pf: pre-compute rule hash

Makes it cheaper to compare rules when "keep_counters" is set.
This also sets up keeping them in a RB tree.

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 93f8c38c 25-Feb-2022 Mateusz Guzik <mjg@FreeBSD.org>

pf: add pf_config_lock

For now only protects rule creation/destruction, but will allow
gradually reducing the scope of rules lock when changing the
rules.

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 2f968abc 21-Mar-2022 Mateusz Guzik <mjg@FreeBSD.org>

pf: include anchor path when hashing a rule

Otherwise all anchors hash to the same value.

Note this can result in checksum mismatches between pfsynced hosts,
but it has to be sorted out as the previously computed checksum
would fail to indicate changed anchors.

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# b163dcab 28-Feb-2022 Mateusz Guzik <mjg@FreeBSD.org>

pf: hoist the unlinked rules lock out of the mass rule removal loop

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 15ada751 25-Feb-2022 Mateusz Guzik <mjg@FreeBSD.org>

pf: remove spurious zeroing from pf_ioctl_addrule

Newly allocated counters are guaranteed to be 0.

This removes 5 IPIs for each loaded rule.

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# c5131afe 01-Oct-2021 Kristof Provost <kp@FreeBSD.org>

pf: add anchor support for ether rules

Support anchors in ether rules.

Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D32482


# 30087aa2 17-Feb-2021 Kristof Provost <kp@FreeBSD.org>

pf: Support clearing ether counters

Allow the evaluations/packets/bytes counters on Ethernet rules to be
cleared.

Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31748


# 6b7c2680 16-Feb-2021 Kristof Provost <kp@FreeBSD.org>

pf: Only hook the Ethernet pfil hook when we have rules

Avoid the overhead of the Ethernet pfil hooks if we don't have any
Ethernet rules.

Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31742


# 20c4899a 10-Feb-2021 Kristof Provost <kp@FreeBSD.org>

pf: Do not hold PF_RULES_RLOCK while processing Ethernet rules

Avoid the overhead of acquiring a (read) RULES lock when processing the
Ethernet rules.
We can get away with that because when rules are modified they're staged
in V_pf_keth_inactive. We take care to ensure the swap to V_pf_keth is
atomic, so that pf_test_eth_rule() always sees either the old rules, or
the new ruleset.

We need to take care not to delete the old ruleset until we're sure no
pf_test_eth_rule() is still running with those. We accomplish that by
using NET_EPOCH_CALL() to actually free the old rules.

Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31739


# e732e742 03-Feb-2021 Kristof Provost <kp@FreeBSD.org>

pf: Initial Ethernet level filtering code

This is the kernel side of stateless Ethernel level filtering for pf.

The primary use case for this is to enable captive portal functionality
to allow/deny access by MAC address, rather than per IP address.

Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31737


# 773e3a71 31-Jan-2022 Mark Johnston <markj@FreeBSD.org>

pf: Initialize pf_kpool mutexes earlier

There are some error paths in ioctl handlers that will call
pf_krule_free() before the rule's rpool.mtx field is initialized,
causing a panic with INVARIANTS enabled.

Fix the problem by introducing pf_krule_alloc() and initializing the
mutex there. This does mean that the rule->krule and pool->kpool
conversion functions need to stop zeroing the input structure, but I
don't see a nicer way to handle this except perhaps by guarding the
mtx_destroy() with a mtx_initialized() check.

Constify some related functions while here and add a regression test
based on a syzkaller reproducer.

Reported by: syzbot+77cd12872691d219c158@syzkaller.appspotmail.com
Reviewed by: kp
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34115


# e5ca5e80 16-Jan-2022 Kristof Provost <kp@FreeBSD.org>

pf: ensure we don't destroy an uninitialised lock

The new lock introduced in 5f5e32f1b3 needs to be initialised early so
that it can be safely destroyed if we error out.

Reported-by: syzbot+d76113e9a4ae0c0fcac2@syzkaller.appspotmail.com
MFC after: 3 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 5f5e32f1 10-Jan-2022 Kristof Provost <kp@FreeBSD.org>

pf: protect the rpool from races

The roundrobin pool stores its state in the rule, which could
potentially lead to invalid addresses being returned.

For example, thread A just executed PF_AINC(&rpool->counter) and
immediately afterwards thread B executes PF_ACPY(naddr, &rpool->counter)
(i.e. after the pf_match_addr() check of rpool->counter).

Lock the rpool with its own mutex to prevent these races. The
performance impact of this is expected to be low, as each rule has its
own lock, and the lock is also only relevant when state is being created
(so only for the initial packets of a connection, not for all traffic).

See also: https://redmine.pfsense.org/issues/12660
Reviewed by: glebius
MFC after: 3 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D33874


# 8e492101 15-Nov-2021 Kristof Provost <kp@FreeBSD.org>

pf: add COMPAT_FREEBSD13 for DIOCKEEPCOUNTERS

DIOCKEEPCOUNTERS used to overlap with DIOCGIFSPEEDV0, which has been
fixed in 14, but remains in stable/12 and stable/13.
Support the old, overlapping, call under COMPAT_FREEBSD13.

Reviewed by: jhb
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D33001


# 8f3d786c 01-Nov-2021 Mateusz Guzik <mjg@FreeBSD.org>

pf: remove the flags argument from pf_unlink_state

All consumers call it with PF_ENTER_LOCKED.

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# edf6dd82 01-Nov-2021 Mateusz Guzik <mjg@FreeBSD.org>

pf: fix use-after-free from pf_find_state_all

state was returned without any locks nor references held

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# bcd4c17c 19-Oct-2021 Mateusz Guzik <mjg@FreeBSD.org>

pf: fix some cc --analyze warnings

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 1c680e62 08-Oct-2021 Kristof Provost <kp@FreeBSD.org>

pf: do not copy anchor_wildcard / anchor_relative from userspace

We overwrite these fields again in pf_kanchor_setup() anyway.

MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")


# cb130596 23-Sep-2021 Kristof Provost <kp@FreeBSD.org>

pf: fix pagefault in pf_getstatus()

We can't copyout() while holding a lock, in case it triggers a page
fault.
Release the lock before copyout, which is safe because we've already
copied all the data into the nvlist.

PR: 258601
Reviewed by: mjg
MFC after: 1 week
Sponsored by: Modirum MDPay
Differential Revision: https://reviews.freebsd.org/D32076


# df005aa9 15-Sep-2021 John Baldwin <jhb@FreeBSD.org>

pf: Remove duplicate declaration of pf_ioctl_maxcount.

Fixes a -Wredundant-decls warning with GCC 9.

Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D31944


# b64f7ce9 07-Sep-2021 Kristof Provost <kp@FreeBSD.org>

pf: qid and pqid can be uint16_t

tag2name() returns a uint16_t, so we don't need to use uint32_t for the
qid (or pqid). This reduces the size of struct pf_kstate slightly. That
in turn buys us space to add extra fields for dummynet later.

Happily these fields are not exposed to user space (there are user space
versions of them, but they can just stay uint32_t), so there's no ABI
breakage in modifying this.

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31873


# 4cab80a8 29-Aug-2021 Kristof Provost <kp@FreeBSD.org>

pf: Add counters for syncookies

Count when we send a syncookie, receive a valid syncookie or detect a
synflood.

Reviewed by: kbowling
MFC after: 1 week
Sponsored by: Modirum MDPay
Differential Revision: https://reviews.freebsd.org/D31713


# 2b10cf85 16-Aug-2021 Kristof Provost <kp@FreeBSD.org>

pf: Introduce nvlist variant of DIOCGETSTATUS

Make it possible to extend the GETSTATUS call (e.g. when we want to add
new counters, such as for syncookie support) by introducing an
nvlist-based alternative.

MFC after: 1 week
Sponsored by: Modirum MDPay
Differential Revision: https://reviews.freebsd.org/D31694


# 600745f1 02-Aug-2021 Kristof Provost <kp@FreeBSD.org>

pf: bound DIOCGETSTATES memory use

Similar to what we did earlier for DIOCGETSTATESV2 we only allocate
enough memory for a handful of states and copy those out, bit by bit,
rather than allocating memory for all states in one go.

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")


# b69019c1 06-Jul-2021 Kristof Provost <kp@FreeBSD.org>

pf: remove DIOCGETSTATESNV

While nvlists are very useful in maximising flexibility for future
extensions their performance is simply unacceptably bad for the
getstates feature, where we can easily want to export a million states
or more.

The DIOCGETSTATESNV call has been MFCd, but has not hit a release on any
branch, so we can still remove it everywhere.

Reviewed by: mjg
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31099


# 64432ad2 28-Jul-2021 Mark Johnston <markj@FreeBSD.org>

pf: Validate user string nul-termination before copying

Some pf ioctl handlers use strlcpy() to copy strings when converting
from user structures to their in-kernel representations. strlcpy()
ensures that the destination will be nul-terminated, but it assumes that
the source is nul-terminated. In particular, it returns the full length
of the source string, so if the source is not nul-terminated, strlcpy()
will keep scanning until it finds a nul byte, and it may encounter an
unmapped page first. Add a helper to validate user strings before
copying.

There are also places where we look up a ruleset using a user-provided
anchor string. In some ioctl handlers we were already nul-terminating
the string, avoiding the same problem, but in other places we were not.
Fix those by nul-terminating as well. Aside from being consistent,
anchors have a maximum length of MAXPATHLEN - 1 so calling strnlen()
might not be so desirable.

Reported by: syzbot+35a1549b4663e9483dd1@syzkaller.appspotmail.com
Reviewed by: kp
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31169


# 2b82c57e 28-Jul-2021 Mark Johnston <markj@FreeBSD.org>

pf: Initialize arrays before copying out to userland

A number of pf ioctls populate an array of structures and copy it out.
They have the following structures:
- caller specifies the size of its output buffer
- ioctl handler allocates a kernel buffer of the same size
- ioctl handler populates the buffer, possibly leaving some items
initialized if the caller provided more space than needed
- ioctl handler copies the entire buffer out to userland

Thus, if more space was provided than is required, we end up copying out
uninitialized kernel memory. Simply zero the buffer at allocation time
to prevent this.

Reported by: KMSAN
Reviewed by: kp
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31313


# d2dc4548 25-Jul-2021 Kristof Provost <kp@FreeBSD.org>

pf: remove duplicate ERROUT_FUNCTION definition

Sponsored by: Modirum MDPay


# 87c010e6 24-Jul-2021 Mateusz Guzik <mjg@FreeBSD.org>

pf: batch critical section for several counters

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 02cf67cc 22-Jul-2021 Mateusz Guzik <mjg@FreeBSD.org>

pf: switch rule counters to pf_counter_u64

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# d40d4b3e 22-Jul-2021 Mateusz Guzik <mjg@FreeBSD.org>

pf: switch kif counters to pf_counter_u64

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# fc4c42ce 23-Jul-2021 Mateusz Guzik <mjg@FreeBSD.org>

pf: switch pf_status.fcounters to pf_counter_u64

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 49a7d472 23-Jul-2021 Mateusz Guzik <mjg@FreeBSD.org>

pf: consistently malloc rules with M_ZERO

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 32271c4d 20-Jul-2021 Kristof Provost <kp@FreeBSD.org>

pf: clean up syncookie callout on vnet shutdown

Ensure that we cancel any outstanding callouts for syncookies when we
terminate the vnet.

MFC after: 1 week
Sponsored by: Modirum MDPay


# 231e83d3 26-May-2021 Kristof Provost <kp@FreeBSD.org>

pf: syncookie ioctl interface

Kernel side implementation to allow switching between on and off modes,
and allow this configuration to be retrieved.

MFC after: 1 week
Sponsored by: Modirum MDPay
Differential Revision: https://reviews.freebsd.org/D31139


# 8e1864ed 20-May-2021 Kristof Provost <kp@FreeBSD.org>

pf: syncookie support

Import OpenBSD's syncookie support for pf. This feature help pf resist
TCP SYN floods by only creating states once the remote host completes
the TCP handshake rather than when the initial SYN packet is received.

This is accomplished by using the initial sequence numbers to encode a
cookie (hence the name) in the SYN+ACK response and verifying this on
receipt of the client ACK.

Reviewed by: kbowling
Obtained from: OpenBSD
MFC after: 1 week
Sponsored by: Modirum MDPay
Differential Revision: https://reviews.freebsd.org/D31138


# 81f95106 14-Jul-2021 Mark Johnston <markj@FreeBSD.org>

pf: Constify tag name and queue name helper functions

No functional change intended.

Reviewed by: kp
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31168


# 3fc12ae0 08-Jul-2021 Kristof Provost <kp@FreeBSD.org>

pf: bound DIOCGETSTATESV2 memory use

Rather than allocating however much memory userspace asks for we only
allocate enough for a handful of states, and copy to userspace for each
completed row.
We start out with enough space for 16 states (per row), but grow that as
required. In most configurations we expect at most a handful of states
per row (more than that would have other negative effects on packet
processing performance).

Reviewed by: mjg
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31111


# c6bf20a2 05-Jul-2021 Kristof Provost <kp@FreeBSD.org>

pf: add DIOCGETSTATESV2

Add a new version of the DIOCGETSTATES call, which extends the struct to
include the original interface information.

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31097


# 34641052 08-Jul-2021 Kristof Provost <kp@FreeBSD.org>

pf: pf_killstates() never fails, so remove the return value

Suggested by: mjg
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")


# fa96701c 05-Jul-2021 Kristof Provost <kp@FreeBSD.org>

pf: Handle errors returned by pf_killstates()

Happily this wasn't a real bug, because pf_killstates() never fails, but
we should check the return value anyway, in case it does ever start
returning errors.

Reported by: clang --analyze
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 211cddf9 06-Jul-2021 Kristof Provost <kp@FreeBSD.org>

pf: rename pf_state to pf_kstate

Indicate that this is a kernel-only structure, and make it easier to
distinguish from others used to communicate with userspace.

Reviewed by: mjg
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31096


# dc1ab04e 02-Jul-2021 Mateusz Guzik <mjg@FreeBSD.org>

pf: allow table stats clearing and reading with ruleset rlock

Instead serialize against these operations with a dedicated lock.

Prior to the change, When pushing 17 mln pps of traffic, calling
DIOCRGETTSTATS in a loop would restrict throughput to about 7 mln. With
the change there is no slowdown.

Reviewed by: kp (previous version)
Sponsored by: Rubicon Communications, LLC ("Netgate")


# a19ff8ce 29-Jun-2021 Kristof Provost <kp@FreeBSD.org>

pf: getstates: avoid taking the hashrow lock if the row is empty

Reviewed by: mjg
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30946


# 48d5b863 02-Jul-2021 Mateusz Guzik <mjg@FreeBSD.org>

pf: make DIOCGETSTATESNV iterations killable

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 55cc305d 28-Jun-2021 Mateusz Guzik <mjg@FreeBSD.org>

pf: revert: Use counter(9) for pf_state byte/packet tracking

stats are not shared and consequently per-CPU counters only waste
memory.

No slowdown was measured when passing over 20M pps.

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 8b5f4e69 14-Jun-2021 Kristof Provost <kp@FreeBSD.org>

pf: don't hold a lock during copyout()

copyout() can trigger page faults, so it may potentially sleep.

Reported by: avg
MFC after: 3 days
Sponsored by: Rubicon Communications, LLC ("Netgate")


# ea21980a 09-Jun-2021 Kristof Provost <kp@FreeBSD.org>

pf: use M_WAITOK where possible

In the ioctl path use M_WAITOK allocations whereever possible. These are
less sensitive to memory pressure, and ioctl requests have no hard
deadlines.

Reviewed by: donner
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30702


# 0f86492b 01-Jun-2021 Kristof Provost <kp@FreeBSD.org>

pf: Fix more ioctl memory leaks

We must also remember to free nvlists added to a parent nvlist with
nvlist_append_nvlist_array().

More importantly, when nvlist_pack() allocates memory for us it does so
in the M_NVLIST zone, so we must free it with free(.., M_NVLIST). Using
free(.., M_TEMP) as we did silently failed to free the memory.

MFC after: 3 days
Reported by: kib@
Tested by: kib@
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30595


# ec7b47fc 31-May-2021 Kristof Provost <kp@FreeBSD.org>

pf: Move provider declaration to pf.h

This simplifies life a bit, by not requiring us to repease the
declaration for every file where we want static probe points.

It also makes the gcc6 build happy.


# 3032c353 18-May-2021 Kristof Provost <kp@FreeBSD.org>

pf: Move nvlist conversion functions to pf_nv

Separate the conversion functions (between kernel structs and nvlists)
to pf_nv. This reduces the size of pf_ioctl.c, which is already quite
large and complex, a good bit. It also keeps all the fairly
straightforward conversion code together.

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30359


# 4483fb47 24-May-2021 Kristof Provost <kp@FreeBSD.org>

pf: fix ioctl() memory leak

When we create an nvlist and insert it into another nvlist we must
remember to destroy it. The nvlist_add_nvlist() function makes a copy,
just like nvlist_add_string() makes a copy of the string. If we don't
we're leaking memory on every (nvlist-based) ioctl() call.

While here remove two redundant 'break' statements.

PR: 255971
MFC after: 3 days
Sponsored by: Rubicon Communications, LLC ("Netgate")


# b62489cc 13-May-2021 Kristof Provost <kp@FreeBSD.org>

pf: Support killing floating states by interface

Floating states get assigned to interface 'all' (V_pfi_all), so when we
try to flush all states for an interface states originally created
through this interface are not flushed. Only if-bound states can be
flushed in this way.

Given that we track the original interface we can check if the state's
interface is 'all', and if so compare to the orig_if instead.

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30246


# d0fdf2b2 12-May-2021 Kristof Provost <kp@FreeBSD.org>

pf: Track the original kif for floating states

Track (and display) the interface that created a state, even if it's a
floating state (and thus uses virtual interface 'all').

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30245


# 0592a4c8 05-May-2021 Kristof Provost <kp@FreeBSD.org>

pf: Add DIOCGETSTATESNV

Add DIOCGETSTATESNV, an nvlist-based alternative to DIOCGETSTATES.

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30243


# 1732afaa 05-May-2021 Kristof Provost <kp@FreeBSD.org>

pf: Add DIOCGETSTATENV

Add DIOCGETSTATENV, an nvlist-based alternative to DIOCGETSTATE.

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30242


# 93abcf17 03-May-2021 Kristof Provost <kp@FreeBSD.org>

pf: Support killing 'matching' states

Optionally also kill states that match (i.e. are the NATed state or
opposite direction state entry for) the state we're killing.

See also https://redmine.pfsense.org/issues/8555

Submitted by: Steven Brown
Reviewed by: bcr (man page)
Obtained from: https://github.com/pfsense/FreeBSD-src/pull/11/
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30092


# abbcba9c 30-Apr-2021 Kristof Provost <kp@FreeBSD.org>

pf: Allow states to by killed per 'gateway'

This allows us to kill states created from a rule with route-to/reply-to
set. This is particularly useful in multi-wan setups, where one of the
WAN links goes down.

Submitted by: Steven Brown
Obtained from: https://github.com/pfsense/FreeBSD-src/pull/11/
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30058


# e989530a 29-Apr-2021 Kristof Provost <kp@FreeBSD.org>

pf: Introduce DIOCKILLSTATESNV

Introduce an nvlist based alternative to DIOCKILLSTATES.

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30054


# 7606a45d 29-Apr-2021 Kristof Provost <kp@FreeBSD.org>

pf: Introduce DIOCCLRSTATESNV

Introduce an nvlist variant of DIOCCLRSTATES.

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30052


# 6b146f3b 20-Apr-2021 Kristof Provost <kp@FreeBSD.org>

pf: Error tracing SDTs

Add additional DTrace static trace points to facilitate debugging
failing pf ioctl calls.

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 402dfb0a 24-Apr-2021 Kristof Provost <kp@FreeBSD.org>

pf: Fix parsing of long table names

When parsing the nvlist for a struct pf_addr_wrap we unconditionally
tried to parse "ifname". This broke for PF_ADDR_TABLE when the table
name was longer than IFNAMSIZ. PF_TABLE_NAME_SIZE is longer than
IFNAMSIZ, so this is a valid configuration.

Only parse (or return) ifname or tblname for the corresponding
pf_addr_wrap type.

This manifested as a failure to set rules such as these, where the pfctl
optimiser generated an automatic table:

pass in proto tcp to 192.168.0.1 port ssh
pass in proto tcp to 192.168.0.2 port ssh
pass in proto tcp to 192.168.0.3 port ssh
pass in proto tcp to 192.168.0.4 port ssh
pass in proto tcp to 192.168.0.5 port ssh
pass in proto tcp to 192.168.0.6 port ssh
pass in proto tcp to 192.168.0.7 port ssh

Reported by: Florian Smeets
Tested by: Florian Smeets
Reviewed by: donner
X-MFC-With: 5c11c5a3655842a176124ef2334fcdf830422c8a
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29962


# 6fcc8e04 20-Apr-2021 Kristof Provost <kp@FreeBSD.org>

pf: Allow multiple labels to be set on a rule

Allow up to 5 labels to be set on each rule.
This offers more flexibility in using labels. For example, it replaces
the customer 'schedule' keyword used by pfSense to terminate states
according to a schedule.

Reviewed by: glebius
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29936


# 586aab9e 16-Apr-2021 Kristof Provost <kp@FreeBSD.org>

pf: Refactor state killing

Extract the state killing code from pfioctl() and rephrase the filtering
conditions for readability.

No functional change intended.

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29795


# 42ec75f8 15-Apr-2021 Kristof Provost <kp@FreeBSD.org>

pf: Optionally attempt to preserve rule counter values across ruleset updates

Usually rule counters are reset to zero on every update of the ruleset.
With keepcounters set pf will attempt to find matching rules between old
and new rulesets and preserve the rule counters.

MFC after: 4 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29780


# 4f1f67e8 15-Apr-2021 Kristof Provost <kp@FreeBSD.org>

pf: PFRULE_REFS should not be user-visible

Split the PFRULE_REFS flag from the rule_flag field. PFRULE_REFS is a
kernel-internal flag and should not be exposed to or read from
userspace.

MFC after: 4 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29778


# 2aa21096 13-Apr-2021 Kurosawa Takahiro <takahiro.kurosawa@gmail.com>

pf: Implement the NAT source port selection of MAP-E Customer Edge

MAP-E (RFC 7597) requires special care for selecting source ports
in NAT operation on the Customer Edge because a part of bits of the port
numbers are used by the Border Relay to distinguish another side of the
IPv4-over-IPv6 tunnel.

PR: 254577
Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D29468


# 5e98cae6 12-Apr-2021 Kristof Provost <kp@FreeBSD.org>

pf: Ensure that we don't use kif passed to pfi_kkif_attach()

Once a kif is passed to pfi_kkif_attach() we must ensure we never re-use
it for anything else.
Set the kif to NULL afterwards to guarantee this.

Reported-by: syzbot+be5d4f4a7a4c295e659a@syzkaller.appspotmail.com
MFC after: 4 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")


# d710367d 25-Mar-2021 Kristof Provost <kp@FreeBSD.org>

pf: Implement nvlist variant of DIOCGETRULE

MFC after: 4 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29559


# 5c62eded 11-Mar-2021 Kristof Provost <kp@FreeBSD.org>

pf: Introduce nvlist variant of DIOCADDRULE

This will make future extensions of the API much easier.
The intent is to remove support for DIOCADDRULE in FreeBSD 14.

Reviewed by: markj (previous version), glebius (previous version)
MFC after: 4 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29557


# 4967f672 08-Apr-2021 Kristof Provost <kp@FreeBSD.org>

pf: Remove unused variable rt_listid from struct pf_krule

Reviewed by: donner
MFC after: 4 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29639


# 15b82e00 11-Mar-2021 Kristof Provost <kp@FreeBSD.org>

pf: pool/kpool conversion code

stuct pf_pool and struct pf_kpool are different. We should not simply
bcopy() them.

Happily it turns out that their differences were all pointers, and the
userspace provided pointers were overwritten by the kernel, so this did
actually work correctly, but we should fix it anyway.

Reviewed by: glebius
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29216


# cecfaf9b 10-Mar-2021 Kristof Provost <kp@FreeBSD.org>

pf: Fully remove interrupt events on vnet cleanup

swi_remove() removes the software interrupt handler but does not remove
the associated interrupt event.
This is visible when creating and remove a vnet jail in `procstat -t
12`.

We can remove it manually with intr_event_destroy().

PR: 254171
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D29211


# 913e7dc3 10-Mar-2021 Kristof Provost <kp@FreeBSD.org>

pf: Remove redundant kif != NULL checks

pf_kkif_free() already checks for NULL, so we don't have to check before
we call it.

Reviewed by: melifaro@
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29195


# 5e9dae8e 10-Mar-2021 Kristof Provost <kp@FreeBSD.org>

pf: Factor out pf_krule_free()

Reviewed by: melifaro@
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29194


# 2ed689a6 18-Feb-2021 Kristof Provost <kp@FreeBSD.org>

pf: Fix osfp configuration

pf_rule_to_krule() incorrectly converted the rule osfp configuration to
the krule structure.

Reported by: delphij@
MFC after: 3 days


# c4e0f7aa 17-Feb-2021 Kristof Provost <kp@FreeBSD.org>

pf: Assert that pfil_link() calls succeed

These should only fail if we use them incorrectly, so assert that they
succeed.

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC (“Netgate”’)


# 8a439f32 15-Feb-2021 Kristof Provost <kp@FreeBSD.org>

pf: Remove unused return value from (de)hook_pf()

These functions always return 0, which is good, because the code calling
them doesn't handle this error gracefully.

As the functions always succeed remove their return value, and the code
handling their errors (because it was never executed anyway).

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC (“Netgate”’)


# 5e42cb13 13-Feb-2021 Kristof Provost <kp@FreeBSD.org>

pf: Slightly relax pf_rule_addr validation

Ensure we don't reject no-route / urpf-failed addresses.

PR: 253479
Reported by: michal AT microwave.sk
Revied by: donner@
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D28650


# 7a808c5e 26-Jan-2021 Kristof Provost <kp@FreeBSD.org>

pf: Improve pf_rule input validation

Move the validation checks to pf_rule_to_krule() to reduce duplication.
This also makes the checks consistent across different ioctls.

Reported-by: syzbot+e9632d7ad17398f0bd8f@syzkaller.appspotmail.com
Reviewed by: tuexen@, donner@
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28362


# ea36212b 13-Jan-2021 Kristof Provost <kp@FreeBSD.org>

pf: Don't hold PF_RULES_WLOCK during copyin() on DIOCRCLRTSTATS

We cannot hold a non-sleepable lock during copyin(). This means we can't
safely count the table, so instead we fall back to the pf_ioctl_maxcount
used in other ioctls to protect against overly large requests.

Reported by: syzbot+81e380344d4a6c37d78a@syzkaller.appspotmail.com
MFC after: 1 week


# 26c841e2 13-Dec-2020 Kristof Provost <kp@FreeBSD.org>

pf: Allocate and free pfi_kkif in separate functions

Factor out allocating and freeing pfi_kkif structures. This will be
useful when we change the counters to be counter_u64, so we don't have
to deal with that complexity in the multiple locations where we allocate
pfi_kkif structures.

No functional change.

MFC after: 2 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D27762


# 320c1116 12-Dec-2020 Kristof Provost <kp@FreeBSD.org>

pf: Split pfi_kif into a user and kernel space structure

No functional change.

MFC after: 2 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D27761


# c3adacda 05-Dec-2020 Kristof Provost <kp@FreeBSD.org>

pf: Change pf_krule counters to use counter_u64

This improves the cache behaviour of pf and results in improved
throughput.

MFC after: 2 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D27760


# e86bddea 05-Dec-2020 Kristof Provost <kp@FreeBSD.org>

pf: Split pf_rule into kernel and user space versions

No functional change intended.

MFC after: 2 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D27758


# fbbf270e 13-Nov-2020 Kristof Provost <kp@FreeBSD.org>

pf: Use counter_u64 in pf_src_node

Reviewd by: philip
MFC after: 2 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D27756


# 17ad7334 23-Dec-2020 Kristof Provost <kp@FreeBSD.org>

pf: Split pf_src_node into a kernel and userspace struct

Introduce a kernel version of struct pf_src_node (pf_ksrc_node).

This will allow us to improve the in-kernel data structure without
breaking userspace compatibility.

Reviewed by: philip
MFC after: 2 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D27707


# 1c00efe9 23-Dec-2020 Kristof Provost <kp@FreeBSD.org>

pf: Use counter(9) for pf_state byte/packet tracking

This improves cache behaviour by not writing to the same variable from
multiple cores simultaneously.

pf_state is only used in the kernel, so can be safely modified.

Reviewed by: Lutz Donnerhacke, philip
MFC after: 1 week
Sponsed by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D27661


# 5d49283f 24-Nov-2020 Mark Johnston <markj@FreeBSD.org>

pf: Make tag hashing more robust

tagname2tag() hashes the tag name before truncating it to 63 characters.
tag_unref() removes the tag from the name hash by computing the hash
over the truncated name. Ensure that both operations compute the same
hash for a given tag.

The larger issue is a lack of string validation in pf(4) ioctl handlers.
This is intended to be fixed with some future work, but an extra safety
belt in tagname2hashindex() is worthwhile regardless.

Reported by: syzbot+a0988828aafb00de7d68@syzkaller.appspotmail.com
Reviewed by: kp
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27346


# 662c1305 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

net: clean up empty lines in .c and .h files


# 1ef06ed8 03-May-2020 Kristof Provost <kp@FreeBSD.org>

pf: Improve DIOCADDRULE validation

We expect the addrwrap.p.dyn value to be set to NULL (and assert such),
but do not verify it on input.

Reported-by: syzbot+936a89182e7d8f927de1@syzkaller.appspotmail.com
Reviewed by: melifaro (previous version)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D24538


# a7c85336 26-Apr-2020 Kristof Provost <kp@FreeBSD.org>

pf: Improve input validation

If we pass an anchor name which doesn't exist pfr_table_count() returns
-1, which leads to an overflow in mallocarray() and thus a panic.

Explicitly check that pfr_table_count() does not return an error.

Reported-by: syzbot+bd09d55d897d63d5f4f4@syzkaller.appspotmail.com
Reviewed by: melifaro
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D24539


# 98582ce3 19-Apr-2020 Kristof Provost <kp@FreeBSD.org>

pf: Improve ioctl() input validation

Both DIOCCHANGEADDR and DIOCADDADDR take a struct pf_pooladdr from
userspace. They failed to validate the dyn pointer contained in its
struct pf_addr_wrap member structure.

This triggered assertion failures under fuzz testing in
pfi_dynaddr_setup(). Happily the dyn variable was overruled there, but
we should verify that it's set to NULL anyway.

Reported-by: syzbot+93e93150bc29f9b4b85f@syzkaller.appspotmail.com
Reviewed by: emaste
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D24431


# 95324dc3 17-Apr-2020 Kristof Provost <kp@FreeBSD.org>

pf: Do not allow negative ps_len in DIOCGETSTATES

Userspace may pass a negative ps_len value to us, which causes an
assertion failure in malloc().
Treat negative values as zero, i.e. return the required size.

Reported-by: syzbot+53370d9d0358ee2a059a@syzkaller.appspotmail.com
Reviewed by: lutz at donnerhacke.de
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D24447


# c54ee572 31-Jul-2019 Ed Maste <emaste@FreeBSD.org>

pf: zero (another) output buffer in pfioctl

Avoid potential structure padding leak. r350294 identified a leak via
static analysis; although there's no report of a leak with the
DIOCGETSRCNODES ioctl it's a good practice to zero the memory.

Suggested by: kp
MFC after: 3 days
Sponsored by: The FreeBSD Foundation


# 532bc586 24-Jul-2019 Ed Maste <emaste@FreeBSD.org>

pf: zero output buffer in pfioctl

Avoid potential structure padding leak.

Reported by: Vlad Tsyrklevich <vlad@tsyrklevich.net>
Reviewed by: kp
MFC after: 3 days
Security: Potential kernel memory disclosure
Sponsored by: The FreeBSD Foundation


# 1c75b9d2 18-Apr-2019 Kristof Provost <kp@FreeBSD.org>

pf: No need to M_NOWAIT in DIOCRSETTFLAGS

Now that we don't hold a lock during DIOCRSETTFLAGS memory allocation we can
use M_WAITOK.

MFC after: 1 week
Event: Aberdeen hackathon 2019
Pointed out by: glebius@


# f5e0d9fc 17-Apr-2019 Kristof Provost <kp@FreeBSD.org>

pf: Fix panic on invalid DIOCRSETTFLAGS

If during DIOCRSETTFLAGS pfrio_buffer is NULL copyin() will fault, which we're
not allowed to do with a lock held.
We must count the number of entries in the table and release the lock during
copyin(). Only then can we re-acquire the lock. Note that this is safe, because
pfr_set_tflags() will check if the table and entries exist.

This was discovered by a local syzcaller instance.

MFC after: 1 week
Event: Aberdeen hackathon 2019


# a342f577 26-Mar-2019 Ed Maste <emaste@FreeBSD.org>

pf: use UID_ROOT and GID_WHEEL named constants in make_dev

No functional change but improves consistency and greppability of
make_dev calls.

Discussed with: kp


# f8e7fe32 08-Mar-2019 Kristof Provost <kp@FreeBSD.org>

pf: Fix DIOCGETSRCNODES

r343295 broke DIOCGETSRCNODES by failing to reset 'nr' after counting the
number of source tracking nodes.
This meant that we never copied the information to userspace, leading to '? ->
?' output from pfctl.

PR: 236368
MFC after: 1 week


# d178fee6 10-Feb-2019 Patrick Kelsey <pkelsey@FreeBSD.org>

Place pf_altq_get_nth_active() under the ALTQ ifdef

MFC after: 1 week


# 8f2ac656 10-Feb-2019 Patrick Kelsey <pkelsey@FreeBSD.org>

Reduce the time it takes the kernel to install a new PF config containing a large number of queues

In general, the time savings come from separating the active and
inactive queues lists into separate interface and non-interface queue
lists, and changing the rule and queue tag management from list-based
to hash-bashed.

In HFSC, a linear scan of the class table during each queue destroy
was also eliminated.

There are now two new tunables to control the hash size used for each
tag set (default for each is 128):

net.pf.queue_tag_hashsize
net.pf.rule_tag_hashsize

Reviewed by: kp
MFC after: 1 week
Sponsored by: RG Nets
Differential Revision: https://reviews.freebsd.org/D19131


# d38ca329 01-Feb-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Return PFIL_CONSUMED if packet was consumed. While here gather all
the identical endings of pf_check_*() into single function.

PR: 235411


# b252313f 31-Jan-2019 Gleb Smirnoff <glebius@FreeBSD.org>

New pfil(9) KPI together with newborn pfil API and control utility.

The KPI have been reviewed and cleansed of features that were planned
back 20 years ago and never implemented. The pfil(9) internals have
been made opaque to protocols with only returned types and function
declarations exposed. The KPI is made more strict, but at the same time
more extensible, as kernel uses same command structures that userland
ioctl uses.

In nutshell [KA]PI is about declaring filtering points, declaring
filters and linking and unlinking them together.

New [KA]PI makes it possible to reconfigure pfil(9) configuration:
change order of hooks, rehook filter from one filtering point to a
different one, disconnect a hook on output leaving it on input only,
prepend/append a filter to existing list of filters.

Now it possible for a single packet filter to provide multiple rulesets
that may be linked to different points. Think of per-interface ACLs in
Cisco or Juniper. None of existing packet filters yet support that,
however limited usage is already possible, e.g. default ruleset can
be moved to single interface, as soon as interface would pride their
filtering points.

Another future feature is possiblity to create pfil heads, that provide
not an mbuf pointer but just a memory pointer with length. That would
allow filtering at very early stages of a packet lifecycle, e.g. when
packet has just been received by a NIC and no mbuf was yet allocated.

Differential Revision: https://reviews.freebsd.org/D18951


# 59099cd3 28-Jan-2019 Patrick Kelsey <pkelsey@FreeBSD.org>

Don't re-evaluate ALTQ kernel configuration due to events on non-ALTQ interfaces

Re-evaluating the ALTQ kernel configuration can be expensive,
particularly when there are a large number (hundreds or thousands) of
queues, and is wholly unnecessary in response to events on interfaces
that do not support ALTQ as such interfaces cannot be part of an ALTQ
configuration.

Reviewed by: kp
MFC after: 1 week
Sponsored by: RG Nets
Differential Revision: https://reviews.freebsd.org/D18918


# d9d146e6 24-Jan-2019 Kristof Provost <kp@FreeBSD.org>

pf: Fix use-after-free of counters

When cleaning up a vnet we free the counters in V_pf_default_rule and
V_pf_status from shutdown_pf(), but we can still use them later, for example
through pf_purge_expired_src_nodes().

Free them as the very last operation, as they rely on nothing else themselves.

PR: 235097
MFC after: 1 week


# 180b0dcb 21-Jan-2019 Kristof Provost <kp@FreeBSD.org>

pf: Validate psn_len in DIOCGETSRCNODES

psn_len is controlled by user space, but we allocated memory based on it.
Check how much memory we might need at most (i.e. how many source nodes we
have) and limit the allocation to that.

Reported by: markj
MFC after: 1 week


# fbbf436d 02-Nov-2018 Kristof Provost <kp@FreeBSD.org>

pfsync: Handle syncdev going away

If the syncdev is removed we no longer need to clean up the multicast
entry we've got set up for that device.

Pass the ifnet detach event through pf to pfsync, and remove our
multicast handle, and mark us as no longer having a syncdev.

Note that this callback is always installed, even if the pfsync
interface is disabled (and thus it's not a per-vnet callback pointer).

MFC after: 2 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D17502


# 5f6cf24e 02-Nov-2018 Kristof Provost <kp@FreeBSD.org>

pfsync: Make pfsync callbacks per-vnet

The callbacks are installed and removed depending on the state of the
pfsync device, which is per-vnet. The callbacks must also be per-vnet.

MFC after: 2 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D17499


# 249cc75f 22-Aug-2018 Patrick Kelsey <pkelsey@FreeBSD.org>

Extended pf(4) ioctl interface and pfctl(8) to allow bandwidths of
2^32 bps or greater to be used. Prior to this, bandwidth parameters
would simply wrap at the 2^32 boundary. The computations in the HFSC
scheduler and token bucket regulator have been modified to operate
correctly up to at least 100 Gbps. No other algorithms have been
examined or modified for correct operation above 2^32 bps (some may
have existing computation resolution or overflow issues at rates below
that threshold). pfctl(8) will now limit non-HFSC bandwidth
parameters to 2^32 - 1 before passing them to the kernel.

The extensions to the pf(4) ioctl interface have been made in a
backwards-compatible way by versioning affected data structures,
supporting all versions in the kernel, and implementing macros that
will cause existing code that consumes that interface to use version 0
without source modifications. If version 0 consumers of the interface
are used against a new kernel that has had bandwidth parameters of
2^32 or greater configured by updated tools, such bandwidth parameters
will be reported as 2^32 - 1 bps by those old consumers.

All in-tree consumers of the pf(4) interface have been updated. To
update out-of-tree consumers to the latest version of the interface,
define PFIOC_USE_LATEST ahead of any includes and use the code of
pfctl(8) as a guide for the ioctls of interest.

PR: 211730
Reviewed by: jmallett, kp, loos
MFC after: 2 weeks
Relnotes: yes
Sponsored by: RG Nets
Differential Revision: https://reviews.freebsd.org/D16782


# 5f901c92 24-Jul-2018 Andrew Turner <andrew@FreeBSD.org>

Use the new VNET_DEFINE_STATIC macro when we are defining static VNET
variables.

Reviewed by: bz
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D16147


# 3e603d1f 14-Jul-2018 Kristof Provost <kp@FreeBSD.org>

pf: Fix panic on vnet jail shutdown with synproxy

When shutting down a vnet jail pf_shutdown() clears the remaining states, which
through pf_clear_states() calls pf_unlink_state().
For synproxy states pf_unlink_state() will send a TCP RST, which eventually
tries to schedule the pf swi in pf_send(). This means we can't remove the
software interrupt until after pf_shutdown().

MFC after: 1 week


# cc535c95 03-Jul-2018 Will Andrews <will@FreeBSD.org>

Revert r335833.

Several third-parties use at least some of these ioctls. While it would be
better for regression testing if they were used in base (or at least in the
test suite), it's currently not worth the trouble to push through removal.

Submitted by: antoine, markj


# c1887e9f 30-Jun-2018 Will Andrews <will@FreeBSD.org>

pf: remove unused ioctls.

Several ioctls are unused in pf, in the sense that no base utility
references them. Additionally, a cursory review of pf-based ports
indicates they're not used elsewhere either. Some of them have been
unused since the original import. As far as I can tell, they're also
unused in OpenBSD. Finally, removing this code removes the need for
future pf work to take them into account.

Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D16076


# 455969d3 30-May-2018 Kristof Provost <kp@FreeBSD.org>

pf: Replace rwlock on PF_RULES_LOCK with rmlock

Given that PF_RULES_LOCK is a mostly read lock, replace the rwlock with rmlock.
This change improves packet processing rate in high pps environments.
Benchmarking by olivier@ shows a 65% improvement in pps.

While here, also eliminate all appearances of "sys/rwlock.h" includes since it
is not used anymore.

Submitted by: farrokhi@
Differential Revision: https://reviews.freebsd.org/D15502


# c41420d5 11-Apr-2018 Kristof Provost <kp@FreeBSD.org>

pf: limit ioctl to a reasonable and tuneable number of elements

pf ioctls frequently take a variable number of elements as argument. This can
potentially allow users to request very large allocations. These will fail,
but even a failing M_NOWAIT might tie up resources and result in concurrent
M_WAITOK allocations entering vm_wait and inducing reclamation of caches.

Limit these ioctls to what should be a reasonable value, but allow users to
tune it should they need to.

Differential Revision: https://reviews.freebsd.org/D15018


# 1a125a2f 06-Apr-2018 Kristof Provost <kp@FreeBSD.org>

pf: Improve ioctl validation

Ensure that multiplications for memory allocations cannot overflow, and
that we'll not try to allocate M_WAITOK for potentially overly large
allocations.

MFC after: 1 week


# 02214ac8 06-Apr-2018 Kristof Provost <kp@FreeBSD.org>

pf: Improve ioctl validation for DIOCIGETIFACES and DIOCXCOMMIT

These ioctls can process a number of items at a time, which puts us at
risk of overflow in mallocarray() and of impossibly large allocations
even if we don't overflow.

There's no obvious limit to the request size for these, so we limit the
requests to something which won't overflow. Change the memory allocation
to M_NOWAIT so excessive requests will fail rather than stall forever.

MFC after: 1 week


# adfe2f6a 06-Apr-2018 Kristof Provost <kp@FreeBSD.org>

pf: Improve ioctl validation for DIOCRGETTABLES, DIOCRGETTSTATS, DIOCRCLRTSTATS and DIOCRSETTFLAGS

These ioctls can process a number of items at a time, which puts us at
risk of overflow in mallocarray() and of impossibly large allocations
even if we don't overflow.

Limit the allocation to required size (or the user allocation, if that's
smaller). That does mean we need to do the allocation with the rules
lock held (so the number doesn't change while we're doing this), so it
can't M_WAITOK.

MFC after: 1 week


# 8748b499 06-Apr-2018 Kristof Provost <kp@FreeBSD.org>

pf: Improve ioctl validation for DIOCRADDTABLES and DIOCRDELTABLES

The DIOCRADDTABLES and DIOCRDELTABLES ioctls can process a number of
tables at a time, and as such try to allocate <number of tables> *
sizeof(struct pfr_table). This multiplication can overflow. Thanks to
mallocarray() this is not exploitable, but an overflow does panic the
system.

Arbitrarily limit this to 65535 tables. pfctl only ever processes one
table at a time, so it presents no issues there.

MFC after: 1 week


# effaab88 23-Mar-2018 Kristof Provost <kp@FreeBSD.org>

netpfil: Introduce PFIL_FWD flag

Forwarded packets passed through PFIL_OUT, which made it difficult for
firewalls to figure out if they were forwarding or producing packets. This in
turn is an issue for pf for IPv6 fragment handling: it needs to call
ip6_output() or ip6_forward() to handle the fragments. Figuring out which was
difficult (and until now, incorrect).
Having pfil distinguish the two removes an ugly piece of code from pf.

Introduce a new variant of the netpfil callbacks with a flags variable, which
has PFIL_FWD set for forwarded packets. This allows pf to reliably work out if
a packet is forwarded.

Reviewed by: ae, kevans
Differential Revision: https://reviews.freebsd.org/D13715


# 6273ba66 07-Jan-2018 Kristof Provost <kp@FreeBSD.org>

pf: Avoid integer overflow issues by using mallocarray() iso. malloc()

pfioctl() handles several ioctl that takes variable length input, these
include:
- DIOCRADDTABLES
- DIOCRDELTABLES
- DIOCRGETTABLES
- DIOCRGETTSTATS
- DIOCRCLRTSTATS
- DIOCRSETTFLAGS

All of them take a pfioc_table struct as input from userland. One of
its elements (pfrio_size) is used in a buffer length calculation.
The calculation contains an integer overflow which if triggered can lead
to out of bound reads and writes later on.

Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>


# 9d671fee 31-Dec-2017 Kristof Provost <kp@FreeBSD.org>

pf: Allow the module to be unloaded

pf can now be safely unloaded. Most of this code is exercised on vnet
jail shutdown.

Don't block unloading.


# fe267a55 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: general adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.


# 468cefa2 07-May-2017 Kristof Provost <kp@FreeBSD.org>

pf: Fix vnet initialisation

When running the vnet init code (pf_load_vnet()) we used to iterate over
all vnets, marking them as unhooked.
This is incorrect and leads to panics if pf is unloaded, as the unload
code does not unregister the pfil hooks (because the vnet is marked as
unhooked).

There's no need or reason to touch other vnets during initialisation.
Their pf_load_vnet() function will be triggered, which handles all
required initialisation.

Reviewed by: zec, gnn
Differential Revision: https://reviews.freebsd.org/D10592


# 64c79ee7 03-May-2017 Kristof Provost <kp@FreeBSD.org>

pf: Fix panic on unload

vnet_pf_uninit() is called through vnet_deregister_sysuninit() and
linker_file_unload() when the pf module is unloaded. This is executed
after pf_unload() so we end up trying to take locks which have been
destroyed already.

Move pf_unload() to a separate SYSUNINIT() to ensure it's called after
all the vnet_pf_uninit() calls.

Differential Revision: https://reviews.freebsd.org/D10025


# 4e261006 18-Apr-2017 Kristof Provost <kp@FreeBSD.org>

pf: Also clear limit counters

The "pfctl -F info" command didn't clear the limit counters ( as shown in the
"pfctl -vsi" output).

Submitted by: Max <maximos@als.nnov.ru>


# 9f5efe71 13-Apr-2017 Gleb Smirnoff <glebius@FreeBSD.org>

Fix potential NULL deref.

Found by: PVS Studio


# 2f8fb3a8 22-Mar-2017 Kristof Provost <kp@FreeBSD.org>

pf: Fix possible shutdown race

Prevent possible races in the pf_unload() / pf_purge_thread() shutdown
code. Lock the pf_purge_thread() with the new pf_end_lock to prevent
these races.

Use a shared/exclusive lock, as we need to also acquire another sx lock
(VNET_LIST_RLOCK). It's fine for both pf_purge_thread() and pf_unload()
to sleep,

Pointed out by: eri, glebius, jhb
Differential Revision: https://reviews.freebsd.org/D10026


# 5c172e70 17-Mar-2017 Kristof Provost <kp@FreeBSD.org>

pf: Fix memory leak on vnet shutdown or unload

Rules are unlinked in shutdown_pf(), so we must call
pf_unload_vnet_purge(), which frees unlinked rules, after that, not
before.

Reviewed by: eri, bz
Differential Revision: https://reviews.freebsd.org/D10040


# 2a57d24b 11-Mar-2017 Kristof Provost <kp@FreeBSD.org>

pf: Fix incorrect rw_sleep() in pf_unload()

When we unload we don't hold the pf_rules_lock, so we cannot call rw_sleep()
with it, because it would release a lock we do not hold. There's no need for the
lock either, so we can just tsleep().

While here also make the same change in pf_purge_thread(), because it explicitly
takes the lock before rw_sleep() and then immediately releases it afterwards.


# 813196a1 04-Oct-2016 Kristof Provost <kp@FreeBSD.org>

pf: remove fastroute tag

The tag fastroute came from ipf and was removed in OpenBSD in 2011. The code
allows to skip the in pfil hooks and completely removes the out pfil invoke,
albeit looking up a route that the IP stack will likely find on its own.
The code between IPv4 and IPv6 is also inconsistent and marked as "XXX"
for years.

Submitted by: Franco Fichtner <franco@opnsense.org>
Differential Revision: https://reviews.freebsd.org/D8058


# aa7cac58 08-Jul-2016 Kristof Provost <kp@FreeBSD.org>

pf: Map hook returns onto the correct error values

pf returns PF_PASS, PF_DROP, ... in the netpfil hooks, but the hook callers
expect to get E<foo> error codes.
Map the returns values. A pass is 0 (everything is OK), anything else means
pf ate the packet, so return EACCES, which tells the stack not to emit an ICMP
error message.

PR: 207598


# a8fc1b78 24-Jun-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

The void isn't void.

Unbreak sparc64 and powerpc builds.

Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
MFC after: 12 days


# a8e8c574 23-Jun-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

PFSTATE_NOSYNC goes onto state_flags, not sync_state;
this prevents: panic: pfsync_delete_state: unexpected sync state 8

Reviewed by: kp
Approved by: re (gjb)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6942


# a0429b54 23-Jun-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Update pf(4) and pflog(4) to survive basic VNET testing, which includes
proper virtualisation, teardown, avoiding use-after-free, race conditions,
no longer creating a thread per VNET (which could easily be a couple of
thousand threads), gracefully ignoring global events (e.g., eventhandlers)
on teardown, clearing various globally cached pointers and checking
them before use.

Reviewed by: kp
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D6924


# 8147948e 22-Jun-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Import a fix for and old security issue (CVE-2010-3830) in pf which
was not relevant to FreeBSD as only root could open /dev/pf by default.
With VIMAGE this is will longer be the case. As pf(4) starts to
be supported with VNETs 3rd party users may open /dev/pf inside the
virtual jail instance; thus we need to address this issue after all.
While OpenBSD largely rewrote code parts for the fix [1], and it's
unclear what Apple [3] did, import the minimal fix from NetBSD [2].

[1] http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/net/pf_ioctl.c.diff?r1=1.235&r2=1.236
[2] http://mail-index.netbsd.org/source-changes/2011/01/19/msg017518.html
[3] https://support.apple.com/en-gb/HT202154

Obtained from: http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/dist/pf/net/pf_ioctl.c.diff?r1=1.42&r2=1.43&only_with_tag=MAIN
MFC After: 2 weeks
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Security: CVE-2010-3830


# 89856f7e 21-Jun-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Get closer to a VIMAGE network stack teardown from top to bottom rather
than removing the network interfaces first. This change is rather larger
and convoluted as the ordering requirements cannot be separated.

Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and
related modules to their own SI_SUB_PROTO_FIREWALL.
Move initialization of "physical" interfaces to SI_SUB_DRIVERS,
move virtual (cloned) interfaces to SI_SUB_PSEUDO.
Move Multicast to SI_SUB_PROTO_MC.

Re-work parts of multicast initialisation and teardown, not taking the
huge amount of memory into account if used as a module yet.

For interface teardown we try to do as many of them as we can on
SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling
over a higher layer protocol such as IP. In that case the interface
has to go along (or before) the higher layer protocol is shutdown.

Kernel hhooks need to go last on teardown as they may be used at various
higher layers and we cannot remove them before we cleaned up the higher
layers.

For interface teardown there are multiple paths:
(a) a cloned interface is destroyed (inside a VIMAGE or in the base system),
(b) any interface is moved from a virtual network stack to a different
network stack ("vmove"), or (c) a virtual network stack is being shut down.
All code paths go through if_detach_internal() where we, depending on the
vmove flag or the vnet state, make a decision on how much to shut down;
in case we are destroying a VNET the individual protocol layers will
cleanup their own parts thus we cannot do so again for each interface as
we end up with, e.g., double-frees, destroying locks twice or acquiring
already destroyed locks.
When calling into protocol cleanups we equally have to tell them
whether they need to detach upper layer protocols ("ulp") or not
(e.g., in6_ifdetach()).

Provide or enahnce helper functions to do proper cleanup at a protocol
rather than at an interface level.

Approved by: re (hrs)
Obtained from: projects/vnet
Reviewed by: gnn, jhb
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D6747


# 3e248e0f 17-Jun-2016 Kristof Provost <kp@FreeBSD.org>

pf: Filter on and set vlan PCP values

Adopt the OpenBSD syntax for setting and filtering on VLAN PCP values. This
introduces two new keywords: 'set prio' to set the PCP value, and 'prio' to
filter on it.

Reviewed by: allanjude, araujo
Approved by: re (gjb)
Obtained from: OpenBSD (mostly)
Differential Revision: https://reviews.freebsd.org/D6786


# a4641f4e 03-May-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/net*: minor spelling fixes.

No functional change.


# 14b5e85b 25-Feb-2016 Kristof Provost <kp@FreeBSD.org>

pf: Fix possible out-of-bounds write

In the DIOCRSETADDRS ioctl() handler we allocate a table for struct pfr_addrs,
which is processed in pfr_set_addrs(). At the users request we also provide
feedback on the deleted addresses, by storing them after the new list
('bcopy(&ad, addr + size + i, sizeof(ad));' in pfr_set_addrs()).

This means we write outside the bounds of the buffer we've just allocated.
We need to look at pfrio_size2 instead (i.e. the size the user reserved for our
feedback). That'd allow a malicious user to specify a smaller pfrio_size2 than
pfrio_size though, in which case we'd still read outside of the allocated
buffer. Instead we allocate the largest of the two values.

Reported By: Paul J Murphy <paul@inetstat.net>
PR: 207463
MFC after: 5 days
Differential Revision: https://reviews.freebsd.org/D5426


# 1f12da0e 22-Jan-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Just checkpoint the WIP in order to be able to make the tree update
easier. Note: this is currently not in a usable state as certain
teardown parts are not called and the DOMAIN rework is missing.
More to come soon and find its way to head.

Obtained from: P4 //depot/user/bz/vimage/...
Sponsored by: The FreeBSD Foundation


# c110fc49 14-Oct-2015 Kristof Provost <kp@FreeBSD.org>

pf: Fix TSO issues

In certain configurations (mostly but not exclusively as a VM on Xen) pf
produced packets with an invalid TCP checksum.

The problem was that pf could only handle packets with a full checksum. The
FreeBSD IP stack produces TCP packets with a pseudo-header checksum (only
addresses, length and protocol).
Certain network interfaces expect to see the pseudo-header checksum, so they
end up producing packets with invalid checksums.

To fix this stop calculating the full checksum and teach pf to only update TCP
checksums if TSO is disabled or the change affects the pseudo-header checksum.

PR: 154428, 193579, 198868
Reviewed by: sbruno
MFC after: 1 week
Relnotes: yes
Sponsored by: RootBSD
Differential Revision: https://reviews.freebsd.org/D3779


# f2fc809d 17-Aug-2015 Luiz Otavio O Souza <loos@FreeBSD.org>

Fix the copy of addresses passed from userland in table replace command.

The size2 is the maximum userland buffer size (used when the addresses are
copied back to userland).

Obtained from: pfSense
MFC after: 3 days
Sponsored by: Rubicon Communications (Netgate)


# 643ef281 11-Aug-2015 Mariusz Zaborski <oshogbo@FreeBSD.org>

Use correct src/dst ports when removing states.

Submitted by: Milosz Kaniewski <m.kaniewski@wheelsystems.com>,
UMEZAWA Takeshi <umezawa@iij.ad.jp> (orginal)
Reviewed by: glebius
Approved by: pjd (mentor)
Obtained from: OpenBSD
MFC after: 3 days


# 30fe681e 19-May-2015 Gleb Smirnoff <glebius@FreeBSD.org>

During module unload unlock rules before destroying UMA zones, which
may sleep in uma_drain(). It is safe to unlock here, since we are already
dehooked from pfil(9) and all pf threads had quit.

Sponsored by: Nginx, Inc.


# 772e66a6 16-Apr-2015 Gleb Smirnoff <glebius@FreeBSD.org>

Move ALTQ from contrib to net/altq. The ALTQ code is for many years
discontinued by its initial authors. In FreeBSD the code was already
slightly edited during the pf(4) SMP project. It is about to be edited
more in the projects/ifnet. Moving out of contrib also allows to remove
several hacks to the make glue.

Reviewed by: net@


# 3e8c6d74 16-Mar-2015 Gleb Smirnoff <glebius@FreeBSD.org>

Always lock the hash row of a source node when updating its 'states' counter.

PR: 182401
Sponsored by: Nginx, Inc.


# efc6c51f 21-Jan-2015 Gleb Smirnoff <glebius@FreeBSD.org>

Back out r276841, r276756, r276747, r276746. The change in r276747 is very
very questionable, since it makes vimages more dependent on each other. But
the reason for the backout is that it screwed up shutting down the pf purge
threads, and now kernel immedially panics on pf module unload. Although module
unloading isn't an advertised feature of pf, it is very important for
development process.

I'd like to not backout r276746, since in general it is good. But since it
has introduced numerous build breakages, that later were addressed in
r276841, r276756, r276747, I need to back it out as well. Better replay it
in clean fashion from scratch.


# 8d665c6b 06-Jan-2015 Craig Rodrigues <rodrigc@FreeBSD.org>

Reapply previous patch to fix build.

PR: 194515


# c75820c7 06-Jan-2015 Craig Rodrigues <rodrigc@FreeBSD.org>

Merge: r258322 from projects/pf branch

Split functions that initialize various pf parts into their
vimage parts and global parts.
Since global parts appeared to be only mutex initializations, just
abandon them and use MTX_SYSINIT() instead.
Kill my incorrect VNET_FOREACH() iterator and instead use correct
approach with VNET_SYSINIT().

PR: 194515
Differential Revision: D1309
Submitted by: glebius, Nikos Vassiliadis <nvass@gmx.com>
Reviewed by: trociny, zec, gnn


# 7b56cc43 19-Nov-2014 Ermal Luçi <eri@FreeBSD.org>

pf(4) needs to have a correct checksum during its processing.
Calculate checksums for the IPv6 path when needed before
delving into pf(4) code as required.

PR: 172648, 179392
Reviewed by: glebius@
Approved by: gnn@
Obtained from: pfSense
MFC after: 1 week
Sponsored by: Netgate


# 450cecf0 12-Sep-2014 Gleb Smirnoff <glebius@FreeBSD.org>

- Provide a sleepable lock to protect against ioctl() vs ioctl() races.
- Use the new lock to protect against simultaneous DIOCSTART and/or
DIOCSTOP ioctls.

Reported & tested by: jmallett
Sponsored by: Nginx, Inc.


# a9572d8f 14-Aug-2014 Gleb Smirnoff <glebius@FreeBSD.org>

- Count global pf(4) statistics in counter(9).
- Do not count global number of states and of src_nodes,
use uma_zone_get_cur() to obtain values.
- Struct pf_status becomes merely an ioctl API structure,
and moves to netpfil/pf/pf.h with its constants.
- V_pf_status is now of type struct pf_kstatus.

Submitted by: Kajetan Staszkiewicz <vegeta tuxpowered.net>
Sponsored by: InnoGames GmbH


# 53f4b0cf 25-Apr-2014 Gleb Smirnoff <glebius@FreeBSD.org>

The current API for adding rules with pool addresses is the following:

- DIOCADDADDR adds addresses and puts them into V_pf_pabuf
- DIOCADDRULE takes all addresses from V_pf_pabuf and links
them into rule.

The ugly part is that if address is a table, then it is initialized
in DIOCADDRULE, because we need ruleset, and DIOCADDADDR doesn't
supply ruleset. But if address is a dynaddr, we need address family,
and address family could be different for different addresses in one
rule, so dynaddr is initialized in DIOCADDADDR.

This leads to the entangled state of addresses on V_pf_pabuf. Some are
initialized, and some not. That's why running pf_empty_pool(&V_pf_pabuf)
can lead to a panic on a NULL table address.

Since proper fix requires API/ABI change, for now simply plug the panic
in pf_empty_pool().

Reported by: danger


# 7e92ce73 29-Mar-2014 Martin Matuska <mm@FreeBSD.org>

De-virtualize UMA zone pf_mtag_z and move to global initialization part.

The m_tag struct does not know about vnet context and the pf_mtag_free()
callback is called unaware of current vnet. This causes a panic.

Reviewed by: Nikos Vassiliadis, trociny@


# fb3541ad 04-Mar-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Instead of playing games with casts simply add 3 more members to the
structure pf_rule, that are used when the structure is passed via
ioctl().

PR: 187074


# 48278b88 14-Feb-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Once pf became not covered by a single mutex, many counters in it became
race prone. Some just gather statistics, but some are later used in
different calculations.

A real problem was the race provoked underflow of the states_cur counter
on a rule. Once it goes below zero, it wraps to UINT32_MAX. Later this
value is used in pf_state_expires() and any state created by this rule
is immediately expired.

Thus, make fields states_cur, states_tot and src_nodes of struct
pf_rule be counter(9)s.

Thanks to Dennis for providing me shell access to problematic box and
his help with reproducing, debugging and investigating the problem.

Thanks to: Dennis Yusupoff <dyr smartspb.net>
Also reported by: dumbbell, pgj, Rambler
Sponsored by: Nginx, Inc.


# d26bbeb9 22-Jan-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Fix resource leak and simplify code for DIOCCHANGEADDR.

CID: 1007035


# 19acaeca 22-Nov-2013 Gleb Smirnoff <glebius@FreeBSD.org>

The DIOCKILLSRCNODES operation was implemented with O(m*n) complexity,
where "m" is number of source nodes and "n" is number of states. Thus,
on heavy loaded router its processing consumed a lot of CPU time.

Reimplement it with O(m+n) complexity. We first scan through source
nodes and disconnect matching ones, putting them on the freelist and
marking with a cookie value in their expire field. Then we scan through
the states, detecting references to source nodes with a cookie, and
disconnect them as well. Then the freelist is passed to pf_free_src_nodes().

In collaboration with: Kajetan Staszkiewicz <kajetan.staszkiewicz innogames.de>
PR: kern/176763
Sponsored by: InnoGames GmbH
Sponsored by: Nginx, Inc.


# 1320f8c0 22-Nov-2013 Gleb Smirnoff <glebius@FreeBSD.org>

Fix off by ones when scanning source nodes hash.

Sponsored by: Nginx, Inc.


# f053058c 18-Nov-2013 Gleb Smirnoff <glebius@FreeBSD.org>

- Split functions that initialize various pf parts into their vimage
parts and global parts.
- Since global parts appeared to be only mutex initializations, just
abandon them and use MTX_SYSINIT() instead.
- Kill my incorrect VNET_FOREACH() iterator and instead use correct
approach with VNET_SYSINIT().

Submitted by: Nikos Vassiliadis <nvass gmx.com>
Reviewed by: trociny


# 7710f9f1 04-Nov-2013 Gleb Smirnoff <glebius@FreeBSD.org>

Remove unused PFTM_UNTIL_PACKET const.


# 76039bc8 26-Oct-2013 Gleb Smirnoff <glebius@FreeBSD.org>

The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by: Netflix
Sponsored by: Nginx, Inc.


# 6828cc99 19-Jun-2013 Gleb Smirnoff <glebius@FreeBSD.org>

De-vnet hash sizes and hash masks.

Submitted by: Nikos Vassiliadis <nvass gmx.com>
Reviewed by: trociny


# 048c9541 11-May-2013 Gleb Smirnoff <glebius@FreeBSD.org>

Fix DIOCADDSTATE operation.


# d8aa10cc 28-Dec-2012 Gleb Smirnoff <glebius@FreeBSD.org>

In netpfil/pf:
- Add my copyright to files I've touched a lot this year.
- Add dash in front of all copyright notices according to style(9).
- Move $OpenBSD$ down below copyright notices.
- Remove extra line between cdefs.h and __FBSDID.


# bf1e95a2 15-Dec-2012 Mikolaj Golub <trociny@FreeBSD.org>

In pfioctl, if the permission checks failed we returned with vnet context
set.

As the checks don't require vnet context, this is fixed by setting
vnet after the checks.

PR: kern/160541
Submitted by: Nikos Vassiliadis (slightly different approach)


# 9823d527 10-Oct-2012 Kevin Lo <kevlo@FreeBSD.org>

Revert previous commit...

Pointyhat to: kevlo (myself)


# a10cee30 09-Oct-2012 Kevin Lo <kevlo@FreeBSD.org>

Prefer NULL over 0 for pointers


# b833c0d9 08-Oct-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Any pfil(9) hooks should be called with already set VNET context.

Reviewed by: bz


# 21d172a3 06-Oct-2012 Gleb Smirnoff <glebius@FreeBSD.org>

A step in resolving mess with byte ordering for AF_INET. After this change:

- All packets in NETISR_IP queue are in net byte order.
- ip_input() is entered in net byte order and converts packet
to host byte order right _after_ processing pfil(9) hooks.
- ip_output() is entered in host byte order and converts packet
to net byte order right _before_ processing pfil(9) hooks.
- ip_fragment() accepts and emits packet in net byte order.
- ip_forward(), ip_mloopback() use host byte order (untouched actually).
- ip_fastforward() no longer modifies packet at all (except ip_ttl).
- Swapping of byte order there and back removed from the following modules:
pf(4), ipfw(4), enc(4), if_bridge(4).
- Swapping of byte order added to ipfilter(4), based on __FreeBSD_version
- __FreeBSD_version bumped.
- pfil(9) manual page updated.

Reviewed by: ray, luigi, eri, melifaro
Tested by: glebius (LE), ray (BE)


# 51e02a31 22-Sep-2012 Gleb Smirnoff <glebius@FreeBSD.org>

EBUSY is a better reply for refusing to unload pf(4) or pfsync(4).

Submitted by: pluknet


# 7f7ef494 18-Sep-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Provide kernel compile time option to make pf(4) default rule to drop.

This is important to secure a small timeframe at boot time, when
network is already configured, but pf(4) is not yet.

PR: kern/171622
Submitted by: Olivier Cochard-LabbИ <olivier cochard.me>


# effbcf38 18-Sep-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Fix DIOCNATLOOK: zero key padding before performing lookup.


# 3b3a8eb9 14-Sep-2012 Gleb Smirnoff <glebius@FreeBSD.org>

o Create directory sys/netpfil, where all packet filters should
reside, and move there ipfw(4) and pf(4).

o Move most modified parts of pf out of contrib.

Actual movements:

sys/contrib/pf/net/*.c -> sys/netpfil/pf/
sys/contrib/pf/net/*.h -> sys/net/
contrib/pf/pfctl/*.c -> sbin/pfctl
contrib/pf/pfctl/*.h -> sbin/pfctl
contrib/pf/pfctl/pfctl.8 -> sbin/pfctl
contrib/pf/pfctl/*.4 -> share/man/man4
contrib/pf/pfctl/*.5 -> share/man/man5

sys/netinet/ipfw -> sys/netpfil/ipfw

The arguable movement is pf/net/*.h -> sys/net. There are
future plans to refactor pf includes, so I decided not to
break things twice.

Not modified bits of pf left in contrib: authpf, ftp-proxy,
tftp-proxy, pflogd.

The ipfw(4) movement is planned to be merged to stable/9,
to make head and stable match.

Discussed with: bz, luigi