History log of /openbsd-current/sys/net/pf_lb.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.74 10-May-2023 sashan

nat-to may fail to insert state due to conflict on chosen source
port number. This is typically indicated by 'wire key attach failed on...'
message when pf(4) debugging is enabled. The problem is caused by
glitch in pf_get_sport() which fails to discover conflict in advance.
In order to fix it we must also calculate toeplitz hash in
pf_get_sport() to initialize look up key properly.

the bug has been kindly reported by joosepm _von_ gmail _dot_ com

OK dlg@


Revision tags: OPENBSD_7_3_BASE
# 1.73 04-Jan-2023 dlg

move the pf_state_tree_id type from pfvar.h to pfvar_priv.h.

the pf_state_tree_id type is private to the kernel.

while here, move it from being an RB tree to an RBT tree. this saves
about 12k in pf.o on amd64.

ok sashan@


Revision tags: OPENBSD_7_2_BASE
# 1.72 31-Aug-2022 benno

make kernel build without INET6 again
ok sashan@


# 1.71 03-Aug-2022 sashan

Bug was reported by Chriss Cappucio. It has turned out my earlier change
to pf_lb.c was not complete. We must add a test to determine number of
addresses defined by pool, so we don't treat pool definition
172.16.0.0/16 as a single IP address in pool. If pool is defined as
172.16.0.0/16, then we don't want to fall back to PF_POOL_NONE. Missing
this measure in pf_map_addr() may cause pf_get_sport() to enter infinite
loop when source ports translation become depleted for the first address
found in pool (like 172.16.0.1), because the bug prevents pf_map_addr()
to move to next address in pool (like 172.16.0.2).

while investigating issue I've also noticed an oddity for small random
pools such as 192.168.1.32/28. One would expect the addresses for nat
will be randomly picked from range .32 - .47 in this case. however the
random selection yield significantly more (like 20%) addresses ending by .32
In order to fix it we make random pool to use arc4random_uniform(~mask + 1)
instead of current arc4random().

feedback by claudio@
tested by hrvoje@


Revision tags: OPENBSD_7_1_BASE
# 1.70 16-Feb-2022 sashan

nat-to round-robin without a pool should fallback to POOL_NONE
bug reported by giovanni@

OK giovanni@


# 1.69 16-Dec-2021 sashan

fix zero division found by syzkaller. The sanity checks in pf(4) ioctls
are not powerful enough to detect invalid port ranges (or even invalid
rules). syzkaller does not use pfctl(8), it uses ioctl(2) to pass some
random chunk of memory as a rule to pf(4). Fix adds explicit check
for 0 divider to pf_get_transaddr(). It should make syzkaller happy
without disturbing anyone else.

OK gnezdo@

Reported-by: syzbot+d1f00da48fa717e171f3@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.68 12-Dec-2020 jan

Correct wrong type of variable and remove useless casts.

OK bluhm@


Revision tags: OPENBSD_6_8_BASE
# 1.67 29-Jul-2020 yasuoka

Fix previous commit which referred wrong address and returned wrong
value.

ok sashan


# 1.66 28-Jul-2020 yasuoka

Use the table on root always if current table is not active.

ok sashan


# 1.65 24-Jul-2020 yasuoka

Increase state counter for least-states when the address is selected
by sticky-address. Also fix the problem that the interface which is
specified by the selected table entry is not used properly.

ok jung sashan


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.64 02-Jul-2019 yasuoka

When source address tracking record is used for "route-to", the next
hop interface configured with "route-to" was not used. Keep the
interface within the pf_src_node and use it when the record is used.

OK sashan


Revision tags: OPENBSD_6_5_BASE
# 1.63 10-Dec-2018 kn

Remove useless macros

These are just unhelpful case conversion.

OK sashan henning


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.62 06-Feb-2018 henning

some finger muscle workout:
bzero -> memset and (very few) bcopy -> memcpy/memmove


Revision tags: OPENBSD_6_2_BASE
# 1.61 12-Jul-2017 bluhm

Use a 32 bit variable to detect integer overflow when searching for
an unused nat port. Prevents a possible endless loop if high port
is 65535 or low port is 0.
report and analysis Jingmin Zhou; OK sashan@ visa@


# 1.60 23-Apr-2017 sthen

Some of the LOG_NOTICE messages from PF were seen in normal operations
with certain rulesets and excessively noisy; move them to LOG_INFO (which was
previously unused). ok benno@


Revision tags: OPENBSD_6_1_BASE
# 1.59 08-Feb-2017 jsg

Remove an uneeded NULL test which was after a deref.
ok mpi@ henning@ sashan@


# 1.58 26-Oct-2016 bluhm

Put union pf_headers and struct pf_pdesc into separate header file
pfvar_priv.h. The pf_headers had to be defined in multiple .c files
before. In pfvar.h it would have unknown storage size, this file
is included in too many places. The idea is to have a private pf
header that is only included in the pf part of the kernel. For now
it contains pf_pdesc and pf_headers, it may be extended later.
discussion, input and OK henning@ procter@ sashan@


# 1.57 27-Sep-2016 dlg

roll back turning RB into RBT until i get better at this process.


# 1.56 27-Sep-2016 dlg

move pf from the RB macros to the RBT functions.


Revision tags: OPENBSD_6_0_BASE
# 1.55 19-Jul-2016 henning

remove wrong and misleading comment, ok phessler


# 1.54 24-Jun-2016 bluhm

The function pf_get_sport() did work for out rules only. Make it
aware of the direction of the packet. Now nat-to can be used by
in rules and together with divert-to. Collisions with existing
states are found and produce a "NAT proxy port allocation failed"
message.
OK henning@ mikeb@


# 1.53 15-Jun-2016 mikeb

There's no need to convert values returned by arc4random to the network
byte order. Spotted by Gleb Smirnoff (glebius@FreeBSD.org), thanks!

ok tedu


Revision tags: OPENBSD_5_9_BASE
# 1.52 24-Nov-2015 mpi

No need for <net/if_types.h>

As a bonus this removes a "#if NCARP > 0", say yeah!


# 1.51 15-Oct-2015 bluhm

When using a pf rule with both nat-to and rdr-to, it could happen
that the nated source port was reused as destination port. Do not
initialize nport at the beginning of the function, but where it is
needed.
OK sashan@


# 1.50 13-Oct-2015 sashan

- pf_insert_src_node(): global argument (arg6) is useless, function
always gets pointer to rule.

- pf_remove_src_node(): function should always remove matching src node,
regardless the sn->rule.ptr being NULL or valid rule

- sn->rule.ptr is never NULL, spotted by mpi and Richard Procter _von_ gmail.com

OK mpi@, OK mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.49 03-Aug-2015 jsg

A recently added sanity check panic in pf_postprocess_addr() was
triggered for a reply-to rule. It turns out this case has been using
uninitialised memory as if it were a valid pf pool.

As the rest of the function assumes a valid pool for now just return.

Problem reported by RD Thrush.

ok jung@ mikeb@


# 1.48 20-Jul-2015 jsg

Add some panics to default paths where code later assumes a non default
path was taken. This both prevents warnings from clang and acts as a
sanity check.

ok mcbride@ henning@


# 1.47 18-Jul-2015 sashan

msg.mpi


# 1.46 18-Jul-2015 sashan

INET/INET6 address family check should be unified in PF

it also adds af_unhandled(), where it is currently missing.

ok mcbride@


# 1.45 17-Jul-2015 jsg

fix the indentation of a block of code, no binary change
ok mikeb@ some time ago


# 1.44 16-Jul-2015 mpi

Expand ancient NTOHL/NTOHS/HTONS/HTONL macros.

ok guenther@, henning@


# 1.43 03-Jun-2015 yasuoka

Fix pf_map_addr() not to cause dividing by 0. This fixes problem when
using table or dynamic interface addresses for source-hash. Also
avoid calling arc4random_uniform() with upper_bound == 0.

ok mikeb


# 1.42 14-Mar-2015 jsg

Remove some includes include-what-you-use claims don't
have any direct symbols used. Tested for indirect use by compiling
amd64/i386/sparc64 kernels.

ok tedu@ deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.41 06-Jan-2015 jsg

init a potentially uninitialised var in pf_postprocess_addr
ok mikeb@ henning@


# 1.40 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.39 19-Dec-2014 reyk

Support source-hash and random with tables and dynifs; not just pools.
This finally allows to use source-hash for dynamic loadbalancing, eg.
"rdr-to <hosts> source-hash", instead of just round-robin and least-states.

An older pre-siphash version of this diff was tested by many people.

OK tedu@ benno@


# 1.38 19-Dec-2014 mcbride

Comment is no longer true, remove it.


# 1.37 18-Dec-2014 tedu

use siphash for pf_lb. for ipv6, we stretch it out a bit, but good enough.
ok reyk


# 1.36 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.35 10-Nov-2014 bluhm

Split the logic for the ICMP and ICMP6 case in pf_get_sport(). The
types ICMP_ECHO and ICMP6_ECHO_REQUEST have their special meaning
only if the protocol matches.
Put an #ifdef INET6 around ICMP6_ECHO_REQUEST to make the kernel
without IPv6 compile.
OK henning@


# 1.34 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


# 1.33 14-Aug-2014 blambert

fix logging strings (correct function name via __func__ + a typo)

ok florian@ henning@


Revision tags: OPENBSD_5_6_BASE
# 1.32 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.31 02-Jul-2014 mikeb

better indentation; no functional change


Revision tags: OPENBSD_5_5_BASE
# 1.30 30-Oct-2013 mikeb

translate icmpv6 echo id's the same way we do for icmpv4; ok henning


# 1.29 30-Oct-2013 mikeb

add a comment describing why do we call pf_map_addr again if port
selection process fails; ok henning


# 1.28 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.27 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


# 1.26 17-Oct-2013 bluhm

The header file netinet/in_var.h included netinet6/in6_var.h. This
created a bunch of useless dependencies. Remove this implicit
inclusion and do an explicit #include <netinet6/in6_var.h> when it
is needed.
OK mpi@ henning@


Revision tags: OPENBSD_5_4_BASE
# 1.25 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_3_BASE
# 1.24 29-Dec-2012 markus

make sure the entry from tree_src_tracking is still in the pool;
fixes nat with sticky address and ip address change on pppoe(4) for example;
ok henning@, zinke@; mikeb@


# 1.23 29-Dec-2012 markus

reset the counter in case its current value has been removed
from the pool (e.g. ifconfig em0 1.2.3.4 -alias)
ok henning@, mikeb@


# 1.22 29-Dec-2012 markus

pass pf_pool directly to pfr_pool_get(); simplifies the API;
ok henning@, zinke@, mikeb@


Revision tags: OPENBSD_5_2_BASE
# 1.21 09-Jul-2012 zinke

Enable support for the 'weight' keyword in the 'least-states'
load balancing case, this allows Weighted Least States (WLS).
Everything prepared on c2k11 with help from mcbride@.

This finally makes PF ready for the cloud.

ok henning@ mikeb@ pyr@


Revision tags: OPENBSD_5_1_BASE
# 1.20 03-Feb-2012 bluhm

The kernel did not compile without INET6. Put some #ifdefs into
pf to fix that.
- add #ifdef INET6 in obvious places
- af translation is only possible with both INET and INET6
- interleave #endif /* INET6 */ and closing brace correctly
- it is not necessary to #ifdef function prototypes
- do not compile af translate functions at all instead of empty stub,
then the linker will report inconsistencies
- pf_poolmask() actually takes an sa_family_t not an u_int8_t argument
No binary change for GENERIC compiled with -O2 and -UDIAGNOSTIC.
reported by Olivier Cochard-Labbe; ok mikeb@ henning@


# 1.19 13-Oct-2011 claudio

Since the IPv6 madness is not enough introduce NAT64 -- which is actually
"af-to" a generic IP version translator for pf(4).
Not everything perfect yet but lets fix these things in the tree.
Insane amount of work done by sperreault@, mikeb@ and reyk@.
Looked over by mcbride@ henning@ and myself at eurobsdcon.
OK mcbride@ and general put it in from deraadt@


# 1.18 18-Sep-2011 miod

Fix various format string types to as a minimum match the width of the
variables being processed.
ok bluhm@ henning@


Revision tags: OPENBSD_5_0_BASE
# 1.17 29-Jul-2011 mcbride

Make sure we use the right tbl/dyn pointer to check the pfrkt_refcntcost;
improved debugging for error cases inside the weighted round-robin loop.

original diff from claudio, ok henning


# 1.16 27-Jul-2011 mcbride

Add support for weighted round-robin in load balancing pools and tables.
Diff from zinke@ with a some minor cleanup.
ok henning claudio deraadt


# 1.15 03-Jul-2011 zinke

bring in least-states load balancing algorithm

ok mcbride@ henning@


# 1.14 17-May-2011 mikeb

exclude link local address from the dynamic interface address pool
so that rules like "pass out on vr1 inet6 nat-to (vr1)" won't map
to the non routable ipv6 link local address; with suggestions and
ok claudio, henning


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.13 27-Jun-2010 henning

stuff nsaddr/ndaddr/nsport/ndport (addrs/ports after NAT, used a lot while
walking the ruleset and up until state is fully set up) into pf_pdesc instead
of passing around those 4 seperately all the time, also shrinks the argument
count for a few functions that have/partialy had an insane count of arguments.
kinda preparational since we'll need them elsewhere too, soon
ok ryan jsing


Revision tags: OPENBSD_4_7_BASE
# 1.12 04-Feb-2010 sthen

pf_get_sport() picks a random port from the port range specified in a
nat rule. It should check to see if it's in-use (i.e. matches an existing
PF state), if it is, it cycles sequentially through other ports until
it finds a free one. However the check was being done with the state
keys the wrong way round so it was never actually finding the state
to be in-use.

- switch the keys to correct this, avoiding random state collisions
with nat. Fixes PR 6300 and problems reported by robert@ and viq.

- check pf_get_sport() return code in pf_test(); if port allocation
fails the packet should be dropped rather than sent out untranslated.

Help/ok claudio@.


# 1.11 18-Jan-2010 mcbride

Convert pf debug logging to using log()/addlog(), a single standardised
definition of DPFPRINTF(), and log priorities from syslog.h. Old debug
levels will still work for now, but will eventually be phased out.

discussed with henning, ok dlg


# 1.10 12-Jan-2010 mcbride

First pass at removing the 'pf_pool' mechanism for translation and routing
actions. Allow interfaces to be specified in special table entries for
the routing actions. Lists of addresses can now only be done using tables,
which pfctl will generate automatically from the existing syntax.

Functionally, this deprecates the use of multiple tables or dynamic
interfaces in a single nat or rdr rule.

ok henning dlg claudio


# 1.9 14-Dec-2009 henning

fix sticky-address - by pretty much re-implementing it. still following
the original approach using a source tracking node.
the reimplementation i smore flexible than the original one, we now have an
slist of source tracking nodes per state. that is cheap because more than
one entry will be an absolute exception.
ok beck and jsg, also stress tested by Sebastian Benoit <benoit-lists at fb12.de>


# 1.8 03-Nov-2009 claudio

rtables are stacked on rdomains (it is possible to have multiple routing
tables on top of a rdomain) but until now our code was a crazy mix so that
it was impossible to correctly use rtables in that case. Additionally pf(4)
only knows about rtables and not about rdomains. This is especially bad when
tracking (possibly conflicting) states in various domains.
This diff fixes all or most of these issues. It adds a lookup function to
get the rdomain id based on a rtable id. Makes pf understand rdomains and
allows pf to move packets between rdomains (it is similar to NAT).
Because pf states now track the rdomain id as well it is necessary to modify
the pfsync wire format. So old and new systems will not sync up.
A lot of help by dlg@, tested by sthen@, jsg@ and probably more
OK dlg@, mpf@, deraadt@


# 1.7 07-Sep-2009 sthen

Fix static-port, found by jmc@. ok henning@.


# 1.6 01-Sep-2009 henning

the diff theo calls me insanae for:
rewrite of the NAT code, basically. nat and rdr become actions on regular
rules, seperate nat/rdr/binat rules do not exist any more.
match in on $intf rdr-to 1.2.3.4
match out on $intf nat-to 5.6.7.8
the code is capable of doing nat and rdr in any direction, but we prevent
this in pfctl for now, there are implications that need to be documented
better.
the address rewrite happens inline, subsequent rules will see the already
changed addresses. nat / rdr can be applied multiple times as well.
match in on $intf rdr-to 1.2.3.4
match in on $intf to 1.2.3.4 rdr-to 5.6.7.8
help and ok dlg sthen claudio, reyk tested too


Revision tags: OPENBSD_4_6_BASE
# 1.5 24-Jun-2009 sthen

move the "pf_map_addr: selected address" printf up to -xnoisy.
ok henning@


# 1.4 05-Mar-2009 mcbride

Stricter state checking for ICMP and ICMPv6 packets: include the ICMP type
in one port of the state key, using the type to determine which side should
be the id, and which should be the type. Also:
- Handle ICMP6 messages which are typically sent to multicast addresses but
recieve unicast replies, by doing fallthrough lookups against the correct
multicast address.
- Clear up some mistaken assumptions in the PF code:
- Not all ICMP packets have an icmp_id, so simulate one based on other
data if we can, otherwise set it to 0.
- Don't modify the icmp id field in NAT unless it's echo
- Use the full range of possible id's when NATing icmp6 echoy

ok henning marco
testing matthieu todd


Revision tags: OPENBSD_4_5_BASE
# 1.3 18-Feb-2009 henning

bring back the NAT NOP fix, but this time right.
when we want to pretend pf_get_translation didn't do anything we must
get rid of _both_ state keys and reset all 4 sk pointers to NULL and
not leave one key behind and have all 4 pointers point to it - that must
fail. tested dhill sthen, david agrees, deraadt ok


# 1.2 12-Feb-2009 sthen

revert pf.c r1.629 (which moved to this file) which was causing
"panic: pool_do_get(pfstatekeypl): free list modified" discussed with many.

ok dlg


# 1.1 29-Jan-2009 pyr

Split the address selection from pools away from pf.c and put it in
pf_lb.c. This will ease the process of adding more selection types
without bloatening pf.c even more.

ok and a weird death threat, henning@
raised eyebrow, dlg@


# 1.73 04-Jan-2023 dlg

move the pf_state_tree_id type from pfvar.h to pfvar_priv.h.

the pf_state_tree_id type is private to the kernel.

while here, move it from being an RB tree to an RBT tree. this saves
about 12k in pf.o on amd64.

ok sashan@


Revision tags: OPENBSD_7_2_BASE
# 1.72 31-Aug-2022 benno

make kernel build without INET6 again
ok sashan@


# 1.71 03-Aug-2022 sashan

Bug was reported by Chriss Cappucio. It has turned out my earlier change
to pf_lb.c was not complete. We must add a test to determine number of
addresses defined by pool, so we don't treat pool definition
172.16.0.0/16 as a single IP address in pool. If pool is defined as
172.16.0.0/16, then we don't want to fall back to PF_POOL_NONE. Missing
this measure in pf_map_addr() may cause pf_get_sport() to enter infinite
loop when source ports translation become depleted for the first address
found in pool (like 172.16.0.1), because the bug prevents pf_map_addr()
to move to next address in pool (like 172.16.0.2).

while investigating issue I've also noticed an oddity for small random
pools such as 192.168.1.32/28. One would expect the addresses for nat
will be randomly picked from range .32 - .47 in this case. however the
random selection yield significantly more (like 20%) addresses ending by .32
In order to fix it we make random pool to use arc4random_uniform(~mask + 1)
instead of current arc4random().

feedback by claudio@
tested by hrvoje@


Revision tags: OPENBSD_7_1_BASE
# 1.70 16-Feb-2022 sashan

nat-to round-robin without a pool should fallback to POOL_NONE
bug reported by giovanni@

OK giovanni@


# 1.69 16-Dec-2021 sashan

fix zero division found by syzkaller. The sanity checks in pf(4) ioctls
are not powerful enough to detect invalid port ranges (or even invalid
rules). syzkaller does not use pfctl(8), it uses ioctl(2) to pass some
random chunk of memory as a rule to pf(4). Fix adds explicit check
for 0 divider to pf_get_transaddr(). It should make syzkaller happy
without disturbing anyone else.

OK gnezdo@

Reported-by: syzbot+d1f00da48fa717e171f3@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.68 12-Dec-2020 jan

Correct wrong type of variable and remove useless casts.

OK bluhm@


Revision tags: OPENBSD_6_8_BASE
# 1.67 29-Jul-2020 yasuoka

Fix previous commit which referred wrong address and returned wrong
value.

ok sashan


# 1.66 28-Jul-2020 yasuoka

Use the table on root always if current table is not active.

ok sashan


# 1.65 24-Jul-2020 yasuoka

Increase state counter for least-states when the address is selected
by sticky-address. Also fix the problem that the interface which is
specified by the selected table entry is not used properly.

ok jung sashan


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.64 02-Jul-2019 yasuoka

When source address tracking record is used for "route-to", the next
hop interface configured with "route-to" was not used. Keep the
interface within the pf_src_node and use it when the record is used.

OK sashan


Revision tags: OPENBSD_6_5_BASE
# 1.63 10-Dec-2018 kn

Remove useless macros

These are just unhelpful case conversion.

OK sashan henning


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.62 06-Feb-2018 henning

some finger muscle workout:
bzero -> memset and (very few) bcopy -> memcpy/memmove


Revision tags: OPENBSD_6_2_BASE
# 1.61 12-Jul-2017 bluhm

Use a 32 bit variable to detect integer overflow when searching for
an unused nat port. Prevents a possible endless loop if high port
is 65535 or low port is 0.
report and analysis Jingmin Zhou; OK sashan@ visa@


# 1.60 23-Apr-2017 sthen

Some of the LOG_NOTICE messages from PF were seen in normal operations
with certain rulesets and excessively noisy; move them to LOG_INFO (which was
previously unused). ok benno@


Revision tags: OPENBSD_6_1_BASE
# 1.59 08-Feb-2017 jsg

Remove an uneeded NULL test which was after a deref.
ok mpi@ henning@ sashan@


# 1.58 26-Oct-2016 bluhm

Put union pf_headers and struct pf_pdesc into separate header file
pfvar_priv.h. The pf_headers had to be defined in multiple .c files
before. In pfvar.h it would have unknown storage size, this file
is included in too many places. The idea is to have a private pf
header that is only included in the pf part of the kernel. For now
it contains pf_pdesc and pf_headers, it may be extended later.
discussion, input and OK henning@ procter@ sashan@


# 1.57 27-Sep-2016 dlg

roll back turning RB into RBT until i get better at this process.


# 1.56 27-Sep-2016 dlg

move pf from the RB macros to the RBT functions.


Revision tags: OPENBSD_6_0_BASE
# 1.55 19-Jul-2016 henning

remove wrong and misleading comment, ok phessler


# 1.54 24-Jun-2016 bluhm

The function pf_get_sport() did work for out rules only. Make it
aware of the direction of the packet. Now nat-to can be used by
in rules and together with divert-to. Collisions with existing
states are found and produce a "NAT proxy port allocation failed"
message.
OK henning@ mikeb@


# 1.53 15-Jun-2016 mikeb

There's no need to convert values returned by arc4random to the network
byte order. Spotted by Gleb Smirnoff (glebius@FreeBSD.org), thanks!

ok tedu


Revision tags: OPENBSD_5_9_BASE
# 1.52 24-Nov-2015 mpi

No need for <net/if_types.h>

As a bonus this removes a "#if NCARP > 0", say yeah!


# 1.51 15-Oct-2015 bluhm

When using a pf rule with both nat-to and rdr-to, it could happen
that the nated source port was reused as destination port. Do not
initialize nport at the beginning of the function, but where it is
needed.
OK sashan@


# 1.50 13-Oct-2015 sashan

- pf_insert_src_node(): global argument (arg6) is useless, function
always gets pointer to rule.

- pf_remove_src_node(): function should always remove matching src node,
regardless the sn->rule.ptr being NULL or valid rule

- sn->rule.ptr is never NULL, spotted by mpi and Richard Procter _von_ gmail.com

OK mpi@, OK mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.49 03-Aug-2015 jsg

A recently added sanity check panic in pf_postprocess_addr() was
triggered for a reply-to rule. It turns out this case has been using
uninitialised memory as if it were a valid pf pool.

As the rest of the function assumes a valid pool for now just return.

Problem reported by RD Thrush.

ok jung@ mikeb@


# 1.48 20-Jul-2015 jsg

Add some panics to default paths where code later assumes a non default
path was taken. This both prevents warnings from clang and acts as a
sanity check.

ok mcbride@ henning@


# 1.47 18-Jul-2015 sashan

msg.mpi


# 1.46 18-Jul-2015 sashan

INET/INET6 address family check should be unified in PF

it also adds af_unhandled(), where it is currently missing.

ok mcbride@


# 1.45 17-Jul-2015 jsg

fix the indentation of a block of code, no binary change
ok mikeb@ some time ago


# 1.44 16-Jul-2015 mpi

Expand ancient NTOHL/NTOHS/HTONS/HTONL macros.

ok guenther@, henning@


# 1.43 03-Jun-2015 yasuoka

Fix pf_map_addr() not to cause dividing by 0. This fixes problem when
using table or dynamic interface addresses for source-hash. Also
avoid calling arc4random_uniform() with upper_bound == 0.

ok mikeb


# 1.42 14-Mar-2015 jsg

Remove some includes include-what-you-use claims don't
have any direct symbols used. Tested for indirect use by compiling
amd64/i386/sparc64 kernels.

ok tedu@ deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.41 06-Jan-2015 jsg

init a potentially uninitialised var in pf_postprocess_addr
ok mikeb@ henning@


# 1.40 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.39 19-Dec-2014 reyk

Support source-hash and random with tables and dynifs; not just pools.
This finally allows to use source-hash for dynamic loadbalancing, eg.
"rdr-to <hosts> source-hash", instead of just round-robin and least-states.

An older pre-siphash version of this diff was tested by many people.

OK tedu@ benno@


# 1.38 19-Dec-2014 mcbride

Comment is no longer true, remove it.


# 1.37 18-Dec-2014 tedu

use siphash for pf_lb. for ipv6, we stretch it out a bit, but good enough.
ok reyk


# 1.36 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.35 10-Nov-2014 bluhm

Split the logic for the ICMP and ICMP6 case in pf_get_sport(). The
types ICMP_ECHO and ICMP6_ECHO_REQUEST have their special meaning
only if the protocol matches.
Put an #ifdef INET6 around ICMP6_ECHO_REQUEST to make the kernel
without IPv6 compile.
OK henning@


# 1.34 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


# 1.33 14-Aug-2014 blambert

fix logging strings (correct function name via __func__ + a typo)

ok florian@ henning@


Revision tags: OPENBSD_5_6_BASE
# 1.32 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.31 02-Jul-2014 mikeb

better indentation; no functional change


Revision tags: OPENBSD_5_5_BASE
# 1.30 30-Oct-2013 mikeb

translate icmpv6 echo id's the same way we do for icmpv4; ok henning


# 1.29 30-Oct-2013 mikeb

add a comment describing why do we call pf_map_addr again if port
selection process fails; ok henning


# 1.28 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.27 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


# 1.26 17-Oct-2013 bluhm

The header file netinet/in_var.h included netinet6/in6_var.h. This
created a bunch of useless dependencies. Remove this implicit
inclusion and do an explicit #include <netinet6/in6_var.h> when it
is needed.
OK mpi@ henning@


Revision tags: OPENBSD_5_4_BASE
# 1.25 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_3_BASE
# 1.24 29-Dec-2012 markus

make sure the entry from tree_src_tracking is still in the pool;
fixes nat with sticky address and ip address change on pppoe(4) for example;
ok henning@, zinke@; mikeb@


# 1.23 29-Dec-2012 markus

reset the counter in case its current value has been removed
from the pool (e.g. ifconfig em0 1.2.3.4 -alias)
ok henning@, mikeb@


# 1.22 29-Dec-2012 markus

pass pf_pool directly to pfr_pool_get(); simplifies the API;
ok henning@, zinke@, mikeb@


Revision tags: OPENBSD_5_2_BASE
# 1.21 09-Jul-2012 zinke

Enable support for the 'weight' keyword in the 'least-states'
load balancing case, this allows Weighted Least States (WLS).
Everything prepared on c2k11 with help from mcbride@.

This finally makes PF ready for the cloud.

ok henning@ mikeb@ pyr@


Revision tags: OPENBSD_5_1_BASE
# 1.20 03-Feb-2012 bluhm

The kernel did not compile without INET6. Put some #ifdefs into
pf to fix that.
- add #ifdef INET6 in obvious places
- af translation is only possible with both INET and INET6
- interleave #endif /* INET6 */ and closing brace correctly
- it is not necessary to #ifdef function prototypes
- do not compile af translate functions at all instead of empty stub,
then the linker will report inconsistencies
- pf_poolmask() actually takes an sa_family_t not an u_int8_t argument
No binary change for GENERIC compiled with -O2 and -UDIAGNOSTIC.
reported by Olivier Cochard-Labbe; ok mikeb@ henning@


# 1.19 13-Oct-2011 claudio

Since the IPv6 madness is not enough introduce NAT64 -- which is actually
"af-to" a generic IP version translator for pf(4).
Not everything perfect yet but lets fix these things in the tree.
Insane amount of work done by sperreault@, mikeb@ and reyk@.
Looked over by mcbride@ henning@ and myself at eurobsdcon.
OK mcbride@ and general put it in from deraadt@


# 1.18 18-Sep-2011 miod

Fix various format string types to as a minimum match the width of the
variables being processed.
ok bluhm@ henning@


Revision tags: OPENBSD_5_0_BASE
# 1.17 29-Jul-2011 mcbride

Make sure we use the right tbl/dyn pointer to check the pfrkt_refcntcost;
improved debugging for error cases inside the weighted round-robin loop.

original diff from claudio, ok henning


# 1.16 27-Jul-2011 mcbride

Add support for weighted round-robin in load balancing pools and tables.
Diff from zinke@ with a some minor cleanup.
ok henning claudio deraadt


# 1.15 03-Jul-2011 zinke

bring in least-states load balancing algorithm

ok mcbride@ henning@


# 1.14 17-May-2011 mikeb

exclude link local address from the dynamic interface address pool
so that rules like "pass out on vr1 inet6 nat-to (vr1)" won't map
to the non routable ipv6 link local address; with suggestions and
ok claudio, henning


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.13 27-Jun-2010 henning

stuff nsaddr/ndaddr/nsport/ndport (addrs/ports after NAT, used a lot while
walking the ruleset and up until state is fully set up) into pf_pdesc instead
of passing around those 4 seperately all the time, also shrinks the argument
count for a few functions that have/partialy had an insane count of arguments.
kinda preparational since we'll need them elsewhere too, soon
ok ryan jsing


Revision tags: OPENBSD_4_7_BASE
# 1.12 04-Feb-2010 sthen

pf_get_sport() picks a random port from the port range specified in a
nat rule. It should check to see if it's in-use (i.e. matches an existing
PF state), if it is, it cycles sequentially through other ports until
it finds a free one. However the check was being done with the state
keys the wrong way round so it was never actually finding the state
to be in-use.

- switch the keys to correct this, avoiding random state collisions
with nat. Fixes PR 6300 and problems reported by robert@ and viq.

- check pf_get_sport() return code in pf_test(); if port allocation
fails the packet should be dropped rather than sent out untranslated.

Help/ok claudio@.


# 1.11 18-Jan-2010 mcbride

Convert pf debug logging to using log()/addlog(), a single standardised
definition of DPFPRINTF(), and log priorities from syslog.h. Old debug
levels will still work for now, but will eventually be phased out.

discussed with henning, ok dlg


# 1.10 12-Jan-2010 mcbride

First pass at removing the 'pf_pool' mechanism for translation and routing
actions. Allow interfaces to be specified in special table entries for
the routing actions. Lists of addresses can now only be done using tables,
which pfctl will generate automatically from the existing syntax.

Functionally, this deprecates the use of multiple tables or dynamic
interfaces in a single nat or rdr rule.

ok henning dlg claudio


# 1.9 14-Dec-2009 henning

fix sticky-address - by pretty much re-implementing it. still following
the original approach using a source tracking node.
the reimplementation i smore flexible than the original one, we now have an
slist of source tracking nodes per state. that is cheap because more than
one entry will be an absolute exception.
ok beck and jsg, also stress tested by Sebastian Benoit <benoit-lists at fb12.de>


# 1.8 03-Nov-2009 claudio

rtables are stacked on rdomains (it is possible to have multiple routing
tables on top of a rdomain) but until now our code was a crazy mix so that
it was impossible to correctly use rtables in that case. Additionally pf(4)
only knows about rtables and not about rdomains. This is especially bad when
tracking (possibly conflicting) states in various domains.
This diff fixes all or most of these issues. It adds a lookup function to
get the rdomain id based on a rtable id. Makes pf understand rdomains and
allows pf to move packets between rdomains (it is similar to NAT).
Because pf states now track the rdomain id as well it is necessary to modify
the pfsync wire format. So old and new systems will not sync up.
A lot of help by dlg@, tested by sthen@, jsg@ and probably more
OK dlg@, mpf@, deraadt@


# 1.7 07-Sep-2009 sthen

Fix static-port, found by jmc@. ok henning@.


# 1.6 01-Sep-2009 henning

the diff theo calls me insanae for:
rewrite of the NAT code, basically. nat and rdr become actions on regular
rules, seperate nat/rdr/binat rules do not exist any more.
match in on $intf rdr-to 1.2.3.4
match out on $intf nat-to 5.6.7.8
the code is capable of doing nat and rdr in any direction, but we prevent
this in pfctl for now, there are implications that need to be documented
better.
the address rewrite happens inline, subsequent rules will see the already
changed addresses. nat / rdr can be applied multiple times as well.
match in on $intf rdr-to 1.2.3.4
match in on $intf to 1.2.3.4 rdr-to 5.6.7.8
help and ok dlg sthen claudio, reyk tested too


Revision tags: OPENBSD_4_6_BASE
# 1.5 24-Jun-2009 sthen

move the "pf_map_addr: selected address" printf up to -xnoisy.
ok henning@


# 1.4 05-Mar-2009 mcbride

Stricter state checking for ICMP and ICMPv6 packets: include the ICMP type
in one port of the state key, using the type to determine which side should
be the id, and which should be the type. Also:
- Handle ICMP6 messages which are typically sent to multicast addresses but
recieve unicast replies, by doing fallthrough lookups against the correct
multicast address.
- Clear up some mistaken assumptions in the PF code:
- Not all ICMP packets have an icmp_id, so simulate one based on other
data if we can, otherwise set it to 0.
- Don't modify the icmp id field in NAT unless it's echo
- Use the full range of possible id's when NATing icmp6 echoy

ok henning marco
testing matthieu todd


Revision tags: OPENBSD_4_5_BASE
# 1.3 18-Feb-2009 henning

bring back the NAT NOP fix, but this time right.
when we want to pretend pf_get_translation didn't do anything we must
get rid of _both_ state keys and reset all 4 sk pointers to NULL and
not leave one key behind and have all 4 pointers point to it - that must
fail. tested dhill sthen, david agrees, deraadt ok


# 1.2 12-Feb-2009 sthen

revert pf.c r1.629 (which moved to this file) which was causing
"panic: pool_do_get(pfstatekeypl): free list modified" discussed with many.

ok dlg


# 1.1 29-Jan-2009 pyr

Split the address selection from pools away from pf.c and put it in
pf_lb.c. This will ease the process of adding more selection types
without bloatening pf.c even more.

ok and a weird death threat, henning@
raised eyebrow, dlg@


# 1.72 31-Aug-2022 benno

make kernel build without INET6 again
ok sashan@


# 1.71 03-Aug-2022 sashan

Bug was reported by Chriss Cappucio. It has turned out my earlier change
to pf_lb.c was not complete. We must add a test to determine number of
addresses defined by pool, so we don't treat pool definition
172.16.0.0/16 as a single IP address in pool. If pool is defined as
172.16.0.0/16, then we don't want to fall back to PF_POOL_NONE. Missing
this measure in pf_map_addr() may cause pf_get_sport() to enter infinite
loop when source ports translation become depleted for the first address
found in pool (like 172.16.0.1), because the bug prevents pf_map_addr()
to move to next address in pool (like 172.16.0.2).

while investigating issue I've also noticed an oddity for small random
pools such as 192.168.1.32/28. One would expect the addresses for nat
will be randomly picked from range .32 - .47 in this case. however the
random selection yield significantly more (like 20%) addresses ending by .32
In order to fix it we make random pool to use arc4random_uniform(~mask + 1)
instead of current arc4random().

feedback by claudio@
tested by hrvoje@


Revision tags: OPENBSD_7_1_BASE
# 1.70 16-Feb-2022 sashan

nat-to round-robin without a pool should fallback to POOL_NONE
bug reported by giovanni@

OK giovanni@


# 1.69 16-Dec-2021 sashan

fix zero division found by syzkaller. The sanity checks in pf(4) ioctls
are not powerful enough to detect invalid port ranges (or even invalid
rules). syzkaller does not use pfctl(8), it uses ioctl(2) to pass some
random chunk of memory as a rule to pf(4). Fix adds explicit check
for 0 divider to pf_get_transaddr(). It should make syzkaller happy
without disturbing anyone else.

OK gnezdo@

Reported-by: syzbot+d1f00da48fa717e171f3@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.68 12-Dec-2020 jan

Correct wrong type of variable and remove useless casts.

OK bluhm@


Revision tags: OPENBSD_6_8_BASE
# 1.67 29-Jul-2020 yasuoka

Fix previous commit which referred wrong address and returned wrong
value.

ok sashan


# 1.66 28-Jul-2020 yasuoka

Use the table on root always if current table is not active.

ok sashan


# 1.65 24-Jul-2020 yasuoka

Increase state counter for least-states when the address is selected
by sticky-address. Also fix the problem that the interface which is
specified by the selected table entry is not used properly.

ok jung sashan


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.64 02-Jul-2019 yasuoka

When source address tracking record is used for "route-to", the next
hop interface configured with "route-to" was not used. Keep the
interface within the pf_src_node and use it when the record is used.

OK sashan


Revision tags: OPENBSD_6_5_BASE
# 1.63 10-Dec-2018 kn

Remove useless macros

These are just unhelpful case conversion.

OK sashan henning


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.62 06-Feb-2018 henning

some finger muscle workout:
bzero -> memset and (very few) bcopy -> memcpy/memmove


Revision tags: OPENBSD_6_2_BASE
# 1.61 12-Jul-2017 bluhm

Use a 32 bit variable to detect integer overflow when searching for
an unused nat port. Prevents a possible endless loop if high port
is 65535 or low port is 0.
report and analysis Jingmin Zhou; OK sashan@ visa@


# 1.60 23-Apr-2017 sthen

Some of the LOG_NOTICE messages from PF were seen in normal operations
with certain rulesets and excessively noisy; move them to LOG_INFO (which was
previously unused). ok benno@


Revision tags: OPENBSD_6_1_BASE
# 1.59 08-Feb-2017 jsg

Remove an uneeded NULL test which was after a deref.
ok mpi@ henning@ sashan@


# 1.58 26-Oct-2016 bluhm

Put union pf_headers and struct pf_pdesc into separate header file
pfvar_priv.h. The pf_headers had to be defined in multiple .c files
before. In pfvar.h it would have unknown storage size, this file
is included in too many places. The idea is to have a private pf
header that is only included in the pf part of the kernel. For now
it contains pf_pdesc and pf_headers, it may be extended later.
discussion, input and OK henning@ procter@ sashan@


# 1.57 27-Sep-2016 dlg

roll back turning RB into RBT until i get better at this process.


# 1.56 27-Sep-2016 dlg

move pf from the RB macros to the RBT functions.


Revision tags: OPENBSD_6_0_BASE
# 1.55 19-Jul-2016 henning

remove wrong and misleading comment, ok phessler


# 1.54 24-Jun-2016 bluhm

The function pf_get_sport() did work for out rules only. Make it
aware of the direction of the packet. Now nat-to can be used by
in rules and together with divert-to. Collisions with existing
states are found and produce a "NAT proxy port allocation failed"
message.
OK henning@ mikeb@


# 1.53 15-Jun-2016 mikeb

There's no need to convert values returned by arc4random to the network
byte order. Spotted by Gleb Smirnoff (glebius@FreeBSD.org), thanks!

ok tedu


Revision tags: OPENBSD_5_9_BASE
# 1.52 24-Nov-2015 mpi

No need for <net/if_types.h>

As a bonus this removes a "#if NCARP > 0", say yeah!


# 1.51 15-Oct-2015 bluhm

When using a pf rule with both nat-to and rdr-to, it could happen
that the nated source port was reused as destination port. Do not
initialize nport at the beginning of the function, but where it is
needed.
OK sashan@


# 1.50 13-Oct-2015 sashan

- pf_insert_src_node(): global argument (arg6) is useless, function
always gets pointer to rule.

- pf_remove_src_node(): function should always remove matching src node,
regardless the sn->rule.ptr being NULL or valid rule

- sn->rule.ptr is never NULL, spotted by mpi and Richard Procter _von_ gmail.com

OK mpi@, OK mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.49 03-Aug-2015 jsg

A recently added sanity check panic in pf_postprocess_addr() was
triggered for a reply-to rule. It turns out this case has been using
uninitialised memory as if it were a valid pf pool.

As the rest of the function assumes a valid pool for now just return.

Problem reported by RD Thrush.

ok jung@ mikeb@


# 1.48 20-Jul-2015 jsg

Add some panics to default paths where code later assumes a non default
path was taken. This both prevents warnings from clang and acts as a
sanity check.

ok mcbride@ henning@


# 1.47 18-Jul-2015 sashan

msg.mpi


# 1.46 18-Jul-2015 sashan

INET/INET6 address family check should be unified in PF

it also adds af_unhandled(), where it is currently missing.

ok mcbride@


# 1.45 17-Jul-2015 jsg

fix the indentation of a block of code, no binary change
ok mikeb@ some time ago


# 1.44 16-Jul-2015 mpi

Expand ancient NTOHL/NTOHS/HTONS/HTONL macros.

ok guenther@, henning@


# 1.43 03-Jun-2015 yasuoka

Fix pf_map_addr() not to cause dividing by 0. This fixes problem when
using table or dynamic interface addresses for source-hash. Also
avoid calling arc4random_uniform() with upper_bound == 0.

ok mikeb


# 1.42 14-Mar-2015 jsg

Remove some includes include-what-you-use claims don't
have any direct symbols used. Tested for indirect use by compiling
amd64/i386/sparc64 kernels.

ok tedu@ deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.41 06-Jan-2015 jsg

init a potentially uninitialised var in pf_postprocess_addr
ok mikeb@ henning@


# 1.40 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.39 19-Dec-2014 reyk

Support source-hash and random with tables and dynifs; not just pools.
This finally allows to use source-hash for dynamic loadbalancing, eg.
"rdr-to <hosts> source-hash", instead of just round-robin and least-states.

An older pre-siphash version of this diff was tested by many people.

OK tedu@ benno@


# 1.38 19-Dec-2014 mcbride

Comment is no longer true, remove it.


# 1.37 18-Dec-2014 tedu

use siphash for pf_lb. for ipv6, we stretch it out a bit, but good enough.
ok reyk


# 1.36 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.35 10-Nov-2014 bluhm

Split the logic for the ICMP and ICMP6 case in pf_get_sport(). The
types ICMP_ECHO and ICMP6_ECHO_REQUEST have their special meaning
only if the protocol matches.
Put an #ifdef INET6 around ICMP6_ECHO_REQUEST to make the kernel
without IPv6 compile.
OK henning@


# 1.34 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


# 1.33 14-Aug-2014 blambert

fix logging strings (correct function name via __func__ + a typo)

ok florian@ henning@


Revision tags: OPENBSD_5_6_BASE
# 1.32 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.31 02-Jul-2014 mikeb

better indentation; no functional change


Revision tags: OPENBSD_5_5_BASE
# 1.30 30-Oct-2013 mikeb

translate icmpv6 echo id's the same way we do for icmpv4; ok henning


# 1.29 30-Oct-2013 mikeb

add a comment describing why do we call pf_map_addr again if port
selection process fails; ok henning


# 1.28 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.27 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


# 1.26 17-Oct-2013 bluhm

The header file netinet/in_var.h included netinet6/in6_var.h. This
created a bunch of useless dependencies. Remove this implicit
inclusion and do an explicit #include <netinet6/in6_var.h> when it
is needed.
OK mpi@ henning@


Revision tags: OPENBSD_5_4_BASE
# 1.25 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_3_BASE
# 1.24 29-Dec-2012 markus

make sure the entry from tree_src_tracking is still in the pool;
fixes nat with sticky address and ip address change on pppoe(4) for example;
ok henning@, zinke@; mikeb@


# 1.23 29-Dec-2012 markus

reset the counter in case its current value has been removed
from the pool (e.g. ifconfig em0 1.2.3.4 -alias)
ok henning@, mikeb@


# 1.22 29-Dec-2012 markus

pass pf_pool directly to pfr_pool_get(); simplifies the API;
ok henning@, zinke@, mikeb@


Revision tags: OPENBSD_5_2_BASE
# 1.21 09-Jul-2012 zinke

Enable support for the 'weight' keyword in the 'least-states'
load balancing case, this allows Weighted Least States (WLS).
Everything prepared on c2k11 with help from mcbride@.

This finally makes PF ready for the cloud.

ok henning@ mikeb@ pyr@


Revision tags: OPENBSD_5_1_BASE
# 1.20 03-Feb-2012 bluhm

The kernel did not compile without INET6. Put some #ifdefs into
pf to fix that.
- add #ifdef INET6 in obvious places
- af translation is only possible with both INET and INET6
- interleave #endif /* INET6 */ and closing brace correctly
- it is not necessary to #ifdef function prototypes
- do not compile af translate functions at all instead of empty stub,
then the linker will report inconsistencies
- pf_poolmask() actually takes an sa_family_t not an u_int8_t argument
No binary change for GENERIC compiled with -O2 and -UDIAGNOSTIC.
reported by Olivier Cochard-Labbe; ok mikeb@ henning@


# 1.19 13-Oct-2011 claudio

Since the IPv6 madness is not enough introduce NAT64 -- which is actually
"af-to" a generic IP version translator for pf(4).
Not everything perfect yet but lets fix these things in the tree.
Insane amount of work done by sperreault@, mikeb@ and reyk@.
Looked over by mcbride@ henning@ and myself at eurobsdcon.
OK mcbride@ and general put it in from deraadt@


# 1.18 18-Sep-2011 miod

Fix various format string types to as a minimum match the width of the
variables being processed.
ok bluhm@ henning@


Revision tags: OPENBSD_5_0_BASE
# 1.17 29-Jul-2011 mcbride

Make sure we use the right tbl/dyn pointer to check the pfrkt_refcntcost;
improved debugging for error cases inside the weighted round-robin loop.

original diff from claudio, ok henning


# 1.16 27-Jul-2011 mcbride

Add support for weighted round-robin in load balancing pools and tables.
Diff from zinke@ with a some minor cleanup.
ok henning claudio deraadt


# 1.15 03-Jul-2011 zinke

bring in least-states load balancing algorithm

ok mcbride@ henning@


# 1.14 17-May-2011 mikeb

exclude link local address from the dynamic interface address pool
so that rules like "pass out on vr1 inet6 nat-to (vr1)" won't map
to the non routable ipv6 link local address; with suggestions and
ok claudio, henning


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.13 27-Jun-2010 henning

stuff nsaddr/ndaddr/nsport/ndport (addrs/ports after NAT, used a lot while
walking the ruleset and up until state is fully set up) into pf_pdesc instead
of passing around those 4 seperately all the time, also shrinks the argument
count for a few functions that have/partialy had an insane count of arguments.
kinda preparational since we'll need them elsewhere too, soon
ok ryan jsing


Revision tags: OPENBSD_4_7_BASE
# 1.12 04-Feb-2010 sthen

pf_get_sport() picks a random port from the port range specified in a
nat rule. It should check to see if it's in-use (i.e. matches an existing
PF state), if it is, it cycles sequentially through other ports until
it finds a free one. However the check was being done with the state
keys the wrong way round so it was never actually finding the state
to be in-use.

- switch the keys to correct this, avoiding random state collisions
with nat. Fixes PR 6300 and problems reported by robert@ and viq.

- check pf_get_sport() return code in pf_test(); if port allocation
fails the packet should be dropped rather than sent out untranslated.

Help/ok claudio@.


# 1.11 18-Jan-2010 mcbride

Convert pf debug logging to using log()/addlog(), a single standardised
definition of DPFPRINTF(), and log priorities from syslog.h. Old debug
levels will still work for now, but will eventually be phased out.

discussed with henning, ok dlg


# 1.10 12-Jan-2010 mcbride

First pass at removing the 'pf_pool' mechanism for translation and routing
actions. Allow interfaces to be specified in special table entries for
the routing actions. Lists of addresses can now only be done using tables,
which pfctl will generate automatically from the existing syntax.

Functionally, this deprecates the use of multiple tables or dynamic
interfaces in a single nat or rdr rule.

ok henning dlg claudio


# 1.9 14-Dec-2009 henning

fix sticky-address - by pretty much re-implementing it. still following
the original approach using a source tracking node.
the reimplementation i smore flexible than the original one, we now have an
slist of source tracking nodes per state. that is cheap because more than
one entry will be an absolute exception.
ok beck and jsg, also stress tested by Sebastian Benoit <benoit-lists at fb12.de>


# 1.8 03-Nov-2009 claudio

rtables are stacked on rdomains (it is possible to have multiple routing
tables on top of a rdomain) but until now our code was a crazy mix so that
it was impossible to correctly use rtables in that case. Additionally pf(4)
only knows about rtables and not about rdomains. This is especially bad when
tracking (possibly conflicting) states in various domains.
This diff fixes all or most of these issues. It adds a lookup function to
get the rdomain id based on a rtable id. Makes pf understand rdomains and
allows pf to move packets between rdomains (it is similar to NAT).
Because pf states now track the rdomain id as well it is necessary to modify
the pfsync wire format. So old and new systems will not sync up.
A lot of help by dlg@, tested by sthen@, jsg@ and probably more
OK dlg@, mpf@, deraadt@


# 1.7 07-Sep-2009 sthen

Fix static-port, found by jmc@. ok henning@.


# 1.6 01-Sep-2009 henning

the diff theo calls me insanae for:
rewrite of the NAT code, basically. nat and rdr become actions on regular
rules, seperate nat/rdr/binat rules do not exist any more.
match in on $intf rdr-to 1.2.3.4
match out on $intf nat-to 5.6.7.8
the code is capable of doing nat and rdr in any direction, but we prevent
this in pfctl for now, there are implications that need to be documented
better.
the address rewrite happens inline, subsequent rules will see the already
changed addresses. nat / rdr can be applied multiple times as well.
match in on $intf rdr-to 1.2.3.4
match in on $intf to 1.2.3.4 rdr-to 5.6.7.8
help and ok dlg sthen claudio, reyk tested too


Revision tags: OPENBSD_4_6_BASE
# 1.5 24-Jun-2009 sthen

move the "pf_map_addr: selected address" printf up to -xnoisy.
ok henning@


# 1.4 05-Mar-2009 mcbride

Stricter state checking for ICMP and ICMPv6 packets: include the ICMP type
in one port of the state key, using the type to determine which side should
be the id, and which should be the type. Also:
- Handle ICMP6 messages which are typically sent to multicast addresses but
recieve unicast replies, by doing fallthrough lookups against the correct
multicast address.
- Clear up some mistaken assumptions in the PF code:
- Not all ICMP packets have an icmp_id, so simulate one based on other
data if we can, otherwise set it to 0.
- Don't modify the icmp id field in NAT unless it's echo
- Use the full range of possible id's when NATing icmp6 echoy

ok henning marco
testing matthieu todd


Revision tags: OPENBSD_4_5_BASE
# 1.3 18-Feb-2009 henning

bring back the NAT NOP fix, but this time right.
when we want to pretend pf_get_translation didn't do anything we must
get rid of _both_ state keys and reset all 4 sk pointers to NULL and
not leave one key behind and have all 4 pointers point to it - that must
fail. tested dhill sthen, david agrees, deraadt ok


# 1.2 12-Feb-2009 sthen

revert pf.c r1.629 (which moved to this file) which was causing
"panic: pool_do_get(pfstatekeypl): free list modified" discussed with many.

ok dlg


# 1.1 29-Jan-2009 pyr

Split the address selection from pools away from pf.c and put it in
pf_lb.c. This will ease the process of adding more selection types
without bloatening pf.c even more.

ok and a weird death threat, henning@
raised eyebrow, dlg@


# 1.71 03-Aug-2022 sashan

Bug was reported by Chriss Cappucio. It has turned out my earlier change
to pf_lb.c was not complete. We must add a test to determine number of
addresses defined by pool, so we don't treat pool definition
172.16.0.0/16 as a single IP address in pool. If pool is defined as
172.16.0.0/16, then we don't want to fall back to PF_POOL_NONE. Missing
this measure in pf_map_addr() may cause pf_get_sport() to enter infinite
loop when source ports translation become depleted for the first address
found in pool (like 172.16.0.1), because the bug prevents pf_map_addr()
to move to next address in pool (like 172.16.0.2).

while investigating issue I've also noticed an oddity for small random
pools such as 192.168.1.32/28. One would expect the addresses for nat
will be randomly picked from range .32 - .47 in this case. however the
random selection yield significantly more (like 20%) addresses ending by .32
In order to fix it we make random pool to use arc4random_uniform(~mask + 1)
instead of current arc4random().

feedback by claudio@
tested by hrvoje@


Revision tags: OPENBSD_7_1_BASE
# 1.70 16-Feb-2022 sashan

nat-to round-robin without a pool should fallback to POOL_NONE
bug reported by giovanni@

OK giovanni@


# 1.69 16-Dec-2021 sashan

fix zero division found by syzkaller. The sanity checks in pf(4) ioctls
are not powerful enough to detect invalid port ranges (or even invalid
rules). syzkaller does not use pfctl(8), it uses ioctl(2) to pass some
random chunk of memory as a rule to pf(4). Fix adds explicit check
for 0 divider to pf_get_transaddr(). It should make syzkaller happy
without disturbing anyone else.

OK gnezdo@

Reported-by: syzbot+d1f00da48fa717e171f3@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.68 12-Dec-2020 jan

Correct wrong type of variable and remove useless casts.

OK bluhm@


Revision tags: OPENBSD_6_8_BASE
# 1.67 29-Jul-2020 yasuoka

Fix previous commit which referred wrong address and returned wrong
value.

ok sashan


# 1.66 28-Jul-2020 yasuoka

Use the table on root always if current table is not active.

ok sashan


# 1.65 24-Jul-2020 yasuoka

Increase state counter for least-states when the address is selected
by sticky-address. Also fix the problem that the interface which is
specified by the selected table entry is not used properly.

ok jung sashan


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.64 02-Jul-2019 yasuoka

When source address tracking record is used for "route-to", the next
hop interface configured with "route-to" was not used. Keep the
interface within the pf_src_node and use it when the record is used.

OK sashan


Revision tags: OPENBSD_6_5_BASE
# 1.63 10-Dec-2018 kn

Remove useless macros

These are just unhelpful case conversion.

OK sashan henning


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.62 06-Feb-2018 henning

some finger muscle workout:
bzero -> memset and (very few) bcopy -> memcpy/memmove


Revision tags: OPENBSD_6_2_BASE
# 1.61 12-Jul-2017 bluhm

Use a 32 bit variable to detect integer overflow when searching for
an unused nat port. Prevents a possible endless loop if high port
is 65535 or low port is 0.
report and analysis Jingmin Zhou; OK sashan@ visa@


# 1.60 23-Apr-2017 sthen

Some of the LOG_NOTICE messages from PF were seen in normal operations
with certain rulesets and excessively noisy; move them to LOG_INFO (which was
previously unused). ok benno@


Revision tags: OPENBSD_6_1_BASE
# 1.59 08-Feb-2017 jsg

Remove an uneeded NULL test which was after a deref.
ok mpi@ henning@ sashan@


# 1.58 26-Oct-2016 bluhm

Put union pf_headers and struct pf_pdesc into separate header file
pfvar_priv.h. The pf_headers had to be defined in multiple .c files
before. In pfvar.h it would have unknown storage size, this file
is included in too many places. The idea is to have a private pf
header that is only included in the pf part of the kernel. For now
it contains pf_pdesc and pf_headers, it may be extended later.
discussion, input and OK henning@ procter@ sashan@


# 1.57 27-Sep-2016 dlg

roll back turning RB into RBT until i get better at this process.


# 1.56 27-Sep-2016 dlg

move pf from the RB macros to the RBT functions.


Revision tags: OPENBSD_6_0_BASE
# 1.55 19-Jul-2016 henning

remove wrong and misleading comment, ok phessler


# 1.54 24-Jun-2016 bluhm

The function pf_get_sport() did work for out rules only. Make it
aware of the direction of the packet. Now nat-to can be used by
in rules and together with divert-to. Collisions with existing
states are found and produce a "NAT proxy port allocation failed"
message.
OK henning@ mikeb@


# 1.53 15-Jun-2016 mikeb

There's no need to convert values returned by arc4random to the network
byte order. Spotted by Gleb Smirnoff (glebius@FreeBSD.org), thanks!

ok tedu


Revision tags: OPENBSD_5_9_BASE
# 1.52 24-Nov-2015 mpi

No need for <net/if_types.h>

As a bonus this removes a "#if NCARP > 0", say yeah!


# 1.51 15-Oct-2015 bluhm

When using a pf rule with both nat-to and rdr-to, it could happen
that the nated source port was reused as destination port. Do not
initialize nport at the beginning of the function, but where it is
needed.
OK sashan@


# 1.50 13-Oct-2015 sashan

- pf_insert_src_node(): global argument (arg6) is useless, function
always gets pointer to rule.

- pf_remove_src_node(): function should always remove matching src node,
regardless the sn->rule.ptr being NULL or valid rule

- sn->rule.ptr is never NULL, spotted by mpi and Richard Procter _von_ gmail.com

OK mpi@, OK mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.49 03-Aug-2015 jsg

A recently added sanity check panic in pf_postprocess_addr() was
triggered for a reply-to rule. It turns out this case has been using
uninitialised memory as if it were a valid pf pool.

As the rest of the function assumes a valid pool for now just return.

Problem reported by RD Thrush.

ok jung@ mikeb@


# 1.48 20-Jul-2015 jsg

Add some panics to default paths where code later assumes a non default
path was taken. This both prevents warnings from clang and acts as a
sanity check.

ok mcbride@ henning@


# 1.47 18-Jul-2015 sashan

msg.mpi


# 1.46 18-Jul-2015 sashan

INET/INET6 address family check should be unified in PF

it also adds af_unhandled(), where it is currently missing.

ok mcbride@


# 1.45 17-Jul-2015 jsg

fix the indentation of a block of code, no binary change
ok mikeb@ some time ago


# 1.44 16-Jul-2015 mpi

Expand ancient NTOHL/NTOHS/HTONS/HTONL macros.

ok guenther@, henning@


# 1.43 03-Jun-2015 yasuoka

Fix pf_map_addr() not to cause dividing by 0. This fixes problem when
using table or dynamic interface addresses for source-hash. Also
avoid calling arc4random_uniform() with upper_bound == 0.

ok mikeb


# 1.42 14-Mar-2015 jsg

Remove some includes include-what-you-use claims don't
have any direct symbols used. Tested for indirect use by compiling
amd64/i386/sparc64 kernels.

ok tedu@ deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.41 06-Jan-2015 jsg

init a potentially uninitialised var in pf_postprocess_addr
ok mikeb@ henning@


# 1.40 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.39 19-Dec-2014 reyk

Support source-hash and random with tables and dynifs; not just pools.
This finally allows to use source-hash for dynamic loadbalancing, eg.
"rdr-to <hosts> source-hash", instead of just round-robin and least-states.

An older pre-siphash version of this diff was tested by many people.

OK tedu@ benno@


# 1.38 19-Dec-2014 mcbride

Comment is no longer true, remove it.


# 1.37 18-Dec-2014 tedu

use siphash for pf_lb. for ipv6, we stretch it out a bit, but good enough.
ok reyk


# 1.36 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.35 10-Nov-2014 bluhm

Split the logic for the ICMP and ICMP6 case in pf_get_sport(). The
types ICMP_ECHO and ICMP6_ECHO_REQUEST have their special meaning
only if the protocol matches.
Put an #ifdef INET6 around ICMP6_ECHO_REQUEST to make the kernel
without IPv6 compile.
OK henning@


# 1.34 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


# 1.33 14-Aug-2014 blambert

fix logging strings (correct function name via __func__ + a typo)

ok florian@ henning@


Revision tags: OPENBSD_5_6_BASE
# 1.32 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.31 02-Jul-2014 mikeb

better indentation; no functional change


Revision tags: OPENBSD_5_5_BASE
# 1.30 30-Oct-2013 mikeb

translate icmpv6 echo id's the same way we do for icmpv4; ok henning


# 1.29 30-Oct-2013 mikeb

add a comment describing why do we call pf_map_addr again if port
selection process fails; ok henning


# 1.28 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.27 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


# 1.26 17-Oct-2013 bluhm

The header file netinet/in_var.h included netinet6/in6_var.h. This
created a bunch of useless dependencies. Remove this implicit
inclusion and do an explicit #include <netinet6/in6_var.h> when it
is needed.
OK mpi@ henning@


Revision tags: OPENBSD_5_4_BASE
# 1.25 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_3_BASE
# 1.24 29-Dec-2012 markus

make sure the entry from tree_src_tracking is still in the pool;
fixes nat with sticky address and ip address change on pppoe(4) for example;
ok henning@, zinke@; mikeb@


# 1.23 29-Dec-2012 markus

reset the counter in case its current value has been removed
from the pool (e.g. ifconfig em0 1.2.3.4 -alias)
ok henning@, mikeb@


# 1.22 29-Dec-2012 markus

pass pf_pool directly to pfr_pool_get(); simplifies the API;
ok henning@, zinke@, mikeb@


Revision tags: OPENBSD_5_2_BASE
# 1.21 09-Jul-2012 zinke

Enable support for the 'weight' keyword in the 'least-states'
load balancing case, this allows Weighted Least States (WLS).
Everything prepared on c2k11 with help from mcbride@.

This finally makes PF ready for the cloud.

ok henning@ mikeb@ pyr@


Revision tags: OPENBSD_5_1_BASE
# 1.20 03-Feb-2012 bluhm

The kernel did not compile without INET6. Put some #ifdefs into
pf to fix that.
- add #ifdef INET6 in obvious places
- af translation is only possible with both INET and INET6
- interleave #endif /* INET6 */ and closing brace correctly
- it is not necessary to #ifdef function prototypes
- do not compile af translate functions at all instead of empty stub,
then the linker will report inconsistencies
- pf_poolmask() actually takes an sa_family_t not an u_int8_t argument
No binary change for GENERIC compiled with -O2 and -UDIAGNOSTIC.
reported by Olivier Cochard-Labbe; ok mikeb@ henning@


# 1.19 13-Oct-2011 claudio

Since the IPv6 madness is not enough introduce NAT64 -- which is actually
"af-to" a generic IP version translator for pf(4).
Not everything perfect yet but lets fix these things in the tree.
Insane amount of work done by sperreault@, mikeb@ and reyk@.
Looked over by mcbride@ henning@ and myself at eurobsdcon.
OK mcbride@ and general put it in from deraadt@


# 1.18 18-Sep-2011 miod

Fix various format string types to as a minimum match the width of the
variables being processed.
ok bluhm@ henning@


Revision tags: OPENBSD_5_0_BASE
# 1.17 29-Jul-2011 mcbride

Make sure we use the right tbl/dyn pointer to check the pfrkt_refcntcost;
improved debugging for error cases inside the weighted round-robin loop.

original diff from claudio, ok henning


# 1.16 27-Jul-2011 mcbride

Add support for weighted round-robin in load balancing pools and tables.
Diff from zinke@ with a some minor cleanup.
ok henning claudio deraadt


# 1.15 03-Jul-2011 zinke

bring in least-states load balancing algorithm

ok mcbride@ henning@


# 1.14 17-May-2011 mikeb

exclude link local address from the dynamic interface address pool
so that rules like "pass out on vr1 inet6 nat-to (vr1)" won't map
to the non routable ipv6 link local address; with suggestions and
ok claudio, henning


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.13 27-Jun-2010 henning

stuff nsaddr/ndaddr/nsport/ndport (addrs/ports after NAT, used a lot while
walking the ruleset and up until state is fully set up) into pf_pdesc instead
of passing around those 4 seperately all the time, also shrinks the argument
count for a few functions that have/partialy had an insane count of arguments.
kinda preparational since we'll need them elsewhere too, soon
ok ryan jsing


Revision tags: OPENBSD_4_7_BASE
# 1.12 04-Feb-2010 sthen

pf_get_sport() picks a random port from the port range specified in a
nat rule. It should check to see if it's in-use (i.e. matches an existing
PF state), if it is, it cycles sequentially through other ports until
it finds a free one. However the check was being done with the state
keys the wrong way round so it was never actually finding the state
to be in-use.

- switch the keys to correct this, avoiding random state collisions
with nat. Fixes PR 6300 and problems reported by robert@ and viq.

- check pf_get_sport() return code in pf_test(); if port allocation
fails the packet should be dropped rather than sent out untranslated.

Help/ok claudio@.


# 1.11 18-Jan-2010 mcbride

Convert pf debug logging to using log()/addlog(), a single standardised
definition of DPFPRINTF(), and log priorities from syslog.h. Old debug
levels will still work for now, but will eventually be phased out.

discussed with henning, ok dlg


# 1.10 12-Jan-2010 mcbride

First pass at removing the 'pf_pool' mechanism for translation and routing
actions. Allow interfaces to be specified in special table entries for
the routing actions. Lists of addresses can now only be done using tables,
which pfctl will generate automatically from the existing syntax.

Functionally, this deprecates the use of multiple tables or dynamic
interfaces in a single nat or rdr rule.

ok henning dlg claudio


# 1.9 14-Dec-2009 henning

fix sticky-address - by pretty much re-implementing it. still following
the original approach using a source tracking node.
the reimplementation i smore flexible than the original one, we now have an
slist of source tracking nodes per state. that is cheap because more than
one entry will be an absolute exception.
ok beck and jsg, also stress tested by Sebastian Benoit <benoit-lists at fb12.de>


# 1.8 03-Nov-2009 claudio

rtables are stacked on rdomains (it is possible to have multiple routing
tables on top of a rdomain) but until now our code was a crazy mix so that
it was impossible to correctly use rtables in that case. Additionally pf(4)
only knows about rtables and not about rdomains. This is especially bad when
tracking (possibly conflicting) states in various domains.
This diff fixes all or most of these issues. It adds a lookup function to
get the rdomain id based on a rtable id. Makes pf understand rdomains and
allows pf to move packets between rdomains (it is similar to NAT).
Because pf states now track the rdomain id as well it is necessary to modify
the pfsync wire format. So old and new systems will not sync up.
A lot of help by dlg@, tested by sthen@, jsg@ and probably more
OK dlg@, mpf@, deraadt@


# 1.7 07-Sep-2009 sthen

Fix static-port, found by jmc@. ok henning@.


# 1.6 01-Sep-2009 henning

the diff theo calls me insanae for:
rewrite of the NAT code, basically. nat and rdr become actions on regular
rules, seperate nat/rdr/binat rules do not exist any more.
match in on $intf rdr-to 1.2.3.4
match out on $intf nat-to 5.6.7.8
the code is capable of doing nat and rdr in any direction, but we prevent
this in pfctl for now, there are implications that need to be documented
better.
the address rewrite happens inline, subsequent rules will see the already
changed addresses. nat / rdr can be applied multiple times as well.
match in on $intf rdr-to 1.2.3.4
match in on $intf to 1.2.3.4 rdr-to 5.6.7.8
help and ok dlg sthen claudio, reyk tested too


Revision tags: OPENBSD_4_6_BASE
# 1.5 24-Jun-2009 sthen

move the "pf_map_addr: selected address" printf up to -xnoisy.
ok henning@


# 1.4 05-Mar-2009 mcbride

Stricter state checking for ICMP and ICMPv6 packets: include the ICMP type
in one port of the state key, using the type to determine which side should
be the id, and which should be the type. Also:
- Handle ICMP6 messages which are typically sent to multicast addresses but
recieve unicast replies, by doing fallthrough lookups against the correct
multicast address.
- Clear up some mistaken assumptions in the PF code:
- Not all ICMP packets have an icmp_id, so simulate one based on other
data if we can, otherwise set it to 0.
- Don't modify the icmp id field in NAT unless it's echo
- Use the full range of possible id's when NATing icmp6 echoy

ok henning marco
testing matthieu todd


Revision tags: OPENBSD_4_5_BASE
# 1.3 18-Feb-2009 henning

bring back the NAT NOP fix, but this time right.
when we want to pretend pf_get_translation didn't do anything we must
get rid of _both_ state keys and reset all 4 sk pointers to NULL and
not leave one key behind and have all 4 pointers point to it - that must
fail. tested dhill sthen, david agrees, deraadt ok


# 1.2 12-Feb-2009 sthen

revert pf.c r1.629 (which moved to this file) which was causing
"panic: pool_do_get(pfstatekeypl): free list modified" discussed with many.

ok dlg


# 1.1 29-Jan-2009 pyr

Split the address selection from pools away from pf.c and put it in
pf_lb.c. This will ease the process of adding more selection types
without bloatening pf.c even more.

ok and a weird death threat, henning@
raised eyebrow, dlg@


# 1.70 16-Feb-2022 sashan

nat-to round-robin without a pool should fallback to POOL_NONE
bug reported by giovanni@

OK giovanni@


# 1.69 16-Dec-2021 sashan

fix zero division found by syzkaller. The sanity checks in pf(4) ioctls
are not powerful enough to detect invalid port ranges (or even invalid
rules). syzkaller does not use pfctl(8), it uses ioctl(2) to pass some
random chunk of memory as a rule to pf(4). Fix adds explicit check
for 0 divider to pf_get_transaddr(). It should make syzkaller happy
without disturbing anyone else.

OK gnezdo@

Reported-by: syzbot+d1f00da48fa717e171f3@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.68 12-Dec-2020 jan

Correct wrong type of variable and remove useless casts.

OK bluhm@


Revision tags: OPENBSD_6_8_BASE
# 1.67 29-Jul-2020 yasuoka

Fix previous commit which referred wrong address and returned wrong
value.

ok sashan


# 1.66 28-Jul-2020 yasuoka

Use the table on root always if current table is not active.

ok sashan


# 1.65 24-Jul-2020 yasuoka

Increase state counter for least-states when the address is selected
by sticky-address. Also fix the problem that the interface which is
specified by the selected table entry is not used properly.

ok jung sashan


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.64 02-Jul-2019 yasuoka

When source address tracking record is used for "route-to", the next
hop interface configured with "route-to" was not used. Keep the
interface within the pf_src_node and use it when the record is used.

OK sashan


Revision tags: OPENBSD_6_5_BASE
# 1.63 10-Dec-2018 kn

Remove useless macros

These are just unhelpful case conversion.

OK sashan henning


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.62 06-Feb-2018 henning

some finger muscle workout:
bzero -> memset and (very few) bcopy -> memcpy/memmove


Revision tags: OPENBSD_6_2_BASE
# 1.61 12-Jul-2017 bluhm

Use a 32 bit variable to detect integer overflow when searching for
an unused nat port. Prevents a possible endless loop if high port
is 65535 or low port is 0.
report and analysis Jingmin Zhou; OK sashan@ visa@


# 1.60 23-Apr-2017 sthen

Some of the LOG_NOTICE messages from PF were seen in normal operations
with certain rulesets and excessively noisy; move them to LOG_INFO (which was
previously unused). ok benno@


Revision tags: OPENBSD_6_1_BASE
# 1.59 08-Feb-2017 jsg

Remove an uneeded NULL test which was after a deref.
ok mpi@ henning@ sashan@


# 1.58 26-Oct-2016 bluhm

Put union pf_headers and struct pf_pdesc into separate header file
pfvar_priv.h. The pf_headers had to be defined in multiple .c files
before. In pfvar.h it would have unknown storage size, this file
is included in too many places. The idea is to have a private pf
header that is only included in the pf part of the kernel. For now
it contains pf_pdesc and pf_headers, it may be extended later.
discussion, input and OK henning@ procter@ sashan@


# 1.57 27-Sep-2016 dlg

roll back turning RB into RBT until i get better at this process.


# 1.56 27-Sep-2016 dlg

move pf from the RB macros to the RBT functions.


Revision tags: OPENBSD_6_0_BASE
# 1.55 19-Jul-2016 henning

remove wrong and misleading comment, ok phessler


# 1.54 24-Jun-2016 bluhm

The function pf_get_sport() did work for out rules only. Make it
aware of the direction of the packet. Now nat-to can be used by
in rules and together with divert-to. Collisions with existing
states are found and produce a "NAT proxy port allocation failed"
message.
OK henning@ mikeb@


# 1.53 15-Jun-2016 mikeb

There's no need to convert values returned by arc4random to the network
byte order. Spotted by Gleb Smirnoff (glebius@FreeBSD.org), thanks!

ok tedu


Revision tags: OPENBSD_5_9_BASE
# 1.52 24-Nov-2015 mpi

No need for <net/if_types.h>

As a bonus this removes a "#if NCARP > 0", say yeah!


# 1.51 15-Oct-2015 bluhm

When using a pf rule with both nat-to and rdr-to, it could happen
that the nated source port was reused as destination port. Do not
initialize nport at the beginning of the function, but where it is
needed.
OK sashan@


# 1.50 13-Oct-2015 sashan

- pf_insert_src_node(): global argument (arg6) is useless, function
always gets pointer to rule.

- pf_remove_src_node(): function should always remove matching src node,
regardless the sn->rule.ptr being NULL or valid rule

- sn->rule.ptr is never NULL, spotted by mpi and Richard Procter _von_ gmail.com

OK mpi@, OK mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.49 03-Aug-2015 jsg

A recently added sanity check panic in pf_postprocess_addr() was
triggered for a reply-to rule. It turns out this case has been using
uninitialised memory as if it were a valid pf pool.

As the rest of the function assumes a valid pool for now just return.

Problem reported by RD Thrush.

ok jung@ mikeb@


# 1.48 20-Jul-2015 jsg

Add some panics to default paths where code later assumes a non default
path was taken. This both prevents warnings from clang and acts as a
sanity check.

ok mcbride@ henning@


# 1.47 18-Jul-2015 sashan

msg.mpi


# 1.46 18-Jul-2015 sashan

INET/INET6 address family check should be unified in PF

it also adds af_unhandled(), where it is currently missing.

ok mcbride@


# 1.45 17-Jul-2015 jsg

fix the indentation of a block of code, no binary change
ok mikeb@ some time ago


# 1.44 16-Jul-2015 mpi

Expand ancient NTOHL/NTOHS/HTONS/HTONL macros.

ok guenther@, henning@


# 1.43 03-Jun-2015 yasuoka

Fix pf_map_addr() not to cause dividing by 0. This fixes problem when
using table or dynamic interface addresses for source-hash. Also
avoid calling arc4random_uniform() with upper_bound == 0.

ok mikeb


# 1.42 14-Mar-2015 jsg

Remove some includes include-what-you-use claims don't
have any direct symbols used. Tested for indirect use by compiling
amd64/i386/sparc64 kernels.

ok tedu@ deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.41 06-Jan-2015 jsg

init a potentially uninitialised var in pf_postprocess_addr
ok mikeb@ henning@


# 1.40 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.39 19-Dec-2014 reyk

Support source-hash and random with tables and dynifs; not just pools.
This finally allows to use source-hash for dynamic loadbalancing, eg.
"rdr-to <hosts> source-hash", instead of just round-robin and least-states.

An older pre-siphash version of this diff was tested by many people.

OK tedu@ benno@


# 1.38 19-Dec-2014 mcbride

Comment is no longer true, remove it.


# 1.37 18-Dec-2014 tedu

use siphash for pf_lb. for ipv6, we stretch it out a bit, but good enough.
ok reyk


# 1.36 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.35 10-Nov-2014 bluhm

Split the logic for the ICMP and ICMP6 case in pf_get_sport(). The
types ICMP_ECHO and ICMP6_ECHO_REQUEST have their special meaning
only if the protocol matches.
Put an #ifdef INET6 around ICMP6_ECHO_REQUEST to make the kernel
without IPv6 compile.
OK henning@


# 1.34 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


# 1.33 14-Aug-2014 blambert

fix logging strings (correct function name via __func__ + a typo)

ok florian@ henning@


Revision tags: OPENBSD_5_6_BASE
# 1.32 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.31 02-Jul-2014 mikeb

better indentation; no functional change


Revision tags: OPENBSD_5_5_BASE
# 1.30 30-Oct-2013 mikeb

translate icmpv6 echo id's the same way we do for icmpv4; ok henning


# 1.29 30-Oct-2013 mikeb

add a comment describing why do we call pf_map_addr again if port
selection process fails; ok henning


# 1.28 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.27 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


# 1.26 17-Oct-2013 bluhm

The header file netinet/in_var.h included netinet6/in6_var.h. This
created a bunch of useless dependencies. Remove this implicit
inclusion and do an explicit #include <netinet6/in6_var.h> when it
is needed.
OK mpi@ henning@


Revision tags: OPENBSD_5_4_BASE
# 1.25 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_3_BASE
# 1.24 29-Dec-2012 markus

make sure the entry from tree_src_tracking is still in the pool;
fixes nat with sticky address and ip address change on pppoe(4) for example;
ok henning@, zinke@; mikeb@


# 1.23 29-Dec-2012 markus

reset the counter in case its current value has been removed
from the pool (e.g. ifconfig em0 1.2.3.4 -alias)
ok henning@, mikeb@


# 1.22 29-Dec-2012 markus

pass pf_pool directly to pfr_pool_get(); simplifies the API;
ok henning@, zinke@, mikeb@


Revision tags: OPENBSD_5_2_BASE
# 1.21 09-Jul-2012 zinke

Enable support for the 'weight' keyword in the 'least-states'
load balancing case, this allows Weighted Least States (WLS).
Everything prepared on c2k11 with help from mcbride@.

This finally makes PF ready for the cloud.

ok henning@ mikeb@ pyr@


Revision tags: OPENBSD_5_1_BASE
# 1.20 03-Feb-2012 bluhm

The kernel did not compile without INET6. Put some #ifdefs into
pf to fix that.
- add #ifdef INET6 in obvious places
- af translation is only possible with both INET and INET6
- interleave #endif /* INET6 */ and closing brace correctly
- it is not necessary to #ifdef function prototypes
- do not compile af translate functions at all instead of empty stub,
then the linker will report inconsistencies
- pf_poolmask() actually takes an sa_family_t not an u_int8_t argument
No binary change for GENERIC compiled with -O2 and -UDIAGNOSTIC.
reported by Olivier Cochard-Labbe; ok mikeb@ henning@


# 1.19 13-Oct-2011 claudio

Since the IPv6 madness is not enough introduce NAT64 -- which is actually
"af-to" a generic IP version translator for pf(4).
Not everything perfect yet but lets fix these things in the tree.
Insane amount of work done by sperreault@, mikeb@ and reyk@.
Looked over by mcbride@ henning@ and myself at eurobsdcon.
OK mcbride@ and general put it in from deraadt@


# 1.18 18-Sep-2011 miod

Fix various format string types to as a minimum match the width of the
variables being processed.
ok bluhm@ henning@


Revision tags: OPENBSD_5_0_BASE
# 1.17 29-Jul-2011 mcbride

Make sure we use the right tbl/dyn pointer to check the pfrkt_refcntcost;
improved debugging for error cases inside the weighted round-robin loop.

original diff from claudio, ok henning


# 1.16 27-Jul-2011 mcbride

Add support for weighted round-robin in load balancing pools and tables.
Diff from zinke@ with a some minor cleanup.
ok henning claudio deraadt


# 1.15 03-Jul-2011 zinke

bring in least-states load balancing algorithm

ok mcbride@ henning@


# 1.14 17-May-2011 mikeb

exclude link local address from the dynamic interface address pool
so that rules like "pass out on vr1 inet6 nat-to (vr1)" won't map
to the non routable ipv6 link local address; with suggestions and
ok claudio, henning


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.13 27-Jun-2010 henning

stuff nsaddr/ndaddr/nsport/ndport (addrs/ports after NAT, used a lot while
walking the ruleset and up until state is fully set up) into pf_pdesc instead
of passing around those 4 seperately all the time, also shrinks the argument
count for a few functions that have/partialy had an insane count of arguments.
kinda preparational since we'll need them elsewhere too, soon
ok ryan jsing


Revision tags: OPENBSD_4_7_BASE
# 1.12 04-Feb-2010 sthen

pf_get_sport() picks a random port from the port range specified in a
nat rule. It should check to see if it's in-use (i.e. matches an existing
PF state), if it is, it cycles sequentially through other ports until
it finds a free one. However the check was being done with the state
keys the wrong way round so it was never actually finding the state
to be in-use.

- switch the keys to correct this, avoiding random state collisions
with nat. Fixes PR 6300 and problems reported by robert@ and viq.

- check pf_get_sport() return code in pf_test(); if port allocation
fails the packet should be dropped rather than sent out untranslated.

Help/ok claudio@.


# 1.11 18-Jan-2010 mcbride

Convert pf debug logging to using log()/addlog(), a single standardised
definition of DPFPRINTF(), and log priorities from syslog.h. Old debug
levels will still work for now, but will eventually be phased out.

discussed with henning, ok dlg


# 1.10 12-Jan-2010 mcbride

First pass at removing the 'pf_pool' mechanism for translation and routing
actions. Allow interfaces to be specified in special table entries for
the routing actions. Lists of addresses can now only be done using tables,
which pfctl will generate automatically from the existing syntax.

Functionally, this deprecates the use of multiple tables or dynamic
interfaces in a single nat or rdr rule.

ok henning dlg claudio


# 1.9 14-Dec-2009 henning

fix sticky-address - by pretty much re-implementing it. still following
the original approach using a source tracking node.
the reimplementation i smore flexible than the original one, we now have an
slist of source tracking nodes per state. that is cheap because more than
one entry will be an absolute exception.
ok beck and jsg, also stress tested by Sebastian Benoit <benoit-lists at fb12.de>


# 1.8 03-Nov-2009 claudio

rtables are stacked on rdomains (it is possible to have multiple routing
tables on top of a rdomain) but until now our code was a crazy mix so that
it was impossible to correctly use rtables in that case. Additionally pf(4)
only knows about rtables and not about rdomains. This is especially bad when
tracking (possibly conflicting) states in various domains.
This diff fixes all or most of these issues. It adds a lookup function to
get the rdomain id based on a rtable id. Makes pf understand rdomains and
allows pf to move packets between rdomains (it is similar to NAT).
Because pf states now track the rdomain id as well it is necessary to modify
the pfsync wire format. So old and new systems will not sync up.
A lot of help by dlg@, tested by sthen@, jsg@ and probably more
OK dlg@, mpf@, deraadt@


# 1.7 07-Sep-2009 sthen

Fix static-port, found by jmc@. ok henning@.


# 1.6 01-Sep-2009 henning

the diff theo calls me insanae for:
rewrite of the NAT code, basically. nat and rdr become actions on regular
rules, seperate nat/rdr/binat rules do not exist any more.
match in on $intf rdr-to 1.2.3.4
match out on $intf nat-to 5.6.7.8
the code is capable of doing nat and rdr in any direction, but we prevent
this in pfctl for now, there are implications that need to be documented
better.
the address rewrite happens inline, subsequent rules will see the already
changed addresses. nat / rdr can be applied multiple times as well.
match in on $intf rdr-to 1.2.3.4
match in on $intf to 1.2.3.4 rdr-to 5.6.7.8
help and ok dlg sthen claudio, reyk tested too


Revision tags: OPENBSD_4_6_BASE
# 1.5 24-Jun-2009 sthen

move the "pf_map_addr: selected address" printf up to -xnoisy.
ok henning@


# 1.4 05-Mar-2009 mcbride

Stricter state checking for ICMP and ICMPv6 packets: include the ICMP type
in one port of the state key, using the type to determine which side should
be the id, and which should be the type. Also:
- Handle ICMP6 messages which are typically sent to multicast addresses but
recieve unicast replies, by doing fallthrough lookups against the correct
multicast address.
- Clear up some mistaken assumptions in the PF code:
- Not all ICMP packets have an icmp_id, so simulate one based on other
data if we can, otherwise set it to 0.
- Don't modify the icmp id field in NAT unless it's echo
- Use the full range of possible id's when NATing icmp6 echoy

ok henning marco
testing matthieu todd


Revision tags: OPENBSD_4_5_BASE
# 1.3 18-Feb-2009 henning

bring back the NAT NOP fix, but this time right.
when we want to pretend pf_get_translation didn't do anything we must
get rid of _both_ state keys and reset all 4 sk pointers to NULL and
not leave one key behind and have all 4 pointers point to it - that must
fail. tested dhill sthen, david agrees, deraadt ok


# 1.2 12-Feb-2009 sthen

revert pf.c r1.629 (which moved to this file) which was causing
"panic: pool_do_get(pfstatekeypl): free list modified" discussed with many.

ok dlg


# 1.1 29-Jan-2009 pyr

Split the address selection from pools away from pf.c and put it in
pf_lb.c. This will ease the process of adding more selection types
without bloatening pf.c even more.

ok and a weird death threat, henning@
raised eyebrow, dlg@


# 1.69 16-Dec-2021 sashan

fix zero division found by syzkaller. The sanity checks in pf(4) ioctls
are not powerful enough to detect invalid port ranges (or even invalid
rules). syzkaller does not use pfctl(8), it uses ioctl(2) to pass some
random chunk of memory as a rule to pf(4). Fix adds explicit check
for 0 divider to pf_get_transaddr(). It should make syzkaller happy
without disturbing anyone else.

OK gnezdo@

Reported-by: syzbot+d1f00da48fa717e171f3@syzkaller.appspotmail.com


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
# 1.68 12-Dec-2020 jan

Correct wrong type of variable and remove useless casts.

OK bluhm@


Revision tags: OPENBSD_6_8_BASE
# 1.67 29-Jul-2020 yasuoka

Fix previous commit which referred wrong address and returned wrong
value.

ok sashan


# 1.66 28-Jul-2020 yasuoka

Use the table on root always if current table is not active.

ok sashan


# 1.65 24-Jul-2020 yasuoka

Increase state counter for least-states when the address is selected
by sticky-address. Also fix the problem that the interface which is
specified by the selected table entry is not used properly.

ok jung sashan


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.64 02-Jul-2019 yasuoka

When source address tracking record is used for "route-to", the next
hop interface configured with "route-to" was not used. Keep the
interface within the pf_src_node and use it when the record is used.

OK sashan


Revision tags: OPENBSD_6_5_BASE
# 1.63 10-Dec-2018 kn

Remove useless macros

These are just unhelpful case conversion.

OK sashan henning


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.62 06-Feb-2018 henning

some finger muscle workout:
bzero -> memset and (very few) bcopy -> memcpy/memmove


Revision tags: OPENBSD_6_2_BASE
# 1.61 12-Jul-2017 bluhm

Use a 32 bit variable to detect integer overflow when searching for
an unused nat port. Prevents a possible endless loop if high port
is 65535 or low port is 0.
report and analysis Jingmin Zhou; OK sashan@ visa@


# 1.60 23-Apr-2017 sthen

Some of the LOG_NOTICE messages from PF were seen in normal operations
with certain rulesets and excessively noisy; move them to LOG_INFO (which was
previously unused). ok benno@


Revision tags: OPENBSD_6_1_BASE
# 1.59 08-Feb-2017 jsg

Remove an uneeded NULL test which was after a deref.
ok mpi@ henning@ sashan@


# 1.58 26-Oct-2016 bluhm

Put union pf_headers and struct pf_pdesc into separate header file
pfvar_priv.h. The pf_headers had to be defined in multiple .c files
before. In pfvar.h it would have unknown storage size, this file
is included in too many places. The idea is to have a private pf
header that is only included in the pf part of the kernel. For now
it contains pf_pdesc and pf_headers, it may be extended later.
discussion, input and OK henning@ procter@ sashan@


# 1.57 27-Sep-2016 dlg

roll back turning RB into RBT until i get better at this process.


# 1.56 27-Sep-2016 dlg

move pf from the RB macros to the RBT functions.


Revision tags: OPENBSD_6_0_BASE
# 1.55 19-Jul-2016 henning

remove wrong and misleading comment, ok phessler


# 1.54 24-Jun-2016 bluhm

The function pf_get_sport() did work for out rules only. Make it
aware of the direction of the packet. Now nat-to can be used by
in rules and together with divert-to. Collisions with existing
states are found and produce a "NAT proxy port allocation failed"
message.
OK henning@ mikeb@


# 1.53 15-Jun-2016 mikeb

There's no need to convert values returned by arc4random to the network
byte order. Spotted by Gleb Smirnoff (glebius@FreeBSD.org), thanks!

ok tedu


Revision tags: OPENBSD_5_9_BASE
# 1.52 24-Nov-2015 mpi

No need for <net/if_types.h>

As a bonus this removes a "#if NCARP > 0", say yeah!


# 1.51 15-Oct-2015 bluhm

When using a pf rule with both nat-to and rdr-to, it could happen
that the nated source port was reused as destination port. Do not
initialize nport at the beginning of the function, but where it is
needed.
OK sashan@


# 1.50 13-Oct-2015 sashan

- pf_insert_src_node(): global argument (arg6) is useless, function
always gets pointer to rule.

- pf_remove_src_node(): function should always remove matching src node,
regardless the sn->rule.ptr being NULL or valid rule

- sn->rule.ptr is never NULL, spotted by mpi and Richard Procter _von_ gmail.com

OK mpi@, OK mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.49 03-Aug-2015 jsg

A recently added sanity check panic in pf_postprocess_addr() was
triggered for a reply-to rule. It turns out this case has been using
uninitialised memory as if it were a valid pf pool.

As the rest of the function assumes a valid pool for now just return.

Problem reported by RD Thrush.

ok jung@ mikeb@


# 1.48 20-Jul-2015 jsg

Add some panics to default paths where code later assumes a non default
path was taken. This both prevents warnings from clang and acts as a
sanity check.

ok mcbride@ henning@


# 1.47 18-Jul-2015 sashan

msg.mpi


# 1.46 18-Jul-2015 sashan

INET/INET6 address family check should be unified in PF

it also adds af_unhandled(), where it is currently missing.

ok mcbride@


# 1.45 17-Jul-2015 jsg

fix the indentation of a block of code, no binary change
ok mikeb@ some time ago


# 1.44 16-Jul-2015 mpi

Expand ancient NTOHL/NTOHS/HTONS/HTONL macros.

ok guenther@, henning@


# 1.43 03-Jun-2015 yasuoka

Fix pf_map_addr() not to cause dividing by 0. This fixes problem when
using table or dynamic interface addresses for source-hash. Also
avoid calling arc4random_uniform() with upper_bound == 0.

ok mikeb


# 1.42 14-Mar-2015 jsg

Remove some includes include-what-you-use claims don't
have any direct symbols used. Tested for indirect use by compiling
amd64/i386/sparc64 kernels.

ok tedu@ deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.41 06-Jan-2015 jsg

init a potentially uninitialised var in pf_postprocess_addr
ok mikeb@ henning@


# 1.40 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.39 19-Dec-2014 reyk

Support source-hash and random with tables and dynifs; not just pools.
This finally allows to use source-hash for dynamic loadbalancing, eg.
"rdr-to <hosts> source-hash", instead of just round-robin and least-states.

An older pre-siphash version of this diff was tested by many people.

OK tedu@ benno@


# 1.38 19-Dec-2014 mcbride

Comment is no longer true, remove it.


# 1.37 18-Dec-2014 tedu

use siphash for pf_lb. for ipv6, we stretch it out a bit, but good enough.
ok reyk


# 1.36 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.35 10-Nov-2014 bluhm

Split the logic for the ICMP and ICMP6 case in pf_get_sport(). The
types ICMP_ECHO and ICMP6_ECHO_REQUEST have their special meaning
only if the protocol matches.
Put an #ifdef INET6 around ICMP6_ECHO_REQUEST to make the kernel
without IPv6 compile.
OK henning@


# 1.34 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


# 1.33 14-Aug-2014 blambert

fix logging strings (correct function name via __func__ + a typo)

ok florian@ henning@


Revision tags: OPENBSD_5_6_BASE
# 1.32 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.31 02-Jul-2014 mikeb

better indentation; no functional change


Revision tags: OPENBSD_5_5_BASE
# 1.30 30-Oct-2013 mikeb

translate icmpv6 echo id's the same way we do for icmpv4; ok henning


# 1.29 30-Oct-2013 mikeb

add a comment describing why do we call pf_map_addr again if port
selection process fails; ok henning


# 1.28 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.27 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


# 1.26 17-Oct-2013 bluhm

The header file netinet/in_var.h included netinet6/in6_var.h. This
created a bunch of useless dependencies. Remove this implicit
inclusion and do an explicit #include <netinet6/in6_var.h> when it
is needed.
OK mpi@ henning@


Revision tags: OPENBSD_5_4_BASE
# 1.25 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_3_BASE
# 1.24 29-Dec-2012 markus

make sure the entry from tree_src_tracking is still in the pool;
fixes nat with sticky address and ip address change on pppoe(4) for example;
ok henning@, zinke@; mikeb@


# 1.23 29-Dec-2012 markus

reset the counter in case its current value has been removed
from the pool (e.g. ifconfig em0 1.2.3.4 -alias)
ok henning@, mikeb@


# 1.22 29-Dec-2012 markus

pass pf_pool directly to pfr_pool_get(); simplifies the API;
ok henning@, zinke@, mikeb@


Revision tags: OPENBSD_5_2_BASE
# 1.21 09-Jul-2012 zinke

Enable support for the 'weight' keyword in the 'least-states'
load balancing case, this allows Weighted Least States (WLS).
Everything prepared on c2k11 with help from mcbride@.

This finally makes PF ready for the cloud.

ok henning@ mikeb@ pyr@


Revision tags: OPENBSD_5_1_BASE
# 1.20 03-Feb-2012 bluhm

The kernel did not compile without INET6. Put some #ifdefs into
pf to fix that.
- add #ifdef INET6 in obvious places
- af translation is only possible with both INET and INET6
- interleave #endif /* INET6 */ and closing brace correctly
- it is not necessary to #ifdef function prototypes
- do not compile af translate functions at all instead of empty stub,
then the linker will report inconsistencies
- pf_poolmask() actually takes an sa_family_t not an u_int8_t argument
No binary change for GENERIC compiled with -O2 and -UDIAGNOSTIC.
reported by Olivier Cochard-Labbe; ok mikeb@ henning@


# 1.19 13-Oct-2011 claudio

Since the IPv6 madness is not enough introduce NAT64 -- which is actually
"af-to" a generic IP version translator for pf(4).
Not everything perfect yet but lets fix these things in the tree.
Insane amount of work done by sperreault@, mikeb@ and reyk@.
Looked over by mcbride@ henning@ and myself at eurobsdcon.
OK mcbride@ and general put it in from deraadt@


# 1.18 18-Sep-2011 miod

Fix various format string types to as a minimum match the width of the
variables being processed.
ok bluhm@ henning@


Revision tags: OPENBSD_5_0_BASE
# 1.17 29-Jul-2011 mcbride

Make sure we use the right tbl/dyn pointer to check the pfrkt_refcntcost;
improved debugging for error cases inside the weighted round-robin loop.

original diff from claudio, ok henning


# 1.16 27-Jul-2011 mcbride

Add support for weighted round-robin in load balancing pools and tables.
Diff from zinke@ with a some minor cleanup.
ok henning claudio deraadt


# 1.15 03-Jul-2011 zinke

bring in least-states load balancing algorithm

ok mcbride@ henning@


# 1.14 17-May-2011 mikeb

exclude link local address from the dynamic interface address pool
so that rules like "pass out on vr1 inet6 nat-to (vr1)" won't map
to the non routable ipv6 link local address; with suggestions and
ok claudio, henning


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.13 27-Jun-2010 henning

stuff nsaddr/ndaddr/nsport/ndport (addrs/ports after NAT, used a lot while
walking the ruleset and up until state is fully set up) into pf_pdesc instead
of passing around those 4 seperately all the time, also shrinks the argument
count for a few functions that have/partialy had an insane count of arguments.
kinda preparational since we'll need them elsewhere too, soon
ok ryan jsing


Revision tags: OPENBSD_4_7_BASE
# 1.12 04-Feb-2010 sthen

pf_get_sport() picks a random port from the port range specified in a
nat rule. It should check to see if it's in-use (i.e. matches an existing
PF state), if it is, it cycles sequentially through other ports until
it finds a free one. However the check was being done with the state
keys the wrong way round so it was never actually finding the state
to be in-use.

- switch the keys to correct this, avoiding random state collisions
with nat. Fixes PR 6300 and problems reported by robert@ and viq.

- check pf_get_sport() return code in pf_test(); if port allocation
fails the packet should be dropped rather than sent out untranslated.

Help/ok claudio@.


# 1.11 18-Jan-2010 mcbride

Convert pf debug logging to using log()/addlog(), a single standardised
definition of DPFPRINTF(), and log priorities from syslog.h. Old debug
levels will still work for now, but will eventually be phased out.

discussed with henning, ok dlg


# 1.10 12-Jan-2010 mcbride

First pass at removing the 'pf_pool' mechanism for translation and routing
actions. Allow interfaces to be specified in special table entries for
the routing actions. Lists of addresses can now only be done using tables,
which pfctl will generate automatically from the existing syntax.

Functionally, this deprecates the use of multiple tables or dynamic
interfaces in a single nat or rdr rule.

ok henning dlg claudio


# 1.9 14-Dec-2009 henning

fix sticky-address - by pretty much re-implementing it. still following
the original approach using a source tracking node.
the reimplementation i smore flexible than the original one, we now have an
slist of source tracking nodes per state. that is cheap because more than
one entry will be an absolute exception.
ok beck and jsg, also stress tested by Sebastian Benoit <benoit-lists at fb12.de>


# 1.8 03-Nov-2009 claudio

rtables are stacked on rdomains (it is possible to have multiple routing
tables on top of a rdomain) but until now our code was a crazy mix so that
it was impossible to correctly use rtables in that case. Additionally pf(4)
only knows about rtables and not about rdomains. This is especially bad when
tracking (possibly conflicting) states in various domains.
This diff fixes all or most of these issues. It adds a lookup function to
get the rdomain id based on a rtable id. Makes pf understand rdomains and
allows pf to move packets between rdomains (it is similar to NAT).
Because pf states now track the rdomain id as well it is necessary to modify
the pfsync wire format. So old and new systems will not sync up.
A lot of help by dlg@, tested by sthen@, jsg@ and probably more
OK dlg@, mpf@, deraadt@


# 1.7 07-Sep-2009 sthen

Fix static-port, found by jmc@. ok henning@.


# 1.6 01-Sep-2009 henning

the diff theo calls me insanae for:
rewrite of the NAT code, basically. nat and rdr become actions on regular
rules, seperate nat/rdr/binat rules do not exist any more.
match in on $intf rdr-to 1.2.3.4
match out on $intf nat-to 5.6.7.8
the code is capable of doing nat and rdr in any direction, but we prevent
this in pfctl for now, there are implications that need to be documented
better.
the address rewrite happens inline, subsequent rules will see the already
changed addresses. nat / rdr can be applied multiple times as well.
match in on $intf rdr-to 1.2.3.4
match in on $intf to 1.2.3.4 rdr-to 5.6.7.8
help and ok dlg sthen claudio, reyk tested too


Revision tags: OPENBSD_4_6_BASE
# 1.5 24-Jun-2009 sthen

move the "pf_map_addr: selected address" printf up to -xnoisy.
ok henning@


# 1.4 05-Mar-2009 mcbride

Stricter state checking for ICMP and ICMPv6 packets: include the ICMP type
in one port of the state key, using the type to determine which side should
be the id, and which should be the type. Also:
- Handle ICMP6 messages which are typically sent to multicast addresses but
recieve unicast replies, by doing fallthrough lookups against the correct
multicast address.
- Clear up some mistaken assumptions in the PF code:
- Not all ICMP packets have an icmp_id, so simulate one based on other
data if we can, otherwise set it to 0.
- Don't modify the icmp id field in NAT unless it's echo
- Use the full range of possible id's when NATing icmp6 echoy

ok henning marco
testing matthieu todd


Revision tags: OPENBSD_4_5_BASE
# 1.3 18-Feb-2009 henning

bring back the NAT NOP fix, but this time right.
when we want to pretend pf_get_translation didn't do anything we must
get rid of _both_ state keys and reset all 4 sk pointers to NULL and
not leave one key behind and have all 4 pointers point to it - that must
fail. tested dhill sthen, david agrees, deraadt ok


# 1.2 12-Feb-2009 sthen

revert pf.c r1.629 (which moved to this file) which was causing
"panic: pool_do_get(pfstatekeypl): free list modified" discussed with many.

ok dlg


# 1.1 29-Jan-2009 pyr

Split the address selection from pools away from pf.c and put it in
pf_lb.c. This will ease the process of adding more selection types
without bloatening pf.c even more.

ok and a weird death threat, henning@
raised eyebrow, dlg@


# 1.68 12-Dec-2020 jan

Correct wrong type of variable and remove useless casts.

OK bluhm@


Revision tags: OPENBSD_6_8_BASE
# 1.67 29-Jul-2020 yasuoka

Fix previous commit which referred wrong address and returned wrong
value.

ok sashan


# 1.66 28-Jul-2020 yasuoka

Use the table on root always if current table is not active.

ok sashan


# 1.65 24-Jul-2020 yasuoka

Increase state counter for least-states when the address is selected
by sticky-address. Also fix the problem that the interface which is
specified by the selected table entry is not used properly.

ok jung sashan


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.64 02-Jul-2019 yasuoka

When source address tracking record is used for "route-to", the next
hop interface configured with "route-to" was not used. Keep the
interface within the pf_src_node and use it when the record is used.

OK sashan


Revision tags: OPENBSD_6_5_BASE
# 1.63 10-Dec-2018 kn

Remove useless macros

These are just unhelpful case conversion.

OK sashan henning


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.62 06-Feb-2018 henning

some finger muscle workout:
bzero -> memset and (very few) bcopy -> memcpy/memmove


Revision tags: OPENBSD_6_2_BASE
# 1.61 12-Jul-2017 bluhm

Use a 32 bit variable to detect integer overflow when searching for
an unused nat port. Prevents a possible endless loop if high port
is 65535 or low port is 0.
report and analysis Jingmin Zhou; OK sashan@ visa@


# 1.60 23-Apr-2017 sthen

Some of the LOG_NOTICE messages from PF were seen in normal operations
with certain rulesets and excessively noisy; move them to LOG_INFO (which was
previously unused). ok benno@


Revision tags: OPENBSD_6_1_BASE
# 1.59 08-Feb-2017 jsg

Remove an uneeded NULL test which was after a deref.
ok mpi@ henning@ sashan@


# 1.58 26-Oct-2016 bluhm

Put union pf_headers and struct pf_pdesc into separate header file
pfvar_priv.h. The pf_headers had to be defined in multiple .c files
before. In pfvar.h it would have unknown storage size, this file
is included in too many places. The idea is to have a private pf
header that is only included in the pf part of the kernel. For now
it contains pf_pdesc and pf_headers, it may be extended later.
discussion, input and OK henning@ procter@ sashan@


# 1.57 27-Sep-2016 dlg

roll back turning RB into RBT until i get better at this process.


# 1.56 27-Sep-2016 dlg

move pf from the RB macros to the RBT functions.


Revision tags: OPENBSD_6_0_BASE
# 1.55 19-Jul-2016 henning

remove wrong and misleading comment, ok phessler


# 1.54 24-Jun-2016 bluhm

The function pf_get_sport() did work for out rules only. Make it
aware of the direction of the packet. Now nat-to can be used by
in rules and together with divert-to. Collisions with existing
states are found and produce a "NAT proxy port allocation failed"
message.
OK henning@ mikeb@


# 1.53 15-Jun-2016 mikeb

There's no need to convert values returned by arc4random to the network
byte order. Spotted by Gleb Smirnoff (glebius@FreeBSD.org), thanks!

ok tedu


Revision tags: OPENBSD_5_9_BASE
# 1.52 24-Nov-2015 mpi

No need for <net/if_types.h>

As a bonus this removes a "#if NCARP > 0", say yeah!


# 1.51 15-Oct-2015 bluhm

When using a pf rule with both nat-to and rdr-to, it could happen
that the nated source port was reused as destination port. Do not
initialize nport at the beginning of the function, but where it is
needed.
OK sashan@


# 1.50 13-Oct-2015 sashan

- pf_insert_src_node(): global argument (arg6) is useless, function
always gets pointer to rule.

- pf_remove_src_node(): function should always remove matching src node,
regardless the sn->rule.ptr being NULL or valid rule

- sn->rule.ptr is never NULL, spotted by mpi and Richard Procter _von_ gmail.com

OK mpi@, OK mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.49 03-Aug-2015 jsg

A recently added sanity check panic in pf_postprocess_addr() was
triggered for a reply-to rule. It turns out this case has been using
uninitialised memory as if it were a valid pf pool.

As the rest of the function assumes a valid pool for now just return.

Problem reported by RD Thrush.

ok jung@ mikeb@


# 1.48 20-Jul-2015 jsg

Add some panics to default paths where code later assumes a non default
path was taken. This both prevents warnings from clang and acts as a
sanity check.

ok mcbride@ henning@


# 1.47 18-Jul-2015 sashan

msg.mpi


# 1.46 18-Jul-2015 sashan

INET/INET6 address family check should be unified in PF

it also adds af_unhandled(), where it is currently missing.

ok mcbride@


# 1.45 17-Jul-2015 jsg

fix the indentation of a block of code, no binary change
ok mikeb@ some time ago


# 1.44 16-Jul-2015 mpi

Expand ancient NTOHL/NTOHS/HTONS/HTONL macros.

ok guenther@, henning@


# 1.43 03-Jun-2015 yasuoka

Fix pf_map_addr() not to cause dividing by 0. This fixes problem when
using table or dynamic interface addresses for source-hash. Also
avoid calling arc4random_uniform() with upper_bound == 0.

ok mikeb


# 1.42 14-Mar-2015 jsg

Remove some includes include-what-you-use claims don't
have any direct symbols used. Tested for indirect use by compiling
amd64/i386/sparc64 kernels.

ok tedu@ deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.41 06-Jan-2015 jsg

init a potentially uninitialised var in pf_postprocess_addr
ok mikeb@ henning@


# 1.40 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.39 19-Dec-2014 reyk

Support source-hash and random with tables and dynifs; not just pools.
This finally allows to use source-hash for dynamic loadbalancing, eg.
"rdr-to <hosts> source-hash", instead of just round-robin and least-states.

An older pre-siphash version of this diff was tested by many people.

OK tedu@ benno@


# 1.38 19-Dec-2014 mcbride

Comment is no longer true, remove it.


# 1.37 18-Dec-2014 tedu

use siphash for pf_lb. for ipv6, we stretch it out a bit, but good enough.
ok reyk


# 1.36 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.35 10-Nov-2014 bluhm

Split the logic for the ICMP and ICMP6 case in pf_get_sport(). The
types ICMP_ECHO and ICMP6_ECHO_REQUEST have their special meaning
only if the protocol matches.
Put an #ifdef INET6 around ICMP6_ECHO_REQUEST to make the kernel
without IPv6 compile.
OK henning@


# 1.34 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


# 1.33 14-Aug-2014 blambert

fix logging strings (correct function name via __func__ + a typo)

ok florian@ henning@


Revision tags: OPENBSD_5_6_BASE
# 1.32 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.31 02-Jul-2014 mikeb

better indentation; no functional change


Revision tags: OPENBSD_5_5_BASE
# 1.30 30-Oct-2013 mikeb

translate icmpv6 echo id's the same way we do for icmpv4; ok henning


# 1.29 30-Oct-2013 mikeb

add a comment describing why do we call pf_map_addr again if port
selection process fails; ok henning


# 1.28 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.27 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


# 1.26 17-Oct-2013 bluhm

The header file netinet/in_var.h included netinet6/in6_var.h. This
created a bunch of useless dependencies. Remove this implicit
inclusion and do an explicit #include <netinet6/in6_var.h> when it
is needed.
OK mpi@ henning@


Revision tags: OPENBSD_5_4_BASE
# 1.25 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_3_BASE
# 1.24 29-Dec-2012 markus

make sure the entry from tree_src_tracking is still in the pool;
fixes nat with sticky address and ip address change on pppoe(4) for example;
ok henning@, zinke@; mikeb@


# 1.23 29-Dec-2012 markus

reset the counter in case its current value has been removed
from the pool (e.g. ifconfig em0 1.2.3.4 -alias)
ok henning@, mikeb@


# 1.22 29-Dec-2012 markus

pass pf_pool directly to pfr_pool_get(); simplifies the API;
ok henning@, zinke@, mikeb@


Revision tags: OPENBSD_5_2_BASE
# 1.21 09-Jul-2012 zinke

Enable support for the 'weight' keyword in the 'least-states'
load balancing case, this allows Weighted Least States (WLS).
Everything prepared on c2k11 with help from mcbride@.

This finally makes PF ready for the cloud.

ok henning@ mikeb@ pyr@


Revision tags: OPENBSD_5_1_BASE
# 1.20 03-Feb-2012 bluhm

The kernel did not compile without INET6. Put some #ifdefs into
pf to fix that.
- add #ifdef INET6 in obvious places
- af translation is only possible with both INET and INET6
- interleave #endif /* INET6 */ and closing brace correctly
- it is not necessary to #ifdef function prototypes
- do not compile af translate functions at all instead of empty stub,
then the linker will report inconsistencies
- pf_poolmask() actually takes an sa_family_t not an u_int8_t argument
No binary change for GENERIC compiled with -O2 and -UDIAGNOSTIC.
reported by Olivier Cochard-Labbe; ok mikeb@ henning@


# 1.19 13-Oct-2011 claudio

Since the IPv6 madness is not enough introduce NAT64 -- which is actually
"af-to" a generic IP version translator for pf(4).
Not everything perfect yet but lets fix these things in the tree.
Insane amount of work done by sperreault@, mikeb@ and reyk@.
Looked over by mcbride@ henning@ and myself at eurobsdcon.
OK mcbride@ and general put it in from deraadt@


# 1.18 18-Sep-2011 miod

Fix various format string types to as a minimum match the width of the
variables being processed.
ok bluhm@ henning@


Revision tags: OPENBSD_5_0_BASE
# 1.17 29-Jul-2011 mcbride

Make sure we use the right tbl/dyn pointer to check the pfrkt_refcntcost;
improved debugging for error cases inside the weighted round-robin loop.

original diff from claudio, ok henning


# 1.16 27-Jul-2011 mcbride

Add support for weighted round-robin in load balancing pools and tables.
Diff from zinke@ with a some minor cleanup.
ok henning claudio deraadt


# 1.15 03-Jul-2011 zinke

bring in least-states load balancing algorithm

ok mcbride@ henning@


# 1.14 17-May-2011 mikeb

exclude link local address from the dynamic interface address pool
so that rules like "pass out on vr1 inet6 nat-to (vr1)" won't map
to the non routable ipv6 link local address; with suggestions and
ok claudio, henning


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.13 27-Jun-2010 henning

stuff nsaddr/ndaddr/nsport/ndport (addrs/ports after NAT, used a lot while
walking the ruleset and up until state is fully set up) into pf_pdesc instead
of passing around those 4 seperately all the time, also shrinks the argument
count for a few functions that have/partialy had an insane count of arguments.
kinda preparational since we'll need them elsewhere too, soon
ok ryan jsing


Revision tags: OPENBSD_4_7_BASE
# 1.12 04-Feb-2010 sthen

pf_get_sport() picks a random port from the port range specified in a
nat rule. It should check to see if it's in-use (i.e. matches an existing
PF state), if it is, it cycles sequentially through other ports until
it finds a free one. However the check was being done with the state
keys the wrong way round so it was never actually finding the state
to be in-use.

- switch the keys to correct this, avoiding random state collisions
with nat. Fixes PR 6300 and problems reported by robert@ and viq.

- check pf_get_sport() return code in pf_test(); if port allocation
fails the packet should be dropped rather than sent out untranslated.

Help/ok claudio@.


# 1.11 18-Jan-2010 mcbride

Convert pf debug logging to using log()/addlog(), a single standardised
definition of DPFPRINTF(), and log priorities from syslog.h. Old debug
levels will still work for now, but will eventually be phased out.

discussed with henning, ok dlg


# 1.10 12-Jan-2010 mcbride

First pass at removing the 'pf_pool' mechanism for translation and routing
actions. Allow interfaces to be specified in special table entries for
the routing actions. Lists of addresses can now only be done using tables,
which pfctl will generate automatically from the existing syntax.

Functionally, this deprecates the use of multiple tables or dynamic
interfaces in a single nat or rdr rule.

ok henning dlg claudio


# 1.9 14-Dec-2009 henning

fix sticky-address - by pretty much re-implementing it. still following
the original approach using a source tracking node.
the reimplementation i smore flexible than the original one, we now have an
slist of source tracking nodes per state. that is cheap because more than
one entry will be an absolute exception.
ok beck and jsg, also stress tested by Sebastian Benoit <benoit-lists at fb12.de>


# 1.8 03-Nov-2009 claudio

rtables are stacked on rdomains (it is possible to have multiple routing
tables on top of a rdomain) but until now our code was a crazy mix so that
it was impossible to correctly use rtables in that case. Additionally pf(4)
only knows about rtables and not about rdomains. This is especially bad when
tracking (possibly conflicting) states in various domains.
This diff fixes all or most of these issues. It adds a lookup function to
get the rdomain id based on a rtable id. Makes pf understand rdomains and
allows pf to move packets between rdomains (it is similar to NAT).
Because pf states now track the rdomain id as well it is necessary to modify
the pfsync wire format. So old and new systems will not sync up.
A lot of help by dlg@, tested by sthen@, jsg@ and probably more
OK dlg@, mpf@, deraadt@


# 1.7 07-Sep-2009 sthen

Fix static-port, found by jmc@. ok henning@.


# 1.6 01-Sep-2009 henning

the diff theo calls me insanae for:
rewrite of the NAT code, basically. nat and rdr become actions on regular
rules, seperate nat/rdr/binat rules do not exist any more.
match in on $intf rdr-to 1.2.3.4
match out on $intf nat-to 5.6.7.8
the code is capable of doing nat and rdr in any direction, but we prevent
this in pfctl for now, there are implications that need to be documented
better.
the address rewrite happens inline, subsequent rules will see the already
changed addresses. nat / rdr can be applied multiple times as well.
match in on $intf rdr-to 1.2.3.4
match in on $intf to 1.2.3.4 rdr-to 5.6.7.8
help and ok dlg sthen claudio, reyk tested too


Revision tags: OPENBSD_4_6_BASE
# 1.5 24-Jun-2009 sthen

move the "pf_map_addr: selected address" printf up to -xnoisy.
ok henning@


# 1.4 05-Mar-2009 mcbride

Stricter state checking for ICMP and ICMPv6 packets: include the ICMP type
in one port of the state key, using the type to determine which side should
be the id, and which should be the type. Also:
- Handle ICMP6 messages which are typically sent to multicast addresses but
recieve unicast replies, by doing fallthrough lookups against the correct
multicast address.
- Clear up some mistaken assumptions in the PF code:
- Not all ICMP packets have an icmp_id, so simulate one based on other
data if we can, otherwise set it to 0.
- Don't modify the icmp id field in NAT unless it's echo
- Use the full range of possible id's when NATing icmp6 echoy

ok henning marco
testing matthieu todd


Revision tags: OPENBSD_4_5_BASE
# 1.3 18-Feb-2009 henning

bring back the NAT NOP fix, but this time right.
when we want to pretend pf_get_translation didn't do anything we must
get rid of _both_ state keys and reset all 4 sk pointers to NULL and
not leave one key behind and have all 4 pointers point to it - that must
fail. tested dhill sthen, david agrees, deraadt ok


# 1.2 12-Feb-2009 sthen

revert pf.c r1.629 (which moved to this file) which was causing
"panic: pool_do_get(pfstatekeypl): free list modified" discussed with many.

ok dlg


# 1.1 29-Jan-2009 pyr

Split the address selection from pools away from pf.c and put it in
pf_lb.c. This will ease the process of adding more selection types
without bloatening pf.c even more.

ok and a weird death threat, henning@
raised eyebrow, dlg@


# 1.67 29-Jul-2020 yasuoka

Fix previous commit which referred wrong address and returned wrong
value.

ok sashan


# 1.66 28-Jul-2020 yasuoka

Use the table on root always if current table is not active.

ok sashan


# 1.65 24-Jul-2020 yasuoka

Increase state counter for least-states when the address is selected
by sticky-address. Also fix the problem that the interface which is
specified by the selected table entry is not used properly.

ok jung sashan


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.64 02-Jul-2019 yasuoka

When source address tracking record is used for "route-to", the next
hop interface configured with "route-to" was not used. Keep the
interface within the pf_src_node and use it when the record is used.

OK sashan


Revision tags: OPENBSD_6_5_BASE
# 1.63 10-Dec-2018 kn

Remove useless macros

These are just unhelpful case conversion.

OK sashan henning


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.62 06-Feb-2018 henning

some finger muscle workout:
bzero -> memset and (very few) bcopy -> memcpy/memmove


Revision tags: OPENBSD_6_2_BASE
# 1.61 12-Jul-2017 bluhm

Use a 32 bit variable to detect integer overflow when searching for
an unused nat port. Prevents a possible endless loop if high port
is 65535 or low port is 0.
report and analysis Jingmin Zhou; OK sashan@ visa@


# 1.60 23-Apr-2017 sthen

Some of the LOG_NOTICE messages from PF were seen in normal operations
with certain rulesets and excessively noisy; move them to LOG_INFO (which was
previously unused). ok benno@


Revision tags: OPENBSD_6_1_BASE
# 1.59 08-Feb-2017 jsg

Remove an uneeded NULL test which was after a deref.
ok mpi@ henning@ sashan@


# 1.58 26-Oct-2016 bluhm

Put union pf_headers and struct pf_pdesc into separate header file
pfvar_priv.h. The pf_headers had to be defined in multiple .c files
before. In pfvar.h it would have unknown storage size, this file
is included in too many places. The idea is to have a private pf
header that is only included in the pf part of the kernel. For now
it contains pf_pdesc and pf_headers, it may be extended later.
discussion, input and OK henning@ procter@ sashan@


# 1.57 27-Sep-2016 dlg

roll back turning RB into RBT until i get better at this process.


# 1.56 27-Sep-2016 dlg

move pf from the RB macros to the RBT functions.


Revision tags: OPENBSD_6_0_BASE
# 1.55 19-Jul-2016 henning

remove wrong and misleading comment, ok phessler


# 1.54 24-Jun-2016 bluhm

The function pf_get_sport() did work for out rules only. Make it
aware of the direction of the packet. Now nat-to can be used by
in rules and together with divert-to. Collisions with existing
states are found and produce a "NAT proxy port allocation failed"
message.
OK henning@ mikeb@


# 1.53 15-Jun-2016 mikeb

There's no need to convert values returned by arc4random to the network
byte order. Spotted by Gleb Smirnoff (glebius@FreeBSD.org), thanks!

ok tedu


Revision tags: OPENBSD_5_9_BASE
# 1.52 24-Nov-2015 mpi

No need for <net/if_types.h>

As a bonus this removes a "#if NCARP > 0", say yeah!


# 1.51 15-Oct-2015 bluhm

When using a pf rule with both nat-to and rdr-to, it could happen
that the nated source port was reused as destination port. Do not
initialize nport at the beginning of the function, but where it is
needed.
OK sashan@


# 1.50 13-Oct-2015 sashan

- pf_insert_src_node(): global argument (arg6) is useless, function
always gets pointer to rule.

- pf_remove_src_node(): function should always remove matching src node,
regardless the sn->rule.ptr being NULL or valid rule

- sn->rule.ptr is never NULL, spotted by mpi and Richard Procter _von_ gmail.com

OK mpi@, OK mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.49 03-Aug-2015 jsg

A recently added sanity check panic in pf_postprocess_addr() was
triggered for a reply-to rule. It turns out this case has been using
uninitialised memory as if it were a valid pf pool.

As the rest of the function assumes a valid pool for now just return.

Problem reported by RD Thrush.

ok jung@ mikeb@


# 1.48 20-Jul-2015 jsg

Add some panics to default paths where code later assumes a non default
path was taken. This both prevents warnings from clang and acts as a
sanity check.

ok mcbride@ henning@


# 1.47 18-Jul-2015 sashan

msg.mpi


# 1.46 18-Jul-2015 sashan

INET/INET6 address family check should be unified in PF

it also adds af_unhandled(), where it is currently missing.

ok mcbride@


# 1.45 17-Jul-2015 jsg

fix the indentation of a block of code, no binary change
ok mikeb@ some time ago


# 1.44 16-Jul-2015 mpi

Expand ancient NTOHL/NTOHS/HTONS/HTONL macros.

ok guenther@, henning@


# 1.43 03-Jun-2015 yasuoka

Fix pf_map_addr() not to cause dividing by 0. This fixes problem when
using table or dynamic interface addresses for source-hash. Also
avoid calling arc4random_uniform() with upper_bound == 0.

ok mikeb


# 1.42 14-Mar-2015 jsg

Remove some includes include-what-you-use claims don't
have any direct symbols used. Tested for indirect use by compiling
amd64/i386/sparc64 kernels.

ok tedu@ deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.41 06-Jan-2015 jsg

init a potentially uninitialised var in pf_postprocess_addr
ok mikeb@ henning@


# 1.40 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.39 19-Dec-2014 reyk

Support source-hash and random with tables and dynifs; not just pools.
This finally allows to use source-hash for dynamic loadbalancing, eg.
"rdr-to <hosts> source-hash", instead of just round-robin and least-states.

An older pre-siphash version of this diff was tested by many people.

OK tedu@ benno@


# 1.38 19-Dec-2014 mcbride

Comment is no longer true, remove it.


# 1.37 18-Dec-2014 tedu

use siphash for pf_lb. for ipv6, we stretch it out a bit, but good enough.
ok reyk


# 1.36 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.35 10-Nov-2014 bluhm

Split the logic for the ICMP and ICMP6 case in pf_get_sport(). The
types ICMP_ECHO and ICMP6_ECHO_REQUEST have their special meaning
only if the protocol matches.
Put an #ifdef INET6 around ICMP6_ECHO_REQUEST to make the kernel
without IPv6 compile.
OK henning@


# 1.34 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


# 1.33 14-Aug-2014 blambert

fix logging strings (correct function name via __func__ + a typo)

ok florian@ henning@


Revision tags: OPENBSD_5_6_BASE
# 1.32 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.31 02-Jul-2014 mikeb

better indentation; no functional change


Revision tags: OPENBSD_5_5_BASE
# 1.30 30-Oct-2013 mikeb

translate icmpv6 echo id's the same way we do for icmpv4; ok henning


# 1.29 30-Oct-2013 mikeb

add a comment describing why do we call pf_map_addr again if port
selection process fails; ok henning


# 1.28 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.27 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


# 1.26 17-Oct-2013 bluhm

The header file netinet/in_var.h included netinet6/in6_var.h. This
created a bunch of useless dependencies. Remove this implicit
inclusion and do an explicit #include <netinet6/in6_var.h> when it
is needed.
OK mpi@ henning@


Revision tags: OPENBSD_5_4_BASE
# 1.25 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_3_BASE
# 1.24 29-Dec-2012 markus

make sure the entry from tree_src_tracking is still in the pool;
fixes nat with sticky address and ip address change on pppoe(4) for example;
ok henning@, zinke@; mikeb@


# 1.23 29-Dec-2012 markus

reset the counter in case its current value has been removed
from the pool (e.g. ifconfig em0 1.2.3.4 -alias)
ok henning@, mikeb@


# 1.22 29-Dec-2012 markus

pass pf_pool directly to pfr_pool_get(); simplifies the API;
ok henning@, zinke@, mikeb@


Revision tags: OPENBSD_5_2_BASE
# 1.21 09-Jul-2012 zinke

Enable support for the 'weight' keyword in the 'least-states'
load balancing case, this allows Weighted Least States (WLS).
Everything prepared on c2k11 with help from mcbride@.

This finally makes PF ready for the cloud.

ok henning@ mikeb@ pyr@


Revision tags: OPENBSD_5_1_BASE
# 1.20 03-Feb-2012 bluhm

The kernel did not compile without INET6. Put some #ifdefs into
pf to fix that.
- add #ifdef INET6 in obvious places
- af translation is only possible with both INET and INET6
- interleave #endif /* INET6 */ and closing brace correctly
- it is not necessary to #ifdef function prototypes
- do not compile af translate functions at all instead of empty stub,
then the linker will report inconsistencies
- pf_poolmask() actually takes an sa_family_t not an u_int8_t argument
No binary change for GENERIC compiled with -O2 and -UDIAGNOSTIC.
reported by Olivier Cochard-Labbe; ok mikeb@ henning@


# 1.19 13-Oct-2011 claudio

Since the IPv6 madness is not enough introduce NAT64 -- which is actually
"af-to" a generic IP version translator for pf(4).
Not everything perfect yet but lets fix these things in the tree.
Insane amount of work done by sperreault@, mikeb@ and reyk@.
Looked over by mcbride@ henning@ and myself at eurobsdcon.
OK mcbride@ and general put it in from deraadt@


# 1.18 18-Sep-2011 miod

Fix various format string types to as a minimum match the width of the
variables being processed.
ok bluhm@ henning@


Revision tags: OPENBSD_5_0_BASE
# 1.17 29-Jul-2011 mcbride

Make sure we use the right tbl/dyn pointer to check the pfrkt_refcntcost;
improved debugging for error cases inside the weighted round-robin loop.

original diff from claudio, ok henning


# 1.16 27-Jul-2011 mcbride

Add support for weighted round-robin in load balancing pools and tables.
Diff from zinke@ with a some minor cleanup.
ok henning claudio deraadt


# 1.15 03-Jul-2011 zinke

bring in least-states load balancing algorithm

ok mcbride@ henning@


# 1.14 17-May-2011 mikeb

exclude link local address from the dynamic interface address pool
so that rules like "pass out on vr1 inet6 nat-to (vr1)" won't map
to the non routable ipv6 link local address; with suggestions and
ok claudio, henning


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.13 27-Jun-2010 henning

stuff nsaddr/ndaddr/nsport/ndport (addrs/ports after NAT, used a lot while
walking the ruleset and up until state is fully set up) into pf_pdesc instead
of passing around those 4 seperately all the time, also shrinks the argument
count for a few functions that have/partialy had an insane count of arguments.
kinda preparational since we'll need them elsewhere too, soon
ok ryan jsing


Revision tags: OPENBSD_4_7_BASE
# 1.12 04-Feb-2010 sthen

pf_get_sport() picks a random port from the port range specified in a
nat rule. It should check to see if it's in-use (i.e. matches an existing
PF state), if it is, it cycles sequentially through other ports until
it finds a free one. However the check was being done with the state
keys the wrong way round so it was never actually finding the state
to be in-use.

- switch the keys to correct this, avoiding random state collisions
with nat. Fixes PR 6300 and problems reported by robert@ and viq.

- check pf_get_sport() return code in pf_test(); if port allocation
fails the packet should be dropped rather than sent out untranslated.

Help/ok claudio@.


# 1.11 18-Jan-2010 mcbride

Convert pf debug logging to using log()/addlog(), a single standardised
definition of DPFPRINTF(), and log priorities from syslog.h. Old debug
levels will still work for now, but will eventually be phased out.

discussed with henning, ok dlg


# 1.10 12-Jan-2010 mcbride

First pass at removing the 'pf_pool' mechanism for translation and routing
actions. Allow interfaces to be specified in special table entries for
the routing actions. Lists of addresses can now only be done using tables,
which pfctl will generate automatically from the existing syntax.

Functionally, this deprecates the use of multiple tables or dynamic
interfaces in a single nat or rdr rule.

ok henning dlg claudio


# 1.9 14-Dec-2009 henning

fix sticky-address - by pretty much re-implementing it. still following
the original approach using a source tracking node.
the reimplementation i smore flexible than the original one, we now have an
slist of source tracking nodes per state. that is cheap because more than
one entry will be an absolute exception.
ok beck and jsg, also stress tested by Sebastian Benoit <benoit-lists at fb12.de>


# 1.8 03-Nov-2009 claudio

rtables are stacked on rdomains (it is possible to have multiple routing
tables on top of a rdomain) but until now our code was a crazy mix so that
it was impossible to correctly use rtables in that case. Additionally pf(4)
only knows about rtables and not about rdomains. This is especially bad when
tracking (possibly conflicting) states in various domains.
This diff fixes all or most of these issues. It adds a lookup function to
get the rdomain id based on a rtable id. Makes pf understand rdomains and
allows pf to move packets between rdomains (it is similar to NAT).
Because pf states now track the rdomain id as well it is necessary to modify
the pfsync wire format. So old and new systems will not sync up.
A lot of help by dlg@, tested by sthen@, jsg@ and probably more
OK dlg@, mpf@, deraadt@


# 1.7 07-Sep-2009 sthen

Fix static-port, found by jmc@. ok henning@.


# 1.6 01-Sep-2009 henning

the diff theo calls me insanae for:
rewrite of the NAT code, basically. nat and rdr become actions on regular
rules, seperate nat/rdr/binat rules do not exist any more.
match in on $intf rdr-to 1.2.3.4
match out on $intf nat-to 5.6.7.8
the code is capable of doing nat and rdr in any direction, but we prevent
this in pfctl for now, there are implications that need to be documented
better.
the address rewrite happens inline, subsequent rules will see the already
changed addresses. nat / rdr can be applied multiple times as well.
match in on $intf rdr-to 1.2.3.4
match in on $intf to 1.2.3.4 rdr-to 5.6.7.8
help and ok dlg sthen claudio, reyk tested too


Revision tags: OPENBSD_4_6_BASE
# 1.5 24-Jun-2009 sthen

move the "pf_map_addr: selected address" printf up to -xnoisy.
ok henning@


# 1.4 05-Mar-2009 mcbride

Stricter state checking for ICMP and ICMPv6 packets: include the ICMP type
in one port of the state key, using the type to determine which side should
be the id, and which should be the type. Also:
- Handle ICMP6 messages which are typically sent to multicast addresses but
recieve unicast replies, by doing fallthrough lookups against the correct
multicast address.
- Clear up some mistaken assumptions in the PF code:
- Not all ICMP packets have an icmp_id, so simulate one based on other
data if we can, otherwise set it to 0.
- Don't modify the icmp id field in NAT unless it's echo
- Use the full range of possible id's when NATing icmp6 echoy

ok henning marco
testing matthieu todd


Revision tags: OPENBSD_4_5_BASE
# 1.3 18-Feb-2009 henning

bring back the NAT NOP fix, but this time right.
when we want to pretend pf_get_translation didn't do anything we must
get rid of _both_ state keys and reset all 4 sk pointers to NULL and
not leave one key behind and have all 4 pointers point to it - that must
fail. tested dhill sthen, david agrees, deraadt ok


# 1.2 12-Feb-2009 sthen

revert pf.c r1.629 (which moved to this file) which was causing
"panic: pool_do_get(pfstatekeypl): free list modified" discussed with many.

ok dlg


# 1.1 29-Jan-2009 pyr

Split the address selection from pools away from pf.c and put it in
pf_lb.c. This will ease the process of adding more selection types
without bloatening pf.c even more.

ok and a weird death threat, henning@
raised eyebrow, dlg@


# 1.65 24-Jul-2020 yasuoka

Increase state counter for least-states when the address is selected
by sticky-address. Also fix the problem that the interface which is
specified by the selected table entry is not used properly.

ok jung sashan


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.64 02-Jul-2019 yasuoka

When source address tracking record is used for "route-to", the next
hop interface configured with "route-to" was not used. Keep the
interface within the pf_src_node and use it when the record is used.

OK sashan


Revision tags: OPENBSD_6_5_BASE
# 1.63 10-Dec-2018 kn

Remove useless macros

These are just unhelpful case conversion.

OK sashan henning


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.62 06-Feb-2018 henning

some finger muscle workout:
bzero -> memset and (very few) bcopy -> memcpy/memmove


Revision tags: OPENBSD_6_2_BASE
# 1.61 12-Jul-2017 bluhm

Use a 32 bit variable to detect integer overflow when searching for
an unused nat port. Prevents a possible endless loop if high port
is 65535 or low port is 0.
report and analysis Jingmin Zhou; OK sashan@ visa@


# 1.60 23-Apr-2017 sthen

Some of the LOG_NOTICE messages from PF were seen in normal operations
with certain rulesets and excessively noisy; move them to LOG_INFO (which was
previously unused). ok benno@


Revision tags: OPENBSD_6_1_BASE
# 1.59 08-Feb-2017 jsg

Remove an uneeded NULL test which was after a deref.
ok mpi@ henning@ sashan@


# 1.58 26-Oct-2016 bluhm

Put union pf_headers and struct pf_pdesc into separate header file
pfvar_priv.h. The pf_headers had to be defined in multiple .c files
before. In pfvar.h it would have unknown storage size, this file
is included in too many places. The idea is to have a private pf
header that is only included in the pf part of the kernel. For now
it contains pf_pdesc and pf_headers, it may be extended later.
discussion, input and OK henning@ procter@ sashan@


# 1.57 27-Sep-2016 dlg

roll back turning RB into RBT until i get better at this process.


# 1.56 27-Sep-2016 dlg

move pf from the RB macros to the RBT functions.


Revision tags: OPENBSD_6_0_BASE
# 1.55 19-Jul-2016 henning

remove wrong and misleading comment, ok phessler


# 1.54 24-Jun-2016 bluhm

The function pf_get_sport() did work for out rules only. Make it
aware of the direction of the packet. Now nat-to can be used by
in rules and together with divert-to. Collisions with existing
states are found and produce a "NAT proxy port allocation failed"
message.
OK henning@ mikeb@


# 1.53 15-Jun-2016 mikeb

There's no need to convert values returned by arc4random to the network
byte order. Spotted by Gleb Smirnoff (glebius@FreeBSD.org), thanks!

ok tedu


Revision tags: OPENBSD_5_9_BASE
# 1.52 24-Nov-2015 mpi

No need for <net/if_types.h>

As a bonus this removes a "#if NCARP > 0", say yeah!


# 1.51 15-Oct-2015 bluhm

When using a pf rule with both nat-to and rdr-to, it could happen
that the nated source port was reused as destination port. Do not
initialize nport at the beginning of the function, but where it is
needed.
OK sashan@


# 1.50 13-Oct-2015 sashan

- pf_insert_src_node(): global argument (arg6) is useless, function
always gets pointer to rule.

- pf_remove_src_node(): function should always remove matching src node,
regardless the sn->rule.ptr being NULL or valid rule

- sn->rule.ptr is never NULL, spotted by mpi and Richard Procter _von_ gmail.com

OK mpi@, OK mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.49 03-Aug-2015 jsg

A recently added sanity check panic in pf_postprocess_addr() was
triggered for a reply-to rule. It turns out this case has been using
uninitialised memory as if it were a valid pf pool.

As the rest of the function assumes a valid pool for now just return.

Problem reported by RD Thrush.

ok jung@ mikeb@


# 1.48 20-Jul-2015 jsg

Add some panics to default paths where code later assumes a non default
path was taken. This both prevents warnings from clang and acts as a
sanity check.

ok mcbride@ henning@


# 1.47 18-Jul-2015 sashan

msg.mpi


# 1.46 18-Jul-2015 sashan

INET/INET6 address family check should be unified in PF

it also adds af_unhandled(), where it is currently missing.

ok mcbride@


# 1.45 17-Jul-2015 jsg

fix the indentation of a block of code, no binary change
ok mikeb@ some time ago


# 1.44 16-Jul-2015 mpi

Expand ancient NTOHL/NTOHS/HTONS/HTONL macros.

ok guenther@, henning@


# 1.43 03-Jun-2015 yasuoka

Fix pf_map_addr() not to cause dividing by 0. This fixes problem when
using table or dynamic interface addresses for source-hash. Also
avoid calling arc4random_uniform() with upper_bound == 0.

ok mikeb


# 1.42 14-Mar-2015 jsg

Remove some includes include-what-you-use claims don't
have any direct symbols used. Tested for indirect use by compiling
amd64/i386/sparc64 kernels.

ok tedu@ deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.41 06-Jan-2015 jsg

init a potentially uninitialised var in pf_postprocess_addr
ok mikeb@ henning@


# 1.40 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.39 19-Dec-2014 reyk

Support source-hash and random with tables and dynifs; not just pools.
This finally allows to use source-hash for dynamic loadbalancing, eg.
"rdr-to <hosts> source-hash", instead of just round-robin and least-states.

An older pre-siphash version of this diff was tested by many people.

OK tedu@ benno@


# 1.38 19-Dec-2014 mcbride

Comment is no longer true, remove it.


# 1.37 18-Dec-2014 tedu

use siphash for pf_lb. for ipv6, we stretch it out a bit, but good enough.
ok reyk


# 1.36 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.35 10-Nov-2014 bluhm

Split the logic for the ICMP and ICMP6 case in pf_get_sport(). The
types ICMP_ECHO and ICMP6_ECHO_REQUEST have their special meaning
only if the protocol matches.
Put an #ifdef INET6 around ICMP6_ECHO_REQUEST to make the kernel
without IPv6 compile.
OK henning@


# 1.34 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


# 1.33 14-Aug-2014 blambert

fix logging strings (correct function name via __func__ + a typo)

ok florian@ henning@


Revision tags: OPENBSD_5_6_BASE
# 1.32 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.31 02-Jul-2014 mikeb

better indentation; no functional change


Revision tags: OPENBSD_5_5_BASE
# 1.30 30-Oct-2013 mikeb

translate icmpv6 echo id's the same way we do for icmpv4; ok henning


# 1.29 30-Oct-2013 mikeb

add a comment describing why do we call pf_map_addr again if port
selection process fails; ok henning


# 1.28 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.27 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


# 1.26 17-Oct-2013 bluhm

The header file netinet/in_var.h included netinet6/in6_var.h. This
created a bunch of useless dependencies. Remove this implicit
inclusion and do an explicit #include <netinet6/in6_var.h> when it
is needed.
OK mpi@ henning@


Revision tags: OPENBSD_5_4_BASE
# 1.25 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_3_BASE
# 1.24 29-Dec-2012 markus

make sure the entry from tree_src_tracking is still in the pool;
fixes nat with sticky address and ip address change on pppoe(4) for example;
ok henning@, zinke@; mikeb@


# 1.23 29-Dec-2012 markus

reset the counter in case its current value has been removed
from the pool (e.g. ifconfig em0 1.2.3.4 -alias)
ok henning@, mikeb@


# 1.22 29-Dec-2012 markus

pass pf_pool directly to pfr_pool_get(); simplifies the API;
ok henning@, zinke@, mikeb@


Revision tags: OPENBSD_5_2_BASE
# 1.21 09-Jul-2012 zinke

Enable support for the 'weight' keyword in the 'least-states'
load balancing case, this allows Weighted Least States (WLS).
Everything prepared on c2k11 with help from mcbride@.

This finally makes PF ready for the cloud.

ok henning@ mikeb@ pyr@


Revision tags: OPENBSD_5_1_BASE
# 1.20 03-Feb-2012 bluhm

The kernel did not compile without INET6. Put some #ifdefs into
pf to fix that.
- add #ifdef INET6 in obvious places
- af translation is only possible with both INET and INET6
- interleave #endif /* INET6 */ and closing brace correctly
- it is not necessary to #ifdef function prototypes
- do not compile af translate functions at all instead of empty stub,
then the linker will report inconsistencies
- pf_poolmask() actually takes an sa_family_t not an u_int8_t argument
No binary change for GENERIC compiled with -O2 and -UDIAGNOSTIC.
reported by Olivier Cochard-Labbe; ok mikeb@ henning@


# 1.19 13-Oct-2011 claudio

Since the IPv6 madness is not enough introduce NAT64 -- which is actually
"af-to" a generic IP version translator for pf(4).
Not everything perfect yet but lets fix these things in the tree.
Insane amount of work done by sperreault@, mikeb@ and reyk@.
Looked over by mcbride@ henning@ and myself at eurobsdcon.
OK mcbride@ and general put it in from deraadt@


# 1.18 18-Sep-2011 miod

Fix various format string types to as a minimum match the width of the
variables being processed.
ok bluhm@ henning@


Revision tags: OPENBSD_5_0_BASE
# 1.17 29-Jul-2011 mcbride

Make sure we use the right tbl/dyn pointer to check the pfrkt_refcntcost;
improved debugging for error cases inside the weighted round-robin loop.

original diff from claudio, ok henning


# 1.16 27-Jul-2011 mcbride

Add support for weighted round-robin in load balancing pools and tables.
Diff from zinke@ with a some minor cleanup.
ok henning claudio deraadt


# 1.15 03-Jul-2011 zinke

bring in least-states load balancing algorithm

ok mcbride@ henning@


# 1.14 17-May-2011 mikeb

exclude link local address from the dynamic interface address pool
so that rules like "pass out on vr1 inet6 nat-to (vr1)" won't map
to the non routable ipv6 link local address; with suggestions and
ok claudio, henning


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.13 27-Jun-2010 henning

stuff nsaddr/ndaddr/nsport/ndport (addrs/ports after NAT, used a lot while
walking the ruleset and up until state is fully set up) into pf_pdesc instead
of passing around those 4 seperately all the time, also shrinks the argument
count for a few functions that have/partialy had an insane count of arguments.
kinda preparational since we'll need them elsewhere too, soon
ok ryan jsing


Revision tags: OPENBSD_4_7_BASE
# 1.12 04-Feb-2010 sthen

pf_get_sport() picks a random port from the port range specified in a
nat rule. It should check to see if it's in-use (i.e. matches an existing
PF state), if it is, it cycles sequentially through other ports until
it finds a free one. However the check was being done with the state
keys the wrong way round so it was never actually finding the state
to be in-use.

- switch the keys to correct this, avoiding random state collisions
with nat. Fixes PR 6300 and problems reported by robert@ and viq.

- check pf_get_sport() return code in pf_test(); if port allocation
fails the packet should be dropped rather than sent out untranslated.

Help/ok claudio@.


# 1.11 18-Jan-2010 mcbride

Convert pf debug logging to using log()/addlog(), a single standardised
definition of DPFPRINTF(), and log priorities from syslog.h. Old debug
levels will still work for now, but will eventually be phased out.

discussed with henning, ok dlg


# 1.10 12-Jan-2010 mcbride

First pass at removing the 'pf_pool' mechanism for translation and routing
actions. Allow interfaces to be specified in special table entries for
the routing actions. Lists of addresses can now only be done using tables,
which pfctl will generate automatically from the existing syntax.

Functionally, this deprecates the use of multiple tables or dynamic
interfaces in a single nat or rdr rule.

ok henning dlg claudio


# 1.9 14-Dec-2009 henning

fix sticky-address - by pretty much re-implementing it. still following
the original approach using a source tracking node.
the reimplementation i smore flexible than the original one, we now have an
slist of source tracking nodes per state. that is cheap because more than
one entry will be an absolute exception.
ok beck and jsg, also stress tested by Sebastian Benoit <benoit-lists at fb12.de>


# 1.8 03-Nov-2009 claudio

rtables are stacked on rdomains (it is possible to have multiple routing
tables on top of a rdomain) but until now our code was a crazy mix so that
it was impossible to correctly use rtables in that case. Additionally pf(4)
only knows about rtables and not about rdomains. This is especially bad when
tracking (possibly conflicting) states in various domains.
This diff fixes all or most of these issues. It adds a lookup function to
get the rdomain id based on a rtable id. Makes pf understand rdomains and
allows pf to move packets between rdomains (it is similar to NAT).
Because pf states now track the rdomain id as well it is necessary to modify
the pfsync wire format. So old and new systems will not sync up.
A lot of help by dlg@, tested by sthen@, jsg@ and probably more
OK dlg@, mpf@, deraadt@


# 1.7 07-Sep-2009 sthen

Fix static-port, found by jmc@. ok henning@.


# 1.6 01-Sep-2009 henning

the diff theo calls me insanae for:
rewrite of the NAT code, basically. nat and rdr become actions on regular
rules, seperate nat/rdr/binat rules do not exist any more.
match in on $intf rdr-to 1.2.3.4
match out on $intf nat-to 5.6.7.8
the code is capable of doing nat and rdr in any direction, but we prevent
this in pfctl for now, there are implications that need to be documented
better.
the address rewrite happens inline, subsequent rules will see the already
changed addresses. nat / rdr can be applied multiple times as well.
match in on $intf rdr-to 1.2.3.4
match in on $intf to 1.2.3.4 rdr-to 5.6.7.8
help and ok dlg sthen claudio, reyk tested too


Revision tags: OPENBSD_4_6_BASE
# 1.5 24-Jun-2009 sthen

move the "pf_map_addr: selected address" printf up to -xnoisy.
ok henning@


# 1.4 05-Mar-2009 mcbride

Stricter state checking for ICMP and ICMPv6 packets: include the ICMP type
in one port of the state key, using the type to determine which side should
be the id, and which should be the type. Also:
- Handle ICMP6 messages which are typically sent to multicast addresses but
recieve unicast replies, by doing fallthrough lookups against the correct
multicast address.
- Clear up some mistaken assumptions in the PF code:
- Not all ICMP packets have an icmp_id, so simulate one based on other
data if we can, otherwise set it to 0.
- Don't modify the icmp id field in NAT unless it's echo
- Use the full range of possible id's when NATing icmp6 echoy

ok henning marco
testing matthieu todd


Revision tags: OPENBSD_4_5_BASE
# 1.3 18-Feb-2009 henning

bring back the NAT NOP fix, but this time right.
when we want to pretend pf_get_translation didn't do anything we must
get rid of _both_ state keys and reset all 4 sk pointers to NULL and
not leave one key behind and have all 4 pointers point to it - that must
fail. tested dhill sthen, david agrees, deraadt ok


# 1.2 12-Feb-2009 sthen

revert pf.c r1.629 (which moved to this file) which was causing
"panic: pool_do_get(pfstatekeypl): free list modified" discussed with many.

ok dlg


# 1.1 29-Jan-2009 pyr

Split the address selection from pools away from pf.c and put it in
pf_lb.c. This will ease the process of adding more selection types
without bloatening pf.c even more.

ok and a weird death threat, henning@
raised eyebrow, dlg@


# 1.64 02-Jul-2019 yasuoka

When source address tracking record is used for "route-to", the next
hop interface configured with "route-to" was not used. Keep the
interface within the pf_src_node and use it when the record is used.

OK sashan


Revision tags: OPENBSD_6_5_BASE
# 1.63 10-Dec-2018 kn

Remove useless macros

These are just unhelpful case conversion.

OK sashan henning


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.62 06-Feb-2018 henning

some finger muscle workout:
bzero -> memset and (very few) bcopy -> memcpy/memmove


Revision tags: OPENBSD_6_2_BASE
# 1.61 12-Jul-2017 bluhm

Use a 32 bit variable to detect integer overflow when searching for
an unused nat port. Prevents a possible endless loop if high port
is 65535 or low port is 0.
report and analysis Jingmin Zhou; OK sashan@ visa@


# 1.60 23-Apr-2017 sthen

Some of the LOG_NOTICE messages from PF were seen in normal operations
with certain rulesets and excessively noisy; move them to LOG_INFO (which was
previously unused). ok benno@


Revision tags: OPENBSD_6_1_BASE
# 1.59 08-Feb-2017 jsg

Remove an uneeded NULL test which was after a deref.
ok mpi@ henning@ sashan@


# 1.58 26-Oct-2016 bluhm

Put union pf_headers and struct pf_pdesc into separate header file
pfvar_priv.h. The pf_headers had to be defined in multiple .c files
before. In pfvar.h it would have unknown storage size, this file
is included in too many places. The idea is to have a private pf
header that is only included in the pf part of the kernel. For now
it contains pf_pdesc and pf_headers, it may be extended later.
discussion, input and OK henning@ procter@ sashan@


# 1.57 27-Sep-2016 dlg

roll back turning RB into RBT until i get better at this process.


# 1.56 27-Sep-2016 dlg

move pf from the RB macros to the RBT functions.


Revision tags: OPENBSD_6_0_BASE
# 1.55 19-Jul-2016 henning

remove wrong and misleading comment, ok phessler


# 1.54 24-Jun-2016 bluhm

The function pf_get_sport() did work for out rules only. Make it
aware of the direction of the packet. Now nat-to can be used by
in rules and together with divert-to. Collisions with existing
states are found and produce a "NAT proxy port allocation failed"
message.
OK henning@ mikeb@


# 1.53 15-Jun-2016 mikeb

There's no need to convert values returned by arc4random to the network
byte order. Spotted by Gleb Smirnoff (glebius@FreeBSD.org), thanks!

ok tedu


Revision tags: OPENBSD_5_9_BASE
# 1.52 24-Nov-2015 mpi

No need for <net/if_types.h>

As a bonus this removes a "#if NCARP > 0", say yeah!


# 1.51 15-Oct-2015 bluhm

When using a pf rule with both nat-to and rdr-to, it could happen
that the nated source port was reused as destination port. Do not
initialize nport at the beginning of the function, but where it is
needed.
OK sashan@


# 1.50 13-Oct-2015 sashan

- pf_insert_src_node(): global argument (arg6) is useless, function
always gets pointer to rule.

- pf_remove_src_node(): function should always remove matching src node,
regardless the sn->rule.ptr being NULL or valid rule

- sn->rule.ptr is never NULL, spotted by mpi and Richard Procter _von_ gmail.com

OK mpi@, OK mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.49 03-Aug-2015 jsg

A recently added sanity check panic in pf_postprocess_addr() was
triggered for a reply-to rule. It turns out this case has been using
uninitialised memory as if it were a valid pf pool.

As the rest of the function assumes a valid pool for now just return.

Problem reported by RD Thrush.

ok jung@ mikeb@


# 1.48 20-Jul-2015 jsg

Add some panics to default paths where code later assumes a non default
path was taken. This both prevents warnings from clang and acts as a
sanity check.

ok mcbride@ henning@


# 1.47 18-Jul-2015 sashan

msg.mpi


# 1.46 18-Jul-2015 sashan

INET/INET6 address family check should be unified in PF

it also adds af_unhandled(), where it is currently missing.

ok mcbride@


# 1.45 17-Jul-2015 jsg

fix the indentation of a block of code, no binary change
ok mikeb@ some time ago


# 1.44 16-Jul-2015 mpi

Expand ancient NTOHL/NTOHS/HTONS/HTONL macros.

ok guenther@, henning@


# 1.43 03-Jun-2015 yasuoka

Fix pf_map_addr() not to cause dividing by 0. This fixes problem when
using table or dynamic interface addresses for source-hash. Also
avoid calling arc4random_uniform() with upper_bound == 0.

ok mikeb


# 1.42 14-Mar-2015 jsg

Remove some includes include-what-you-use claims don't
have any direct symbols used. Tested for indirect use by compiling
amd64/i386/sparc64 kernels.

ok tedu@ deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.41 06-Jan-2015 jsg

init a potentially uninitialised var in pf_postprocess_addr
ok mikeb@ henning@


# 1.40 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.39 19-Dec-2014 reyk

Support source-hash and random with tables and dynifs; not just pools.
This finally allows to use source-hash for dynamic loadbalancing, eg.
"rdr-to <hosts> source-hash", instead of just round-robin and least-states.

An older pre-siphash version of this diff was tested by many people.

OK tedu@ benno@


# 1.38 19-Dec-2014 mcbride

Comment is no longer true, remove it.


# 1.37 18-Dec-2014 tedu

use siphash for pf_lb. for ipv6, we stretch it out a bit, but good enough.
ok reyk


# 1.36 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.35 10-Nov-2014 bluhm

Split the logic for the ICMP and ICMP6 case in pf_get_sport(). The
types ICMP_ECHO and ICMP6_ECHO_REQUEST have their special meaning
only if the protocol matches.
Put an #ifdef INET6 around ICMP6_ECHO_REQUEST to make the kernel
without IPv6 compile.
OK henning@


# 1.34 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


# 1.33 14-Aug-2014 blambert

fix logging strings (correct function name via __func__ + a typo)

ok florian@ henning@


Revision tags: OPENBSD_5_6_BASE
# 1.32 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.31 02-Jul-2014 mikeb

better indentation; no functional change


Revision tags: OPENBSD_5_5_BASE
# 1.30 30-Oct-2013 mikeb

translate icmpv6 echo id's the same way we do for icmpv4; ok henning


# 1.29 30-Oct-2013 mikeb

add a comment describing why do we call pf_map_addr again if port
selection process fails; ok henning


# 1.28 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.27 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


# 1.26 17-Oct-2013 bluhm

The header file netinet/in_var.h included netinet6/in6_var.h. This
created a bunch of useless dependencies. Remove this implicit
inclusion and do an explicit #include <netinet6/in6_var.h> when it
is needed.
OK mpi@ henning@


Revision tags: OPENBSD_5_4_BASE
# 1.25 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_3_BASE
# 1.24 29-Dec-2012 markus

make sure the entry from tree_src_tracking is still in the pool;
fixes nat with sticky address and ip address change on pppoe(4) for example;
ok henning@, zinke@; mikeb@


# 1.23 29-Dec-2012 markus

reset the counter in case its current value has been removed
from the pool (e.g. ifconfig em0 1.2.3.4 -alias)
ok henning@, mikeb@


# 1.22 29-Dec-2012 markus

pass pf_pool directly to pfr_pool_get(); simplifies the API;
ok henning@, zinke@, mikeb@


Revision tags: OPENBSD_5_2_BASE
# 1.21 09-Jul-2012 zinke

Enable support for the 'weight' keyword in the 'least-states'
load balancing case, this allows Weighted Least States (WLS).
Everything prepared on c2k11 with help from mcbride@.

This finally makes PF ready for the cloud.

ok henning@ mikeb@ pyr@


Revision tags: OPENBSD_5_1_BASE
# 1.20 03-Feb-2012 bluhm

The kernel did not compile without INET6. Put some #ifdefs into
pf to fix that.
- add #ifdef INET6 in obvious places
- af translation is only possible with both INET and INET6
- interleave #endif /* INET6 */ and closing brace correctly
- it is not necessary to #ifdef function prototypes
- do not compile af translate functions at all instead of empty stub,
then the linker will report inconsistencies
- pf_poolmask() actually takes an sa_family_t not an u_int8_t argument
No binary change for GENERIC compiled with -O2 and -UDIAGNOSTIC.
reported by Olivier Cochard-Labbe; ok mikeb@ henning@


# 1.19 13-Oct-2011 claudio

Since the IPv6 madness is not enough introduce NAT64 -- which is actually
"af-to" a generic IP version translator for pf(4).
Not everything perfect yet but lets fix these things in the tree.
Insane amount of work done by sperreault@, mikeb@ and reyk@.
Looked over by mcbride@ henning@ and myself at eurobsdcon.
OK mcbride@ and general put it in from deraadt@


# 1.18 18-Sep-2011 miod

Fix various format string types to as a minimum match the width of the
variables being processed.
ok bluhm@ henning@


Revision tags: OPENBSD_5_0_BASE
# 1.17 29-Jul-2011 mcbride

Make sure we use the right tbl/dyn pointer to check the pfrkt_refcntcost;
improved debugging for error cases inside the weighted round-robin loop.

original diff from claudio, ok henning


# 1.16 27-Jul-2011 mcbride

Add support for weighted round-robin in load balancing pools and tables.
Diff from zinke@ with a some minor cleanup.
ok henning claudio deraadt


# 1.15 03-Jul-2011 zinke

bring in least-states load balancing algorithm

ok mcbride@ henning@


# 1.14 17-May-2011 mikeb

exclude link local address from the dynamic interface address pool
so that rules like "pass out on vr1 inet6 nat-to (vr1)" won't map
to the non routable ipv6 link local address; with suggestions and
ok claudio, henning


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.13 27-Jun-2010 henning

stuff nsaddr/ndaddr/nsport/ndport (addrs/ports after NAT, used a lot while
walking the ruleset and up until state is fully set up) into pf_pdesc instead
of passing around those 4 seperately all the time, also shrinks the argument
count for a few functions that have/partialy had an insane count of arguments.
kinda preparational since we'll need them elsewhere too, soon
ok ryan jsing


Revision tags: OPENBSD_4_7_BASE
# 1.12 04-Feb-2010 sthen

pf_get_sport() picks a random port from the port range specified in a
nat rule. It should check to see if it's in-use (i.e. matches an existing
PF state), if it is, it cycles sequentially through other ports until
it finds a free one. However the check was being done with the state
keys the wrong way round so it was never actually finding the state
to be in-use.

- switch the keys to correct this, avoiding random state collisions
with nat. Fixes PR 6300 and problems reported by robert@ and viq.

- check pf_get_sport() return code in pf_test(); if port allocation
fails the packet should be dropped rather than sent out untranslated.

Help/ok claudio@.


# 1.11 18-Jan-2010 mcbride

Convert pf debug logging to using log()/addlog(), a single standardised
definition of DPFPRINTF(), and log priorities from syslog.h. Old debug
levels will still work for now, but will eventually be phased out.

discussed with henning, ok dlg


# 1.10 12-Jan-2010 mcbride

First pass at removing the 'pf_pool' mechanism for translation and routing
actions. Allow interfaces to be specified in special table entries for
the routing actions. Lists of addresses can now only be done using tables,
which pfctl will generate automatically from the existing syntax.

Functionally, this deprecates the use of multiple tables or dynamic
interfaces in a single nat or rdr rule.

ok henning dlg claudio


# 1.9 14-Dec-2009 henning

fix sticky-address - by pretty much re-implementing it. still following
the original approach using a source tracking node.
the reimplementation i smore flexible than the original one, we now have an
slist of source tracking nodes per state. that is cheap because more than
one entry will be an absolute exception.
ok beck and jsg, also stress tested by Sebastian Benoit <benoit-lists at fb12.de>


# 1.8 03-Nov-2009 claudio

rtables are stacked on rdomains (it is possible to have multiple routing
tables on top of a rdomain) but until now our code was a crazy mix so that
it was impossible to correctly use rtables in that case. Additionally pf(4)
only knows about rtables and not about rdomains. This is especially bad when
tracking (possibly conflicting) states in various domains.
This diff fixes all or most of these issues. It adds a lookup function to
get the rdomain id based on a rtable id. Makes pf understand rdomains and
allows pf to move packets between rdomains (it is similar to NAT).
Because pf states now track the rdomain id as well it is necessary to modify
the pfsync wire format. So old and new systems will not sync up.
A lot of help by dlg@, tested by sthen@, jsg@ and probably more
OK dlg@, mpf@, deraadt@


# 1.7 07-Sep-2009 sthen

Fix static-port, found by jmc@. ok henning@.


# 1.6 01-Sep-2009 henning

the diff theo calls me insanae for:
rewrite of the NAT code, basically. nat and rdr become actions on regular
rules, seperate nat/rdr/binat rules do not exist any more.
match in on $intf rdr-to 1.2.3.4
match out on $intf nat-to 5.6.7.8
the code is capable of doing nat and rdr in any direction, but we prevent
this in pfctl for now, there are implications that need to be documented
better.
the address rewrite happens inline, subsequent rules will see the already
changed addresses. nat / rdr can be applied multiple times as well.
match in on $intf rdr-to 1.2.3.4
match in on $intf to 1.2.3.4 rdr-to 5.6.7.8
help and ok dlg sthen claudio, reyk tested too


Revision tags: OPENBSD_4_6_BASE
# 1.5 24-Jun-2009 sthen

move the "pf_map_addr: selected address" printf up to -xnoisy.
ok henning@


# 1.4 05-Mar-2009 mcbride

Stricter state checking for ICMP and ICMPv6 packets: include the ICMP type
in one port of the state key, using the type to determine which side should
be the id, and which should be the type. Also:
- Handle ICMP6 messages which are typically sent to multicast addresses but
recieve unicast replies, by doing fallthrough lookups against the correct
multicast address.
- Clear up some mistaken assumptions in the PF code:
- Not all ICMP packets have an icmp_id, so simulate one based on other
data if we can, otherwise set it to 0.
- Don't modify the icmp id field in NAT unless it's echo
- Use the full range of possible id's when NATing icmp6 echoy

ok henning marco
testing matthieu todd


Revision tags: OPENBSD_4_5_BASE
# 1.3 18-Feb-2009 henning

bring back the NAT NOP fix, but this time right.
when we want to pretend pf_get_translation didn't do anything we must
get rid of _both_ state keys and reset all 4 sk pointers to NULL and
not leave one key behind and have all 4 pointers point to it - that must
fail. tested dhill sthen, david agrees, deraadt ok


# 1.2 12-Feb-2009 sthen

revert pf.c r1.629 (which moved to this file) which was causing
"panic: pool_do_get(pfstatekeypl): free list modified" discussed with many.

ok dlg


# 1.1 29-Jan-2009 pyr

Split the address selection from pools away from pf.c and put it in
pf_lb.c. This will ease the process of adding more selection types
without bloatening pf.c even more.

ok and a weird death threat, henning@
raised eyebrow, dlg@


# 1.63 10-Dec-2018 kn

Remove useless macros

These are just unhelpful case conversion.

OK sashan henning


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.62 06-Feb-2018 henning

some finger muscle workout:
bzero -> memset and (very few) bcopy -> memcpy/memmove


Revision tags: OPENBSD_6_2_BASE
# 1.61 12-Jul-2017 bluhm

Use a 32 bit variable to detect integer overflow when searching for
an unused nat port. Prevents a possible endless loop if high port
is 65535 or low port is 0.
report and analysis Jingmin Zhou; OK sashan@ visa@


# 1.60 23-Apr-2017 sthen

Some of the LOG_NOTICE messages from PF were seen in normal operations
with certain rulesets and excessively noisy; move them to LOG_INFO (which was
previously unused). ok benno@


Revision tags: OPENBSD_6_1_BASE
# 1.59 08-Feb-2017 jsg

Remove an uneeded NULL test which was after a deref.
ok mpi@ henning@ sashan@


# 1.58 26-Oct-2016 bluhm

Put union pf_headers and struct pf_pdesc into separate header file
pfvar_priv.h. The pf_headers had to be defined in multiple .c files
before. In pfvar.h it would have unknown storage size, this file
is included in too many places. The idea is to have a private pf
header that is only included in the pf part of the kernel. For now
it contains pf_pdesc and pf_headers, it may be extended later.
discussion, input and OK henning@ procter@ sashan@


# 1.57 27-Sep-2016 dlg

roll back turning RB into RBT until i get better at this process.


# 1.56 27-Sep-2016 dlg

move pf from the RB macros to the RBT functions.


Revision tags: OPENBSD_6_0_BASE
# 1.55 19-Jul-2016 henning

remove wrong and misleading comment, ok phessler


# 1.54 24-Jun-2016 bluhm

The function pf_get_sport() did work for out rules only. Make it
aware of the direction of the packet. Now nat-to can be used by
in rules and together with divert-to. Collisions with existing
states are found and produce a "NAT proxy port allocation failed"
message.
OK henning@ mikeb@


# 1.53 15-Jun-2016 mikeb

There's no need to convert values returned by arc4random to the network
byte order. Spotted by Gleb Smirnoff (glebius@FreeBSD.org), thanks!

ok tedu


Revision tags: OPENBSD_5_9_BASE
# 1.52 24-Nov-2015 mpi

No need for <net/if_types.h>

As a bonus this removes a "#if NCARP > 0", say yeah!


# 1.51 15-Oct-2015 bluhm

When using a pf rule with both nat-to and rdr-to, it could happen
that the nated source port was reused as destination port. Do not
initialize nport at the beginning of the function, but where it is
needed.
OK sashan@


# 1.50 13-Oct-2015 sashan

- pf_insert_src_node(): global argument (arg6) is useless, function
always gets pointer to rule.

- pf_remove_src_node(): function should always remove matching src node,
regardless the sn->rule.ptr being NULL or valid rule

- sn->rule.ptr is never NULL, spotted by mpi and Richard Procter _von_ gmail.com

OK mpi@, OK mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.49 03-Aug-2015 jsg

A recently added sanity check panic in pf_postprocess_addr() was
triggered for a reply-to rule. It turns out this case has been using
uninitialised memory as if it were a valid pf pool.

As the rest of the function assumes a valid pool for now just return.

Problem reported by RD Thrush.

ok jung@ mikeb@


# 1.48 20-Jul-2015 jsg

Add some panics to default paths where code later assumes a non default
path was taken. This both prevents warnings from clang and acts as a
sanity check.

ok mcbride@ henning@


# 1.47 18-Jul-2015 sashan

msg.mpi


# 1.46 18-Jul-2015 sashan

INET/INET6 address family check should be unified in PF

it also adds af_unhandled(), where it is currently missing.

ok mcbride@


# 1.45 17-Jul-2015 jsg

fix the indentation of a block of code, no binary change
ok mikeb@ some time ago


# 1.44 16-Jul-2015 mpi

Expand ancient NTOHL/NTOHS/HTONS/HTONL macros.

ok guenther@, henning@


# 1.43 03-Jun-2015 yasuoka

Fix pf_map_addr() not to cause dividing by 0. This fixes problem when
using table or dynamic interface addresses for source-hash. Also
avoid calling arc4random_uniform() with upper_bound == 0.

ok mikeb


# 1.42 14-Mar-2015 jsg

Remove some includes include-what-you-use claims don't
have any direct symbols used. Tested for indirect use by compiling
amd64/i386/sparc64 kernels.

ok tedu@ deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.41 06-Jan-2015 jsg

init a potentially uninitialised var in pf_postprocess_addr
ok mikeb@ henning@


# 1.40 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.39 19-Dec-2014 reyk

Support source-hash and random with tables and dynifs; not just pools.
This finally allows to use source-hash for dynamic loadbalancing, eg.
"rdr-to <hosts> source-hash", instead of just round-robin and least-states.

An older pre-siphash version of this diff was tested by many people.

OK tedu@ benno@


# 1.38 19-Dec-2014 mcbride

Comment is no longer true, remove it.


# 1.37 18-Dec-2014 tedu

use siphash for pf_lb. for ipv6, we stretch it out a bit, but good enough.
ok reyk


# 1.36 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.35 10-Nov-2014 bluhm

Split the logic for the ICMP and ICMP6 case in pf_get_sport(). The
types ICMP_ECHO and ICMP6_ECHO_REQUEST have their special meaning
only if the protocol matches.
Put an #ifdef INET6 around ICMP6_ECHO_REQUEST to make the kernel
without IPv6 compile.
OK henning@


# 1.34 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


# 1.33 14-Aug-2014 blambert

fix logging strings (correct function name via __func__ + a typo)

ok florian@ henning@


Revision tags: OPENBSD_5_6_BASE
# 1.32 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.31 02-Jul-2014 mikeb

better indentation; no functional change


Revision tags: OPENBSD_5_5_BASE
# 1.30 30-Oct-2013 mikeb

translate icmpv6 echo id's the same way we do for icmpv4; ok henning


# 1.29 30-Oct-2013 mikeb

add a comment describing why do we call pf_map_addr again if port
selection process fails; ok henning


# 1.28 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.27 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


# 1.26 17-Oct-2013 bluhm

The header file netinet/in_var.h included netinet6/in6_var.h. This
created a bunch of useless dependencies. Remove this implicit
inclusion and do an explicit #include <netinet6/in6_var.h> when it
is needed.
OK mpi@ henning@


Revision tags: OPENBSD_5_4_BASE
# 1.25 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_3_BASE
# 1.24 29-Dec-2012 markus

make sure the entry from tree_src_tracking is still in the pool;
fixes nat with sticky address and ip address change on pppoe(4) for example;
ok henning@, zinke@; mikeb@


# 1.23 29-Dec-2012 markus

reset the counter in case its current value has been removed
from the pool (e.g. ifconfig em0 1.2.3.4 -alias)
ok henning@, mikeb@


# 1.22 29-Dec-2012 markus

pass pf_pool directly to pfr_pool_get(); simplifies the API;
ok henning@, zinke@, mikeb@


Revision tags: OPENBSD_5_2_BASE
# 1.21 09-Jul-2012 zinke

Enable support for the 'weight' keyword in the 'least-states'
load balancing case, this allows Weighted Least States (WLS).
Everything prepared on c2k11 with help from mcbride@.

This finally makes PF ready for the cloud.

ok henning@ mikeb@ pyr@


Revision tags: OPENBSD_5_1_BASE
# 1.20 03-Feb-2012 bluhm

The kernel did not compile without INET6. Put some #ifdefs into
pf to fix that.
- add #ifdef INET6 in obvious places
- af translation is only possible with both INET and INET6
- interleave #endif /* INET6 */ and closing brace correctly
- it is not necessary to #ifdef function prototypes
- do not compile af translate functions at all instead of empty stub,
then the linker will report inconsistencies
- pf_poolmask() actually takes an sa_family_t not an u_int8_t argument
No binary change for GENERIC compiled with -O2 and -UDIAGNOSTIC.
reported by Olivier Cochard-Labbe; ok mikeb@ henning@


# 1.19 13-Oct-2011 claudio

Since the IPv6 madness is not enough introduce NAT64 -- which is actually
"af-to" a generic IP version translator for pf(4).
Not everything perfect yet but lets fix these things in the tree.
Insane amount of work done by sperreault@, mikeb@ and reyk@.
Looked over by mcbride@ henning@ and myself at eurobsdcon.
OK mcbride@ and general put it in from deraadt@


# 1.18 18-Sep-2011 miod

Fix various format string types to as a minimum match the width of the
variables being processed.
ok bluhm@ henning@


Revision tags: OPENBSD_5_0_BASE
# 1.17 29-Jul-2011 mcbride

Make sure we use the right tbl/dyn pointer to check the pfrkt_refcntcost;
improved debugging for error cases inside the weighted round-robin loop.

original diff from claudio, ok henning


# 1.16 27-Jul-2011 mcbride

Add support for weighted round-robin in load balancing pools and tables.
Diff from zinke@ with a some minor cleanup.
ok henning claudio deraadt


# 1.15 03-Jul-2011 zinke

bring in least-states load balancing algorithm

ok mcbride@ henning@


# 1.14 17-May-2011 mikeb

exclude link local address from the dynamic interface address pool
so that rules like "pass out on vr1 inet6 nat-to (vr1)" won't map
to the non routable ipv6 link local address; with suggestions and
ok claudio, henning


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.13 27-Jun-2010 henning

stuff nsaddr/ndaddr/nsport/ndport (addrs/ports after NAT, used a lot while
walking the ruleset and up until state is fully set up) into pf_pdesc instead
of passing around those 4 seperately all the time, also shrinks the argument
count for a few functions that have/partialy had an insane count of arguments.
kinda preparational since we'll need them elsewhere too, soon
ok ryan jsing


Revision tags: OPENBSD_4_7_BASE
# 1.12 04-Feb-2010 sthen

pf_get_sport() picks a random port from the port range specified in a
nat rule. It should check to see if it's in-use (i.e. matches an existing
PF state), if it is, it cycles sequentially through other ports until
it finds a free one. However the check was being done with the state
keys the wrong way round so it was never actually finding the state
to be in-use.

- switch the keys to correct this, avoiding random state collisions
with nat. Fixes PR 6300 and problems reported by robert@ and viq.

- check pf_get_sport() return code in pf_test(); if port allocation
fails the packet should be dropped rather than sent out untranslated.

Help/ok claudio@.


# 1.11 18-Jan-2010 mcbride

Convert pf debug logging to using log()/addlog(), a single standardised
definition of DPFPRINTF(), and log priorities from syslog.h. Old debug
levels will still work for now, but will eventually be phased out.

discussed with henning, ok dlg


# 1.10 12-Jan-2010 mcbride

First pass at removing the 'pf_pool' mechanism for translation and routing
actions. Allow interfaces to be specified in special table entries for
the routing actions. Lists of addresses can now only be done using tables,
which pfctl will generate automatically from the existing syntax.

Functionally, this deprecates the use of multiple tables or dynamic
interfaces in a single nat or rdr rule.

ok henning dlg claudio


# 1.9 14-Dec-2009 henning

fix sticky-address - by pretty much re-implementing it. still following
the original approach using a source tracking node.
the reimplementation i smore flexible than the original one, we now have an
slist of source tracking nodes per state. that is cheap because more than
one entry will be an absolute exception.
ok beck and jsg, also stress tested by Sebastian Benoit <benoit-lists at fb12.de>


# 1.8 03-Nov-2009 claudio

rtables are stacked on rdomains (it is possible to have multiple routing
tables on top of a rdomain) but until now our code was a crazy mix so that
it was impossible to correctly use rtables in that case. Additionally pf(4)
only knows about rtables and not about rdomains. This is especially bad when
tracking (possibly conflicting) states in various domains.
This diff fixes all or most of these issues. It adds a lookup function to
get the rdomain id based on a rtable id. Makes pf understand rdomains and
allows pf to move packets between rdomains (it is similar to NAT).
Because pf states now track the rdomain id as well it is necessary to modify
the pfsync wire format. So old and new systems will not sync up.
A lot of help by dlg@, tested by sthen@, jsg@ and probably more
OK dlg@, mpf@, deraadt@


# 1.7 07-Sep-2009 sthen

Fix static-port, found by jmc@. ok henning@.


# 1.6 01-Sep-2009 henning

the diff theo calls me insanae for:
rewrite of the NAT code, basically. nat and rdr become actions on regular
rules, seperate nat/rdr/binat rules do not exist any more.
match in on $intf rdr-to 1.2.3.4
match out on $intf nat-to 5.6.7.8
the code is capable of doing nat and rdr in any direction, but we prevent
this in pfctl for now, there are implications that need to be documented
better.
the address rewrite happens inline, subsequent rules will see the already
changed addresses. nat / rdr can be applied multiple times as well.
match in on $intf rdr-to 1.2.3.4
match in on $intf to 1.2.3.4 rdr-to 5.6.7.8
help and ok dlg sthen claudio, reyk tested too


Revision tags: OPENBSD_4_6_BASE
# 1.5 24-Jun-2009 sthen

move the "pf_map_addr: selected address" printf up to -xnoisy.
ok henning@


# 1.4 05-Mar-2009 mcbride

Stricter state checking for ICMP and ICMPv6 packets: include the ICMP type
in one port of the state key, using the type to determine which side should
be the id, and which should be the type. Also:
- Handle ICMP6 messages which are typically sent to multicast addresses but
recieve unicast replies, by doing fallthrough lookups against the correct
multicast address.
- Clear up some mistaken assumptions in the PF code:
- Not all ICMP packets have an icmp_id, so simulate one based on other
data if we can, otherwise set it to 0.
- Don't modify the icmp id field in NAT unless it's echo
- Use the full range of possible id's when NATing icmp6 echoy

ok henning marco
testing matthieu todd


Revision tags: OPENBSD_4_5_BASE
# 1.3 18-Feb-2009 henning

bring back the NAT NOP fix, but this time right.
when we want to pretend pf_get_translation didn't do anything we must
get rid of _both_ state keys and reset all 4 sk pointers to NULL and
not leave one key behind and have all 4 pointers point to it - that must
fail. tested dhill sthen, david agrees, deraadt ok


# 1.2 12-Feb-2009 sthen

revert pf.c r1.629 (which moved to this file) which was causing
"panic: pool_do_get(pfstatekeypl): free list modified" discussed with many.

ok dlg


# 1.1 29-Jan-2009 pyr

Split the address selection from pools away from pf.c and put it in
pf_lb.c. This will ease the process of adding more selection types
without bloatening pf.c even more.

ok and a weird death threat, henning@
raised eyebrow, dlg@


# 1.62 06-Feb-2018 henning

some finger muscle workout:
bzero -> memset and (very few) bcopy -> memcpy/memmove


Revision tags: OPENBSD_6_2_BASE
# 1.61 12-Jul-2017 bluhm

Use a 32 bit variable to detect integer overflow when searching for
an unused nat port. Prevents a possible endless loop if high port
is 65535 or low port is 0.
report and analysis Jingmin Zhou; OK sashan@ visa@


# 1.60 23-Apr-2017 sthen

Some of the LOG_NOTICE messages from PF were seen in normal operations
with certain rulesets and excessively noisy; move them to LOG_INFO (which was
previously unused). ok benno@


Revision tags: OPENBSD_6_1_BASE
# 1.59 08-Feb-2017 jsg

Remove an uneeded NULL test which was after a deref.
ok mpi@ henning@ sashan@


# 1.58 26-Oct-2016 bluhm

Put union pf_headers and struct pf_pdesc into separate header file
pfvar_priv.h. The pf_headers had to be defined in multiple .c files
before. In pfvar.h it would have unknown storage size, this file
is included in too many places. The idea is to have a private pf
header that is only included in the pf part of the kernel. For now
it contains pf_pdesc and pf_headers, it may be extended later.
discussion, input and OK henning@ procter@ sashan@


# 1.57 27-Sep-2016 dlg

roll back turning RB into RBT until i get better at this process.


# 1.56 27-Sep-2016 dlg

move pf from the RB macros to the RBT functions.


Revision tags: OPENBSD_6_0_BASE
# 1.55 19-Jul-2016 henning

remove wrong and misleading comment, ok phessler


# 1.54 24-Jun-2016 bluhm

The function pf_get_sport() did work for out rules only. Make it
aware of the direction of the packet. Now nat-to can be used by
in rules and together with divert-to. Collisions with existing
states are found and produce a "NAT proxy port allocation failed"
message.
OK henning@ mikeb@


# 1.53 15-Jun-2016 mikeb

There's no need to convert values returned by arc4random to the network
byte order. Spotted by Gleb Smirnoff (glebius@FreeBSD.org), thanks!

ok tedu


Revision tags: OPENBSD_5_9_BASE
# 1.52 24-Nov-2015 mpi

No need for <net/if_types.h>

As a bonus this removes a "#if NCARP > 0", say yeah!


# 1.51 15-Oct-2015 bluhm

When using a pf rule with both nat-to and rdr-to, it could happen
that the nated source port was reused as destination port. Do not
initialize nport at the beginning of the function, but where it is
needed.
OK sashan@


# 1.50 13-Oct-2015 sashan

- pf_insert_src_node(): global argument (arg6) is useless, function
always gets pointer to rule.

- pf_remove_src_node(): function should always remove matching src node,
regardless the sn->rule.ptr being NULL or valid rule

- sn->rule.ptr is never NULL, spotted by mpi and Richard Procter _von_ gmail.com

OK mpi@, OK mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.49 03-Aug-2015 jsg

A recently added sanity check panic in pf_postprocess_addr() was
triggered for a reply-to rule. It turns out this case has been using
uninitialised memory as if it were a valid pf pool.

As the rest of the function assumes a valid pool for now just return.

Problem reported by RD Thrush.

ok jung@ mikeb@


# 1.48 20-Jul-2015 jsg

Add some panics to default paths where code later assumes a non default
path was taken. This both prevents warnings from clang and acts as a
sanity check.

ok mcbride@ henning@


# 1.47 18-Jul-2015 sashan

msg.mpi


# 1.46 18-Jul-2015 sashan

INET/INET6 address family check should be unified in PF

it also adds af_unhandled(), where it is currently missing.

ok mcbride@


# 1.45 17-Jul-2015 jsg

fix the indentation of a block of code, no binary change
ok mikeb@ some time ago


# 1.44 16-Jul-2015 mpi

Expand ancient NTOHL/NTOHS/HTONS/HTONL macros.

ok guenther@, henning@


# 1.43 03-Jun-2015 yasuoka

Fix pf_map_addr() not to cause dividing by 0. This fixes problem when
using table or dynamic interface addresses for source-hash. Also
avoid calling arc4random_uniform() with upper_bound == 0.

ok mikeb


# 1.42 14-Mar-2015 jsg

Remove some includes include-what-you-use claims don't
have any direct symbols used. Tested for indirect use by compiling
amd64/i386/sparc64 kernels.

ok tedu@ deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.41 06-Jan-2015 jsg

init a potentially uninitialised var in pf_postprocess_addr
ok mikeb@ henning@


# 1.40 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.39 19-Dec-2014 reyk

Support source-hash and random with tables and dynifs; not just pools.
This finally allows to use source-hash for dynamic loadbalancing, eg.
"rdr-to <hosts> source-hash", instead of just round-robin and least-states.

An older pre-siphash version of this diff was tested by many people.

OK tedu@ benno@


# 1.38 19-Dec-2014 mcbride

Comment is no longer true, remove it.


# 1.37 18-Dec-2014 tedu

use siphash for pf_lb. for ipv6, we stretch it out a bit, but good enough.
ok reyk


# 1.36 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.35 10-Nov-2014 bluhm

Split the logic for the ICMP and ICMP6 case in pf_get_sport(). The
types ICMP_ECHO and ICMP6_ECHO_REQUEST have their special meaning
only if the protocol matches.
Put an #ifdef INET6 around ICMP6_ECHO_REQUEST to make the kernel
without IPv6 compile.
OK henning@


# 1.34 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


# 1.33 14-Aug-2014 blambert

fix logging strings (correct function name via __func__ + a typo)

ok florian@ henning@


Revision tags: OPENBSD_5_6_BASE
# 1.32 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.31 02-Jul-2014 mikeb

better indentation; no functional change


Revision tags: OPENBSD_5_5_BASE
# 1.30 30-Oct-2013 mikeb

translate icmpv6 echo id's the same way we do for icmpv4; ok henning


# 1.29 30-Oct-2013 mikeb

add a comment describing why do we call pf_map_addr again if port
selection process fails; ok henning


# 1.28 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.27 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


# 1.26 17-Oct-2013 bluhm

The header file netinet/in_var.h included netinet6/in6_var.h. This
created a bunch of useless dependencies. Remove this implicit
inclusion and do an explicit #include <netinet6/in6_var.h> when it
is needed.
OK mpi@ henning@


Revision tags: OPENBSD_5_4_BASE
# 1.25 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_3_BASE
# 1.24 29-Dec-2012 markus

make sure the entry from tree_src_tracking is still in the pool;
fixes nat with sticky address and ip address change on pppoe(4) for example;
ok henning@, zinke@; mikeb@


# 1.23 29-Dec-2012 markus

reset the counter in case its current value has been removed
from the pool (e.g. ifconfig em0 1.2.3.4 -alias)
ok henning@, mikeb@


# 1.22 29-Dec-2012 markus

pass pf_pool directly to pfr_pool_get(); simplifies the API;
ok henning@, zinke@, mikeb@


Revision tags: OPENBSD_5_2_BASE
# 1.21 09-Jul-2012 zinke

Enable support for the 'weight' keyword in the 'least-states'
load balancing case, this allows Weighted Least States (WLS).
Everything prepared on c2k11 with help from mcbride@.

This finally makes PF ready for the cloud.

ok henning@ mikeb@ pyr@


Revision tags: OPENBSD_5_1_BASE
# 1.20 03-Feb-2012 bluhm

The kernel did not compile without INET6. Put some #ifdefs into
pf to fix that.
- add #ifdef INET6 in obvious places
- af translation is only possible with both INET and INET6
- interleave #endif /* INET6 */ and closing brace correctly
- it is not necessary to #ifdef function prototypes
- do not compile af translate functions at all instead of empty stub,
then the linker will report inconsistencies
- pf_poolmask() actually takes an sa_family_t not an u_int8_t argument
No binary change for GENERIC compiled with -O2 and -UDIAGNOSTIC.
reported by Olivier Cochard-Labbe; ok mikeb@ henning@


# 1.19 13-Oct-2011 claudio

Since the IPv6 madness is not enough introduce NAT64 -- which is actually
"af-to" a generic IP version translator for pf(4).
Not everything perfect yet but lets fix these things in the tree.
Insane amount of work done by sperreault@, mikeb@ and reyk@.
Looked over by mcbride@ henning@ and myself at eurobsdcon.
OK mcbride@ and general put it in from deraadt@


# 1.18 18-Sep-2011 miod

Fix various format string types to as a minimum match the width of the
variables being processed.
ok bluhm@ henning@


Revision tags: OPENBSD_5_0_BASE
# 1.17 29-Jul-2011 mcbride

Make sure we use the right tbl/dyn pointer to check the pfrkt_refcntcost;
improved debugging for error cases inside the weighted round-robin loop.

original diff from claudio, ok henning


# 1.16 27-Jul-2011 mcbride

Add support for weighted round-robin in load balancing pools and tables.
Diff from zinke@ with a some minor cleanup.
ok henning claudio deraadt


# 1.15 03-Jul-2011 zinke

bring in least-states load balancing algorithm

ok mcbride@ henning@


# 1.14 17-May-2011 mikeb

exclude link local address from the dynamic interface address pool
so that rules like "pass out on vr1 inet6 nat-to (vr1)" won't map
to the non routable ipv6 link local address; with suggestions and
ok claudio, henning


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.13 27-Jun-2010 henning

stuff nsaddr/ndaddr/nsport/ndport (addrs/ports after NAT, used a lot while
walking the ruleset and up until state is fully set up) into pf_pdesc instead
of passing around those 4 seperately all the time, also shrinks the argument
count for a few functions that have/partialy had an insane count of arguments.
kinda preparational since we'll need them elsewhere too, soon
ok ryan jsing


Revision tags: OPENBSD_4_7_BASE
# 1.12 04-Feb-2010 sthen

pf_get_sport() picks a random port from the port range specified in a
nat rule. It should check to see if it's in-use (i.e. matches an existing
PF state), if it is, it cycles sequentially through other ports until
it finds a free one. However the check was being done with the state
keys the wrong way round so it was never actually finding the state
to be in-use.

- switch the keys to correct this, avoiding random state collisions
with nat. Fixes PR 6300 and problems reported by robert@ and viq.

- check pf_get_sport() return code in pf_test(); if port allocation
fails the packet should be dropped rather than sent out untranslated.

Help/ok claudio@.


# 1.11 18-Jan-2010 mcbride

Convert pf debug logging to using log()/addlog(), a single standardised
definition of DPFPRINTF(), and log priorities from syslog.h. Old debug
levels will still work for now, but will eventually be phased out.

discussed with henning, ok dlg


# 1.10 12-Jan-2010 mcbride

First pass at removing the 'pf_pool' mechanism for translation and routing
actions. Allow interfaces to be specified in special table entries for
the routing actions. Lists of addresses can now only be done using tables,
which pfctl will generate automatically from the existing syntax.

Functionally, this deprecates the use of multiple tables or dynamic
interfaces in a single nat or rdr rule.

ok henning dlg claudio


# 1.9 14-Dec-2009 henning

fix sticky-address - by pretty much re-implementing it. still following
the original approach using a source tracking node.
the reimplementation i smore flexible than the original one, we now have an
slist of source tracking nodes per state. that is cheap because more than
one entry will be an absolute exception.
ok beck and jsg, also stress tested by Sebastian Benoit <benoit-lists at fb12.de>


# 1.8 03-Nov-2009 claudio

rtables are stacked on rdomains (it is possible to have multiple routing
tables on top of a rdomain) but until now our code was a crazy mix so that
it was impossible to correctly use rtables in that case. Additionally pf(4)
only knows about rtables and not about rdomains. This is especially bad when
tracking (possibly conflicting) states in various domains.
This diff fixes all or most of these issues. It adds a lookup function to
get the rdomain id based on a rtable id. Makes pf understand rdomains and
allows pf to move packets between rdomains (it is similar to NAT).
Because pf states now track the rdomain id as well it is necessary to modify
the pfsync wire format. So old and new systems will not sync up.
A lot of help by dlg@, tested by sthen@, jsg@ and probably more
OK dlg@, mpf@, deraadt@


# 1.7 07-Sep-2009 sthen

Fix static-port, found by jmc@. ok henning@.


# 1.6 01-Sep-2009 henning

the diff theo calls me insanae for:
rewrite of the NAT code, basically. nat and rdr become actions on regular
rules, seperate nat/rdr/binat rules do not exist any more.
match in on $intf rdr-to 1.2.3.4
match out on $intf nat-to 5.6.7.8
the code is capable of doing nat and rdr in any direction, but we prevent
this in pfctl for now, there are implications that need to be documented
better.
the address rewrite happens inline, subsequent rules will see the already
changed addresses. nat / rdr can be applied multiple times as well.
match in on $intf rdr-to 1.2.3.4
match in on $intf to 1.2.3.4 rdr-to 5.6.7.8
help and ok dlg sthen claudio, reyk tested too


Revision tags: OPENBSD_4_6_BASE
# 1.5 24-Jun-2009 sthen

move the "pf_map_addr: selected address" printf up to -xnoisy.
ok henning@


# 1.4 05-Mar-2009 mcbride

Stricter state checking for ICMP and ICMPv6 packets: include the ICMP type
in one port of the state key, using the type to determine which side should
be the id, and which should be the type. Also:
- Handle ICMP6 messages which are typically sent to multicast addresses but
recieve unicast replies, by doing fallthrough lookups against the correct
multicast address.
- Clear up some mistaken assumptions in the PF code:
- Not all ICMP packets have an icmp_id, so simulate one based on other
data if we can, otherwise set it to 0.
- Don't modify the icmp id field in NAT unless it's echo
- Use the full range of possible id's when NATing icmp6 echoy

ok henning marco
testing matthieu todd


Revision tags: OPENBSD_4_5_BASE
# 1.3 18-Feb-2009 henning

bring back the NAT NOP fix, but this time right.
when we want to pretend pf_get_translation didn't do anything we must
get rid of _both_ state keys and reset all 4 sk pointers to NULL and
not leave one key behind and have all 4 pointers point to it - that must
fail. tested dhill sthen, david agrees, deraadt ok


# 1.2 12-Feb-2009 sthen

revert pf.c r1.629 (which moved to this file) which was causing
"panic: pool_do_get(pfstatekeypl): free list modified" discussed with many.

ok dlg


# 1.1 29-Jan-2009 pyr

Split the address selection from pools away from pf.c and put it in
pf_lb.c. This will ease the process of adding more selection types
without bloatening pf.c even more.

ok and a weird death threat, henning@
raised eyebrow, dlg@


Revision tags: OPENBSD_6_2_BASE
# 1.61 12-Jul-2017 bluhm

Use a 32 bit variable to detect integer overflow when searching for
an unused nat port. Prevents a possible endless loop if high port
is 65535 or low port is 0.
report and analysis Jingmin Zhou; OK sashan@ visa@


# 1.60 23-Apr-2017 sthen

Some of the LOG_NOTICE messages from PF were seen in normal operations
with certain rulesets and excessively noisy; move them to LOG_INFO (which was
previously unused). ok benno@


Revision tags: OPENBSD_6_1_BASE
# 1.59 08-Feb-2017 jsg

Remove an uneeded NULL test which was after a deref.
ok mpi@ henning@ sashan@


# 1.58 26-Oct-2016 bluhm

Put union pf_headers and struct pf_pdesc into separate header file
pfvar_priv.h. The pf_headers had to be defined in multiple .c files
before. In pfvar.h it would have unknown storage size, this file
is included in too many places. The idea is to have a private pf
header that is only included in the pf part of the kernel. For now
it contains pf_pdesc and pf_headers, it may be extended later.
discussion, input and OK henning@ procter@ sashan@


# 1.57 27-Sep-2016 dlg

roll back turning RB into RBT until i get better at this process.


# 1.56 27-Sep-2016 dlg

move pf from the RB macros to the RBT functions.


Revision tags: OPENBSD_6_0_BASE
# 1.55 19-Jul-2016 henning

remove wrong and misleading comment, ok phessler


# 1.54 24-Jun-2016 bluhm

The function pf_get_sport() did work for out rules only. Make it
aware of the direction of the packet. Now nat-to can be used by
in rules and together with divert-to. Collisions with existing
states are found and produce a "NAT proxy port allocation failed"
message.
OK henning@ mikeb@


# 1.53 15-Jun-2016 mikeb

There's no need to convert values returned by arc4random to the network
byte order. Spotted by Gleb Smirnoff (glebius@FreeBSD.org), thanks!

ok tedu


Revision tags: OPENBSD_5_9_BASE
# 1.52 24-Nov-2015 mpi

No need for <net/if_types.h>

As a bonus this removes a "#if NCARP > 0", say yeah!


# 1.51 15-Oct-2015 bluhm

When using a pf rule with both nat-to and rdr-to, it could happen
that the nated source port was reused as destination port. Do not
initialize nport at the beginning of the function, but where it is
needed.
OK sashan@


# 1.50 13-Oct-2015 sashan

- pf_insert_src_node(): global argument (arg6) is useless, function
always gets pointer to rule.

- pf_remove_src_node(): function should always remove matching src node,
regardless the sn->rule.ptr being NULL or valid rule

- sn->rule.ptr is never NULL, spotted by mpi and Richard Procter _von_ gmail.com

OK mpi@, OK mikeb@


Revision tags: OPENBSD_5_8_BASE
# 1.49 03-Aug-2015 jsg

A recently added sanity check panic in pf_postprocess_addr() was
triggered for a reply-to rule. It turns out this case has been using
uninitialised memory as if it were a valid pf pool.

As the rest of the function assumes a valid pool for now just return.

Problem reported by RD Thrush.

ok jung@ mikeb@


# 1.48 20-Jul-2015 jsg

Add some panics to default paths where code later assumes a non default
path was taken. This both prevents warnings from clang and acts as a
sanity check.

ok mcbride@ henning@


# 1.47 18-Jul-2015 sashan

msg.mpi


# 1.46 18-Jul-2015 sashan

INET/INET6 address family check should be unified in PF

it also adds af_unhandled(), where it is currently missing.

ok mcbride@


# 1.45 17-Jul-2015 jsg

fix the indentation of a block of code, no binary change
ok mikeb@ some time ago


# 1.44 16-Jul-2015 mpi

Expand ancient NTOHL/NTOHS/HTONS/HTONL macros.

ok guenther@, henning@


# 1.43 03-Jun-2015 yasuoka

Fix pf_map_addr() not to cause dividing by 0. This fixes problem when
using table or dynamic interface addresses for source-hash. Also
avoid calling arc4random_uniform() with upper_bound == 0.

ok mikeb


# 1.42 14-Mar-2015 jsg

Remove some includes include-what-you-use claims don't
have any direct symbols used. Tested for indirect use by compiling
amd64/i386/sparc64 kernels.

ok tedu@ deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.41 06-Jan-2015 jsg

init a potentially uninitialised var in pf_postprocess_addr
ok mikeb@ henning@


# 1.40 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.39 19-Dec-2014 reyk

Support source-hash and random with tables and dynifs; not just pools.
This finally allows to use source-hash for dynamic loadbalancing, eg.
"rdr-to <hosts> source-hash", instead of just round-robin and least-states.

An older pre-siphash version of this diff was tested by many people.

OK tedu@ benno@


# 1.38 19-Dec-2014 mcbride

Comment is no longer true, remove it.


# 1.37 18-Dec-2014 tedu

use siphash for pf_lb. for ipv6, we stretch it out a bit, but good enough.
ok reyk


# 1.36 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.35 10-Nov-2014 bluhm

Split the logic for the ICMP and ICMP6 case in pf_get_sport(). The
types ICMP_ECHO and ICMP6_ECHO_REQUEST have their special meaning
only if the protocol matches.
Put an #ifdef INET6 around ICMP6_ECHO_REQUEST to make the kernel
without IPv6 compile.
OK henning@


# 1.34 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


# 1.33 14-Aug-2014 blambert

fix logging strings (correct function name via __func__ + a typo)

ok florian@ henning@


Revision tags: OPENBSD_5_6_BASE
# 1.32 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.31 02-Jul-2014 mikeb

better indentation; no functional change


Revision tags: OPENBSD_5_5_BASE
# 1.30 30-Oct-2013 mikeb

translate icmpv6 echo id's the same way we do for icmpv4; ok henning


# 1.29 30-Oct-2013 mikeb

add a comment describing why do we call pf_map_addr again if port
selection process fails; ok henning


# 1.28 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.27 23-Oct-2013 mpi

Remove the number of in_var.h inclusions by moving some functions and
global variables to in.h.

ok mikeb@, deraadt@


# 1.26 17-Oct-2013 bluhm

The header file netinet/in_var.h included netinet6/in6_var.h. This
created a bunch of useless dependencies. Remove this implicit
inclusion and do an explicit #include <netinet6/in6_var.h> when it
is needed.
OK mpi@ henning@


Revision tags: OPENBSD_5_4_BASE
# 1.25 28-Mar-2013 tedu

no need for a lot of code to include proc.h


Revision tags: OPENBSD_5_3_BASE
# 1.24 29-Dec-2012 markus

make sure the entry from tree_src_tracking is still in the pool;
fixes nat with sticky address and ip address change on pppoe(4) for example;
ok henning@, zinke@; mikeb@


# 1.23 29-Dec-2012 markus

reset the counter in case its current value has been removed
from the pool (e.g. ifconfig em0 1.2.3.4 -alias)
ok henning@, mikeb@


# 1.22 29-Dec-2012 markus

pass pf_pool directly to pfr_pool_get(); simplifies the API;
ok henning@, zinke@, mikeb@


Revision tags: OPENBSD_5_2_BASE
# 1.21 09-Jul-2012 zinke

Enable support for the 'weight' keyword in the 'least-states'
load balancing case, this allows Weighted Least States (WLS).
Everything prepared on c2k11 with help from mcbride@.

This finally makes PF ready for the cloud.

ok henning@ mikeb@ pyr@


Revision tags: OPENBSD_5_1_BASE
# 1.20 03-Feb-2012 bluhm

The kernel did not compile without INET6. Put some #ifdefs into
pf to fix that.
- add #ifdef INET6 in obvious places
- af translation is only possible with both INET and INET6
- interleave #endif /* INET6 */ and closing brace correctly
- it is not necessary to #ifdef function prototypes
- do not compile af translate functions at all instead of empty stub,
then the linker will report inconsistencies
- pf_poolmask() actually takes an sa_family_t not an u_int8_t argument
No binary change for GENERIC compiled with -O2 and -UDIAGNOSTIC.
reported by Olivier Cochard-Labbe; ok mikeb@ henning@


# 1.19 13-Oct-2011 claudio

Since the IPv6 madness is not enough introduce NAT64 -- which is actually
"af-to" a generic IP version translator for pf(4).
Not everything perfect yet but lets fix these things in the tree.
Insane amount of work done by sperreault@, mikeb@ and reyk@.
Looked over by mcbride@ henning@ and myself at eurobsdcon.
OK mcbride@ and general put it in from deraadt@


# 1.18 18-Sep-2011 miod

Fix various format string types to as a minimum match the width of the
variables being processed.
ok bluhm@ henning@


Revision tags: OPENBSD_5_0_BASE
# 1.17 29-Jul-2011 mcbride

Make sure we use the right tbl/dyn pointer to check the pfrkt_refcntcost;
improved debugging for error cases inside the weighted round-robin loop.

original diff from claudio, ok henning


# 1.16 27-Jul-2011 mcbride

Add support for weighted round-robin in load balancing pools and tables.
Diff from zinke@ with a some minor cleanup.
ok henning claudio deraadt


# 1.15 03-Jul-2011 zinke

bring in least-states load balancing algorithm

ok mcbride@ henning@


# 1.14 17-May-2011 mikeb

exclude link local address from the dynamic interface address pool
so that rules like "pass out on vr1 inet6 nat-to (vr1)" won't map
to the non routable ipv6 link local address; with suggestions and
ok claudio, henning


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.13 27-Jun-2010 henning

stuff nsaddr/ndaddr/nsport/ndport (addrs/ports after NAT, used a lot while
walking the ruleset and up until state is fully set up) into pf_pdesc instead
of passing around those 4 seperately all the time, also shrinks the argument
count for a few functions that have/partialy had an insane count of arguments.
kinda preparational since we'll need them elsewhere too, soon
ok ryan jsing


Revision tags: OPENBSD_4_7_BASE
# 1.12 04-Feb-2010 sthen

pf_get_sport() picks a random port from the port range specified in a
nat rule. It should check to see if it's in-use (i.e. matches an existing
PF state), if it is, it cycles sequentially through other ports until
it finds a free one. However the check was being done with the state
keys the wrong way round so it was never actually finding the state
to be in-use.

- switch the keys to correct this, avoiding random state collisions
with nat. Fixes PR 6300 and problems reported by robert@ and viq.

- check pf_get_sport() return code in pf_test(); if port allocation
fails the packet should be dropped rather than sent out untranslated.

Help/ok claudio@.


# 1.11 18-Jan-2010 mcbride

Convert pf debug logging to using log()/addlog(), a single standardised
definition of DPFPRINTF(), and log priorities from syslog.h. Old debug
levels will still work for now, but will eventually be phased out.

discussed with henning, ok dlg


# 1.10 12-Jan-2010 mcbride

First pass at removing the 'pf_pool' mechanism for translation and routing
actions. Allow interfaces to be specified in special table entries for
the routing actions. Lists of addresses can now only be done using tables,
which pfctl will generate automatically from the existing syntax.

Functionally, this deprecates the use of multiple tables or dynamic
interfaces in a single nat or rdr rule.

ok henning dlg claudio


# 1.9 14-Dec-2009 henning

fix sticky-address - by pretty much re-implementing it. still following
the original approach using a source tracking node.
the reimplementation i smore flexible than the original one, we now have an
slist of source tracking nodes per state. that is cheap because more than
one entry will be an absolute exception.
ok beck and jsg, also stress tested by Sebastian Benoit <benoit-lists at fb12.de>


# 1.8 03-Nov-2009 claudio

rtables are stacked on rdomains (it is possible to have multiple routing
tables on top of a rdomain) but until now our code was a crazy mix so that
it was impossible to correctly use rtables in that case. Additionally pf(4)
only knows about rtables and not about rdomains. This is especially bad when
tracking (possibly conflicting) states in various domains.
This diff fixes all or most of these issues. It adds a lookup function to
get the rdomain id based on a rtable id. Makes pf understand rdomains and
allows pf to move packets between rdomains (it is similar to NAT).
Because pf states now track the rdomain id as well it is necessary to modify
the pfsync wire format. So old and new systems will not sync up.
A lot of help by dlg@, tested by sthen@, jsg@ and probably more
OK dlg@, mpf@, deraadt@


# 1.7 07-Sep-2009 sthen

Fix static-port, found by jmc@. ok henning@.


# 1.6 01-Sep-2009 henning

the diff theo calls me insanae for:
rewrite of the NAT code, basically. nat and rdr become actions on regular
rules, seperate nat/rdr/binat rules do not exist any more.
match in on $intf rdr-to 1.2.3.4
match out on $intf nat-to 5.6.7.8
the code is capable of doing nat and rdr in any direction, but we prevent
this in pfctl for now, there are implications that need to be documented
better.
the address rewrite happens inline, subsequent rules will see the already
changed addresses. nat / rdr can be applied multiple times as well.
match in on $intf rdr-to 1.2.3.4
match in on $intf to 1.2.3.4 rdr-to 5.6.7.8
help and ok dlg sthen claudio, reyk tested too


Revision tags: OPENBSD_4_6_BASE
# 1.5 24-Jun-2009 sthen

move the "pf_map_addr: selected address" printf up to -xnoisy.
ok henning@


# 1.4 05-Mar-2009 mcbride

Stricter state checking for ICMP and ICMPv6 packets: include the ICMP type
in one port of the state key, using the type to determine which side should
be the id, and which should be the type. Also:
- Handle ICMP6 messages which are typically sent to multicast addresses but
recieve unicast replies, by doing fallthrough lookups against the correct
multicast address.
- Clear up some mistaken assumptions in the PF code:
- Not all ICMP packets have an icmp_id, so simulate one based on other
data if we can, otherwise set it to 0.
- Don't modify the icmp id field in NAT unless it's echo
- Use the full range of possible id's when NATing icmp6 echoy

ok henning marco
testing matthieu todd


Revision tags: OPENBSD_4_5_BASE
# 1.3 18-Feb-2009 henning

bring back the NAT NOP fix, but this time right.
when we want to pretend pf_get_translation didn't do anything we must
get rid of _both_ state keys and reset all 4 sk pointers to NULL and
not leave one key behind and have all 4 pointers point to it - that must
fail. tested dhill sthen, david agrees, deraadt ok


# 1.2 12-Feb-2009 sthen

revert pf.c r1.629 (which moved to this file) which was causing
"panic: pool_do_get(pfstatekeypl): free list modified" discussed with many.

ok dlg


# 1.1 29-Jan-2009 pyr

Split the address selection from pools away from pf.c and put it in
pf_lb.c. This will ease the process of adding more selection types
without bloatening pf.c even more.

ok and a weird death threat, henning@
raised eyebrow, dlg@