History log of /freebsd-11-stable/sys/netipsec/ipsec_output.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 322966 28-Aug-2017 ae

MFC r322750:
Fix the regression introduced in r275710.

When a security policy should match TCP connection with specific ports,
the SYN+ACK segment send by syncache_respond() is considered as forwarded
packet, because at this moment TCP connection does not have PCB structure,
and ip_output() is called without inpcb pointer. In this case SPIDX filled
for SP lookup will not contain TCP ports and security policy will not
be found. This can lead to unencrypted SYN+ACK on the wire.

This patch restores the old behavior, when ports will not be filled only
for forwarded packets.

Reported by: Dewayne Geraghty <dewayne.geraghty at heuristicsystems.com.au>

MFC r322751:
Remove stale comments.


# 322741 21-Aug-2017 ae

MFC r321779:
Add inpcb pointer to struct ipsec_ctx_data and pass it to the pfil hook
from enc_hhook().

This should solve the problem when pf is used with if_enc(4) interface,
and outbound packet with existing PCB checked by pf, and this leads to
deadlock due to pf does its own PCB lookup and tries to take rlock when
wlock is already held.

Now we pass PCB pointer if it is known to the pfil hook, this helps to
avoid extra PCB lookup and thus rlock acquiring is not needed.
For inbound packets it is safe to pass NULL, because we do not held any
PCB locks yet.

PR: 220217
Sponsored by: Yandex LLC


# 319599 05-Jun-2017 ae

MFC r319118:
Disable IPsec debugging code by default when IPSEC_DEBUG kernel option
is not specified.

Due to the long call chain IPsec code can produce the kernel stack
exhaustion on the i386 architecture. The debugging code usually is not
used, but it requires a lot of stack space to keep buffers for strings
formatting. This patch conditionally defines macros to disable building
of IPsec debugging code.

IPsec currently has two sysctl variables to configure debug output:
* net.key.debug variable is used to enable debug output for PF_KEY
protocol. Such debug messages are produced by KEYDBG() macro and
usually they can be interesting for developers.
* net.inet.ipsec.debug variable is used to enable debug output for
DPRINTF() macro and ipseclog() function. DPRINTF() macro usually
is used for development debugging. ipseclog() function is used for
debugging by administrator.

The patch disables KEYDBG() and DPRINTF() macros, and formatting buffers
declarations when IPSEC_DEBUG is not present in kernel config. This
reduces stack requirement for up to several hundreds of bytes.
The net.inet.ipsec.debug variable still can be used to enable ipseclog()
messages by administrator.

PR: 219476

MFC r319412:
Build kdebug_secreplay() function only when IPSEC_DEBUG is defined.
This should fix the build on sparc.

Approved by: re (kib)


# 319492 02-Jun-2017 ae

MFC r318734:
Fix possible double releasing for SA reference.

There are two possible ways how crypto callback are called: directly from
caller and deffered from crypto thread.

For inbound packets the direct call chain is the following:
IPSEC_INPUT() method -> ipsec_common_input() -> xform_input() ->
-> crypto_dispatch() -> crypto_invoke() -> crypto_done() ->
-> xform_input_cb() -> ipsec[46]_common_input_cb() -> netisr_queue().

The SA reference is held while crypto processing is not finished.
The error handling code wrongly expected that crypto callback always called
from the crypto thread context, and it did SA reference releasing in
xform_input_cb(). But when the crypto callback called directly, in case of
error (e.g. data authentification failed) the error handling in
ipsec_common_input() also did SA reference releasing.

To fix this, remove error handling from ipsec_common_input() and do it
in xform_input() before crypto_dispatch().

PR: 219356

MFC r318738:
Fix possible double releasing for SA and SP references.

There are two possible ways how crypto callback are called: directly from
caller and deffered from crypto thread.

For outbound packets the direct call chain is the following:
IPSEC_OUTPUT() method -> ipsec[46]_common_output() ->
-> ipsec[46]_perform_request() -> xform_output() ->
-> crypto_dispatch() -> crypto_invoke() -> crypto_done() ->
-> xform_output_cb() -> ipsec_process_done() -> ip[6]_output().

The SA and SP references are held while crypto processing is not finished.
The error handling code wrongly expected that crypto callback always called
from the crypto thread context, and it did references releasing in
xform_output_cb(). But when the crypto callback called directly, in case of
error the error handling code in ipsec[46]_perform_request() also did
references releasing.

To fix this, remove error handling from ipsec[46]_perform_request() and do it
in xform_output() before crypto_dispatch().

Approved by: re (kib)


# 315514 18-Mar-2017 ae

MFC r304572 (by bz):
Remove the kernel optoion for IPSEC_FILTERTUNNEL, which was deprecated
more than 7 years ago in favour of a sysctl in r192648.

MFC r305122:
Remove redundant sanity checks from ipsec[46]_common_input_cb().

This check already has been done in the each protocol callback.

MFC r309144,309174,309201 (by fabient):
IPsec RFC6479 support for replay window sizes up to 2^32 - 32 packets.

Since the previous algorithm, based on bit shifting, does not scale
with large replay windows, the algorithm used here is based on
RFC 6479: IPsec Anti-Replay Algorithm without Bit Shifting.
The replay window will be fast to be updated, but will cost as many bits
in RAM as its size.

The previous implementation did not provide a lock on the replay window,
which may lead to replay issues.

Obtained from: emeric.poupon@stormshield.eu
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D8468

MFC r309143,309146 (by fabient):
In a dual processor system (2*6 cores) during IPSec throughput tests,
we see a lot of contention on the arc4 lock, used to generate the IV
of the ESP output packets.

The idea of this patch is to split this mutex in order to reduce the
contention on this lock.

Update r309143 to prevent false sharing.

Reviewed by: delphij, markm, ache
Approved by: so
Obtained from: emeric.poupon@stormshield.eu
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D8130

MFC r313330:
Merge projects/ipsec into head/.

Small summary
-------------

o Almost all IPsec releated code was moved into sys/netipsec.
o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel
option IPSEC_SUPPORT added. It enables support for loading
and unloading of ipsec.ko and tcpmd5.ko kernel modules.
o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by
default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type
support was removed. Added TCP/UDP checksum handling for
inbound packets that were decapsulated by transport mode SAs.
setkey(8) modified to show run-time NAT-T configuration of SA.
o New network pseudo interface if_ipsec(4) added. For now it is
build as part of ipsec.ko module (or with IPSEC kernel).
It implements IPsec virtual tunnels to create route-based VPNs.
o The network stack now invokes IPsec functions using special
methods. The only one header file <netipsec/ipsec_support.h>
should be included to declare all the needed things to work
with IPsec.
o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed.
Now these protocols are handled directly via IPsec methods.
o TCP_SIGNATURE support was reworked to be more close to RFC.
o PF_KEY SADB was reworked:
- now all security associations stored in the single SPI namespace,
and all SAs MUST have unique SPI.
- several hash tables added to speed up lookups in SADB.
- SADB now uses rmlock to protect access, and concurrent threads
can do SA lookups in the same time.
- many PF_KEY message handlers were reworked to reflect changes
in SADB.
- SADB_UPDATE message was extended to support new PF_KEY headers:
SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They
can be used by IKE daemon to change SA addresses.
o ipsecrequest and secpolicy structures were cardinally changed to
avoid locking protection for ipsecrequest. Now we support
only limited number (4) of bundled SAs, but they are supported
for both INET and INET6.
o INPCB security policy cache was introduced. Each PCB now caches
used security policies to avoid SP lookup for each packet.
o For inbound security policies added the mode, when the kernel does
check for full history of applied IPsec transforms.
o References counting rules for security policies and security
associations were changed. The proper SA locking added into xform
code.
o xform code was also changed. Now it is possible to unregister xforms.
tdb_xxx structures were changed and renamed to reflect changes in
SADB/SPDB, and changed rules for locking and refcounting.

Obtained from: Yandex LLC
Relnotes: yes
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D9352

MFC r313331:
Add removed headers into the ObsoleteFiles.inc.

MFC r313561 (by glebius):
Move tcp_fields_to_net() static inline into tcp_var.h, just below its
friend tcp_fields_to_host(). There is third party code that also uses
this inline.

MFC r313697:
Remove IPsec related PCB code from SCTP.

The inpcb structure has inp_sp pointer that is initialized by
ipsec_init_pcbpolicy() function. This pointer keeps strorage for IPsec
security policies associated with a specific socket.
An application can use IP_IPSEC_POLICY and IPV6_IPSEC_POLICY socket
options to configure these security policies. Then ip[6]_output()
uses inpcb pointer to specify that an outgoing packet is associated
with some socket. And IPSEC_OUTPUT() method can use a security policy
stored in the inp_sp. For inbound packet the protocol-specific input
routine uses IPSEC_CHECK_POLICY() method to check that a packet conforms
to inbound security policy configured in the inpcb.

SCTP protocol doesn't specify inpcb for ip[6]_output() when it sends
packets. Thus IPSEC_OUTPUT() method does not consider such packets as
associated with some socket and can not apply security policies
from inpcb, even if they are configured. Since IPSEC_CHECK_POLICY()
method is called from protocol-specific input routine, it can specify
inpcb pointer and associated with socket inbound policy will be
checked. But there are two problems:
1. Such check is asymmetric, becasue we can not apply security policy
from inpcb for outgoing packet.
2. IPSEC_CHECK_POLICY() expects that caller holds INPCB lock and
access to inp_sp is protected. But for SCTP this is not correct,
becasue SCTP uses own locks to protect inpcb.

To fix these problems remove IPsec related PCB code from SCTP.
This imply that IP_IPSEC_POLICY and IPV6_IPSEC_POLICY socket options
will be not applicable to SCTP sockets. To be able correctly check
inbound security policies for SCTP, mark its protocol header with
the PR_LASTHDR flag.

Differential Revision: https://reviews.freebsd.org/D9538

MFC r313746:
Add missing check to fix the build with IPSEC_SUPPORT and without MAC.

MFC r313805:
Fix LINT build for powerpc.

Build kernel modules support only when both IPSEC and TCP_SIGNATURE
are not defined.

MFC r313922:
For translated packets do not adjust UDP checksum if it is zero.

In case when decrypted and decapsulated packet is an UDP datagram,
check that its checksum is not zero before doing incremental checksum
adjustment.

MFC r314339:
Document that the size of AH ICV for HMAC-SHA2-NNN should be half of
NNN bits as described in RFC4868.

PR: 215978

MFC r314812:
Introduce the concept of IPsec security policies scope.

Currently are defined three scopes: global, ifnet, and pcb.
Generic security policies that IKE daemon can add via PF_KEY interface
or an administrator creates with setkey(8) utility have GLOBAL scope.
Such policies can be applied by the kernel to outgoing packets and checked
agains inbound packets after IPsec processing.
Security policies created by if_ipsec(4) interfaces have IFNET scope.
Such policies are applied to packets that are passed through if_ipsec(4)
interface.
And security policies created by application using setsockopt()
IP_IPSEC_POLICY option have PCB scope. Such policies are applied to
packets related to specific socket. Currently there is no way to list
PCB policies via setkey(8) utility.

Modify setkey(8) and libipsec(3) to be able distinguish the scope of
security policies in the `setkey -DP` listing. Add two optional flags:
'-t' to list only policies related to virtual *tunneling* interfaces,
i.e. policies with IFNET scope, and '-g' to list only policies with GLOBAL
scope. By default policies from all scopes are listed.

To implement this PF_KEY's sadb_x_policy structure was modified.
sadb_x_policy_reserved field is used to pass the policy scope from the
kernel to userland. SADB_SPDDUMP message extended to support filtering
by scope: sadb_msg_satype field is used to specify bit mask of requested
scopes.

For IFNET policies the sadb_x_policy_priority field of struct sadb_x_policy
is used to pass if_ipsec's interface if_index to the userland. For GLOBAL
policies sadb_x_policy_priority is used only to manage order of security
policies in the SPDB. For IFNET policies it is not used, so it can be used
to keep if_index.

After this change the output of `setkey -DP` now looks like:
# setkey -DPt
0.0.0.0/0[any] 0.0.0.0/0[any] any
in ipsec
esp/tunnel/87.250.242.144-87.250.242.145/unique:145
spid=7 seq=3 pid=58025 scope=ifnet ifname=ipsec0
refcnt=1
# setkey -DPg
::/0 ::/0 icmp6 135,0
out none
spid=5 seq=1 pid=872 scope=global
refcnt=1

Obtained from: Yandex LLC
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D9805

PR: 212018
Relnotes: yes
Sponsored by: Yandex LLC


# 302408 07-Jul-2016 gjb

Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle.
Prune svn:mergeinfo from the new branch, as nothing has been merged
here.

Additional commits post-branch will follow.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


/freebsd-11-stable/MAINTAINERS
/freebsd-11-stable/cddl
/freebsd-11-stable/cddl/contrib/opensolaris
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/dtrace/test/tst/common/print
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/zfs
/freebsd-11-stable/cddl/contrib/opensolaris/lib/libzfs
/freebsd-11-stable/contrib/amd
/freebsd-11-stable/contrib/apr
/freebsd-11-stable/contrib/apr-util
/freebsd-11-stable/contrib/atf
/freebsd-11-stable/contrib/binutils
/freebsd-11-stable/contrib/bmake
/freebsd-11-stable/contrib/byacc
/freebsd-11-stable/contrib/bzip2
/freebsd-11-stable/contrib/com_err
/freebsd-11-stable/contrib/compiler-rt
/freebsd-11-stable/contrib/dialog
/freebsd-11-stable/contrib/dma
/freebsd-11-stable/contrib/dtc
/freebsd-11-stable/contrib/ee
/freebsd-11-stable/contrib/elftoolchain
/freebsd-11-stable/contrib/elftoolchain/ar
/freebsd-11-stable/contrib/elftoolchain/brandelf
/freebsd-11-stable/contrib/elftoolchain/elfdump
/freebsd-11-stable/contrib/expat
/freebsd-11-stable/contrib/file
/freebsd-11-stable/contrib/gcc
/freebsd-11-stable/contrib/gcclibs/libgomp
/freebsd-11-stable/contrib/gdb
/freebsd-11-stable/contrib/gdtoa
/freebsd-11-stable/contrib/groff
/freebsd-11-stable/contrib/ipfilter
/freebsd-11-stable/contrib/ldns
/freebsd-11-stable/contrib/ldns-host
/freebsd-11-stable/contrib/less
/freebsd-11-stable/contrib/libarchive
/freebsd-11-stable/contrib/libarchive/cpio
/freebsd-11-stable/contrib/libarchive/libarchive
/freebsd-11-stable/contrib/libarchive/libarchive_fe
/freebsd-11-stable/contrib/libarchive/tar
/freebsd-11-stable/contrib/libc++
/freebsd-11-stable/contrib/libc-vis
/freebsd-11-stable/contrib/libcxxrt
/freebsd-11-stable/contrib/libexecinfo
/freebsd-11-stable/contrib/libpcap
/freebsd-11-stable/contrib/libstdc++
/freebsd-11-stable/contrib/libucl
/freebsd-11-stable/contrib/libxo
/freebsd-11-stable/contrib/llvm
/freebsd-11-stable/contrib/llvm/projects/libunwind
/freebsd-11-stable/contrib/llvm/tools/clang
/freebsd-11-stable/contrib/llvm/tools/lldb
/freebsd-11-stable/contrib/llvm/tools/llvm-dwarfdump
/freebsd-11-stable/contrib/llvm/tools/llvm-lto
/freebsd-11-stable/contrib/mdocml
/freebsd-11-stable/contrib/mtree
/freebsd-11-stable/contrib/ncurses
/freebsd-11-stable/contrib/netcat
/freebsd-11-stable/contrib/ntp
/freebsd-11-stable/contrib/nvi
/freebsd-11-stable/contrib/one-true-awk
/freebsd-11-stable/contrib/openbsm
/freebsd-11-stable/contrib/openpam
/freebsd-11-stable/contrib/openresolv
/freebsd-11-stable/contrib/pf
/freebsd-11-stable/contrib/sendmail
/freebsd-11-stable/contrib/serf
/freebsd-11-stable/contrib/sqlite3
/freebsd-11-stable/contrib/subversion
/freebsd-11-stable/contrib/tcpdump
/freebsd-11-stable/contrib/tcsh
/freebsd-11-stable/contrib/tnftp
/freebsd-11-stable/contrib/top
/freebsd-11-stable/contrib/top/install-sh
/freebsd-11-stable/contrib/tzcode/stdtime
/freebsd-11-stable/contrib/tzcode/zic
/freebsd-11-stable/contrib/tzdata
/freebsd-11-stable/contrib/unbound
/freebsd-11-stable/contrib/vis
/freebsd-11-stable/contrib/wpa
/freebsd-11-stable/contrib/xz
/freebsd-11-stable/crypto/heimdal
/freebsd-11-stable/crypto/openssh
/freebsd-11-stable/crypto/openssl
/freebsd-11-stable/gnu/lib
/freebsd-11-stable/gnu/usr.bin/binutils
/freebsd-11-stable/gnu/usr.bin/cc/cc_tools
/freebsd-11-stable/gnu/usr.bin/gdb
/freebsd-11-stable/lib/libc/locale/ascii.c
/freebsd-11-stable/sys/cddl/contrib/opensolaris
/freebsd-11-stable/sys/contrib/dev/acpica
/freebsd-11-stable/sys/contrib/ipfilter
/freebsd-11-stable/sys/contrib/libfdt
/freebsd-11-stable/sys/contrib/octeon-sdk
/freebsd-11-stable/sys/contrib/x86emu
/freebsd-11-stable/sys/contrib/xz-embedded
/freebsd-11-stable/usr.sbin/bhyve/atkbdc.h
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.c
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.h
/freebsd-11-stable/usr.sbin/bhyve/console.c
/freebsd-11-stable/usr.sbin/bhyve/console.h
/freebsd-11-stable/usr.sbin/bhyve/pci_fbuf.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.h
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.c
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.h
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.c
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.h
/freebsd-11-stable/usr.sbin/bhyve/rfb.c
/freebsd-11-stable/usr.sbin/bhyve/rfb.h
/freebsd-11-stable/usr.sbin/bhyve/sockstream.c
/freebsd-11-stable/usr.sbin/bhyve/sockstream.h
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.c
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.h
/freebsd-11-stable/usr.sbin/bhyve/usb_mouse.c
/freebsd-11-stable/usr.sbin/bhyve/vga.c
/freebsd-11-stable/usr.sbin/bhyve/vga.h
# 298995 03-May-2016 pfg

sys/net*: minor spelling fixes.

No functional change.


# 297014 18-Mar-2016 ae

Fix handling of net.inet.ipsec.dfbit=2 variable.
IP_DF macro is in host bytes order, but ip_off field is in network bytes
order. So, use htons() for correct check.


# 291292 25-Nov-2015 ae

Overhaul if_enc(4) and make it loadable in run-time.

Use hhook(9) framework to achieve ability of loading and unloading
if_enc(4) kernel module. INET and INET6 code on initialization registers
two helper hooks points in the kernel. if_enc(4) module uses these helper
hook points and registers its hooks. IPSEC code uses these hhook points
to call helper hooks implemented in if_enc(4).


# 288418 30-Sep-2015 ae

Take extra reference to security policy before calling crypto_dispatch().

Currently we perform crypto requests for IPSEC synchronous for most of
crypto providers (software, aesni) and only VIA padlock calls crypto
callback asynchronous. In synchronous mode it is possible, that security
policy will be removed during the processing crypto request. And crypto
callback will release the last reference to SP. Then upon return into
ipsec[46]_process_packet() IPSECREQUEST_UNLOCK() will be called to already
freed request. To prevent this we will take extra reference to SP.

PR: 201876
Sponsored by: Yandex LLC


# 286095 30-Jul-2015 eri

Correct IPSec SA statistic keeping

The IPsec SA statistic keeping is used even for decision making on expiry/rekeying SAs.
When there are multiple transformations being done the statistic keeping might be wrong.

This mostly impacts multiple encapsulations on IPsec since the usual scenario it is not noticed due to the code path not taken.

Differential Revision: https://reviews.freebsd.org/D3239
Reviewed by: ae, gnn
Approved by: gnn(mentor)


# 282139 28-Apr-2015 ae

Fix the comment. We will not do SPD lookup again, because
ip[6]_ipsec_output() will find PACKET_TAG_IPSEC_OUT_DONE mbuf tag.

Sponsored by: Yandex LLC


# 282132 28-Apr-2015 ae

Since PFIL can change mbuf pointer, we should update pointers after
calling ipsec_filter().

Sponsored by: Yandex LLC


# 282046 26-Apr-2015 ae

Fix possible use after free due to security policy deletion.

When we are passing mbuf to IPSec processing via ipsec[46]_process_packet(),
we hold one reference to security policy and release it just after return
from this function. But IPSec processing can be deffered and when we release
reference to security policy after ipsec[46]_process_packet(), user can
delete this security policy from SPDB. And when IPSec processing will be
done, xform's callback function will do access to already freed memory.

To fix this move KEY_FREESP() into callback function. Now IPSec code will
release reference to SP after processing will be finished.

Differential Revision: https://reviews.freebsd.org/D2324
No objections from: #network
Sponsored by: Yandex LLC


# 281695 18-Apr-2015 ae

Change ipsec_address() and ipsec_logsastr() functions to take two
additional arguments - buffer and size of this buffer.

ipsec_address() is used to convert sockaddr structure to presentation
format. The IPv6 part of this function returns pointer to the on-stack
buffer and at the moment when it will be used by caller, it becames
invalid. IPv4 version uses 4 static buffers and returns pointer to
new buffer each time when it called. But anyway it is still possible
to get corrupted data when several threads will use this function.

ipsec_logsastr() is used to format string about SA entry. It also
uses static buffer and has the same problem with concurrent threads.

To fix these problems add the buffer pointer and size of this
buffer to arguments. Now each caller will pass buffer and its size
to these functions. Also convert all places where these functions
are used (except disabled code).

And now ipsec_address() uses inet_ntop() function from libkern.

PR: 185996
Differential Revision: https://reviews.freebsd.org/D2321
Reviewed by: gnn
Sponsored by: Yandex LLC


# 281693 18-Apr-2015 ae

Fix handling of scoped IPv6 addresses in IPSec code.

* in ipsec_encap() embed scope zone ids into link-local addresses
in the new IPv6 header, this helps ip6_output() disambiguate the
scope;
* teach key_ismyaddr6() use in6_localip(). in6_localip() is less
strict than key_sockaddrcmp(). It doesn't compare all fileds of
struct sockaddr_in6, but it is faster and it should be safe,
because all SA's data was checked for correctness. Also, since
IPv6 link-local addresses in the &V_in6_ifaddrhead are stored in
kernel-internal form, we need to embed scope zone id from SA into
the address before calling in6_localip.
* in ipsec_common_input() take scope zone id embedded in the address
and use it to initialize sin6_scope_id, then use this sockaddr
structure to lookup SA, because we keep addresses in the SADB without
embedded scope zone id.

Differential Revision: https://reviews.freebsd.org/D2304
Reviewed by: gnn
Sponsored by: Yandex LLC


# 281692 18-Apr-2015 ae

Remove xform_ipip.c and code related to XF_IP4.

The only thing is used from this code is ipip_output() function, that does
IPIP encapsulation. Other parts of XF_IP4 code were removed in r275133.
Also it isn't possible to configure the use of XF_IP4, nor from userland
via setkey(8), nor from the kernel.

Simplify the ipip_output() function and rename it to ipsec_encap().
* move IP_DF handling from ipsec4_process_packet() into ipsec_encap();
* since ipsec_encap() called from ipsec[64]_process_packet(), it
is safe to assume that mbuf is contiguous at least to IP header
for used IP version. Remove all unneeded m_pullup(), m_copydata
and related checks.
* use V_ip_defttl and V_ip6_defhlim for outer headers;
* use V_ip4_ipsec_ecn and V_ip6_ipsec_ecn for outer headers;
* move all diagnostic messages to the ipsec_encap() callers;
* simplify handling of ipsec_encap() results: if it returns non zero
value, print diagnostic message and free mbuf.
* some style(9) fixes.

Differential Revision: https://reviews.freebsd.org/D2303
Reviewed by: glebius
Sponsored by: Yandex LLC


# 275708 11-Dec-2014 ae

Remove flags and tunalready arguments from ipsec4_process_packet()
and make its prototype similar to ipsec6_process_packet.
The flags argument isn't used here, tunalready is always zero.

Obtained from: Yandex LLC
Sponsored by: Yandex LLC


# 275392 02-Dec-2014 ae

Remove route chaching support from ipsec code. It isn't used for some time.
* remove sa_route_union declaration and route_cache member from struct secashead;
* remove key_sa_routechange() call from ICMP and ICMPv6 code;
* simplify ip_ipsec_mtu();
* remove #include <net/route.h>;

Sponsored by: Yandex LLC


# 274467 13-Nov-2014 ae

Count statistics for the specific address family.

MFC after: 1 week
Sponsored by: Yandex LLC


# 274454 12-Nov-2014 ae

ipsec6_process_packet is called before ip6_output fixes ip6_plen.
Update ip6_plen before bpf processing to be able see correct value.

MFC after: 1 week
Sponsored by: Yandex LLC


# 274434 12-Nov-2014 ae

Fix ips_out_nosa errors accounting.

MFC after: 1 week
Sponsored by: Yandex LLC


# 271862 19-Sep-2014 glebius

Mechanically convert to if_inc_counter().


# 268083 01-Jul-2014 zec

The assumption in ipsec4_process_packet() that the payload may be
only IPv4 is wrong, so check the IP version before mangling the
payload header.


# 266822 28-May-2014 bz

Use IPv4 statistics in ipsec4_process_packet() rather than the IPv6
version. This also unbreaks the NOINET6 builds after r266800.


# 266800 28-May-2014 vanhu

Fixed IPv4-in-IPv6 and IPv6-in-IPv4 IPsec tunnels.
For IPv6-in-IPv4, you may need to do the following command
on the tunnel interface if it is configured as IPv4 only:
ifconfig <interface> inet6 -ifdisabled

Code logic inspired from NetBSD.

PR: kern/169438
Submitted by: emeric.poupon@netasq.com
Reviewed by: fabient, ae
Obtained from: NETASQ


# 264520 16-Apr-2014 ae

Remove _IP_VHL* macros and related ifdefs.

MFC after: 1 week


# 257176 26-Oct-2013 glebius

The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by: Netflix
Sponsored by: Nginx, Inc.


# 252028 20-Jun-2013 ae

Use corresponding macros to update statistics for AH, ESP, IPIP, IPCOMP,
PFKEY.

MFC after: 2 weeks


# 252026 20-Jun-2013 ae

Use IPSECSTAT_INC() and IPSEC6STAT_INC() macros for ipsec statistics
accounting.

MFC after: 2 weeks


# 249294 09-Apr-2013 ae

Use IP6STAT_INC/IP6STAT_DEC macros to update ip6 stats.

MFC after: 1 week


# 243882 05-Dec-2012 glebius

Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually


# 241919 22-Oct-2012 glebius

Couple of changes missed from r241913, which converted
IPv4 stack to network byte order.


# 240233 08-Sep-2012 glebius

Merge the projects/pf/head branch, that was worked on for last six months,
into head. The most significant achievements in the new code:

o Fine grained locking, thus much better performance.
o Fixes to many problems in pf, that were specific to FreeBSD port.

New code doesn't have that many ifdefs and much less OpenBSDisms, thus
is more attractive to our developers.

Those interested in details, can browse through SVN log of the
projects/pf/head branch. And for reference, here is exact list of
revisions merged:

r232043, r232044, r232062, r232148, r232149, r232150, r232298, r232330,
r232332, r232340, r232386, r232390, r232391, r232605, r232655, r232656,
r232661, r232662, r232663, r232664, r232673, r232691, r233309, r233782,
r233829, r233830, r233834, r233835, r233836, r233865, r233866, r233868,
r233873, r234056, r234096, r234100, r234108, r234175, r234187, r234223,
r234271, r234272, r234282, r234307, r234309, r234382, r234384, r234456,
r234486, r234606, r234640, r234641, r234642, r234644, r234651, r235505,
r235506, r235535, r235605, r235606, r235826, r235991, r235993, r236168,
r236173, r236179, r236180, r236181, r236186, r236223, r236227, r236230,
r236252, r236254, r236298, r236299, r236300, r236301, r236397, r236398,
r236399, r236499, r236512, r236513, r236525, r236526, r236545, r236548,
r236553, r236554, r236556, r236557, r236561, r236570, r236630, r236672,
r236673, r236679, r236706, r236710, r236718, r237154, r237155, r237169,
r237314, r237363, r237364, r237368, r237369, r237376, r237440, r237442,
r237751, r237783, r237784, r237785, r237788, r237791, r238421, r238522,
r238523, r238524, r238525, r239173, r239186, r239644, r239652, r239661,
r239773, r240125, r240130, r240131, r240136, r240186, r240196, r240212.

I'd like to thank people who participated in early testing:

Tested by: Florian Smeets <flo freebsd.org>
Tested by: Chekaluk Vitaly <artemrts ukr.net>
Tested by: Ben Wilber <ben desync.com>
Tested by: Ian FREISLICH <ianf cloudseed.co.za>


# 238700 22-Jul-2012 bz

Fix a bug introduced in r221129 that leads to a panic wen using bundled
SAs. For now allow same address family bundles. While discovered with
ESP and AH, which does not make a lot of sense, IPcomp could be a possible
problematic candidate.

PR: kern/164400
MFC after: 3 days


# 231852 17-Feb-2012 bz

Merge multi-FIB IPv6 support from projects/multi-fibv6/head/:

Extend the so far IPv4-only support for multiple routing tables (FIBs)
introduced in r178888 to IPv6 providing feature parity.

This includes an extended rtalloc(9) KPI for IPv6, the necessary
adjustments to the network stack, and user land support as in netstat.

Sponsored by: Cisco Systems, Inc.
Reviewed by: melifaro (basically)
MFC after: 10 days


# 223637 28-Jun-2011 bz

Update packet filter (pf) code to OpenBSD 4.5.

You need to update userland (world and ports) tools
to be in sync with the kernel.

Submitted by: mlaier
Submitted by: eri


# 221129 27-Apr-2011 bz

Make IPsec compile without INET adding appropriate #ifdef checks.

Unfold the IPSEC_COMMON_INPUT_CB() macro in xform_{ah,esp,ipcomp}.c
to not need three different versions depending on INET, INET6 or both.

Mark two places preparing for not yet supported functionality with IPv6.

Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC after: 4 days


# 220194 31-Mar-2011 fabient

Fix two SA refcount:
- AH does not release the SA like in ESP/IPCOMP when handling EAGAIN
- ipsec_process_done incorrectly release the SA.

Reviewed by: vanhu
MFC after: 1 week


# 214250 23-Oct-2010 bz

Make the IPsec SADB embedded route cache a union to be able to hold both the
legacy and IPv6 route destination address.
Previously in case of IPv6, there was a memory overwrite due to not enough
space for the IPv6 address.

PR: kern/122565
MFC After: 2 weeks


# 213837 14-Oct-2010 bz

Remove dead code:
assignment to a local variable not used anywhere after that.

MFC after: 3 days


# 213836 14-Oct-2010 bz

Style: make the asterisk go with the variable name, not the type.

MFC after: 3 days


# 196019 01-Aug-2009 rwatson

Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks. Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by: bz
Approved by: re (vimage blanket)


# 195699 14-Jul-2009 rwatson

Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator. Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...). This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack. Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory. Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy. Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address. When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by: bz
Reviewed by: bz, zec
Discussed with: gnn, jamie, jeff, jhb, julian, sam
Suggested by: peter
Approved by: re (kensmith)


# 194062 12-Jun-2009 vanhu

Added support for NAT-Traversal (RFC 3948) in IPsec stack.

Thanks to (no special order) Emmanuel Dreyfus (manu@netbsd.org), Larry
Baird (lab@gta.com), gnn, bz, and other FreeBSD devs, Julien Vanherzeele
(julien.vanherzeele@netasq.com, for years of bug reporting), the PFSense
team, and all people who used / tried the NAT-T patch for years and
reported bugs, patches, etc...

X-MFC: never

Reviewed by: bz
Approved by: gnn(mentor)
Obtained from: NETASQ


# 187936 30-Jan-2009 bz

Use NULL rather than 0 when comparing pointers.

MFC after: 2 weeks


# 185571 02-Dec-2008 bz

Rather than using hidden includes (with cicular dependencies),
directly include only the header files needed. This reduces the
unneeded spamming of various headers into lots of files.

For now, this leaves us with very few modules including vnet.h
and thus needing to depend on opt_route.h.

Reviewed by: brooks, gnn, des, zec, imp
Sponsored by: The FreeBSD Foundation


# 183550 02-Oct-2008 zec

Step 1.5 of importing the network stack virtualization infrastructure
from the vimage project, as per plan established at devsummit 08/08:
http://wiki.freebsd.org/Image/Notes200808DevSummit

Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator
macros, and CURVNET_SET() context setting macros, all currently
resolving to NOPs.

Prepare for virtualization of selected SYSCTL objects by introducing a
family of SYSCTL_V_*() macros, currently resolving to their global
counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT().

Move selected #defines from sys/sys/vimage.h to newly introduced header
files specific to virtualized subsystems (sys/net/vnet.h,
sys/netinet/vinet.h etc.).

All the changes are verified to have zero functional impact at this
point in time by doing MD5 comparision between pre- and post-change
object files(*).

(*) netipsec/keysock.c did not validate depending on compile time options.

Implemented by: julian, bz, brooks, zec
Reviewed by: julian, bz, brooks, kris, rwatson, ...
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation


# 181803 17-Aug-2008 bz

Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).

This is the first in a series of commits over the course
of the next few weeks.

Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.

We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.

Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch


# 181627 12-Aug-2008 vanhu

Increase statistic counters for enc0 interface when enabled
and processing IPSec traffic.

Approved by: gnn (mentor)
MFC after: 1 week


# 179290 24-May-2008 bz

In addition to the ipsec_osdep.h removal a week ago, now also eliminate
IPSEC_SPLASSERT_SOFTNET which has been 'unused' since FreeBSD 5.0.


# 177175 14-Mar-2008 bz

Correct IPsec behaviour with a 'use' level in SP but no SA available.
In that case return an continue processing the packet without IPsec.

PR: 121384
MFC after: 5 days
Reported by: Cyrus Rahman (crahman gmail.com)
Tested by: Cyrus Rahman (crahman gmail.com) [slightly older version]


# 174054 28-Nov-2007 bz

Add sysctls to if_enc(4) to control whether the firewalls or
bpf will see inner and outer headers or just inner or outer
headers for incoming and outgoing IPsec packets.

This is useful in bpf to not have over long lines for debugging
or selcting packets based on the inner headers.
It also properly defines the behavior of what the firewalls see.

Last but not least it gives you if_enc(4) for IPv6 as well.

[ As some auxiliary state was not available in the later
input path we save it in the tdbi. That way tcpdump can give a
consistent view of either of (authentic,confidential) for both
before and after states. ]

Discussed with: thompsa (2007-04-25, basic idea of unifying paths)
Reviewed by: thompsa, gnn


# 171497 19-Jul-2007 bz

Replace hard coded options by their defined PFIL_{IN,OUT} names.

Approved by: re (hrs)


# 171133 01-Jul-2007 gnn

Commit IPv6 support for FAST_IPSEC to the tree.
This commit includes only the kernel files, the rest of the files
will follow in a second commit.

Reviewed by: bz
Approved by: re
Supported by: Secure Computing


# 170123 29-May-2007 bz

In ipsec6_output_tunnel() make sure that the SA contents do not change.

The same would apply to ipsec6_output_trans() but there is a larger patch
around which already corrected that case. Do not interfere with that one.


# 170122 29-May-2007 bz

fix typo: s,applyed,applied,g


# 159965 26-Jun-2006 thompsa

Add a pseudo interface for packet filtering IPSec connections before or after
encryption. There are two functions, a bpf tap which has a basic header with
the SPI number which our current tcpdump knows how to display, and handoff to
pfil(9) for packet filtering.

Obtained from: OpenBSD
Based on: kern/94829
No objections: arch, net
MFC after: 1 month


# 151967 02-Nov-2005 andre

Retire MT_HEADER mbuf type and change its users to use MT_DATA.

Having an additional MT_HEADER mbuf type is superfluous and redundant
as nothing depends on it. It only adds a layer of confusion. The
distinction between header mbuf's and data mbuf's is solely done
through the m->m_flags M_PKTHDR flag.

Non-native code is not changed in this commit. For compatibility
MT_HEADER is mapped to MT_DATA.

Sponsored by: TCP/IP Optimization Fundraise 2005


# 124765 20-Jan-2004 sam

Fix ipip_output() to always set *mp to NULL on failure, even if 'm'
is NULL, otherwise ipsec4_process_packet() may try to m_freem() a
bad pointer.

In ipsec4_process_packet(), don't try to m_freem() 'm' twice; ipip_output()
already did it.

Obtained from: netbsd


# 120585 29-Sep-2003 sam

MFp4: portability work, general cleanup, locking fixes

change 38496
o add ipsec_osdep.h that holds os-specific definitions for portability
o s/KASSERT/IPSEC_ASSERT/ for portability
o s/SPLASSERT/IPSEC_SPLASSERT/ for portability
o remove function names from ASSERT strings since line#+file pinpints
the location
o use __func__ uniformly to reduce string storage
o convert some random #ifdef DIAGNOSTIC code to assertions
o remove some debuggging assertions no longer needed

change 38498
o replace numerous bogus panic's with equally bogus assertions
that at least go away on a production system

change 38502 + 38530
o change explicit mtx operations to #defines to simplify
future changes to a different lock type

change 38531
o hookup ipv4 ctlinput paths to a noop routine; we should be
handling path mtu changes at least
o correct potential null pointer deref in ipsec4_common_input_cb

chnage 38685
o fix locking for bundled SA's and for when key exchange is required

change 38770
o eliminate recursion on the SAHTREE lock

change 38804
o cleanup some types: long -> time_t
o remove refrence to dead #define

change 38805
o correct some types: long -> time_t
o add scan generation # to secpolicy to deal with locking issues

change 38806
o use LIST_FOREACH_SAFE instead of handrolled code
o change key_flush_spd to drop the sptree lock before purging
an entry to avoid lock recursion and to avoid holding the lock
over a long-running operation
o misc cleanups of tangled and twisty code

There is still much to do here but for now things look to be
working again.

Supported by: FreeBSD Foundation


# 119643 01-Sep-2003 sam

Locking and misc cleanups; most of which I've been running for >4 months:

o add locking
o strip irrelevant spl's
o split malloc types to better account for memory use
o remove unused IPSEC_NONBLOCK_ACQUIRE code
o remove dead code

Sponsored by: FreeBSD Foundation


# 117056 30-Jun-2003 sam

correct transfer statistics

Submitted by: Larry Baird <lab@gta.com>
MFC after: 1 day


# 113076 04-Apr-2003 des

ovbcopy -> bcopy


# 112758 28-Mar-2003 sam

add missing copyright notices

Noticed by: Robert Watson


# 111119 19-Feb-2003 imp

Back out M_* changes, per decision of the TRB.

Approved by: trb


# 109623 21-Jan-2003 alfred

Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.


# 108466 30-Dec-2002 sam

Correct mbuf packet header propagation. Previously, packet headers
were sometimes propagated using M_COPY_PKTHDR which actually did
something between a "move" and a "copy" operation. This is replaced
by M_MOVE_PKTHDR (which copies the pkthdr contents and "removes" it
from the source mbuf) and m_dup_pkthdr which copies the packet
header contents including any m_tag chain. This corrects numerous
problems whereby mbuf tags could be lost during packet manipulations.

These changes also introduce arguments to m_tag_copy and m_tag_copy_chain
to specify if the tag copy work should potentially block. This
introduces an incompatibility with openbsd which we may want to revisit.

Note that move/dup of packet headers does not handle target mbufs
that have a cluster bound to them. We may want to support this;
for now we watch for it with an assert.

Finally, M_COPYFLAGS was updated to include M_FIRSTFRAG|M_LASTFRAG.

Supported by: Vernier Networks
Reviewed by: Robert Watson <rwatson@FreeBSD.org>


# 105197 16-Oct-2002 sam

"Fast IPsec": this is an experimental IPsec implementation that is derived
from the KAME IPsec implementation, but with heavy borrowing and influence
of openbsd. A key feature of this implementation is that it uses the kernel
crypto framework to do all crypto work so when h/w crypto support is present
IPsec operation is automatically accelerated. Otherwise the protocol
implementations are rather differet while the SADB and policy management
code is very similar to KAME (for the moment).

Note that this implementation is enabled with a FAST_IPSEC option. With this
you get all protocols; i.e. there is no FAST_IPSEC_ESP option.

FAST_IPSEC and IPSEC are mutually exclusive; you cannot build both into a
single system.

This software is well tested with IPv4 but should be considered very
experimental (i.e. do not deploy in production environments). This software
does NOT currently support IPv6. In fact do not configure FAST_IPSEC and
INET6 in the same system.

Obtained from: KAME + openbsd
Supported by: Vernier Networks