178064Sume Implementation Note 257522Sshin 378064Sume KAME Project 478064Sume http://www.kame.net/ 578064Sume $KAME: IMPLEMENTATION,v 1.216 2001/05/25 07:43:01 jinmei Exp $ 678064Sume $FreeBSD$ 757522Sshin 8122115SumeNOTE: The document tries to describe behaviors/implementation choices 9164224Sbzof the latest KAME/*BSD stack. The description here may not be 10164224Sbzapplicable to KAME-integrated *BSD releases, as we have certain amount 11164224Sbzof changes between them. Still, some of the content can be useful for 12164224SbzKAME-integrated *BSD releases. 13122115Sume 14122115SumeTable of Contents 15122115Sume 16122115Sume 1. IPv6 17122115Sume 1.1 Conformance 18122115Sume 1.2 Neighbor Discovery 19122115Sume 1.3 Scope Zone Index 20122115Sume 1.3.1 Kernel internal 21122115Sume 1.3.2 Interaction with API 22122115Sume 1.3.3 Interaction with users (command line) 23122115Sume 1.4 Plug and Play 24122115Sume 1.4.1 Assignment of link-local, and special addresses 25122115Sume 1.4.2 Stateless address autoconfiguration on hosts 26122115Sume 1.4.3 DHCPv6 27122115Sume 1.5 Generic tunnel interface 28122115Sume 1.6 Address Selection 29122115Sume 1.6.1 Source Address Selection 30122115Sume 1.6.2 Destination Address Ordering 31122115Sume 1.7 Jumbo Payload 32122115Sume 1.8 Loop prevention in header processing 33122115Sume 1.9 ICMPv6 34122115Sume 1.10 Applications 35122115Sume 1.11 Kernel Internals 36122115Sume 1.12 IPv4 mapped address and IPv6 wildcard socket 37122115Sume 1.12.1 KAME/BSDI3 and KAME/FreeBSD228 38122115Sume 1.12.2 KAME/FreeBSD[34]x 39122115Sume 1.12.2.1 KAME/FreeBSD[34]x, listening side 40122115Sume 1.12.2.2 KAME/FreeBSD[34]x, initiating side 41122115Sume 1.12.3 KAME/NetBSD 42122115Sume 1.12.3.1 KAME/NetBSD, listening side 43122115Sume 1.12.3.2 KAME/NetBSD, initiating side 44122115Sume 1.12.4 KAME/BSDI4 45122115Sume 1.12.4.1 KAME/BSDI4, listening side 46122115Sume 1.12.4.2 KAME/BSDI4, initiating side 47122115Sume 1.12.5 KAME/OpenBSD 48122115Sume 1.12.5.1 KAME/OpenBSD, listening side 49122115Sume 1.12.5.2 KAME/OpenBSD, initiating side 50122115Sume 1.12.6 More issues 51122115Sume 1.12.7 Interaction with SIIT translator 52122115Sume 1.13 sockaddr_storage 53122115Sume 1.14 Invalid addresses on the wire 54122115Sume 1.15 Node's required addresses 55122115Sume 1.15.1 Host case 56122115Sume 1.15.2 Router case 57122115Sume 1.16 Advanced API 58122115Sume 1.17 DNS resolver 59122115Sume 2. Network Drivers 60122115Sume 2.1 FreeBSD 2.2.x-RELEASE 61122115Sume 2.2 BSD/OS 3.x 62122115Sume 2.3 NetBSD 63122115Sume 2.4 FreeBSD 3.x-RELEASE 64122115Sume 2.5 FreeBSD 4.x-RELEASE 65122115Sume 2.6 OpenBSD 2.x 66122115Sume 2.7 BSD/OS 4.x 67122115Sume 3. Translator 68122115Sume 3.1 FAITH TCP relay translator 69122115Sume 3.2 IPv6-to-IPv4 header translator 70122115Sume 4. IPsec 71122115Sume 4.1 Policy Management 72122115Sume 4.2 Key Management 73122115Sume 4.3 AH and ESP handling 74122115Sume 4.4 IPComp handling 75122115Sume 4.5 Conformance to RFCs and IDs 76122115Sume 4.6 ECN consideration on IPsec tunnels 77122115Sume 4.7 Interoperability 78122115Sume 4.8 Operations with IPsec tunnel mode 79122115Sume 4.8.1 RFC2401 IPsec tunnel mode approach 80122115Sume 4.8.2 draft-touch-ipsec-vpn approach 81122115Sume 5. ALTQ 82122115Sume 6. Mobile IPv6 83122115Sume 6.1 KAME node as correspondent node 84122115Sume 6.2 KAME node as home agent/mobile node 85122115Sume 6.3 Old Mobile IPv6 code 86164224Sbz 7. Coding style 87164224Sbz 8. Policy on technology with intellectual property right restriction 88122115Sume 8957522Sshin1. IPv6 9057522Sshin 9157522Sshin1.1 Conformance 9257522Sshin 9357522SshinThe KAME kit conforms, or tries to conform, to the latest set of IPv6 9457522Sshinspecifications. For future reference we list some of the relevant documents 9557522Sshinbelow (NOTE: this is not a complete list - this is too hard to maintain...). 9657522SshinFor details please refer to specific chapter in the document, RFCs, manpages 9757522Sshincome with KAME, or comments in the source code. 9857522Sshin 9962588SitojunConformance tests have been performed on past and latest KAME STABLE kit, 10057522Sshinat TAHI project. Results can be viewed at http://www.tahi.org/report/KAME/. 10157522SshinWe also attended Univ. of New Hampshire IOL tests (http://www.iol.unh.edu/) 10257522Sshinin the past, with our past snapshots. 10357522Sshin 10457522SshinRFC1639: FTP Operation Over Big Address Records (FOOBAR) 10557522Sshin * RFC2428 is preferred over RFC1639. ftp clients will first try RFC2428, 10657522Sshin then RFC1639 if failed. 10757522SshinRFC1886: DNS Extensions to support IPv6 10878064SumeRFC1933: (see RFC2893) 10957522SshinRFC1981: Path MTU Discovery for IPv6 11057522SshinRFC2080: RIPng for IPv6 11157522Sshin * KAME-supplied route6d, bgpd and hroute6d support this. 11257522SshinRFC2283: Multiprotocol Extensions for BGP-4 11357522Sshin * so-called "BGP4+". 11457522Sshin * KAME-supplied bgpd supports this. 11557522SshinRFC2292: Advanced Sockets API for IPv6 116122115Sume * see RFC3542 11757522SshinRFC2362: Protocol Independent Multicast-Sparse Mode (PIM-SM) 11878064Sume * RFC2362 defines the packet formats and the protcol of PIM-SM. 11957522SshinRFC2373: IPv6 Addressing Architecture 12057522Sshin * KAME supports node required addresses, and conforms to the scope 12157522Sshin requirement. 12257522SshinRFC2374: An IPv6 Aggregatable Global Unicast Address Format 12357522Sshin * KAME supports 64-bit length of Interface ID. 12457522SshinRFC2375: IPv6 Multicast Address Assignments 12557522Sshin * Userland applications use the well-known addresses assigned in the RFC. 12657522SshinRFC2428: FTP Extensions for IPv6 and NATs 12757522Sshin * RFC2428 is preferred over RFC1639. ftp clients will first try RFC2428, 12857522Sshin then RFC1639 if failed. 12957522SshinRFC2460: IPv6 specification 13057522SshinRFC2461: Neighbor discovery for IPv6 13157522Sshin * See 1.2 in this document for details. 13257522SshinRFC2462: IPv6 Stateless Address Autoconfiguration 13357522Sshin * See 1.4 in this document for details. 13457522SshinRFC2463: ICMPv6 for IPv6 specification 135122115Sume * See 1.9 in this document for details. 13657522SshinRFC2464: Transmission of IPv6 Packets over Ethernet Networks 13757522SshinRFC2465: MIB for IPv6: Textual Conventions and General Group 13857522Sshin * Necessary statistics are gathered by the kernel. Actual IPv6 MIB 13957522Sshin support is provided as patchkit for ucd-snmp. 14057522SshinRFC2466: MIB for IPv6: ICMPv6 group 14157522Sshin * Necessary statistics are gathered by the kernel. Actual IPv6 MIB 14257522Sshin support is provided as patchkit for ucd-snmp. 14357522SshinRFC2467: Transmission of IPv6 Packets over FDDI Networks 14457522SshinRFC2472: IPv6 over PPP 14557522SshinRFC2492: IPv6 over ATM Networks 14657522Sshin * only PVC is supported. 14757522SshinRFC2497: Transmission of IPv6 packet over ARCnet Networks 14857522SshinRFC2545: Use of BGP-4 Multiprotocol Extensions for IPv6 Inter-Domain Routing 149122115SumeRFC2553: (see RFC3493) 15078064SumeRFC2671: Extension Mechanisms for DNS (EDNS0) 15178064Sume * see USAGE for how to use it. 15278064Sume * not supported on kame/freebsd4 and kame/bsdi4. 15378064SumeRFC2673: Binary Labels in the Domain Name System 15478064Sume * KAME/bsdi4 supports A6, DNAME and binary label to some extent. 15578064Sume * KAME apps/bind8 repository has resolver library with partial A6, DNAME 15678064Sume and binary label support. 15757522SshinRFC2675: IPv6 Jumbograms 15857522Sshin * See 1.7 in this document for details. 15957522SshinRFC2710: Multicast Listener Discovery for IPv6 16057522SshinRFC2711: IPv6 router alert option 16162588SitojunRFC2732: Format for Literal IPv6 Addresses in URL's 16262588Sitojun * The spec is implemented in programs that handle URLs 16362588Sitojun (like freebsd ftpio(3) and fetch(1), or netbsd ftp(1)) 16478064SumeRFC2874: DNS Extensions to Support IPv6 Address Aggregation and Renumbering 16578064Sume * KAME/bsdi4 supports A6, DNAME and binary label to some extent. 16678064Sume * KAME apps/bind8 repository has resolver library with partial A6, DNAME 16778064Sume and binary label support. 16878064SumeRFC2893: Transition Mechanisms for IPv6 Hosts and Routers 16978064Sume * IPv4 compatible address is not supported. 17078064Sume * automatic tunneling (4.3) is not supported. 17178064Sume * "gif" interface implements IPv[46]-over-IPv[46] tunnel in a generic way, 17278064Sume and it covers "configured tunnel" described in the spec. 17378064Sume See 1.5 in this document for details. 17478064SumeRFC2894: Router renumbering for IPv6 17578064SumeRFC3041: Privacy Extensions for Stateless Address Autoconfiguration in IPv6 17678064SumeRFC3056: Connection of IPv6 Domains via IPv4 Clouds 17778064Sume * So-called "6to4". 17878064Sume * "stf" interface implements it. Be sure to read 17978064Sume draft-itojun-ipv6-transition-abuse-01.txt 18078064Sume below before configuring it, there can be security issues. 181148394SumeRFC3142: An IPv6-to-IPv4 transport relay translator 182148394Sume * FAITH tcp relay translator (faithd) implements this. See 3.1 for more 183148394Sume details. 184122115SumeRFC3152: Delegation of IP6.ARPA 185122115Sume * libinet6 resolvers contained in the KAME snaps support to use 186122115Sume the ip6.arpa domain (with the nibble format) for IPv6 reverse 187122115Sume lookups. 188122115SumeRFC3484: Default Address Selection for IPv6 189122115Sume * the selection algorithm for both source and destination addresses 190122115Sume is implemented based on the RFC, though some rules are still omitted. 191122115SumeRFC3493: Basic Socket Interface Extensions for IPv6 192122115Sume * IPv4 mapped address (3.7) and special behavior of IPv6 wildcard bind 193122115Sume socket (3.8) are, 194122115Sume - supported and turned on by default on KAME/FreeBSD[34] 195122115Sume and KAME/BSDI4, 196122115Sume - supported but turned off by default on KAME/NetBSD and KAME/FreeBSD5, 197122115Sume - not supported on KAME/FreeBSD228, KAME/OpenBSD and KAME/BSDI3. 198122115Sume see 1.12 in this document for details. 199148394Sume * The AI_ALL and AI_V4MAPPED flags are not supported. 200122115SumeRFC3542: Advanced Sockets API for IPv6 (revised) 201122115Sume * For supported library functions/kernel APIs, see sys/netinet6/ADVAPI. 20278064Sume * Some of the updates in the draft are not implemented yet. See 20378064Sume TODO.2292bis for more details. 204151539SsuzRFC4007: IPv6 Scoped Address Architecture 205151539Ssuz * some part of the documentation (especially about the routing 206151539Ssuz model) is not supported yet. 207151539Ssuz * zone indices that contain scope types have not been supported yet. 208151539Ssuz 209122115Sumedraft-ietf-ipngwg-icmp-name-lookups-09: IPv6 Name Lookups Through ICMP 210151539Ssuzdraft-ietf-ipv6-router-selection-07.txt: 21178064Sume Default Router Preferences and More-Specific Routes 212151539Ssuz * router-side: both router preference and specific routes are supported. 213151539Ssuz * host-side: only router preference is supported. 21478064Sumedraft-ietf-pim-sm-v2-new-02.txt 21578064Sume A revised version of RFC2362, which includes the IPv6 specific 21678064Sume packet format and protocol descriptions. 21778064Sumedraft-ietf-dnsext-mdns-00.txt: Multicast DNS 21878064Sume * kame/mdnsd has test implementation, which will not be built in 21978064Sume default compilation. The draft will experience a major change in the 22078064Sume near future, so don't rely upon it. 221151539Ssuzdraft-ietf-ipngwg-icmp-v3-02.txt: ICMPv6 for IPv6 specification (revised) 222151539Ssuz * See 1.9 in this document for details. 223122115Sumedraft-itojun-ipv6-tcp-to-anycast-01.txt: 224122115Sume Disconnecting TCP connection toward IPv6 anycast address 225151539Ssuzdraft-ietf-ipv6-rfc2462bis-06.txt: IPv6 Stateless Address 226151539Ssuz Autoconfiguration (revised) 227122115Sumedraft-itojun-ipv6-transition-abuse-01.txt: 22878064Sume Possible abuse against IPv6 transition technologies (expired) 22978064Sume * KAME does not implement RFC1933/2893 automatic tunnel. 23062588Sitojun * "stf" interface implements some address filters. Refer to stf(4) 23162588Sitojun for details. Since there's no way to make 6to4 interface 100% secure, 23262588Sitojun we do not include "stf" interface into GENERIC.v6 compilation. 23362588Sitojun * kame/openbsd completely disables IPv4 mapped address support. 23462588Sitojun * kame/netbsd makes IPv4 mapped address support off by default. 23578064Sume * See section 1.12.6 and 1.14 for more details. 23678064Sumedraft-itojun-ipv6-flowlabel-api-01.txt: Socket API for IPv6 flow label field 23778064Sume * no consideration is made against the use of routing headers and such. 23857522Sshin 23957522Sshin1.2 Neighbor Discovery 24057522Sshin 241151539SsuzOur implementation of Neighbor Discovery is fairly stable. Currently 242151539SsuzAddress Resolution, Duplicated Address Detection, and Neighbor 243151539SsuzUnreachability Detection are supported. In the near future we will be 244151539Ssuzadding an Unsolicited Neighbor Advertisement transmission command as 245151539Ssuzan administration tool. 24657522Sshin 24762588SitojunDuplicated Address Detection (DAD) will be performed when an IPv6 address 24862588Sitojunis assigned to a network interface, or the network interface is enabled 24962588Sitojun(ifconfig up). It is documented in RFC2462 5.4. 25057522SshinIf DAD fails, the address will be marked "duplicated" and message will be 25157522Sshingenerated to syslog (and usually to console). The "duplicated" mark 25257522Sshincan be checked with ifconfig. It is administrators' responsibility to check 25362588Sitojunfor and recover from DAD failures. We may try to improve failure recovery 25462588Sitojunin future KAME code. 255151539Ssuz 256151539SsuzA successor version of RFC2462 (called rfc2462bis) clarifies the 257151539Ssuzbehavior when DAD fails (i.e., duplicate is detected): if the 258151539Ssuzduplicate address is a link-local address formed from an interface 259151539Ssuzidentifier based on the hardware address which is supposed to be 260151539Ssuzuniquely assigned (e.g., EUI-64 for an Ethernet interface), IPv6 261151539Ssuzoperation on the interface should be disabled. The KAME 262151539Ssuzimplementation supports this as follows: if this type of duplicate is 263151539Ssuzdetected, the kernel marks "disabled" in the ND specific data 264151539Ssuzstructure for the interface. Every IPv6 I/O operation in the kernel 265151539Ssuzchecks this mark, and the kernel will drop packets received on or 266151539Ssuzbeing sent to the "disabled" interface. Whether the IPv6 operation is 267151539Ssuzdisabled or not can be confirmed by the ndp(8) command. See the man 268151539Ssuzpage for more details. 269151539Ssuz 27062588SitojunDAD procedure may not be effective on certain network interfaces/drivers. 27162588SitojunIf a network driver needs long initialization time (with wireless network 27262588Sitojuninterfaces this situation is popular), and the driver mistakingly raises 27362588SitojunIFF_RUNNING before the driver becomes ready, DAD code will try to transmit 27462588SitojunDAD probes to not-really-ready network driver and the packet will not go out 27562588Sitojunfrom the interface. In such cases, network drivers should be corrected. 27657522Sshin 27762588SitojunSome of network drivers loop multicast packets back to themselves, 278151539Ssuzeven if instructed not to do so (especially in promiscuous mode). In 279151539Ssuzsuch cases DAD may fail, because the DAD engine sees inbound NS packet 280151539Ssuz(actually from the node itself) and considers it as a sign of 281151539Ssuzduplicate. In this case, drivers should be corrected to honor 282151539SsuzIFF_SIMPLEX behavior. For example, you may need to check source MAC 283151539Ssuzaddress on an inbound packet, and reject it if it is from the node 284151539Ssuzitself. 28557522Sshin 28657522SshinNeighbor Discovery specification (RFC2461) does not talk about neighbor 28757522Sshincache handling in the following cases: 28857522Sshin(1) when there was no neighbor cache entry, node received unsolicited 28957522Sshin RS/NS/NA/redirect packet without link-layer address 29057522Sshin(2) neighbor cache handling on medium without link-layer address 29157522Sshin (we need a neighbor cache entry for IsRouter bit) 29257522SshinFor (1), we implemented workaround based on discussions on IETF ipngwg mailing 29357522Sshinlist. For more details, see the comments in the source code and email 29457522Sshinthread started from (IPng 7155), dated Feb 6 1999. 29557522Sshin 296151539SsuzIPv6 on-link determination rule (RFC2461) is quite different from 297151539Ssuzassumptions in BSD IPv4 network code. To implement the behavior in 298151539SsuzRFC2461 section 6.3.6 (3), the kernel needs to know the default 29962588Sitojunoutgoing interface. To configure the default outgoing interface, use 300151539Ssuzcommands like "ndp -I de0" as root. Then the kernel will have a 301151539Ssuz"default" route to the interface with the cloning "C" bit being on. 302151539SsuzThis default route will cause to make a neighbor cache entry for every 303151539Ssuzdestination that does not match an explicit route entry. 30457522Sshin 305151539SsuzNote that we intentionally disable configuring the default interface 306151539Ssuzby default. This is because we found it sometimes caused inconvenient 307151539Ssuzsituation while it was rarely useful in practical usage. For example, 308151539Ssuzconsider a destination that has both IPv4 and IPv6 addresses but is 309151539Ssuzonly reachable via IPv4. Since our getaddrinfo(3) prefers IPv6 by 310151539Ssuzdefault, an (TCP) application using the library with PF_UNSPEC first 311151539Ssuztries to connect to the IPv6 address. If we turn on RFC 2461 6.3.6 312151539Ssuz(3), we have to wait for quite a long period before the first attempt 313151539Ssuzto make a connection fails. If we turn it off, the first attempt will 314151539Ssuzimmediately fail with EHOSTUNREACH, and then the application can try 315151539Ssuzthe next, reachable address. 316151539Ssuz 317151539SsuzThe notion of the default interface is also disabled when the node is 318151539Ssuzacting as a router. The reason is that routers tend to control all 319151539Ssuzroutes stored in the kernel and the default route automatically 320151539Ssuzinstalled would rather confuse the routers. Note that the spec misuse 321151539Ssuzthe word "host" and "node" in several places in Section 5.2 of RFC 322151539Ssuz2461. We basically read the word "node" in this section as "host," 323151539Ssuzand thus believe the implementation policy does not break the 324151539Ssuzspecification. 325151539Ssuz 32657522SshinTo avoid possible DoS attacks and infinite loops, KAME stack will accept 32757522Sshinonly 10 options on ND packet. Therefore, if you have 20 prefix options 32857522Sshinattached to RA, only the first 10 prefixes will be recognized. 329148394SumeIf this troubles you, please contact the KAME team and/or modify 33057522Sshinnd6_maxndopt in sys/netinet6/nd6.c. If there are high demands we may 331148394Sumeprovide a sysctl knob for the variable. 33257522Sshin 33362588SitojunProxy Neighbor Advertisement support is implemented in the kernel. 33462588SitojunFor instance, you can configure it by using the following command: 33562588Sitojun # ndp -s fe80::1234%ne0 0:1:2:3:4:5 proxy 33662588Sitojunwhere ne0 is the interface which attaches to the same link as the 33762588Sitojunproxy target. 33862588SitojunThere are certain limitations, though: 33962588Sitojun- It does not send unsolicited multicast NA on configuration. This is MAY 34062588Sitojun behavior in RFC2461. 34162588Sitojun- It does not add random delay before transmission of solicited NA. This is 34262588Sitojun SHOULD behavior in RFC2461. 34362588Sitojun- We cannot configure proxy NDP for off-link address. The target address for 34462588Sitojun proxying must be link-local address, or must be in prefixes configured to 34562588Sitojun node which does proxy NDP. 34662588Sitojun- RFC2461 is unclear about if it is legal for a host to perform proxy ND. 34762588Sitojun We do not prohibit hosts from doing proxy ND, but there will be very limited 34862588Sitojun use in it. 34962588Sitojun 350151539SsuzStarting mid March 2000, we support Neighbor Unreachability Detection 351151539Ssuz(NUD) on p2p interfaces, including tunnel interfaces (gif). NUD is 352151539Ssuzturned on by default. Before March 2000 the KAME stack did not 353151539Ssuzperform NUD on p2p interfaces. If the change raises any 354151539Ssuzinteroperability issues, you can turn off/on NUD by per-interface 355151539Ssuzbasis. Use "ndp -i interface -nud" to turn it off. Consult ndp(8) 356151539Ssuzfor details. 35762588Sitojun 35862588SitojunRFC2461 specifies upper-layer reachability confirmation hint. Whenever 35962588Sitojunupper-layer reachability confirmation hint comes, ND process can use it 36062588Sitojunto optimize neighbor discovery process - ND process can omit real ND exchange 36162588Sitojunand keep the neighbor cache state in REACHABLE. 36262588SitojunWe currently have two sources for hints: (1) setsockopt(IPV6_REACHCONF) 363151539Ssuzdefined by the RFC3542 API, and (2) hints from tcp(6)_input. 364151539Ssuz 365151539SsuzIt is questionable if they are really trustworthy. For example, a 366151539Ssuzrogue userland program can use IPV6_REACHCONF to confuse the ND 367151539Ssuzprocess. Neighbor cache is a system-wide information pool, and it is 368151539Ssuzbad to allow a single process to affect others. Also, tcp(6)_input 369151539Ssuzcan be hosed by hijack attempts. It is wrong to allow hijack attempts 370151539Ssuzto affect the ND process. 371151539Ssuz 372151539SsuzStarting June 2000, the ND code has a protection mechanism against 373151539Ssuzincorrect upper-layer reachability confirmation. The ND code counts 374151539Ssuzsubsequent upper-layer hints. If the number of hints reaches the 375151539Ssuzmaximum, the ND code will ignore further upper-layer hints and run 376151539Ssuzreal ND process to confirm reachability to the peer. sysctl 377151539Ssuznet.inet6.icmp6.nd6_maxnudhint defines the maximum # of subsequent 37862588Sitojunupper-layer hints to be accepted. 37962588Sitojun(from April 2000 to June 2000, we rejected setsockopt(IPV6_REACHCONF) from 380151539Ssuznon-root process - after a local discussion, it looks that hints are not 38162588Sitojunthat trustworthy even if they are from privileged processes) 38262588Sitojun 38378064SumeIf inbound ND packets carry invalid values, the KAME kernel will 38478064Sumedrop these packet and increment statistics variable. See 38578064Sume"netstat -sn", icmp6 section. For detailed debugging session, you can 38678064Sumeturn on syslog output from the kernel on errors, by turning on sysctl MIB 38778064Sumenet.inet6.icmp6.nd6_debug. nd6_debug can be turned on at bootstrap 38878064Sumetime, by defining ND6_DEBUG kernel compilation option (so you can 38978064Sumedebug behavior during bootstrap). nd6_debug configuration should 390148394Sumeonly be used for test/debug purposes - for a production environment, 39178064Sumend6_debug must be set to 0. If you leave it to 1, malicious parties 39278064Sumecan inject broken packet and fill up /var/log partition. 39357522Sshin 39478064Sume1.3 Scope Zone Index 39578064Sume 39662588SitojunIPv6 uses scoped addresses. It is therefore very important to 39778064Sumespecify the scope zone index (link index for a link-local address, or 39878064Sumesite index for a site-local address) with an IPv6 address. Without a 39978064Sumezone index, a scoped IPv6 address is ambiguous to the kernel, and 400148394Sumethe kernel would not be able to determine the outbound zone for a 40178064Sumepacket to the scoped address. KAME code tries to address the issue in 40278064Sumeseveral ways. 40357522Sshin 404148394SumeThe entire architecture of scoped addresses is documented in RFC4007. 405148394SumeOne non-trivial point of the architecture is that the link scope is 406148394Sume(theoretically) larger than the interface scope. That is, two 407148394Sumedifferent interfaces can belong to a same single link. However, in a 408148394Sumenormal operation, we can assume that there is 1-to-1 relationship 409148394Sumebetween links and interfaces. In other words, we can usually put 410148394Sumelinks and interfaces in the same scope type. The current KAME 411148394Sumeimplementation assumes the 1-to-1 relationship. In particular, we use 412148394Sumeinterface names such as "ne1" as unique link identifiers. This would 413148394Sumebe much more human-readable and intuitive than numeric identifiers, 414148394Sumebut please keep your mind on the theoretical difference between links 415148394Sumeand interfaces. 41657522Sshin 41778064SumeSite-local addresses are very vaguely defined in the specs, and both 41878064Sumethe specification and the KAME code need tons of improvements to 41978064Sumeenable its actual use. For example, it is still very unclear how we 42078064Sumedefine a site, or how we resolve host names in a site. There is work 42178064Sumeunderway to define behavior of routers at site border, but, we have 422148394Sumealmost no code for site boundary node support (neither forwarding nor 42378064Sumerouting) and we bet almost noone has. We recommend, at this moment, 42478064Sumeyou to use global addresses for experiments - there are way too many 42578064Sumepitfalls if you use site-local addresses. 42678064Sume 42762588Sitojun1.3.1 Kernel internal 42862588Sitojun 42978064SumeIn the kernel, the link index for a link-local scope address is 43062588Sitojunembedded into the 2nd 16bit-word (the 3rd and 4th bytes) in the IPv6 43162588Sitojunaddress. 43257522SshinFor example, you may see something like: 43357522Sshin fe80:1::200:f8ff:fe01:6317 43478064Sumein the routing table and the interface address structure (struct 43578064Sumein6_ifaddr). The address above is a link-local unicast address which 43678064Sumebelongs to a network link whose link identifier is 1 (note that it 43778064Sumeeqauls to the interface index by the assumption of our 43878064Sumeimplementation). The embedded index enables us to identify IPv6 43978064Sumelink-local addresses over multiple links effectively and with only a 44057522Sshinlittle code change. 44162588Sitojun 442148394SumeThe use of the internal format must be limited inside the kernel. In 443148394Sumeparticular, addresses sent by an application should not contain the 444148394Sumeembedded index (except via some very special APIs such as routing 445148394Sumesockets). Instead, the index should be specified in the sin6_scope_id 446148394Sumefield of a sockaddr_in6 structure. Obviously, packets sent to or 447148394Sumereceived from must not contain the embedded index either, since the 448148394Sumeindex is meaningful only within the sending/receiving node. 449148394Sume 450148394SumeIn order to deal with the differences, several kernel routines are 451148394Sumeprovided. These are available by including <netinet6/scope_var.h>. 452148394SumeTypically, the following functions will be most generally used: 453148394Sume 454148394Sume- int sa6_embedscope(struct sockaddr_in6 *sa6, int defaultok); 455148394Sume Embed sa6->sin6_scope_id into sa6->sin6_addr. If sin6_scope_id is 456148394Sume 0, defaultok is non-0, and the default zone ID (see RFC4007) is 457148394Sume configured, the default ID will be used instead of the value of the 458148394Sume sin6_scope_id field. On success, sa6->sin6_scope_id will be reset 459148394Sume to 0. 460148394Sume 461148394Sume This function returns 0 on success, or a non-0 error code otherwise. 462148394Sume 463148394Sume- int sa6_recoverscope(struct sockaddr_in6 *sa6); 464148394Sume Extract embedded zone ID in sa6->sin6_addr and set 465148394Sume sa6->sin6_scope_id to that ID. The embedded ID will be cleared with 466148394Sume 0. 467148394Sume 468148394Sume This function returns 0 on success, or a non-0 error code otherwise. 469148394Sume 470148394Sume- int in6_clearscope(struct in6_addr *in6); 471148394Sume Reset the embedded zone ID in 'in6' to 0. This function never fails, and 472148394Sume returns 0 if the original address is intact or non 0 if the address is 473148394Sume modified. The return value doesn't matter in most cases; currently, the 474148394Sume only point where we care about the return value is ip6_input() for checking 475148394Sume whether the source or destination addresses of the incoming packet is in 476148394Sume the embedded form. 477148394Sume 478148394Sume- int in6_setscope(struct in6_addr *in6, struct ifnet *ifp, 479148394Sume u_int32_t *zoneidp); 480148394Sume Embed zone ID determined by the address scope type for 'in6' and the 481148394Sume interface 'ifp' into 'in6'. If zoneidp is non NULL, *zoneidp will 482148394Sume also have the zone ID. 483148394Sume 484148394Sume This function returns 0 on success, or a non-0 error code otherwise. 485148394Sume 486148394SumeThe typical usage of these functions is as follows: 487148394Sume 488148394Sumesa6_embedscope() will be used at the socket or transport layer to 489148394Sumeconvert a sockaddr_in6 structure passed by an application into the 490148394Sumekernel-internal form. In this usage, the second argument is often the 491148394Sume'ip6_use_defzone' global variable. 492148394Sume 493148394Sumesa6_recoverscope() will also be used at the socket or transport layer 494148394Sumeto convert an in6_addr structure with the embedded zone ID into a 495148394Sumesockaddr_in6 structure with the corresponding ID in the sin6_scope_id 496148394Sumefield (and without the embedded ID in sin6_addr). 497148394Sume 498148394Sumein6_clearscope() will be used just before sending a packet to the wire 499148394Sumeto remove the embedded ID. In general, this must be done at the last 500148394Sumestage of an output path, since otherwise the address would lose the ID 501148394Sumeand could be ambiguous with regard to scope. 502148394Sume 503148394Sumein6_setscope() will be used when the kernel receives a packet from the 504148394Sumewire to construct the kernel internal form for each address field in 505148394Sumethe packet (typical examples are the source and destination addresses 506148394Sumeof the packet). In the typical usage, the third argument 'zoneidp' 507148394Sumewill be NULL. A non-NULL value will be used when the validity of the 508148394Sumezone ID must be checked, e.g., when forwarding a packet to another 509148394Sumelink (see ip6_forward() for this usage). 510148394Sume 511148394SumeAn application, when sending a packet, is basically assumed to specify 512148394Sumethe appropriate scope zone of the destination address by the 513148394Sumesin6_scope_id field (this might be done transparently from the 514148394Sumeapplication with getaddrinfo() and the extended textual format - see 515148394Sumebelow), or at least the default scope zone(s) must be configured as a 516148394Sumelast resort. In some cases, however, an application could specify an 517148394Sumeambiguous address with regard to scope, expecting it is disambiguated 518148394Sumein the kernel by some other means. A typical usage is to specify the 519148394Sumeoutgoing interface through another API, which can disambiguate the 520148394Sumeunspecified scope zone. Such a usage is not recommended, but the 521148394Sumekernel implements some trick to deal with even this case. 522148394Sume 523148394SumeA rough sketch of the trick can be summarized as the following 524148394Sumesequence. 525148394Sume 526148394Sume sa6_embedscope(dst, ip6_use_defzone); 527148394Sume in6_selectsrc(dst, ..., &ifp, ...); 528148394Sume in6_setscope(&dst->sin6_addr, ifp, NULL); 529148394Sume 530148394Sumesa6_embedscope() first tries to convert sin6_scope_id (or the default 531148394Sumezone ID) into the kernel-internal form. This can fail with an 532148394Sumeambiguous destination, but it still tries to get the outgoing 533148394Sumeinterface (ifp) in the attempt of determining the source address of 534148394Sumethe outgoing packet using in6_selectsrc(). If the interface is 535148394Sumedetected, and the scope zone was originally ambiguous, in6_setscope() 536148394Sumecan finally determine the appropriate ID with the address itself and 537148394Sumethe interface, and construct the kernel-internal form. See, for 538148394Sumeexample, comments in udp6_output() for more concrete example. 539148394Sume 540148394SumeIn any case, kernel routines except ones in netinet6/scope6.c MUST NOT 541148394Sumedirectly refer to the embedded form. They MUST use the above 542148394Sumeinterface functions. In particular, kernel routines MUST NOT have the 543148394Sumefollowing code fragment: 544148394Sume 545148394Sume /* This is a bad practice. Don't do this */ 546148394Sume if (IN6_IS_ADDR_LINKLOCAL(&sin6->sin6_addr)) 547148394Sume sin6->sin6_addr.s6_addr16[1] = htons(ifp->if_index); 548148394Sume 549148394SumeThis is bad for several reasons. First, address ambiguity is not 550148394Sumespecific to link-local addresses (any non-global multicast addresses 551148394Sumeare inherently ambiguous, and this is particularly true for 552148394Sumeinterface-local addresses). Secondly, this is vulnerable to future 553148394Sumechanges of the embedded form (the embedded position may change, or the 554148394Sumezone ID may not actually be the interface index). Only scope6.c 555148394Sumeroutines should know the details. 556148394Sume 557148394SumeThe above code fragment should thus actually be as follows: 558148394Sume 559148394Sume /* This is correct. */ 560148394Sume in6_setscope(&sin6->sin6_addr, ifp, NULL); 561148394Sume (and catch errors if possible and necessary) 562148394Sume 56362588Sitojun1.3.2 Interaction with API 56462588Sitojun 56578064SumeThere are several candidates of API to deal with scoped addresses 56678064Sumewithout ambiguity. 56762588Sitojun 56878064SumeThe IPV6_PKTINFO ancillary data type or socket option defined in the 569122115Sumeadvanced API (RFC2292 or RFC3542) can specify 57078064Sumethe outgoing interface of a packet. Similarly, the IPV6_PKTINFO or 57178064SumeIPV6_RECVPKTINFO socket options tell kernel to pass the incoming 57278064Sumeinterface to user applications. 57357522Sshin 57478064SumeThese options are enough to disambiguate scoped addresses of an 57578064Sumeincoming packet, because we can uniquely identify the corresponding 57678064Sumezone of the scoped address(es) by the incoming interface. However, 57778064Sumethey are too strong for outgoing packets. For example, consider a 57878064Sumemulti-sited node and suppose that more than one interface of the node 57978064Sumebelongs to a same site. When we want to send a packet to the site, 58078064Sumewe can only specify one of the interfaces for the outgoing packet with 58178064Sumethese options; we cannot just say "send the packet to (one of the 58278064Sumeinterfaces of) the site." 58357522Sshin 58478064SumeAnother kind of candidates is to use the sin6_scope_id member in the 585122115Sumesockaddr_in6 structure, defined in RFC2553. The KAME kernel 586122115Sumeinterprets the sin6_scope_id field properly in order to disambiguate scoped 58778064Sumeaddresses. For example, if an application passes a sockaddr_in6 58878064Sumestructure that has a non-zero sin6_scope_id value to the sendto(2) 58978064Sumesystem call, the kernel should send the packet to the appropriate zone 59078064Sumeaccording to the sin6_scope_id field. Similarly, when the source or 59178064Sumethe destination address of an incoming packet is a scoped one, the 59278064Sumekernel should detect the correct zone identifier based on the address 59378064Sumeand the receiving interface, fill the identifier in the sin6_scope_id 59478064Sumefield of a sockaddr_in6 structure, and then pass the packet to an 59578064Sumeapplication via the recvfrom(2) system call, etc. 59678064Sume 59778064SumeHowever, the semantics of the sin6_scope_id is still vague and on the 59878064Sumeway to standardization. Additionally, not so many operating systems 59978064Sumesupport the behavior above at this moment. 60078064Sume 60178064SumeIn summary, 60278064Sume- If your target system is limited to KAME based ones (i.e. BSD 60378064Sume variants and KAME snaps), use the sin6_scope_id field assuming the 60478064Sume kernel behavior described above. 60578064Sume- Otherwise, (i.e. if your program should be portable on other systems 60678064Sume than BSDs) 60778064Sume + Use the advanced API to disambiguate scoped addresses of incoming 60878064Sume packets. 60978064Sume + To disambiguate scoped addresses of outgoing packets, 61078064Sume * if it is okay to just specify the outgoing interface, use the 61178064Sume advanced API. This would be the case, for example, when you 61278064Sume should only consider link-local addresses and your system 61378064Sume assumes 1-to-1 relationship between links and interfaces. 61478064Sume * otherwise, sorry but you lose. Please rush the IETF IPv6 61578064Sume community into standardizing the semantics of the sin6_scope_id 61678064Sume field. 61778064Sume 61878064SumeRouting daemons and configuration programs, like route6d and ifconfig, 61978064Sumewill need to manipulate the "embedded" zone index. These programs use 62078064Sumerouting sockets and ioctls (like SIOCGIFADDR_IN6) and the kernel API 62178064Sumewill return IPv6 addresses with the 2nd 16bit-word filled in. The 62278064SumeAPIs are for manipulating kernel internal structure. Programs that 62378064Sumeuse these APIs have to be prepared about differences in kernels 62478064Sumeanyway. 62578064Sume 62678064Sumegetaddrinfo(3) and getnameinfo(3) support an extended numeric IPv6 627148394Sumesyntax, as documented in RFC4007. You can specify the outgoing link, 628148394Sumeby using the name of the outgoing interface as the link, like 629148394Sume"fe80::1%ne0" (again, note that we assume there is 1-to-1 relationship 630148394Sumebetween links and interfaces.) This way you will be able to specify a 631148394Sumelink-local scoped address without much trouble. 63278064Sume 63378064SumeOther APIs like inet_pton(3) and inet_ntop(3) are inherently 63478064Sumeunfriendly with scoped addresses, since they are unable to annotate 63578064Sumeaddresses with zone identifier. 63678064Sume 63762588Sitojun1.3.3 Interaction with users (command line) 63862588Sitojun 63978064SumeMost of user applications now support the extended numeric IPv6 64078064Sumesyntax. In this case, you can specify outgoing link, by using the name 64178064Sumeof the outgoing interface like "fe80::1%ne0" (sorry for the duplicated 64278064Sumenotice, but please recall again that we assume 1-to-1 relationship 64378064Sumebetween links and interfaces). This is even the case for some 64478064Sumemanagement tools such as route(8) or ndp(8). For example, to install 64578064Sumethe IPv6 default route by hand, you can type like 64662588Sitojun # route add -inet6 default fe80::9876:5432:1234:abcd%ne0 64762588Sitojun(Although we suggest you to run dynamic routing instead of static 64862588Sitojunroutes, in order to avoid configuration mistakes.) 64962588Sitojun 65062588SitojunSome applications have command line options for specifying an 65162588Sitojunappropriate zone of a scoped address (like "ping6 -I ne0 ff02::1" to 65278064Sumespecify the outgoing interface). However, you can't always expect such 653122115Sumeoptions. Additionally, specifying the outgoing "interface" is in 654122115Sumetheory an overspecification as a way to specify the outgoing "link" 655122115Sume(see above). Thus, we recommend you to use the extended format 656122115Sumedescribed above. This should apply to the case where the outgoing 657122115Sumeinterface is specified. 65862588Sitojun 65962588SitojunIn any case, when you specify a scoped address to the command line, 66062588SitojunNEVER write the embedded form (such as ff02:1::1 or fe80:2::fedc), 66162588Sitojunwhich should only be used inside the kernel (see Section 1.3.1), and 66262588Sitojunis not supposed to work. 66362588Sitojun 66457522Sshin1.4 Plug and Play 66557522Sshin 66657522SshinThe KAME kit implements most of the IPv6 stateless address 66757522Sshinautoconfiguration in the kernel. 66857522SshinNeighbor Discovery functions are implemented in the kernel as a whole. 66957522SshinRouter Advertisement (RA) input for hosts is implemented in the 67057522Sshinkernel. Router Solicitation (RS) output for endhosts, RS input 67157522Sshinfor routers, and RA output for routers are implemented in the 67257522Sshinuserland. 67357522Sshin 67457522Sshin1.4.1 Assignment of link-local, and special addresses 67557522Sshin 67662588SitojunIPv6 link-local address is generated from IEEE802 address (ethernet MAC 67757522Sshinaddress). Each of interface is assigned an IPv6 link-local address 67857522Sshinautomatically, when the interface becomes up (IFF_UP). Also, direct route 67957522Sshinfor the link-local address is added to routing table. 68057522Sshin 68157522SshinHere is an output of netstat command: 68257522Sshin 68357522SshinInternet6: 68457522SshinDestination Gateway Flags Netif Expire 68562588Sitojunfe80::%ed0/64 link#1 UC ed0 68662588Sitojunfe80::%ep0/64 link#2 UC ep0 68757522Sshin 68857522SshinInterfaces that has no IEEE802 address (pseudo interfaces like tunnel 68957522Sshininterfaces, or ppp interfaces) will borrow IEEE802 address from other 69057522Sshininterfaces, such as ethernet interfaces, whenever possible. 69157522SshinIf there is no IEEE802 hardware attached, last-resort pseudorandom value, 69257522Sshinwhich is from MD5(hostname), will be used as source of link-local address. 69357522SshinIf it is not suitable for your usage, you will need to configure the 69457522Sshinlink-local address manually. 69557522Sshin 69657522SshinIf an interface is not capable of handling IPv6 (such as lack of multicast 69757522Sshinsupport), link-local address will not be assigned to that interface. 69857522SshinSee section 2 for details. 69957522Sshin 70057522SshinEach interface joins the solicited multicast address and the 70157522Sshinlink-local all-nodes multicast addresses (e.g. fe80::1:ff01:6317 70257522Sshinand ff02::1, respectively, on the link the interface is attached). 70357522SshinIn addition to a link-local address, the loopback address (::1) will be 70457522Sshinassigned to the loopback interface. Also, ::1/128 and ff01::/32 are 70557522Sshinautomatically added to routing table, and loopback interface joins 70657522Sshinnode-local multicast group ff01::1. 70757522Sshin 70857522Sshin1.4.2 Stateless address autoconfiguration on hosts 70957522Sshin 71057522SshinIn IPv6 specification, nodes are separated into two categories: 71157522Sshinrouters and hosts. Routers forward packets addressed to others, hosts does 71257522Sshinnot forward the packets. net.inet6.ip6.forwarding defines whether this 71362588Sitojunnode is a router or a host (router if it is 1, host if it is 0). 71457522Sshin 71562588SitojunIt is NOT recommended to change net.inet6.ip6.forwarding while the node 716148394Sumeis in operation. IPv6 specification defines behavior for "host" and "router" 71762588Sitojunquite differently, and switching from one to another can cause serious 71862588Sitojuntroubles. It is recommended to configure the variable at bootstrap time only. 71962588Sitojun 72062588SitojunThe first step in stateless address configuration is Duplicated Address 72162588SitojunDetection (DAD). See 1.2 for more detail on DAD. 72262588Sitojun 72357522SshinWhen a host hears Router Advertisement from the router, a host may 724151539Ssuzautoconfigure itself by stateless address autoconfiguration. This 725151539Ssuzbehavior can be controlled by the net.inet6.ip6.accept_rtadv sysctl 726151539Ssuzvariable and a per-interface flag managed in the kernel. The latter, 727151539Ssuzwhich we call "if_accept_rtadv" here, can be changed by the ndp(8) 728151539Ssuzcommand (see the manpage for more details). When the sysctl variable 729151539Ssuzis set to 1, and the flag is set, the host autoconfigures itself. By 730151539Ssuzautoconfiguration, network address prefixes for the receiving 731151539Ssuzinterface (usually global address prefix) are added. The default 732151539Ssuzroute is also configured. 73357522Sshin 73462588SitojunRouters periodically generate Router Advertisement packets. To 73562588Sitojunrequest an adjacent router to generate RA packet, a host can transmit 73662588SitojunRouter Solicitation. To generate an RS packet at any time, use the 737151539Ssuz"rtsol" command. The "rtsold" daemon is also available. "rtsold" 738151539Ssuzgenerates Router Solicitation whenever necessary, and it works greatly 73962588Sitojunfor nomadic usage (notebooks/laptops). If one wishes to ignore Router 74062588SitojunAdvertisements, use sysctl to set net.inet6.ip6.accept_rtadv to 0. 741151539SsuzAdditionally, ndp(8) command can be used to control the behavior 742151539Ssuzper-interface basis. 74362588Sitojun 74457522SshinTo generate Router Advertisement from a router, use the "rtadvd" daemon. 74557522Sshin 74662588SitojunNote that the IPv6 specification assumes the following items and that 74762588Sitojunnonconforming cases are left unspecified: 74857522Sshin- Only hosts will listen to router advertisements 749151539Ssuz- Hosts have a single network interface (except loopback) 75062588SitojunThis is therefore unwise to enable net.inet6.ip6.accept_rtadv on routers, 751151539Ssuzor multi-interface hosts. A misconfigured node can behave strange 75257522Sshin(KAME code allows nonconforming configuration, for those who would like 75357522Sshinto do some experiments). 75457522Sshin 75557522SshinTo summarize the sysctl knob: 75657522Sshin accept_rtadv forwarding role of the node 75757522Sshin --- --- --- 75857522Sshin 0 0 host (to be manually configured) 75957522Sshin 0 1 router 76057522Sshin 1 0 autoconfigured host 761151539Ssuz (spec assumes that hosts have a single 762151539Ssuz interface only, autoconfigred hosts 763151539Ssuz with multiple interfaces are 764151539Ssuz out-of-scope) 76557522Sshin 1 1 invalid, or experimental 76657522Sshin (out-of-scope of spec) 76757522Sshin 768151539SsuzThe if_accept_rtadv flag is referred only when accept_rtadv is 1 (the 769151539Ssuzlatter two cases). The flag does not have any effects when the sysctl 770151539Ssuzvariable is 0. 771151539Ssuz 77257522SshinSee 1.2 in the document for relationship between DAD and autoconfiguration. 77357522Sshin 77462588Sitojun1.4.3 DHCPv6 77557522Sshin 77662588SitojunWe supply a tiny DHCPv6 server/client in kame/dhcp6. However, the 77762588Sitojunimplementation is premature (for example, this does NOT implement 77862588Sitojunaddress lease/release), and it is not in default compilation tree on 77962588Sitojunsome platforms. If you want to do some experiment, compile it on your 78062588Sitojunown. 78157522Sshin 78257522SshinDHCPv6 and autoconfiguration also needs more work. "Managed" and "Other" 78357522Sshinbits in RA have no special effect to stateful autoconfiguration procedure 78457522Sshinin DHCPv6 client program ("Managed" bit actually prevents stateless 78557522Sshinautoconfiguration, but no special action will be taken for DHCPv6 client). 78657522Sshin 78757522Sshin1.5 Generic tunnel interface 78857522Sshin 78957522SshinGIF (Generic InterFace) is a pseudo interface for configured tunnel. 79057522SshinDetails are described in gif(4) manpage. 79157522SshinCurrently 79257522Sshin v6 in v6 79357522Sshin v6 in v4 79457522Sshin v4 in v6 79557522Sshin v4 in v4 79657522Sshinare available. Use "gifconfig" to assign physical (outer) source 79757522Sshinand destination address to gif interfaces. 79857522SshinConfiguration that uses same address family for inner and outer IP 79957522Sshinheader (v4 in v4, or v6 in v6) is dangerous. It is very easy to 80057522Sshinconfigure interfaces and routing tables to perform infinite level 80157522Sshinof tunneling. Please be warned. 80257522Sshin 80357522Sshingif can be configured to be ECN-friendly. See 4.5 for ECN-friendliness 80457522Sshinof tunnels, and gif(4) manpage for how to configure. 80557522Sshin 80657522SshinIf you would like to configure an IPv4-in-IPv6 tunnel with gif interface, 80762588Sitojunread gif(4) carefully. You may need to remove IPv6 link-local address 80857522Sshinautomatically assigned to the gif interface. 80957522Sshin 810122115Sume1.6 Address Selection 81157522Sshin 812122115Sume1.6.1 Source Address Selection 81357522Sshin 814122115SumeThe KAME kernel chooses the source address for an outgoing packet 815122115Sumesent from a user application as follows: 81662588Sitojun 817122115Sume1. if the source address is explicitly specified via an IPV6_PKTINFO 818122115Sume ancillary data item or the socket option of that name, just use it. 819122115Sume Note that this item/option overrides the bound address of the 820122115Sume corresponding (datagram) socket. 82157522Sshin 822122115Sume2. if the corresponding socket is bound, use the bound address. 82362588Sitojun 824122115Sume3. otherwise, the kernel first tries to find the outgoing interface of 825122115Sume the packet. If it fails, the source address selection also fails. 826122115Sume If the kernel can find an interface, choose the most appropriate 827122115Sume address based on the algorithm described in RFC3484. 82862588Sitojun 829122115Sume The policy table used in this algorithm is stored in the kernel. 830122115Sume To install or view the policy, use the ip6addrctl(8) command. The 831122115Sume kernel does not have pre-installed policy. It is expected that the 832122115Sume default policy described in the draft should be installed at the 833122115Sume bootstrap time using this command. 83462588Sitojun 835122115Sume This draft allows an implementation to add implementation-specific 836122115Sume rules with higher precedence than the rule "Use longest matching 837122115Sume prefix." KAME's implementation has the following additional rules 838122115Sume (that apply in the appeared order): 83978064Sume 840122115Sume - prefer addresses on alive interfaces, that is, interfaces with 841122115Sume the UP flag being on. This rule is particularly useful for 842122115Sume routers, since some routing daemons stop advertising prefixes 843122115Sume (addresses) on interfaces that have become down. 84462588Sitojun 845151539Ssuz - prefer addresses on "preferred" interfaces. "Preferred" 846151539Ssuz interfaces can be specified by the ndp(8) command. By default, 847151539Ssuz no interface is preferred, that is, this rule does not apply. 848151539Ssuz Again, this rule is particularly useful for routers, since there 849151539Ssuz is a convention, among router administrators, of assigning 850151539Ssuz "stable" addresses on a particular interface (typically a 851151539Ssuz loopback interface). 852151539Ssuz 853122115Sume In any case, addresses that break the scope zone of the 854122115Sume destination, or addresses whose zone do not contain the outgoing 855122115Sume interface are never chosen. 85662588Sitojun 857122115SumeWhen the procedure above fails, the kernel usually returns 858122115SumeEADDRNOTAVAIL to the application. 85962588Sitojun 860122115SumeIn some cases, the specification explicitly requires the 861122115Sumeimplementation to choose a particular source address. The source 862122115Sumeaddress for a Neighbor Advertisement (NA) message is an example. 86357522SshinUnder the spec (RFC2461 7.2.2) NA's source should be the target 864122115Sumeaddress of the corresponding NS's target. In this case we follow the 865122115Sumespec rather than the above rule. 86657522Sshin 86762588SitojunIf you would like to prohibit the use of deprecated address for some 86862588Sitojunreason, configure net.inet6.ip6.use_deprecated to 0. The issue 86962588Sitojunrelated to deprecated address is described in RFC2462 5.5.4 (NOTE: 87062588Sitojunthere is some debate underway in IETF ipngwg on how to use 87157522Sshin"deprecated" address). 87257522Sshin 873122115SumeAs documented in the source address selection document, temporary 874122115Sumeaddresses for privacy extension are less preferred to public addresses 875122115Sumeby default. However, for administrators who are particularly aware of 876122115Sumethe privacy, there is a system-wide sysctl(3) variable 877122115Sume"net.inet6.ip6.prefer_tempaddr". When the variable is set to 878122115Sumenon-zero, the kernel will rather prefer temporary addresses. The 879122115Sumedefault value of this variable is 0. 880122115Sume 881122115Sume1.6.2 Destination Address Ordering 882122115Sume 883122115SumeKAME's getaddrinfo(3) supports the destination address ordering 884122115Sumealgorithm described in RFC3484. Getaddrinfo(3) needs to know the 885122115Sumesource address for each destination address and policy entries 886122115Sume(described in the previous section) for the source and destination 887122115Sumeaddresses. To get the source address, the library function opens a 888122115SumeUDP socket and tries to connect(2) for the destination. To get the 889122115Sumepolicy entry, the function issues sysctl(3). 890122115Sume 89157522Sshin1.7 Jumbo Payload 89257522Sshin 89357522SshinKAME supports the Jumbo Payload hop-by-hop option used to send IPv6 89457522Sshinpackets with payloads longer than 65,535 octets. But since currently 89557522SshinKAME does not support any physical interface whose MTU is more than 89657522Sshin65,535, such payloads can be seen only on the loopback interface(i.e. 89757522Sshinlo0). 89857522Sshin 89957522SshinIf you want to try jumbo payloads, you first have to reconfigure the 90057522Sshinkernel so that the MTU of the loopback interface is more than 65,535 90157522Sshinbytes; add the following to the kernel configuration file: 90257522Sshin options "LARGE_LOMTU" #To test jumbo payload 90357522Sshinand recompile the new kernel. 90457522Sshin 90557522SshinThen you can test jumbo payloads by the ping6 command with -b and -s 90657522Sshinoptions. The -b option must be specified to enlarge the size of the 90757522Sshinsocket buffer and the -s option specifies the length of the packet, 90862588Sitojunwhich should be more than 65,535. For example, type as follows; 90957522Sshin % ping6 -b 70000 -s 68000 ::1 91057522Sshin 91157522SshinThe IPv6 specification requires that the Jumbo Payload option must not 91257522Sshinbe used in a packet that carries a fragment header. If this condition 91357522Sshinis broken, an ICMPv6 Parameter Problem message must be sent to the 91457522Sshinsender. KAME kernel follows the specification, but you cannot usually 91557522Sshinsee an ICMPv6 error caused by this requirement. 91657522Sshin 91757522SshinIf KAME kernel receives an IPv6 packet, it checks the frame length of 91857522Sshinthe packet and compares it to the length specified in the payload 91957522Sshinlength field of the IPv6 header or in the value of the Jumbo Payload 92057522Sshinoption, if any. If the former is shorter than the latter, KAME kernel 921148394Sumediscards the packet and increments the statistics. You can see the 92257522Sshinstatistics as output of netstat command with `-s -p ip6' option: 92357522Sshin % netstat -s -p ip6 92457522Sshin ip6: 92557522Sshin (snip) 92657522Sshin 1 with data size < data length 92757522Sshin 92857522SshinSo, KAME kernel does not send an ICMPv6 error unless the erroneous 92957522Sshinpacket is an actual Jumbo Payload, that is, its packet size is more 93057522Sshinthan 65,535 bytes. As described above, KAME kernel currently does not 93157522Sshinsupport physical interface with such a huge MTU, so it rarely returns an 93257522SshinICMPv6 error. 93357522Sshin 93457522SshinTCP/UDP over jumbogram is not supported at this moment. This is because 93557522Sshinwe have no medium (other than loopback) to test this. Contact us if you 93657522Sshinneed this. 93757522Sshin 93857522SshinIPsec does not work on jumbograms. This is due to some specification twists 93957522Sshinin supporting AH with jumbograms (AH header size influences payload length, 94057522Sshinand this makes it real hard to authenticate inbound packet with jumbo payload 94157522Sshinoption as well as AH). 94257522Sshin 94357522SshinThere are fundamental issues in *BSD support for jumbograms. We would like to 94462588Sitojunaddress those, but we need more time to finalize the task. To name a few: 94562588Sitojun- mbuf pkthdr.len field is typed as "int" in 4.4BSD, so it cannot hold 94657522Sshin jumbogram with len > 2G on 32bit architecture CPUs. If we would like to 94757522Sshin support jumbogram properly, the field must be expanded to hold 4G + 94857522Sshin IPv6 header + link-layer header. Therefore, it must be expanded to at least 94957522Sshin int64_t (u_int32_t is NOT enough). 95057522Sshin- We mistakingly use "int" to hold packet length in many places. We need 95162588Sitojun to convert them into larger numeric type. It needs a great care, as we may 95257522Sshin experience overflow during packet length computation. 95357522Sshin- We mistakingly check for ip6_plen field of IPv6 header for packet payload 95457522Sshin length in various places. We should be checking mbuf pkthdr.len instead. 95557522Sshin ip6_input() will perform sanity check on jumbo payload option on input, 95657522Sshin and we can safely use mbuf pkthdr.len afterwards. 95762588Sitojun- TCP code needs careful updates in bunch of places, of course. 95857522Sshin 95957522Sshin1.8 Loop prevention in header processing 96057522Sshin 96157522SshinIPv6 specification allows arbitrary number of extension headers to 96257522Sshinbe placed onto packets. If we implement IPv6 packet processing 96357522Sshincode in the way BSD IPv4 code is implemented, kernel stack may 96457522Sshinoverflow due to long function call chain. KAME sys/netinet6 code 96557522Sshinis carefully designed to avoid kernel stack overflow. Because of 96657522Sshinthis, KAME sys/netinet6 code defines its own protocol switch 96757522Sshinstructure, as "struct ip6protosw" (see netinet6/ip6protosw.h). 96878064Sume 96978064SumeIn addition to this, we restrict the number of extension headers 97078064Sume(including the IPv6 header) in each incoming packet, in order to 97178064Sumeprevent a DoS attack that tries to send packets with a massive number 97278064Sumeof extension headers. The upper limit can be configured by the sysctl 973148394Sumevalue net.inet6.ip6.hdrnestlimit. In particular, if the value is 0, 97478064Sumethe node will allow an arbitrary number of headers. As of writing this 97578064Sumedocument, the default value is 50. 97678064Sume 97762588SitojunIPv4 part (sys/netinet) remains untouched for compatibility. 97857522SshinBecause of this, if you receive IPsec-over-IPv4 packet with massive 97957522Sshinnumber of IPsec headers, kernel stack may blow up. IPsec-over-IPv6 is okay. 98057522Sshin 98157522Sshin1.9 ICMPv6 98257522Sshin 98357522SshinAfter RFC2463 was published, IETF ipngwg has decided to disallow ICMPv6 error 98457522Sshinpacket against ICMPv6 redirect, to prevent ICMPv6 storm on a network medium. 98557522SshinKAME already implements this into the kernel. 98657522Sshin 98762588SitojunRFC2463 requires rate limitation for ICMPv6 error packets generated by a 98862588Sitojunnode, to avoid possible DoS attacks. KAME kernel implements two rate- 98962588Sitojunlimitation mechanisms, tunable via sysctl: 99062588Sitojun- Minimum time interval between ICMPv6 error packets 99162588Sitojun KAME kernel will generate no more than one ICMPv6 error packet, 99262588Sitojun during configured time interval. net.inet6.icmp6.errratelimit 99362588Sitojun controls the interval (default: disabled). 99462588Sitojun- Maximum ICMPv6 error packet-per-second 99562588Sitojun KAME kernel will generate no more than the configured number of 99662588Sitojun packets in one second. net.inet6.icmp6.errppslimit controls the 99762588Sitojun maximum packet-per-second value (default: 200pps) 99862588SitojunBasically, we need to pick values that are suitable against the bandwidth 99962588Sitojunof link layer devices directly attached to the node. In some cases the 100062588Sitojundefault values may not fit well. We are still unsure if the default value 100162588Sitojunis sane or not. Comments are welcome. 100262588Sitojun 100357522Sshin1.10 Applications 100457522Sshin 100557522SshinFor userland programming, we support IPv6 socket API as specified in 1006148394SumeRFC2553/3493, RFC3542 and upcoming internet drafts. 100757522Sshin 100857522SshinTCP/UDP over IPv6 is available and quite stable. You can enjoy "telnet", 100957522Sshin"ftp", "rlogin", "rsh", "ssh", etc. These applications are protocol 101057522Sshinindependent. That is, they automatically chooses IPv4 or IPv6 101157522Sshinaccording to DNS. 101257522Sshin 101357522Sshin1.11 Kernel Internals 101457522Sshin 101557522Sshin (*) TCP/UDP part is handled differently between operating system platforms. 101657522Sshin See 1.12 for details. 101757522Sshin 101857522SshinThe current KAME has escaped from the IPv4 netinet logic. While 101957522Sshinip_forward() calls ip_output(), ip6_forward() directly calls 102057522Sshinif_output() since routers must not divide IPv6 packets into fragments. 102157522Sshin 102257522SshinICMPv6 should contain the original packet as long as possible up to 102357522Sshin1280. UDP6/IP6 port unreach, for instance, should contain all 102457522Sshinextension headers and the *unchanged* UDP6 and IP6 headers. 102562588SitojunSo, all IP6 functions except TCP6 never convert network byte 102657522Sshinorder into host byte order, to save the original packet. 102757522Sshin 102862588Sitojuntcp6_input(), udp6_input() and icmp6_input() can't assume that IP6 102957522Sshinheader is preceding the transport headers due to extension 103057522Sshinheaders. So, in6_cksum() was implemented to handle packets whose IP6 103162588Sitojunheader and transport header is not continuous. TCP/IP6 nor UDP/IP6 103257522Sshinheader structure don't exist for checksum calculation. 103357522Sshin 103457522SshinTo process IP6 header, extension headers and transport headers easily, 103557522SshinKAME requires network drivers to store packets in one internal mbuf or 103657522Sshinone or more external mbufs. A typical old driver prepares two 103762588Sitojuninternal mbufs for 100 - 208 bytes data, however, KAME's reference 103857522Sshinimplementation stores it in one external mbuf. 103957522Sshin 104057522Sshin"netstat -s -p ip6" tells you whether or not your driver conforms 104157522SshinKAME's requirement. In the following example, "cce0" violates the 104257522Sshinrequirement. (For more information, refer to Section 2.) 104357522Sshin 104457522Sshin Mbuf statistics: 104557522Sshin 317 one mbuf 104657522Sshin two or more mbuf:: 104757522Sshin lo0 = 8 104857522Sshin cce0 = 10 104957522Sshin 3282 one ext mbuf 105057522Sshin 0 two or more ext mbuf 105157522Sshin 105257522SshinEach input function calls IP6_EXTHDR_CHECK in the beginning to check 105357522Sshinif the region between IP6 and its header is 105457522Sshincontinuous. IP6_EXTHDR_CHECK calls m_pullup() only if the mbuf has 105557522SshinM_LOOP flag, that is, the packet comes from the loopback 105657522Sshininterface. m_pullup() is never called for packets coming from physical 105757522Sshinnetwork interfaces. 105857522Sshin 105962588SitojunTCP6 reassembly makes use of IP6 header to store reassemble 106062588Sitojuninformation. IP6 is not supposed to be just before TCP6, so 106162588Sitojunip6tcpreass structure has a pointer to TCP6 header. Of course, it has 106262588Sitojunalso a pointer back to mbuf to avoid m_pullup(). 106357522Sshin 106462588SitojunLike TCP6, both IP and IP6 reassemble functions never call m_pullup(). 106562588Sitojun 106662588Sitojunxxx_ctlinput() calls in_mrejoin() on PRC_IFNEWADDR. We think this is 106762588Sitojunone of 4.4BSD implementation flaws. Since 4.4BSD keeps ia_multiaddrs 106862588Sitojunin in_ifaddr{}, it can't use multicast feature if the interface has no 106962588Sitojununicast address. So, if an application joins to an interface and then 107062588Sitojunall unicast addresses are removed from the interface, the application 107162588Sitojuncan't send/receive any multicast packets. Moreover, if a new unicast 107262588Sitojunaddress is assigned to the interface, in_mrejoin() must be called. 107362588SitojunKAME's interfaces, however, have ALWAYS one link-local unicast 107462588Sitojunaddress. These extensions have thus not been implemented in KAME. 107562588Sitojun 107657522Sshin1.12 IPv4 mapped address and IPv6 wildcard socket 107757522Sshin 1078122115SumeRFC2553/3493 describes IPv4 mapped address (3.7) and special behavior 107957522Sshinof IPv6 wildcard bind socket (3.8). The spec allows you to: 108057522Sshin- Accept IPv4 connections by AF_INET6 wildcard bind socket. 108157522Sshin- Transmit IPv4 packet over AF_INET6 socket by using special form of 108257522Sshin the address like ::ffff:10.1.1.1. 108357522Sshinbut the spec itself is very complicated and does not specify how the 108457522Sshinsocket layer should behave. 108557522SshinHere we call the former one "listening side" and the latter one "initiating 108657522Sshinside", for reference purposes. 108757522Sshin 108857522SshinAlmost all KAME implementations treat tcp/udp port number space separately 108962588Sitojunbetween IPv4 and IPv6. You can perform wildcard bind on both of the address 109057522Sshinfamilies, on the same port. 109157522Sshin 109262588SitojunThere are some OS-platform differences in KAME code, as we use tcp/udp 109362588Sitojuncode from different origin. The following table summarizes the behavior. 109457522Sshin 109557522Sshin listening side initiating side 109662588Sitojun (AF_INET6 wildcard (connection to ::ffff:10.1.1.1) 109757522Sshin socket gets IPv4 conn.) 109857522Sshin --- --- 109962588SitojunKAME/BSDI3 not supported not supported 110062588SitojunKAME/FreeBSD228 not supported not supported 110162588SitojunKAME/FreeBSD3x configurable supported 110257522Sshin default: enabled 110362588SitojunKAME/FreeBSD4x configurable supported 110462588Sitojun default: enabled 110562588SitojunKAME/NetBSD configurable supported 1106148394Sume default: disabled 110762588SitojunKAME/BSDI4 enabled supported 110862588SitojunKAME/OpenBSD not supported not supported 110957522Sshin 111057522SshinThe following sections will give you more details, and how you can 111157522Sshinconfigure the behavior. 111257522Sshin 111357522SshinComments on listening side: 111457522Sshin 1115122115SumeIt looks that RFC2553/3493 talks too little on wildcard bind issue, 111662588Sitojunspecifically on (1) port space issue, (2) failure mode, (3) relationship 111762588Sitojunbetween AF_INET/INET6 wildcard bind like ordering constraint, and (4) behavior 111862588Sitojunwhen conflicting socket is opened/closed. There can be several separate 111957522Sshininterpretation for this RFC which conform to it but behaves differently. 112057522SshinSo, to implement portable application you should assume nothing 112157522Sshinabout the behavior in the kernel. Using getaddrinfo() is the safest way. 112257522SshinPort number space and wildcard bind issues were discussed in detail 112357522Sshinon ipv6imp mailing list, in mid March 1999 and it looks that there's 112457522Sshinno concrete consensus (means, up to implementers). You may want to 112557522Sshincheck the mailing list archives. 112662588SitojunWe supply a tool called "bindtest" that explores the behavior of 112762588Sitojunkernel bind(2). The tool will not be compiled by default. 112857522Sshin 112957522SshinIf a server application would like to accept IPv4 and IPv6 connections, 113062588Sitojunit should use AF_INET and AF_INET6 socket (you'll need two sockets). 113157522SshinUse getaddrinfo() with AI_PASSIVE into ai_flags, and socket(2) and bind(2) 113257522Sshinto all the addresses returned. 113357522SshinBy opening multiple sockets, you can accept connections onto the socket with 113457522Sshinproper address family. IPv4 connections will be accepted by AF_INET socket, 113562588Sitojunand IPv6 connections will be accepted by AF_INET6 socket (NOTE: KAME/BSDI4 113662588Sitojunkernel sometimes violate this - we will fix it). 113757522Sshin 113862588SitojunIf you try to support IPv6 traffic only and would like to reject IPv4 113962588Sitojuntraffic, always check the peer address when a connection is made toward 114057522SshinAF_INET6 listening socket. If the address is IPv4 mapped address, you may 114157522Sshinwant to reject the connection. You can check the condition by using 114262588SitojunIN6_IS_ADDR_V4MAPPED() macro. This is one of the reasons the author of 114362588Sitojunthe section (itojun) dislikes special behavior of AF_INET6 wildcard bind. 114457522Sshin 114557522SshinComments on initiating side: 114657522Sshin 114757522SshinAdvise to application implementers: to implement a portable IPv6 application 114857522Sshin(which works on multiple IPv6 kernels), we believe that the following 114957522Sshinis the key to the success: 115057522Sshin- NEVER hardcode AF_INET nor AF_INET6. 115157522Sshin- Use getaddrinfo() and getnameinfo() throughout the system. 115257522Sshin Never use gethostby*(), getaddrby*(), inet_*() or getipnodeby*(). 115357522Sshin- If you would like to connect to destination, use getaddrinfo() and try 115457522Sshin all the destination returned, like telnet does. 115557522Sshin- Some of the IPv6 stack is shipped with buggy getaddrinfo(). Ship a minimal 115657522Sshin working version with your application and use that as last resort. 115757522Sshin 115857522SshinIf you would like to use AF_INET6 socket for both IPv4 and IPv6 outgoing 115962588Sitojunconnection, you will need tweaked implementation in DNS support libraries, 1160122115Sumeas documented in RFC2553/3493 6.1. KAME libinet6 includes the tweak in 116162588Sitojungetipnodebyname(). Note that getipnodebyname() itself is not recommended as 116262588Sitojunit does not handle scoped IPv6 addresses at all. For IPv6 name resolution 116362588Sitojungetaddrinfo() is the preferred API. getaddrinfo() does not implement the 116462588Sitojuntweak. 116557522Sshin 116657522SshinWhen writing applications that make outgoing connections, story goes much 116762588Sitojunsimpler if you treat AF_INET and AF_INET6 as totally separate address family. 116857522Sshin{set,get}sockopt issue goes simpler, DNS issue will be made simpler. We do 116957522Sshinnot recommend you to rely upon IPv4 mapped address. 117057522Sshin 117162588Sitojun1.12.1 KAME/BSDI3 and KAME/FreeBSD228 117257522Sshin 117362588SitojunThe platforms do not support IPv4 mapped address at all (both listening side 117462588Sitojunand initiating side). AF_INET6 and AF_INET sockets are totally separated. 117557522Sshin 117662588SitojunPort number space is totally separate between AF_INET and AF_INET6 sockets. 117757522Sshin 117878064SumeIt should be noted that KAME/BSDI3 and KAME/FreeBSD228 are not conformant 1179122115Sumeto RFC2553/3493 section 3.7 and 3.8. It is due to code sharing reasons. 118078064Sume 118162588Sitojun1.12.2 KAME/FreeBSD[34]x 118257522Sshin 118362588SitojunKAME/FreeBSD3x and KAME/FreeBSD4x use shared tcp4/6 code (from 118462588Sitojunsys/netinet/tcp*) and shared udp4/6 code (from sys/netinet/udp*). 118562588SitojunThey use unified inpcb/in6pcb structure. 118657522Sshin 118762588Sitojun1.12.2.1 KAME/FreeBSD[34]x, listening side 118857522Sshin 118962588SitojunThe platform can be configured to support IPv4 mapped address/special 119062588SitojunAF_INET6 wildcard bind (enabled by default). There is no kernel compilation 119162588Sitojunoption to disable it. You can enable/disable the behavior with sysctl 119262588Sitojun(per-node), or setsockopt (per-socket). 119362588Sitojun 119462588SitojunWildcard AF_INET6 socket grabs IPv4 connection if and only if the following 119557522Sshinconditions are satisfied: 119657522Sshin- there's no AF_INET socket that matches the IPv4 connection 119757522Sshin- the AF_INET6 socket is configured to accept IPv4 traffic, i.e. 119878064Sume getsockopt(IPV6_V6ONLY) returns 0. 119957522Sshin 120062588Sitojun(XXX need checking) 120157522Sshin 120262588Sitojun1.12.2.2 KAME/FreeBSD[34]x, initiating side 120357522Sshin 120462588SitojunKAME/FreeBSD3x supports outgoing connection to IPv4 mapped address 120562588Sitojun(::ffff:10.1.1.1), if the node is configured to accept IPv4 connections 120662588Sitojunby AF_INET6 socket. 120762588Sitojun 120862588Sitojun(XXX need checking) 120962588Sitojun 121062588Sitojun1.12.3 KAME/NetBSD 121162588Sitojun 121262588SitojunKAME/NetBSD uses shared tcp4/6 code (from sys/netinet/tcp*) and shared 121362588Sitojunudp4/6 code (from sys/netinet/udp*). The implementation is made differently 121462588Sitojunfrom KAME/FreeBSD[34]x. KAME/NetBSD uses separate inpcb/in6pcb structures, 121562588Sitojunwhile KAME/FreeBSD[34]x uses merged inpcb structure. 121662588Sitojun 121778064SumeIt should be noted that the default configuration of KAME/NetBSD is not 1218122115Sumeconformant to RFC2553/3493 section 3.8. It is intentionally turned off by 1219122115Sumedefault for security reasons. 122078064Sume 122162588SitojunThe platform can be configured to support IPv4 mapped address/special AF_INET6 122262588Sitojunwildcard bind (disabled by default). Kernel behavior can be summarized as 122362588Sitojunfollows: 122462588Sitojun- default: special support code will be compiled in, but is disabled by 122578064Sume default. It can be controlled by sysctl (net.inet6.ip6.v6only), 122678064Sume or setsockopt(IPV6_V6ONLY). 1227122115Sume- add "INET6_BINDV6ONLY": No special support code for AF_INET6 wildcard socket 122862588Sitojun will be compiled in. AF_INET6 sockets and AF_INET sockets are totally 122962588Sitojun separate. The behavior is similar to what described in 1.12.1. 123062588Sitojun 123162588Sitojunsysctl setting will affect per-socket configuration at in6pcb creation time 123262588Sitojunonly. In other words, per-socket configuration will be copied from sysctl 123362588Sitojunconfiguration at in6pcb creation time. To change per-socket behavior, you 123462588Sitojunmust perform setsockopt or reopen the socket. Change in sysctl configuration 123562588Sitojunwill not change the behavior or sockets that are already opened. 123662588Sitojun 1237122115Sume1.12.3.1 KAME/NetBSD, listening side 1238122115Sume 123962588SitojunWildcard AF_INET6 socket grabs IPv4 connection if and only if the following 124062588Sitojunconditions are satisfied: 124162588Sitojun- there's no AF_INET socket that matches the IPv4 connection 124262588Sitojun- the AF_INET6 socket is configured to accept IPv4 traffic, i.e. 124378064Sume getsockopt(IPV6_V6ONLY) returns 0. 124462588Sitojun 124562588SitojunYou cannot bind(2) with IPv4 mapped address. This is a workaround for port 124662588Sitojunnumber duplicate and other twists. 124762588Sitojun 124862588Sitojun1.12.3.2 KAME/NetBSD, initiating side 124962588Sitojun 1250122115SumeWhen getsockopt(IPV6_V6ONLY) is 0 for a socket, you can make an outgoing 1251122115Sumetraffic to IPv4 destination over AF_INET6 socket, using IPv4 mapped 1252122115Sumeaddress destination (::ffff:10.1.1.1). 125362588Sitojun 1254122115SumeWhen getsockopt(IPV6_V6ONLY) is 1 for a socket, you cannot use IPv4 mapped 1255122115Sumeaddress for outgoing traffic. 1256122115Sume 125762588Sitojun1.12.4 KAME/BSDI4 125862588Sitojun 125962588SitojunKAME/BSDI4 uses NRL-based TCP/UDP stack and inpcb source code, 126062588Sitojunwhich was derived from NRL IPv6/IPsec stack. We guess it supports IPv4 mapped 126162588Sitojunaddress and speical AF_INET6 wildcard bind. The implementation is, again, 126262588Sitojundifferent from other KAME/*BSDs. 126362588Sitojun 126462588Sitojun1.12.4.1 KAME/BSDI4, listening side 126562588Sitojun 126662588SitojunNRL inpcb layer supports special behavior of AF_INET6 wildcard socket. 126762588SitojunThere is no way to disable the behavior. 126862588Sitojun 126962588SitojunWildcard AF_INET6 socket grabs IPv4 connection if and only if the following 127062588Sitojuncondition is satisfied: 127162588Sitojun- there's no AF_INET socket that matches the IPv4 connection 127262588Sitojun 127362588Sitojun1.12.4.2 KAME/BSDI4, initiating side 127462588Sitojun 127562588SitojunKAME/BSDi4 supports connection initiation to IPv4 mapped address 127662588Sitojun(like ::ffff:10.1.1.1). 127762588Sitojun 127862588Sitojun1.12.5 KAME/OpenBSD 127962588Sitojun 128062588SitojunKAME/OpenBSD uses NRL-based TCP/UDP stack and inpcb source code, 128162588Sitojunwhich was derived from NRL IPv6/IPsec stack. 128262588Sitojun 1283122115SumeIt should be noted that KAME/OpenBSD is not conformant to RFC2553/3493 section 1284122115Sume3.7 and 3.8. It is intentionally omitted for security reasons. 128578064Sume 128662588Sitojun1.12.5.1 KAME/OpenBSD, listening side 128762588Sitojun 128862588SitojunKAME/OpenBSD disables special behavior on AF_INET6 wildcard bind for 128962588Sitojunsecurity reasons (if IPv4 traffic toward AF_INET6 wildcard bind is allowed, 129062588Sitojunaccess control will become much harder). KAME/BSDI4 uses NRL-based TCP/UDP 129162588Sitojunstack as well, however, the behavior is different due to OpenBSD's security 129262588Sitojunpolicy. 129362588Sitojun 129462588SitojunAs a result the behavior of KAME/OpenBSD is similar to KAME/BSDI3 and 129562588SitojunKAME/FreeBSD228 (see 1.12.1 for more detail). 129662588Sitojun 129762588Sitojun1.12.5.2 KAME/OpenBSD, initiating side 129862588Sitojun 129962588SitojunKAME/OpenBSD does not support connection initiation to IPv4 mapped address 130062588Sitojun(like ::ffff:10.1.1.1). 130162588Sitojun 130262588Sitojun1.12.6 More issues 130362588Sitojun 130462588SitojunIPv4 mapped address support adds a big requirement to EVERY userland codebase. 130562588SitojunEvery userland code should check if an AF_INET6 sockaddr contains IPv4 130662588Sitojunmapped address or not. This adds many twists: 130762588Sitojun 130862588Sitojun- Access controls code becomes harder to write. 130962588Sitojun For example, if you would like to reject packets from 10.0.0.0/8, 131062588Sitojun you need to reject packets to AF_INET socket from 10.0.0.0/8, 131162588Sitojun and to AF_INET6 socket from ::ffff:10.0.0.0/104. 131262588Sitojun- If a protocol on top of IPv4 is defined differently with IPv6, we need to be 131362588Sitojun really careful when we determine which protocol to use. 131462588Sitojun For example, with FTP protocol, we can not simply use sa_family to determine 131562588Sitojun FTP command sets. The following example is incorrect: 131662588Sitojun if (sa_family == AF_INET) 131762588Sitojun use EPSV/EPRT or PASV/PORT; /*IPv4*/ 131862588Sitojun else if (sa_family == AF_INET6) 131962588Sitojun use EPSV/EPRT or LPSV/LPRT; /*IPv6*/ 132062588Sitojun else 132162588Sitojun error; 132278064Sume The correct code, with consideration to IPv4 mapped address, would be: 132362588Sitojun if (sa_family == AF_INET) 132462588Sitojun use EPSV/EPRT or PASV/PORT; /*IPv4*/ 132562588Sitojun else if (sa_family == AF_INET6 && IPv4 mapped address) 132662588Sitojun use EPSV/EPRT or PASV/PORT; /*IPv4 command set on AF_INET6*/ 132762588Sitojun else if (sa_family == AF_INET6 && !IPv4 mapped address) 132862588Sitojun use EPSV/EPRT or LPSV/LPRT; /*IPv6*/ 132962588Sitojun else 133062588Sitojun error; 133162588Sitojun It is too much to ask for every body to be careful like this. 133262588Sitojun The problem is, we are not sure if the above code fragment is perfect for 133362588Sitojun all situations. 133462588Sitojun- By enabling kernel support for IPv4 mapped address (outgoing direction), 133562588Sitojun servers on the kernel can be hosed by IPv6 native packet that has IPv4 133662588Sitojun mapped address in IPv6 header source, and can generate unwanted IPv4 packets. 1337122115Sume draft-itojun-ipv6-transition-abuse-01.txt, draft-cmetz-v6ops-v4mapped-api- 1338122115Sume harmful-00.txt, and draft-itojun-v6ops-v4mapped-harmful-01.txt 1339122115Sume has more on this scenario. 134062588Sitojun 134162588SitojunDue to the above twists, some of KAME userland programs has restrictions on 134262588Sitojunthe use of IPv4 mapped addresses: 134362588Sitojun- rshd/rlogind do not accept connections from IPv4 mapped address. 134462588Sitojun This is to avoid malicious use of IPv4 mapped address in IPv6 native 134562588Sitojun packet, to bypass source-address based authentication. 134678064Sume- ftp/ftpd assume that you are on dual stack network. IPv4 mapped address 134778064Sume will be decoded in userland, and will be passed to AF_INET sockets 134878064Sume (in other words, ftp/ftpd do not support SIIT environment). 134962588Sitojun 135078064Sume1.12.7 Interaction with SIIT translator 135178064Sume 135278064SumeSIIT translator is specified in RFC2765. KAME node cannot become a SIIT 135378064Sumetranslator box, nor SIIT end node (a node in SIIT cloud). 135478064Sume 135578064SumeTo become a SIIT translator box, we need to put additional code for that. 135678064SumeWe do not have the code in our tree at this moment. 135778064Sume 135878064SumeThere are multiple reasons that we are unable to become SIIT end node. 135978064Sume(1) SIIT translators require end nodes in the SIIT cloud to be IPv6-only. 136078064SumeSince we are unable to compile INET-less kernel, we are unable to become 136178064SumeSIIT end node. (2) As presented in 1.12.6, some of our userland code assumes 136278064Sumedual stack network. (3) KAME stack filters out IPv6 packets with IPv4 136378064Sumemapped address in the header, to secure non-SIIT case (which is much more 136478064Sumecommon). Effectively KAME node will reject any packets via SIIT translator 136578064Sumebox. See section 1.14 for more detail about the last item. 136678064Sume 136778064SumeThere are documentation issues too - SIIT document requires very strange 136878064Sumethings. For example, SIIT document asks IPv6-only (meaning no IPv4 code) 136978064Sumenode to be able to construct IPv4 IPsec headers. If a node knows how to 137078064Sumeconstruct IPv4 IPsec headers, that is not an IPv6-only node, it is a dual-stack 137178064Sumenode. The requirements imposed in SIIT document contradict with the other 137278064Sumepart of the document itself. 137378064Sume 137457522Sshin1.13 sockaddr_storage 137557522Sshin 137662588SitojunWhen RFC2553 was about to be finalized, there was discussion on how struct 137757522Sshinsockaddr_storage members are named. One proposal is to prepend "__" to the 137857522Sshinmembers (like "__ss_len") as they should not be touched. The other proposal 137957522Sshinwas that don't prepend it (like "ss_len") as we need to touch those members 138057522Sshindirectly. There was no clear consensus on it. 138157522Sshin 138257522SshinAs a result, RFC2553 defines struct sockaddr_storage as follows: 138357522Sshin struct sockaddr_storage { 138457522Sshin u_char __ss_len; /* address length */ 138557522Sshin u_char __ss_family; /* address family */ 138657522Sshin /* and bunch of padding */ 138757522Sshin }; 138857522SshinOn the contrary, XNET draft defines as follows: 138957522Sshin struct sockaddr_storage { 139057522Sshin u_char ss_len; /* address length */ 139157522Sshin u_char ss_family; /* address family */ 139257522Sshin /* and bunch of padding */ 139357522Sshin }; 139457522Sshin 1395122115SumeIn December 1999, it was agreed that RFC2553bis (RFC3493) should pick the 1396122115Sumelatter (XNET) definition. 139757522Sshin 139857522SshinKAME kit prior to December 1999 used RFC2553 definition. KAME kit after 139957522SshinDecember 1999 (including December) will conform to XNET definition, 1400122115Sumebased on RFC3493 discussion. 140157522Sshin 140257522SshinIf you look at multiple IPv6 implementations, you will be able to see 140357522Sshinboth definitions. As an userland programmer, the most portable way of 140457522Sshindealing with it is to: 140557522Sshin(1) ensure ss_family and/or ss_len are available on the platform, by using 140657522Sshin GNU autoconf, 140757522Sshin(2) have -Dss_family=__ss_family to unify all occurences (including header 140857522Sshin file) into __ss_family, or 140957522Sshin(3) never touch __ss_family. cast to sockaddr * and use sa_family like: 141057522Sshin struct sockaddr_storage ss; 141157522Sshin family = ((struct sockaddr *)&ss)->sa_family 141257522Sshin 141362588Sitojun1.14 Invalid addresses on the wire 141462588Sitojun 141562588SitojunSome of IPv6 transition technologies embed IPv4 address into IPv6 address. 141662588SitojunThese specifications themselves are fine, however, there can be certain 141762588Sitojunset of attacks enabled by these specifications. Recent speicifcation 141862588Sitojundocuments covers up those issues, however, there are already-published RFCs 141962588Sitojunthat does not have protection against those (like using source address of 142062588Sitojun::ffff:127.0.0.1 to bypass "reject packet from remote" filter). 142162588Sitojun 142262588SitojunTo name a few, these address ranges can be used to hose an IPv6 implementation, 142362588Sitojunor bypass security controls: 142462588Sitojun- IPv4 mapped address that embeds unspecified/multicast/loopback/broadcast 142562588Sitojun IPv4 address (if they are in IPv6 native packet header, they are malicious) 142662588Sitojun ::ffff:0.0.0.0/104 ::ffff:127.0.0.0/104 142762588Sitojun ::ffff:224.0.0.0/100 ::ffff:255.0.0.0/104 142878064Sume- 6to4 (RFC3056) prefix generated from unspecified/multicast/loopback/ 142978064Sume broadcast/private IPv4 address 143062588Sitojun 2002:0000::/24 2002:7f00::/24 2002:e000::/24 143162588Sitojun 2002:ff00::/24 2002:0a00::/24 2002:ac10::/28 143262588Sitojun 2002:c0a8::/32 143378064Sume- IPv4 compatible address that embeds unspecified/multicast/loopback/broadcast 143478064Sume IPv4 address (if they are in IPv6 native packet header, they are malicious). 143578064Sume Note that, since KAME doe snot support RFC1933/2893 auto tunnels, KAME nodes 143678064Sume are not vulnerable to these packets. 143778064Sume ::0.0.0.0/104 ::127.0.0.0/104 ::224.0.0.0/100 ::255.0.0.0/104 143862588Sitojun 143978064SumeAlso, since KAME does not support RFC1933/2893 auto tunnels, seeing IPv4 144078064Sumecompatible is very rare. You should take caution if you see those on the wire. 144162588Sitojun 144278064SumeIf we see IPv6 packets with IPv4 mapped address (::ffff:0.0.0.0/96) in the 144378064Sumeheader in dual-stack environment (not in SIIT environment), they indicate 144478064Sumethat someone is trying to inpersonate IPv4 peer. The packet should be dropped. 144578064Sume 144678064SumeIPv6 specifications do not talk very much about IPv6 unspecified address (::) 144778064Sumein the IPv6 source address field. Clarification is in progress. 144878064SumeHere are couple of comments: 144978064Sume- IPv6 unspecified address can be used in IPv6 source address field, if and 145078064Sume only if we have no legal source address for the node. The legal situations 145178064Sume include, but may not be limited to, (1) MLD while no IPv6 address is assigned 145278064Sume to the node and (2) DAD. 145378064Sume- If IPv6 TCP packet has IPv6 unspecified address, it is an attack attempt. 145478064Sume The form can be used as a trigger for TCP DoS attack. KAME code already 145578064Sume filters them out. 145678064Sume- The following examples are seemingly illegal. It seems that there's general 1457151539Ssuz consensus among ipngwg for those. (1) Mobile IPv6 home address option, 145878064Sume (2) offlink packets (so routers should not forward them). 145978064Sume KAME implmements (2) already. 146078064Sume 146162588SitojunKAME code is carefully written to avoid such incidents. More specifically, 146262588SitojunKAME kernel will reject packets with certain source/dstination address in IPv6 146362588Sitojunbase header, or IPv6 routing header. Also, KAME default configuration file 146462588Sitojunis written carefully, to avoid those attacks. 146562588Sitojun 1466122115Sumedraft-itojun-ipv6-transition-abuse-01.txt, draft-cmetz-v6ops-v4mapped-api- 1467122115Sumeharmful-00.txt and draft-itojun-v6ops-v4mapped-harmful-01.txt has more on 1468122115Sumethis issue. 146962588Sitojun 147062588Sitojun1.15 Node's required addresses 147162588Sitojun 147262588SitojunRFC2373 section 2.8 talks about required addresses for an IPv6 147362588Sitojunnode. The section talks about how KAME stack manages those required 147462588Sitojunaddresses. 147562588Sitojun 147662588Sitojun1.15.1 Host case 147762588Sitojun 147862588SitojunThe following items are automatically assigned to the node (or the node will 147962588Sitojunautomatically joins the group), at bootstrap time: 148062588Sitojun- Loopback address 148162588Sitojun- All-nodes multicast addresses (ff01::1) 148262588Sitojun 148362588SitojunThe following items will be automatically handled when the interface becomes 148462588SitojunIFF_UP: 148562588Sitojun- Its link-local address for each interface 148662588Sitojun- Solicited-node multicast address for link-local addresses 148762588Sitojun- Link-local allnodes multicast address (ff02::1) 148862588Sitojun 148962588SitojunThe following items need to be configured manually by ifconfig(8) or prefix(8). 149062588SitojunAlternatively, these can be autoconfigured by using stateless address 149162588Sitojunautoconfiguration. 149262588Sitojun- Assigned unicast/anycast addresses 149362588Sitojun- Solicited-Node multicast address for assigned unicast address 149462588Sitojun 149562588SitojunUsers can join groups by using appropriate system calls like setsockopt(2). 149662588Sitojun 149762588Sitojun1.15.2 Router case 149862588Sitojun 149962588SitojunIn addition to the above, routers needs to handle the following items. 150062588Sitojun 150162588SitojunThe following items need to be configured manually by using ifconfig(8). 150262588Sitojuno The subnet-router anycast addresses for the interfaces it is configured 150362588Sitojun to act as a router on (prefix::/64) 150462588Sitojuno All other anycast addresses with which the router has been configured 150562588Sitojun 150662588SitojunThe router will join the following multicast group when rtadvd(8) is available 150762588Sitojunfor the interface. 150862588Sitojuno All-Routers Multicast Addresses (ff02::2) 150962588Sitojun 151062588SitojunRouting daemons will join appropriate multicast groups, as necessary, 151162588Sitojunlike ff02::9 for RIPng. 151262588Sitojun 151362588SitojunUsers can join groups by using appropriate system calls like setsockopt(2). 151462588Sitojun 151578064Sume1.16 Advanced API 151678064Sume 1517122115SumeCurrent KAME kernel implements RFC3542 API. It also implements RFC2292 API, 151878064Sumefor backward compatibility purposes with *BSD-integrated codebase. 1519122115SumeKAME tree ships with RFC3542 headers. 1520122115Sume*BSD-integrated codebase implements either RFC2292, or RFC3542, API. 152178064Sumesee "COVERAGE" document for detailed implementation status. 152278064Sume 152378064SumeHere are couple of issues to mention: 152478064Sume- *BSD-integrated binaries, compiled for RFC2292, will work on KAME kernel. 152578064Sume For example, OpenBSD 2.7 /sbin/rtsol will work on KAME/openbsd kernel. 1526122115Sume- KAME binaries, compiled using RFC3542, will not work on *BSD-integrated 152778064Sume kenrel. For example, KAME /usr/local/v6/sbin/rtsol will not work on 152878064Sume OpenBSD 2.7 kernel. 1529122115Sume- RFC3542 API is not compatible with RFC2292 API. RFC3542 #define symbols 153078064Sume conflict with RFC2292 symbols. Therefore, if you compile programs that 153178064Sume assume RFC2292 API, the compilation itself goes fine, however, the compiled 153278064Sume binary will not work correctly. The problem is not KAME issue, but API 1533122115Sume issue. For example, Solaris 8 implements RFC3542 API. If you compile 153478064Sume RFC2292-based code on Solaris 8, the binary can behave strange. 153578064Sume 153678064SumeThere are few (or couple of) incompatible behavior in RFC2292 binary backward 153778064Sumecompatibility support in KAME tree. To enumerate: 153878064Sume- Type 0 routing header lacks support for strict/loose bitmap. 153978064Sume Even if we see packets with "strict" bit set, those bits will not be made 154078064Sume visible to the userland. 154178064Sume Background: RFC2292 document is based on RFC1883 IPv6, and it uses 1542122115Sume strict/loose bitmap. RFC3542 document is based on RFC2460 IPv6, and it has 154378064Sume no strict/loose bitmap (it was removed from RFC2460). KAME tree obeys 154478064Sume RFC2460 IPv6, and lacks support for strict/loose bitmap. 154578064Sume 1546122115SumeThe RFC3542 documents leave some particular cases unspecified. The 1547122115SumeKAME implementation treats them as follows: 1548122115Sume- The IPV6_DONTFRAG and IPV6_RECVPATHMTU socket options for TCP 1549122115Sume sockets are ignored. That is, the setsocktopt() call will succeed 1550122115Sume but the specified value will have no effect. 1551122115Sume 1552122115Sume1.17 DNS resolver 1553122115Sume 1554122115SumeKAME ships with modified DNS resolver, in libinet6.a. 1555122115Sumelibinet6.a has a comple of extensions against libc DNS resolver: 1556122115Sume- Can take "options insecure1" and "options insecure2" in /etc/resolv.conf, 1557122115Sume which toggles RES_INSECURE[12] option flag bit. 1558122115Sume- EDNS0 receive buffer size notification support. It can be enabled by 1559122115Sume "options edns0" in /etc/resolv.conf. See USAGE for details. 1560122115Sume- IPv6 transport support (queries/responses over IPv6). Most of BSD official 1561122115Sume releases now has it already. 1562122115Sume- Partial A6 chain chasing/DNAME/bit string label support (KAME/BSDI4). 1563122115Sume 1564122115Sume 156557522Sshin2. Network Drivers 156657522Sshin 156762588SitojunKAME requires three items to be added into the standard drivers: 156857522Sshin 1569122115Sume(1) (freebsd[234] and bsdi[34] only) mbuf clustering requirement. 1570122115Sume In this stable release, we changed MINCLSIZE into MHLEN+1 for all the 1571122115Sume operating systems in order to make all the drivers behave as we expect. 157257522Sshin 157357522Sshin(2) multicast. If "ifmcstat" yields no multicast group for a 157457522Sshin interface, that interface has to be patched. 157557522Sshin 157662588SitojunTo avoid troubles, we suggest you to comment out the device drivers 157762588Sitojunfor unsupported/unnecessary cards, from the kernel configuration file. 157862588SitojunIf you accidentally enable unsupported drivers, some of the userland 157962588Sitojuntools may not work correctly (routing daemons are typical example). 158057522Sshin 158162588SitojunIn the following sections, "official support" means that KAME developers 158262588Sitojunare using that ethernet card/driver frequently. 158362588Sitojun 158457522Sshin(NOTE: In the past we required all pcmcia drivers to have a call to 158557522Sshinin6_ifattach(). We have no such requirement any more) 158657522Sshin 158762588Sitojun2.1 FreeBSD 2.2.x-RELEASE 158862588Sitojun 158962588SitojunHere is a list of FreeBSD 2.2.x-RELEASE drivers and its conditions: 159062588Sitojun 159162588Sitojun driver mbuf(1) multicast(2) official support? 159262588Sitojun --- --- --- --- 159362588Sitojun (Ethernet) 159462588Sitojun ar looks ok - - 159562588Sitojun cnw ok ok yes (*) 159662588Sitojun ed ok ok yes 159762588Sitojun ep ok ok yes 159862588Sitojun fe ok ok yes 159962588Sitojun sn looks ok - - (*) 160062588Sitojun vx looks ok - - 160162588Sitojun wlp ok ok - (*) 160262588Sitojun xl ok ok yes 160362588Sitojun zp ok ok - 160462588Sitojun (FDDI) 160562588Sitojun fpa looks ok ? - 160662588Sitojun (ATM) 160762588Sitojun en ok ok yes 160862588Sitojun (Serial) 160962588Sitojun lp ? - not work 161062588Sitojun sl ? - not work 161162588Sitojun sr looks ok ok - (**) 161262588Sitojun 161362588SitojunYou may want to add an invocation of "rtsol" in "/etc/pccard_ether", 161462588Sitojunif you are using notebook computers and PCMCIA ethernet card. 161562588Sitojun 161662588Sitojun(*) These drivers are distributed with PAO (http://www.jp.freebsd.org/PAO/). 161762588Sitojun 161862588Sitojun(**) There was some report says that, if you make sr driver up and down and 161962588Sitojunthen up, the kernel may hang up. We have disabled frame-relay support from 162062588Sitojunsr driver and after that this looks to be working fine. If you need 162162588Sitojunframe-relay support to come back, please contact KAME developers. 162262588Sitojun 162362588Sitojun2.2 BSD/OS 3.x 162462588Sitojun 162562588SitojunThe following lists BSD/OS 3.x device drivers and its conditions: 162662588Sitojun 162762588Sitojun driver mbuf(1) multicast(2) official support? 162862588Sitojun --- --- --- --- 162962588Sitojun (Ethernet) 163062588Sitojun cnw ok ok yes 163162588Sitojun de ok ok - 163262588Sitojun df ok ok - 163362588Sitojun eb ok ok - 163462588Sitojun ef ok ok yes 163562588Sitojun exp ok ok - 163662588Sitojun mz ok ok yes 163762588Sitojun ne ok ok yes 163862588Sitojun we ok ok - 163962588Sitojun (FDDI) 164062588Sitojun fpa ok ok - 164162588Sitojun (ATM) 164262588Sitojun en maybe ok - 164362588Sitojun (Serial) 164462588Sitojun ntwo ok ok yes 164562588Sitojun sl ? - not work 164662588Sitojun appp ? - not work 164762588Sitojun 164862588SitojunYou may want to use "@insert" directive in /etc/pccard.conf to invoke 164962588Sitojun"rtsol" command right after dynamic insertion of PCMCIA ethernet cards. 165062588Sitojun 165162588Sitojun2.3 NetBSD 165262588Sitojun 165362588SitojunThe following table lists the network drivers we have tried so far. 165462588Sitojun 165562588Sitojun driver mbuf(1) multicast(2) official support? 165662588Sitojun --- --- --- --- 165762588Sitojun (Ethernet) 165862588Sitojun awi pcmcia/i386 ok ok - 165962588Sitojun bah zbus/amiga NG(*) 166062588Sitojun cnw pcmcia/i386 ok ok yes 166162588Sitojun ep pcmcia/i386 ok ok - 1662151539Ssuz fxp pci/i386 ok(*2) ok - 1663151539Ssuz tlp pci/i386 ok ok - 166462588Sitojun le sbus/sparc ok ok yes 166562588Sitojun ne pci/i386 ok ok yes 166662588Sitojun ne pcmcia/i386 ok ok yes 1667151539Ssuz rtk pci/i386 ok ok - 166862588Sitojun wi pcmcia/i386 ok ok yes 166962588Sitojun (ATM) 167062588Sitojun en pci/i386 ok ok - 167162588Sitojun 167262588Sitojun(*) This may need some fix, but I'm not sure what arcnet interfaces assume... 167362588Sitojun 167462588Sitojun2.4 FreeBSD 3.x-RELEASE 167562588Sitojun 167662588SitojunHere is a list of FreeBSD 3.x-RELEASE drivers and its conditions: 167762588Sitojun 167862588Sitojun driver mbuf(1) multicast(2) official support? 167962588Sitojun --- --- --- --- 168062588Sitojun (Ethernet) 168162588Sitojun cnw ok ok -(*) 168262588Sitojun ed ? ok - 168362588Sitojun ep ok ok - 168462588Sitojun fe ok ok yes 168562588Sitojun fxp ?(**) 168662588Sitojun lnc ? ok - 168762588Sitojun sn ? ? -(*) 168862588Sitojun wi ok ok yes 168962588Sitojun xl ? ok - 169062588Sitojun 169162588Sitojun(*) These drivers are distributed with PAO as PAO3 169262588Sitojun (http://www.jp.freebsd.org/PAO/). 1693151539Ssuz(**) there were trouble reports with multicast filter initialization. 169462588Sitojun 169562588SitojunMore drivers will just simply work on KAME FreeBSD 3.x-RELEASE but have not 169662588Sitojunbeen checked yet. 169762588Sitojun 169878064Sume2.5 FreeBSD 4.x-RELEASE 169962588Sitojun 170078064SumeHere is a list of FreeBSD 4.x-RELEASE drivers and its conditions: 170178064Sume 170278064Sume driver multicast 170378064Sume --- --- 170478064Sume (Ethernet) 170578064Sume lnc/vmware ok 170678064Sume 170778064Sume2.6 OpenBSD 2.x 170878064Sume 170962588SitojunHere is a list of OpenBSD 2.x drivers and its conditions: 171062588Sitojun 171162588Sitojun driver mbuf(1) multicast(2) official support? 171262588Sitojun --- --- --- --- 171362588Sitojun (Ethernet) 171462588Sitojun de pci/i386 ok ok yes 171562588Sitojun fxp pci/i386 ?(*) 171662588Sitojun le sbus/sparc ok ok yes 171762588Sitojun ne pci/i386 ok ok yes 171862588Sitojun ne pcmcia/i386 ok ok yes 171962588Sitojun wi pcmcia/i386 ok ok yes 172062588Sitojun 172162588Sitojun(*) There seem to be some problem in driver, with multicast filter 172262588Sitojunconfiguration. This happens with certain revision of chipset on the card. 172362588SitojunShould be fixed by now by workaround in sys/net/if.c, but still not sure. 172462588Sitojun 172578064Sume2.7 BSD/OS 4.x 172662588Sitojun 172762588SitojunThe following lists BSD/OS 4.x device drivers and its conditions: 172862588Sitojun 172962588Sitojun driver mbuf(1) multicast(2) official support? 173062588Sitojun --- --- --- --- 173162588Sitojun (Ethernet) 173262588Sitojun de ok ok yes 173362588Sitojun exp (*) 173462588Sitojun 173562588SitojunYou may want to use "@insert" directive in /etc/pccard.conf to invoke 173662588Sitojun"rtsol" command right after dynamic insertion of PCMCIA ethernet cards. 173762588Sitojun 173862588Sitojun(*) exp driver has serious conflict with KAME initialization sequence. 173962588SitojunA workaround is committed into sys/i386/pci/if_exp.c, and should be okay by now. 174062588Sitojun 1741151539Ssuz 174257522Sshin3. Translator 174357522Sshin 174457522SshinWe categorize IPv4/IPv6 translator into 4 types. 174557522Sshin 174657522SshinTranslator A --- It is used in the early stage of transition to make 174757522Sshinit possible to establish a connection from an IPv6 host in an IPv6 174857522Sshinisland to an IPv4 host in the IPv4 ocean. 174957522Sshin 175057522SshinTranslator B --- It is used in the early stage of transition to make 175157522Sshinit possible to establish a connection from an IPv4 host in the IPv4 175257522Sshinocean to an IPv6 host in an IPv6 island. 175357522Sshin 175457522SshinTranslator C --- It is used in the late stage of transition to make it 175557522Sshinpossible to establish a connection from an IPv4 host in an IPv4 island 175657522Sshinto an IPv6 host in the IPv6 ocean. 175757522Sshin 175857522SshinTranslator D --- It is used in the late stage of transition to make it 175957522Sshinpossible to establish a connection from an IPv6 host in the IPv6 ocean 176057522Sshinto an IPv4 host in an IPv4 island. 176157522Sshin 176257522SshinKAME provides an TCP relay translator for category A. This is called 176357522Sshin"FAITH". We also provide IP header translator for category A. 176457522Sshin 176557522Sshin3.1 FAITH TCP relay translator 176657522Sshin 176757522SshinFAITH system uses TCP relay daemon called "faithd" helped by the KAME kernel. 176857522SshinFAITH will reserve an IPv6 address prefix, and relay TCP connection 176957522Sshintoward that prefix to IPv4 destination. 177057522Sshin 177157522SshinFor example, if the reserved IPv6 prefix is 3ffe:0501:0200:ffff::, and 177257522Sshinthe IPv6 destination for TCP connection is 3ffe:0501:0200:ffff::163.221.202.12, 177357522Sshinthe connection will be relayed toward IPv4 destination 163.221.202.12. 177457522Sshin 177557522Sshin destination IPv4 node (163.221.202.12) 177657522Sshin ^ 177757522Sshin | IPv4 tcp toward 163.221.202.12 177857522Sshin FAITH-relay dual stack node 177957522Sshin ^ 178057522Sshin | IPv6 TCP toward 3ffe:0501:0200:ffff::163.221.202.12 178157522Sshin source IPv6 node 178257522Sshin 178357522Sshinfaithd must be invoked on FAITH-relay dual stack node. 178457522Sshin 1785151539SsuzFor more details, consult kame/kame/faithd/README and RFC3142. 178657522Sshin 178757522Sshin3.2 IPv6-to-IPv4 header translator 178857522Sshin 178978064Sume(to be written) 179057522Sshin 1791151539Ssuz 179257522Sshin4. IPsec 179357522Sshin 179462588SitojunIPsec is implemented as the following three components. 179557522Sshin 179657522Sshin(1) Policy Management 179757522Sshin(2) Key Management 179862588Sitojun(3) AH, ESP and IPComp handling in kernel 179957522Sshin 180062588SitojunNote that KAME/OpenBSD does NOT include support for KAME IPsec code, 180162588Sitojunas OpenBSD team has their home-brew IPsec stack and they have no plan 180262588Sitojunto replace it. IPv6 support for IPsec is, therefore, lacking on KAME/OpenBSD. 180362588Sitojun 180478064Sumehttp://www.netbsd.org/Documentation/network/ipsec/ has more information 180578064Sumeincluding usage examples. 180678064Sume 180757522Sshin4.1 Policy Management 180857522Sshin 1809122115SumeThe kernel implements experimental policy management code. There are two ways 181057522Sshinto manage security policy. One is to configure per-socket policy using 181157522Sshinsetsockopt(3). In this cases, policy configuration is described in 181257522Sshinipsec_set_policy(3). The other is to configure kernel packet filter-based 181357522Sshinpolicy using PF_KEY interface, via setkey(8). 181457522Sshin 181562588SitojunThe policy entry will be matched in order. The order of entries makes 181662588Sitojundifference in behavior. 181757522Sshin 181857522Sshin4.2 Key Management 181957522Sshin 182057522SshinThe key management code implemented in this kit (sys/netkey) is a 182157522Sshinhome-brew PFKEY v2 implementation. This conforms to RFC2367. 182257522Sshin 182362588SitojunThe home-brew IKE daemon, "racoon" is included in the kit (kame/kame/racoon, 182462588Sitojunor usr.sbin/racoon). 182557522SshinBasically you'll need to run racoon as daemon, then setup a policy 182657522Sshinto require keys (like ping -P 'out ipsec esp/transport//use'). 182757522SshinThe kernel will contact racoon daemon as necessary to exchange keys. 182857522Sshin 182962588SitojunIn IKE spec, there's ambiguity about interpretation of "tunnel" proposal. 183062588SitojunFor example, if we would like to propose the use of following packet: 183162588Sitojun IP AH ESP IP payload 183262588Sitojunsome implementation proposes it as "AH transport and ESP tunnel", since 183362588Sitojunthis is more logical from packet construction point of view. Some 183462588Sitojunimplementation proposes it as "AH tunnel and ESP tunnel". 1835122115SumeRacoon follows the latter route (previously it followed the former, and 1836122115Sumethe latter interpretation seems to be popular/consensus). 183762588SitojunThis raises real interoperability issue. We hope this to be resolved quickly. 183862588Sitojun 1839122115Sumeracoon does not implement byte lifetime for both phase 1 and phase 2 1840122115Sume(RFC2409 page 35, Life Type = kilobytes). 1841122115Sume 184257522Sshin4.3 AH and ESP handling 184357522Sshin 184457522SshinIPsec module is implemented as "hooks" to the standard IPv4/IPv6 184557522Sshinprocessing. When sending a packet, ip{,6}_output() checks if ESP/AH 184657522Sshinprocessing is required by checking if a matching SPD (Security 184757522SshinPolicy Database) is found. If ESP/AH is needed, 184857522Sshin{esp,ah}{4,6}_output() will be called and mbuf will be updated 184957522Sshinaccordingly. When a packet is received, {esp,ah}4_input() will be 185057522Sshincalled based on protocol number, i.e. (*inetsw[proto])(). 185157522Sshin{esp,ah}4_input() will decrypt/check authenticity of the packet, 185257522Sshinand strips off daisy-chained header and padding for ESP/AH. It is 185357522Sshinsafe to strip off the ESP/AH header on packet reception, since we 185457522Sshinwill never use the received packet in "as is" form. 185557522Sshin 185657522SshinBy using ESP/AH, TCP4/6 effective data segment size will be affected by 185757522Sshinextra daisy-chained headers inserted by ESP/AH. Our code takes care of 185857522Sshinthe case. 185957522Sshin 186057522SshinBasic crypto functions can be found in directory "sys/crypto". ESP/AH 186157522Sshintransform are listed in {esp,ah}_core.c with wrapper functions. If you 186257522Sshinwish to add some algorithm, add wrapper function in {esp,ah}_core.c, and 186357522Sshinadd your crypto algorithm code into sys/crypto. 186457522Sshin 186562588SitojunTunnel mode works basically fine, but comes with the following restrictions: 186662588Sitojun- You cannot run routing daemon across IPsec tunnel, since we do not model 186762588Sitojun IPsec tunnel as pseudo interfaces. 186857522Sshin- Authentication model for AH tunnel must be revisited. We'll need to 186957522Sshin improve the policy management engine, eventually. 187078064Sume- Path MTU discovery does not work across IPv6 IPsec tunnel gateway due to 187178064Sume insufficient code. 187257522Sshin 187362588SitojunAH specificaton does not talk much about "multiple AH on a packet" case. 187462588SitojunWe incrementally compute AH checksum, from inside to outside. Also, we 187562588Sitojuntreat inner AH to be immutable. 187662588SitojunFor example, if we are to create the following packet: 187762588Sitojun IP AH1 AH2 AH3 payload 187862588Sitojunwe do it incrementally. As a result, we get crypto checksums like below: 187962588Sitojun AH3 has checksum against "IP AH3' payload". 188062588Sitojun where AH3' = AH3 with checksum field filled with 0. 188162588Sitojun AH2 has checksum against "IP AH2' AH3 payload". 188262588Sitojun AH1 has checksum against "IP AH1' AH2 AH3 payload", 188362588SitojunAlso note that AH3 has the smallest sequence number, and AH1 has the largest 188462588Sitojunsequence number. 188557522Sshin 188678064SumeTo avoid traffic analysis on shorter packets, ESP output logic supports 188778064Sumerandom length padding. By setting net.inet.ipsec.esp_randpad (or 188878064Sumenet.inet6.ipsec6.esp_randpad) to positive value N, you can ask the kernel 188978064Sumeto randomly pad packets shorter than N bytes, to random length smaller than 189078064Sumeor equal to N. Note that N does not include ESP authentication data length. 189178064SumeAlso note that the random padding is not included in TCP segment 189278064Sumesize computation. Negative value will turn off the functionality. 189378064SumeRecommeded value for N is like 128, or 256. If you use a too big number 189478064Sumeas N, you may experience inefficiency due to fragmented packtes. 189578064Sume 189662588Sitojun4.4 IPComp handling 189762588Sitojun 189862588SitojunIPComp stands for IP payload compression protocol. This is aimed for 189962588Sitojunpayload compression, not the header compression like PPP VJ compression. 190062588SitojunThis may be useful when you are using slow serial link (say, cell phone) 190162588Sitojunwith powerful CPU (well, recent notebook PCs are really powerful...). 190262588SitojunThe protocol design of IPComp is very similar to IPsec, though it was 190362588Sitojundefined separately from IPsec itself. 190462588Sitojun 190562588SitojunHere are some points to be noted: 190662588Sitojun- IPComp is treated as part of IPsec protocol suite, and SPI and 190762588Sitojun CPI space is unified. Spec says that there's no relationship 190862588Sitojun between two so they are assumed to be separate in specs. 190962588Sitojun- IPComp association (IPCA) is kept in SAD. 191062588Sitojun- It is possible to use well-known CPI (CPI=2 for DEFLATE for example), 191162588Sitojun for outbound/inbound packet, but for indexing purposes one element from 191262588Sitojun SPI/CPI space will be occupied anyway. 191362588Sitojun- pfkey is modified to support IPComp. However, there's no official 191462588Sitojun SA type number assignment yet. Portability with other IPComp 191562588Sitojun stack is questionable (anyway, who else implement IPComp on UN*X?). 191662588Sitojun- Spec says that IPComp output processing must be performed before AH/ESP 191762588Sitojun output processing, to achieve better compression ratio and "stir" data 191862588Sitojun stream before encryption. The most meaningful processing order is: 191962588Sitojun (1) compress payload by IPComp, (2) encrypt payload by ESP, then (3) attach 192062588Sitojun authentication data by AH. 192162588Sitojun However, with manual SPD setting, you are able to violate the ordering 192262588Sitojun (KAME code is too generic, maybe). Also, it is just okay to use IPComp 192362588Sitojun alone, without AH/ESP. 192462588Sitojun- Though the packet size can be significantly decreased by using IPComp, no 192562588Sitojun special consideration is made about path MTU (spec talks nothing about MTU 192662588Sitojun consideration). IPComp is designed for serial links, not ethernet-like 192762588Sitojun medium, it seems. 192862588Sitojun- You can change compression ratio on outbound packet, by changing 192962588Sitojun deflate_policy in sys/netinet6/ipcomp_core.c. You can also change outbound 193062588Sitojun history buffer size by changing deflate_window_out in the same source code. 193162588Sitojun (should it be sysctl accessible, or per-SAD configurable?) 193262588Sitojun- Tunnel mode IPComp is not working right. KAME box can generate tunnelled 193362588Sitojun IPComp packet, however, cannot accept tunneled IPComp packet. 193462588Sitojun- You can negotiate IPComp association with racoon IKE daemon. 193562588Sitojun- KAME code does not attach Adler32 checksum to compressed data. 193662588Sitojun see ipsec wg mailing list discussion in Jan 2000 for details. 193762588Sitojun 193862588Sitojun4.5 Conformance to RFCs and IDs 193962588Sitojun 194057522SshinThe IPsec code in the kernel conforms (or, tries to conform) to the 194157522Sshinfollowing standards: 194257522Sshin "old IPsec" specification documented in rfc182[5-9].txt 194378064Sume "new IPsec" specification documented in: 1944122115Sume rfc240[1-6].txt rfc241[01].txt rfc2451.txt rfc3602.txt 194562588Sitojun IPComp: 194662588Sitojun RFC2393: IP Payload Compression Protocol (IPComp) 194778064SumeIKE specifications (rfc240[7-9].txt) are implemented in userland 194878064Sumeas "racoon" IKE daemon. 194957522Sshin 195057522SshinCurrently supported algorithms are: 195157522Sshin old IPsec AH 195257522Sshin null crypto checksum (no document, just for debugging) 195357522Sshin keyed MD5 with 128bit crypto checksum (rfc1828.txt) 195457522Sshin keyed SHA1 with 128bit crypto checksum (no document) 195557522Sshin HMAC MD5 with 128bit crypto checksum (rfc2085.txt) 195657522Sshin HMAC SHA1 with 128bit crypto checksum (no document) 1957121021Sume HMAC RIPEMD160 with 128bit crypto checksum (no document) 195857522Sshin old IPsec ESP 195957522Sshin null encryption (no document, similar to rfc2410.txt) 196057522Sshin DES-CBC mode (rfc1829.txt) 196157522Sshin new IPsec AH 196257522Sshin null crypto checksum (no document, just for debugging) 196357522Sshin keyed MD5 with 96bit crypto checksum (no document) 196457522Sshin keyed SHA1 with 96bit crypto checksum (no document) 196557522Sshin HMAC MD5 with 96bit crypto checksum (rfc2403.txt 196657522Sshin HMAC SHA1 with 96bit crypto checksum (rfc2404.txt) 1967151539Ssuz HMAC SHA2-256 with 96bit crypto checksum (draft-ietf-ipsec-ciph-sha-256-00.txt) 196878064Sume HMAC SHA2-384 with 96bit crypto checksum (no document) 196978064Sume HMAC SHA2-512 with 96bit crypto checksum (no document) 1970121021Sume HMAC RIPEMD160 with 96bit crypto checksum (RFC2857) 1971121071Sume AES XCBC MAC with 96bit crypto checksum (RFC3566) 197257522Sshin new IPsec ESP 197357522Sshin null encryption (rfc2410.txt) 197457522Sshin DES-CBC with derived IV 197557522Sshin (draft-ietf-ipsec-ciph-des-derived-01.txt, draft expired) 197657522Sshin DES-CBC with explicit IV (rfc2405.txt) 197757522Sshin 3DES-CBC with explicit IV (rfc2451.txt) 197857522Sshin BLOWFISH CBC (rfc2451.txt) 197957522Sshin CAST128 CBC (rfc2451.txt) 1980121071Sume RIJNDAEL/AES CBC (rfc3602.txt) 1981151539Ssuz AES counter mode (rfc3686.txt) 1982121071Sume 1983151539Ssuz each of the above can be combined with new IPsec AH schemes for 1984151539Ssuz ESP authentication. 198562588Sitojun IPComp 198662588Sitojun RFC2394: IP Payload Compression Using DEFLATE 198757522Sshin 198857522SshinThe following algorithms are NOT supported: 198957522Sshin old IPsec AH 199057522Sshin HMAC MD5 with 128bit crypto checksum + 64bit replay prevention 199157522Sshin (rfc2085.txt) 199257522Sshin keyed SHA1 with 160bit crypto checksum + 32bit padding (rfc1852.txt) 199357522Sshin 199462588SitojunThe key/policy management API is based on the following document, with fair 199562588Sitojunamount of extensions: 199662588Sitojun RFC2367: PF_KEY key management API 199757522Sshin 199862588Sitojun4.6 ECN consideration on IPsec tunnels 199957522Sshin 200057522SshinKAME IPsec implements ECN-friendly IPsec tunnel, described in 200162588Sitojundraft-ietf-ipsec-ecn-02.txt. 200257522SshinNormal IPsec tunnel is described in RFC2401. On encapsulation, 200357522SshinIPv4 TOS field (or, IPv6 traffic class field) will be copied from inner 200457522SshinIP header to outer IP header. On decapsulation outer IP header 200557522Sshinwill be simply dropped. The decapsulation rule is not compatible 200657522Sshinwith ECN, since ECN bit on the outer IP TOS/traffic class field will be 200757522Sshinlost. 200857522SshinTo make IPsec tunnel ECN-friendly, we should modify encapsulation 200957522Sshinand decapsulation procedure. This is described in 201062588Sitojundraft-ietf-ipsec-ecn-02.txt, chapter 3.3. 201157522Sshin 201257522SshinKAME IPsec tunnel implementation can give you three behaviors, by setting 201357522Sshinnet.inet.ipsec.ecn (or net.inet6.ipsec6.ecn) to some value: 201457522Sshin- RFC2401: no consideration for ECN (sysctl value -1) 201557522Sshin- ECN forbidden (sysctl value 0) 201657522Sshin- ECN allowed (sysctl value 1) 201757522SshinNote that the behavior is configurable in per-node manner, not per-SA manner 201862588Sitojun(draft-ietf-ipsec-ecn-02 wants per-SA configuration, but it looks too much 201962588Sitojunfor me). 202057522Sshin 202157522SshinThe behavior is summarized as follows (see source code for more detail): 202257522Sshin 202357522Sshin encapsulate decapsulate 202457522Sshin --- --- 202557522SshinRFC2401 copy all TOS bits drop TOS bits on outer 202657522Sshin from inner to outer. (use inner TOS bits as is) 202757522Sshin 202857522SshinECN forbidden copy TOS bits except for ECN drop TOS bits on outer 202957522Sshin (masked with 0xfc) from inner (use inner TOS bits as is) 203057522Sshin to outer. set ECN bits to 0. 203157522Sshin 203257522SshinECN allowed copy TOS bits except for ECN use inner TOS bits with some 203357522Sshin CE (masked with 0xfe) from change. if outer ECN CE bit 203457522Sshin inner to outer. is 1, enable ECN CE bit on 203557522Sshin set ECN CE bit to 0. the inner. 203657522Sshin 203757522SshinGeneral strategy for configuration is as follows: 203857522Sshin- if both IPsec tunnel endpoint are capable of ECN-friendly behavior, 203957522Sshin you'd better configure both end to "ECN allowed" (sysctl value 1). 204057522Sshin- if the other end is very strict about TOS bit, use "RFC2401" 204157522Sshin (sysctl value -1). 204257522Sshin- in other cases, use "ECN forbidden" (sysctl value 0). 204357522SshinThe default behavior is "ECN forbidden" (sysctl value 0). 204457522Sshin 204557522SshinFor more information, please refer to: 204662588Sitojun draft-ietf-ipsec-ecn-02.txt 204757522Sshin RFC2481 (Explicit Congestion Notification) 204857522Sshin KAME sys/netinet6/{ah,esp}_input.c 204957522Sshin 205057522Sshin(Thanks goes to Kenjiro Cho <kjc@csl.sony.co.jp> for detailed analysis) 205157522Sshin 205262588Sitojun4.7 Interoperability 205357522Sshin 205462588SitojunIPsec, IPComp (in kernel) and IKE (in userland as "racoon") has been tested 205562588Sitojunat several interoperability test events, and it is known to interoperate 205662588Sitojunwith many other implementations well. Also, KAME IPsec has quite wide 205762588Sitojuncoverage for IPsec crypto algorithms documented in RFC (we do not cover 205862588Sitojunalgorithms with intellectual property issues, though). 205962588Sitojun 206057522SshinHere are (some of) platforms we have tested IPsec/IKE interoperability 206178064Sumein the past, no particular order. Note that both ends (KAME and 206262588Sitojunothers) may have modified their implementation, so use the following 206362588Sitojunlist just for reference purposes. 2064151539Ssuz 6WIND, ACC, Allied-telesis, Altiga, Ashley-laurent (vpcom.com), 2065151539Ssuz BlueSteel, CISCO IOS, Checkpoint FW-1, Compaq Tru54 UNIX 2066151539Ssuz X5.1B-BL4, Cryptek, Data Fellows (F-Secure), Ericsson, 2067151539Ssuz F-Secure VPN+ 5.40, Fitec, Fitel, FreeS/WAN, HITACHI, HiFn, 2068151539Ssuz IBM AIX 5.1, III, IIJ (fujie stack), Intel Canada, Intel 2069151539Ssuz Packet Protect, MEW NetCocoon, MGCS, Microsoft WinNT/2000/XP, 2070151539Ssuz NAI PGPnet, NEC IX5000, NIST (linux IPsec + plutoplus), 2071151539Ssuz NetLock, Netoctave, Netopia, Netscreen, Nokia EPOC, Nortel 2072151539Ssuz GatewayController/CallServer 2000 (not released yet), 2073151539Ssuz NxNetworks, OpenBSD isakmpd on OpenBSD, Oullim information 2074151539Ssuz technologies SECUREWORKS VPN gateway 3.0, Pivotal, RSA, 2075151539Ssuz Radguard, RapidStream, RedCreek, Routerware, SSH, SecGo 2076151539Ssuz CryptoIP v3, Secure Computing, Soliton, Sun Solaris 8, 2077151539Ssuz TIS/NAI Gauntret, Toshiba, Trilogy AdmitOne 2.6, Trustworks 2078151539Ssuz TrustedClient v3.2, USAGI linux, VPNet, Yamaha RT series, 2079151539Ssuz ZyXEL 208057522Sshin 208162588SitojunHere are (some of) platforms we have tested IPComp/IKE interoperability 208262588Sitojunin the past, in no particular order. 2083151539Ssuz Compaq, IRE, SSH, NetLock, FreeS/WAN, F-Secure VPN+ 5.40 208457522Sshin 208578064SumeVPNC (vpnc.org) provides IPsec conformance tests, using KAME and OpenBSD 208678064SumeIPsec/IKE implementations. Their test results are available at 208778064Sumehttp://www.vpnc.org/conformance.html, and it may give you more idea 208878064Sumeabout which implementation interoperates with KAME IPsec/IKE implementation. 208978064Sume 2090122115Sume4.8 Operations with IPsec tunnel mode 2091122115Sume 2092122115SumeFirst of all, IPsec tunnel is a very hairy thing. It seems to do a neat thing 2093122115Sumelike VPN configuration or secure remote accesses, however, it comes with lots 2094122115Sumeof architectural twists. 2095122115Sume 2096122115SumeRFC2401 defines IPsec tunnel mode, within the context of IPsec. RFC2401 2097122115Sumedefines tunnel mode packet encapsulation/decapsulation on its own, and 2098122115Sumedoes not refer other tunnelling specifications. Since RFC2401 advocates 2099122115Sumefilter-based SPD database matches, it would be natural for us to implement 2100122115SumeIPsec IPsec tunnel mode as filters - not as pseudo interfaces. 2101122115Sume 2102122115SumeThere are some people who are trying to separate IPsec "tunnel mode" from 2103122115Sumethe IPsec itself. They would like to implement IPsec transport mode only, 2104122115Sumeand combine it with tunneling pseudo devices. The prime example is found 2105122115Sumein draft-touch-ipsec-vpn-01.txt. However, if you really define pseudo 2106122115Sumeinterfaces separately from IPsec, IKE daemons would need to negotiate 2107122115Sumetransport mode SAs, instead of tunnel mode SAs. Therefore, we cannot 2108122115Sumereally mix RFC2401-based interpretation and draft-touch-ipsec-vpn-01.txt 2109122115Sumeinterpretation. 2110122115Sume 2111122115SumeThe KAME stack implements can be configured in two ways. You may need 2112122115Sumeto recompile your kernel to switch the behavior. 2113122115Sume- RFC2401 IPsec tunnel mode appraoch (4.8.1) 2114122115Sume- draft-touch-ipsec-vpn approach (4.8.2) 2115122115Sume Works in all kernel configuration, but racoon(8) may not interoperate. 2116122115Sume 2117122115SumeThere are pros and cons on these approaches: 2118122115Sume 2119122115SumeRFC2401 IPsec tunnel mode (filter-like) approach 2120122115Sume PRO: SPD lookup fits nicely with packet filters (if you integrate them) 2121122115Sume CON: cannot run routing daemons across IPsec tunnels 2122122115Sume CON: it is very hard to control source address selection on originating 2123122115Sume cases 2124122115Sume ???: IPv6 scope zone is kept the same 2125122115Sumedraft-touch-ipsec-vpn (transportmode + Pseudo-interface) approach 2126122115Sume PRO: run routing daemons across IPsec tunnels 2127122115Sume PRO: source address selection can be done normally, by looking at 2128122115Sume IPsec tunnel pseudo devices 2129122115Sume CON: on outbound, possibility of infinite loops if routing setup 2130122115Sume is wrong 2131122115Sume CON: due to differences in encap/decap logic from RFC2401, it may not 2132122115Sume interoperate with very picky RFC2401 implementations 2133122115Sume (those who check TOS bits, for example) 2134122115Sume CON: cannot negotiate IKE with other IPsec tunnel-mode devices 2135122115Sume (the other end has to implement 2136122115Sume ???: IPv6 scope zone is likely to be different from the real ethernet 2137122115Sume interface 2138122115Sume 2139122115SumeThe recommendation is different depending on the situation you have: 2140122115Sume- use draft-touch-ipsec-vpn if you have the control over the other end. 2141122115Sume this one is the best in terms of simplicity. 2142122115Sume- if the other end is normal IPsec device with RFC2401 implementation, 2143122115Sume you need to use RFC2401, otherwise you won't be able to run IKE. 2144122115Sume- use RFC2401 approach if you just want to forward packets back and forth 2145122115Sume and there's no plan to use IPsec gateway itself as an originating device. 2146122115Sume 2147122115Sume4.8.1 RFC2401 IPsec tunnel mode approach 2148122115Sume 2149122115SumeTo configure your device as RFC2401 IPsec tunnel mode endpoint, you will 2150122115Sumeuse "tunnel" keyword in setkey(8) "spdadd" directives. Let us assume the 2151122115Sumefollowing topology (A and B could be a network, like prefix/length): 2152122115Sume 2153122115Sume ((((((((((((The internet)))))))))))) 2154122115Sume | | 2155122115Sume |C (global) |D 2156122115Sume your device peer's device 2157122115Sume |A (private) |B 2158122115Sume ==+===== VPN net ==+===== VPN net 2159122115Sume 2160122115SumeThe policy configuration directive is like this. You will need manual 2161122115SumeSAs, or IKE daemon, for actual encryption: 2162122115Sume 2163122115Sume # setkey -c <<EOF 2164122115Sume spdadd A B any -P out ipsec esp/tunnel/C-D/use; 2165122115Sume spdadd B A any -P in ipsec esp/tunnel/D-C/use; 2166122115Sume ^D 2167122115Sume 2168122115SumeThe inbound/outbound traffic is monitored/captured by SPD engine, which works 2169122115Sumejust like packet filters. 2170122115Sume 2171122115SumeWith this, forwarding case should work flawlessly. However, troubles arise 2172122115Sumewhen you have one of the following requirements: 2173122115Sume- When you originate traffic from your VPN gateway device to VPN net on the 2174122115Sume other end (like B), you want your source address to be A (private side) 2175122115Sume so that the traffic would be protected by the policy. 2176122115Sume With this approach, however, the source address selection logic follows 2177122115Sume normal routing table, and C (global side) will be picked for any outgoing 2178122115Sume traffic, even if the destination is B. The resulting packet will be like 2179122115Sume this: 2180122115Sume IP[C -> B] payload 2181122115Sume and will not match the policy (= sent in clear). 2182122115Sume- When you want to run routing protocols on top of the IPsec tunnel, it is 2183122115Sume not possible. As there is no pseudo device that identifies the IPsec tunnel, 2184122115Sume you cannot identify where the routing information came from. As a result, 2185122115Sume you can't run routing daemons. 2186122115Sume 2187122115Sume4.8.2 draft-touch-ipsec-vpn approach 2188122115Sume 2189122115SumeWith this approach, you will configure gif(4) tunnel interfaces, as well as 2190122115SumeIPsec transport mode SAs. 2191122115Sume 2192122115Sume # gifconfig gif0 C D 2193122115Sume # ifconfig gif0 A B 2194122115Sume # setkey -c <<EOF 2195122115Sume spdadd C D any -P out ipsec esp/transport//use; 2196122115Sume spdadd D C any -P in ipsec esp/transport//use; 2197122115Sume ^D 2198122115Sume 2199122115SumeSince we have a pseudo-interface "gif0", and it affects the routes and 2200122115Sumethe source address selection logic, we can have source address A, for 2201122115Sumepackets originated by the VPN gateway to B (and the VPN cloud). 2202122115SumeWe can also exchange routing information over the tunnel (gif0), as the tunnel 2203122115Sumeis represented as a pseudo interface (dynamic routes points to the 2204122115Sumepseudo interface). 2205122115Sume 2206122115SumeThere is a big drawbacks, however; with this, you can use IKE if and only if 2207122115Sumethe other end is using draft-touch-ipsec-vpn approach too. Since racoon(8) 2208122115Sumegrabs phase 2 IKE proposals from the kernel SPD database, you will be 2209122115Sumenegotiating IPsec transport-mode SAs with the other end, not tunnel-mode SAs. 2210122115SumeAlso, since the encapsulation mechanism is different from RFC2401, you may not 2211122115Sumebe able to interoperate with a picky RFC2401 implementations - if the other 2212122115Sumeend checks certain outer IP header fields (like TOS), you will not be able to 2213122115Sumeinteroperate. 2214122115Sume 2215122115Sume 221662588Sitojun5. ALTQ 221757522Sshin 2218151539SsuzKAME kit includes ALTQ, which supports FreeBSD3, FreeBSD4, FreeBSD5 2219151539SsuzNetBSD. OpenBSD has ALTQ merged into pf and its ALTQ code is not 2220151539Ssuzcompatible with other platforms so that KAME's ALTQ is not used for 2221151539SsuzOpenBSD. For BSD/OS, ALTQ does not work. 2222151539SsuzALTQ in KAME supports IPv6. 222378064Sume(actually, ALTQ is developed on KAME repository since ALTQ 2.1 - Jan 2000) 222457522Sshin 222578064SumeALTQ occupies single character device number. For FreeBSD, it is officially 222678064Sumeallocated. For OpenBSD and NetBSD, we use the number which is not 222778064Sumecurrently allocated (will eventually get an official number). 222878064SumeThe character device is enabled for i386 architecture only. To enable and 222978064Sumecompile ALTQ-ready kernel for other archititectures, take the following steps: 223078064Sume- assume that your architecture is FOOBAA. 223178064Sume- modify sys/arch/FOOBAA/FOOBAA/conf.c (or somewhere that defines cdevsw), 223278064Sume to include a line for ALTQ. look at sys/arch/i386/i386/conf.c for 223378064Sume example. The major number must be same as i386 case. 223478064Sume- copy kernel configuration file (like ALTQ.v6 or GENERIC.v6) from i386, 223578064Sume and modify accordingly. 223678064Sume- build a kernel. 223778064Sume- before building userland, change netbsd/{lib,usr.sbin,usr.bin}/Makefile 223878064Sume (or openbsd/foobaa) so that it will visit altq-related sub directories. 223978064Sume 224057522Sshin 2241151539Ssuz6. Mobile IPv6 2242151539Ssuz 224378064Sume6.1 KAME node as correspondent node 224457522Sshin 224578064SumeDefault installation recognizes home address option (in destination 224678064Sumeoptions header). No sub-options are supported. interaction with 224778064SumeIPsec, and/or 2292bis API, needs further study. 224878064Sume 224978064Sume6.2 KAME node as home agent/mobile node 225078064Sume 225178064SumeKAME kit includes Ericsson mobile-ip6 code. The integration is just started 225278064Sume(in Feb 2000), and we will need some more time to integrate it better. 225378064Sume 225478064SumeSee kame/mip6config/{QUICKSTART,README_MIP6.txt} for more details. 225578064Sume 225678064SumeThe Ericsson code implements revision 09 of the mobile-ip6 draft. There 225778064Sumeare other implementations available: 225878064Sume NEC: http://www.6bone.nec.co.jp/mipv6/internal-dist/ (-13 draft) 225978064Sume SFC: http://neo.sfc.wide.ad.jp/~mip6/ (-13 draft) 226078064Sume 226178064Sume7. Coding style 226278064Sume 226378064SumeThe KAME developers basically do not make a bother about coding 226478064Sumestyle. However, there is still some agreement on the style, in order 226578064Sumeto make the distributed develoment smooth. 226678064Sume 2267122115Sume- follow *BSD KNF where possible. note: there are multiple KNF standards. 226878064Sume- the tab character should be 8 columns wide (tabstops are at 8, 16, 24, ... 226978064Sume column). With vi, use ":set ts=8 sw=8". 2270122115Sume With GNU Emacs 20 and later, the easiest way is to use the "bsd" style of 2271122115Sume cc-mode with the variable "c-basic-offset" being 8; 2272122115Sume (add-hook 'c-mode-common-hook 2273122115Sume (function 2274122115Sume (lambda () 2275122115Sume (c-set-style "bsd") 2276122115Sume (setq c-basic-offset 8) ; XXX for Emacs 20 only 2277122115Sume ))) 2278122115Sume The "bsd" style in GNU Emacs 21 sets the variable to 8 by default, 2279122115Sume so the line marked by "XXX" is not necessary if you only use GNU 2280122115Sume Emacs 21. 228178064Sume- each line should be within 80 characters. 228278064Sume- keep a single open/close bracket in a comment such as in the following 228378064Sume line: 228478064Sume putchar('('); /* ) */ 228578064Sume without this, some vi users would have a hard time to match a pair of 228678064Sume brackets. Although this type of bracket seems clumsy and is even 228778064Sume harmful for some other type of vi users and Emacs users, the 228878064Sume agreement in the KAME developers is to allow it. 228978064Sume- add the following line to the head of every KAME-derived file: 229078064Sume /* (dollar)KAME(dollar) */ 229178064Sume where "(dollar)" is the dollar character ($), and around "$" are tabs. 2292122115Sume (this is for C. For other language, you should use its own comment 229378064Sume line.) 229478064Sume Once commited to the CVS repository, this line will contain its 229578064Sume version number (see, for example, at the top of this file). This 229678064Sume would make it easy to report a bug. 229778064Sume- when creating a new file with the WIDE copyright, tap "make copyright.c" at 229878064Sume the top-level, and use copyright.c as a template. KAME RCS tag will be 229978064Sume included automatically. 230078064Sume- when editting a third-party package, keep its own coding style as 230178064Sume much as possible, even if the style does not follow the items above. 2302122115Sume- it is recommended to always wrap an expression containing 2303122115Sume bitwise operators by parentheses, especially when the expression is 2304122115Sume combined with relational operators, in order to avoid unintentional 2305122115Sume mismatch of operators. Thus, we should write 2306122115Sume if ((a & b) == 0) /* (A) */ 2307122115Sume or 2308122115Sume if (a & (b == 0)) /* (B) */ 2309122115Sume instead of 2310122115Sume if (a & b == 0) /* (C) */ 2311122115Sume even if the programmer's intention was (C), which is equivalent to 2312122115Sume (B) according to the grammar of the language C. 2313122115Sume Thus, we should write a code to test if a bit-flag is set for a 2314122115Sume given variable as follows: 2315122115Sume if ((flag & FLAG_A) == 0) /* (D) the FLAG_A is NOT set */ 2316122115Sume if ((flag & FLAG_A) != 0) /* (E) the FLAG_A is set */ 2317122115Sume Some developers in the KAME project rather prefer the following style: 2318122115Sume if (!(flag & FLAG_A)) /* (F) the FLAG_A is NOT set */ 2319122115Sume if ((flag & FLAG_A)) /* (G) the FLAG_A is set */ 2320122115Sume because it would be more intuitive in terms of the relationship 2321122115Sume between the negation operator (!) and the semantics of the 2322122115Sume condition. The KAME developers have discussed the style, and have 2323122115Sume agreed that all the styles from (D) to (G) are valid. So, when you 2324122115Sume see styles like (D) and (E) in the KAME code and feel a bit strange, 2325122115Sume please just keep them. They are intentional. 2326122115Sume- When inserting a separate block just to define some intra-block 2327122115Sume variables, add the level of indentation as if the block was in a 2328122115Sume control statement such as if-else, for, or while. For example, 2329122115Sume foo () 2330122115Sume { 2331122115Sume int a; 233278064Sume 2333122115Sume { 2334122115Sume int internal_a; 2335122115Sume ... 2336122115Sume } 2337122115Sume } 2338122115Sume should be used, instead of 2339122115Sume foo () 2340122115Sume { 2341122115Sume int a; 2342122115Sume 2343122115Sume { 2344122115Sume int internal_a; 2345122115Sume ... 2346122115Sume } 2347122115Sume } 2348122115Sume- Do not use printf() or log() in the packet input path of the kernel code. 2349122115Sume They can make the system vulnerable to packet flooding attacks (results in 2350122115Sume /var overflow). 2351122115Sume- (not a style issue) 2352122115Sume To disable a module that is mistakenly imported (by CVS), just 2353122115Sume remove the source tree in the repository. Note, however, that the 2354122115Sume removal might annoy other developers who have already checked the 2355122115Sume module out, so you should announce the removal as soon as possible. 2356122115Sume Also, be 100% sure not to remove other modules. 2357122115Sume 235878064SumeWhen you want to contribute something to the KAME project, and if *you 235978064Sumedo not mind* the agreement, it would be helpful for the project to 236078064Sumekeep these rules. Note, however, that we would never intend to force 236178064Sumeyou to adopt our rules. We would rather regard your own style, 236278064Sumeespecially when you have a policy about the style. 236378064Sume 2364122115Sume 2365164224Sbz8. Policy on technology with intellectual property right restriction 2366122115Sume 2367122115SumeThere are quite a few IETF documents/whatever which has intellectual property 2368122115Sumeright (IPR) restriction. KAME's stance is stated below. 2369122115Sume 2370122115Sume The goal of KAME is to provide freely redistributable, BSD-licensed, 2371122115Sume implementation of Internet protocol technologies. 2372122115Sume For this purpose, we implement protocols that (1) do not need license 2373122115Sume contract with IPR holder, and (2) are royalty-free. 2374122115Sume The reason for (1) is, even if KAME contracts with the IPR holder in 2375122115Sume question, the users of KAME stack (usually implementers of some other 2376122115Sume codebase) would need to make a license contract with the IPR holder. 2377122115Sume It would damage the "freely redistributable" status of KAME codebase. 2378122115Sume 2379122115Sume By doing so KAME is (implicitly) trying to advocate no-license-contract, 2380122115Sume royalty-free, release of IPRs. 2381122115Sume 2382122115SumeNote however, as documented in README, we do not guarantee that KAME code 2383122115Sumeis free of IPR infringement, you MUST check it if you are to integrate 2384122115SumeKAME into your product (or whatever): 2385122115Sume READ CAREFULLY: Several countries have legal enforcement for 2386122115Sume export/import/use of cryptographic software. Check it before playing 2387122115Sume with the kit. We do not intend to be your legalease clearing house 2388122115Sume (NO WARRANTY). If you intend to include KAME stack into your product, 2389122115Sume you'll need to check if the licenses on each file fit your situations, 2390122115Sume and/or possible intellectual property right issues. 2391122115Sume 239257522Sshin <end of IMPLEMENTATION> 2393