1226031Sstas
2226031Sstas
3226031Sstas
4226031Sstas
5226031Sstas
6226031Sstas
7226031SstasNetwork Working Group                                         P. Hoffman
8226031SstasRequest for Comments: 3491                                    IMC & VPNC
9226031SstasCategory: Standards Track                                    M. Blanchet
10226031Sstas                                                                Viagenie
11226031Sstas                                                              March 2003
12226031Sstas
13226031Sstas
14226031Sstas                   Nameprep: A Stringprep Profile for
15226031Sstas                  Internationalized Domain Names (IDN)
16226031Sstas
17226031SstasStatus of this Memo
18226031Sstas
19226031Sstas   This document specifies an Internet standards track protocol for the
20226031Sstas   Internet community, and requests discussion and suggestions for
21226031Sstas   improvements.  Please refer to the current edition of the "Internet
22226031Sstas   Official Protocol Standards" (STD 1) for the standardization state
23226031Sstas   and status of this protocol.  Distribution of this memo is unlimited.
24226031Sstas
25226031SstasCopyright Notice
26226031Sstas
27226031Sstas   Copyright (C) The Internet Society (2003).  All Rights Reserved.
28226031Sstas
29226031SstasAbstract
30226031Sstas
31226031Sstas   This document describes how to prepare internationalized domain name
32226031Sstas   (IDN) labels in order to increase the likelihood that name input and
33226031Sstas   name comparison work in ways that make sense for typical users
34226031Sstas   throughout the world.  This profile of the stringprep protocol is
35226031Sstas   used as part of a suite of on-the-wire protocols for
36226031Sstas   internationalizing the Domain Name System (DNS).
37226031Sstas
38226031Sstas1. Introduction
39226031Sstas
40226031Sstas   This document specifies processing rules that will allow users to
41226031Sstas   enter internationalized domain names (IDNs) into applications and
42226031Sstas   have the highest chance of getting the content of the strings
43226031Sstas   correct.  It is a profile of stringprep [STRINGPREP].  These
44226031Sstas   processing rules are only intended for internationalized domain
45226031Sstas   names, not for arbitrary text.
46226031Sstas
47226031Sstas   This profile defines the following, as required by [STRINGPREP].
48226031Sstas
49226031Sstas   -  The intended applicability of the profile: internationalized
50226031Sstas      domain names processed by IDNA.
51226031Sstas
52226031Sstas   -  The character repertoire that is the input and output to
53226031Sstas      stringprep:  Unicode 3.2, specified in section 2.
54226031Sstas
55226031Sstas
56226031Sstas
57226031Sstas
58226031SstasHoffman & Blanchet          Standards Track                     [Page 1]
59226031Sstas
60226031SstasRFC 3491                      IDN Nameprep                    March 2003
61226031Sstas
62226031Sstas
63226031Sstas   -  The mappings used: specified in section 3.
64226031Sstas
65226031Sstas   -  The Unicode normalization used: specified in section 4.
66226031Sstas
67226031Sstas   -  The characters that are prohibited as output: specified in section
68226031Sstas      5.
69226031Sstas
70226031Sstas   -  Bidirectional character handling: specified in section 6.
71226031Sstas
72226031Sstas1.1 Interaction of protocol parts
73226031Sstas
74226031Sstas   Nameprep is used by the IDNA [IDNA] protocol for preparing domain
75226031Sstas   names; it is not designed for any other purpose.  It is explicitly
76226031Sstas   not designed for processing arbitrary free text and SHOULD NOT be
77226031Sstas   used for that purpose.  Nameprep is a profile of Stringprep
78226031Sstas   [STRINGPREP].  Implementations of Nameprep MUST fully implement
79226031Sstas   Stringprep.
80226031Sstas
81226031Sstas   Nameprep is used to process domain name labels, not domain names.
82226031Sstas   IDNA calls nameprep for each label in a domain name, not for the
83226031Sstas   whole domain name.
84226031Sstas
85226031Sstas1.2 Terminology
86226031Sstas
87226031Sstas   The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY"
88226031Sstas   in this document are to be interpreted as described in BCP 14, RFC
89226031Sstas   2119 [RFC2119].
90226031Sstas
91226031Sstas2. Character Repertoire
92226031Sstas
93226031Sstas   This profile uses Unicode 3.2, as defined in [STRINGPREP] Appendix A.
94226031Sstas
95226031Sstas3. Mapping
96226031Sstas
97226031Sstas   This profile specifies mapping using the following tables from
98226031Sstas   [STRINGPREP]:
99226031Sstas
100226031Sstas   Table B.1
101226031Sstas   Table B.2
102226031Sstas
103226031Sstas4. Normalization
104226031Sstas
105226031Sstas   This profile specifies using Unicode normalization form KC, as
106226031Sstas   described in [STRINGPREP].
107226031Sstas
108226031Sstas
109226031Sstas
110226031Sstas
111226031Sstas
112226031Sstas
113226031Sstas
114226031SstasHoffman & Blanchet          Standards Track                     [Page 2]
115226031Sstas
116226031SstasRFC 3491                      IDN Nameprep                    March 2003
117226031Sstas
118226031Sstas
119226031Sstas5. Prohibited Output
120226031Sstas
121226031Sstas   This profile specifies prohibiting using the following tables from
122226031Sstas   [STRINGPREP]:
123226031Sstas
124226031Sstas   Table C.1.2
125226031Sstas   Table C.2.2
126226031Sstas   Table C.3
127226031Sstas   Table C.4
128226031Sstas   Table C.5
129226031Sstas   Table C.6
130226031Sstas   Table C.7
131226031Sstas   Table C.8
132226031Sstas   Table C.9
133226031Sstas
134226031Sstas   IMPORTANT NOTE: This profile MUST be used with the IDNA protocol.
135226031Sstas   The IDNA protocol has additional prohibitions that are checked
136226031Sstas   outside of this profile.
137226031Sstas
138226031Sstas6. Bidirectional characters
139226031Sstas
140226031Sstas   This profile specifies checking bidirectional strings as described in
141226031Sstas   [STRINGPREP] section 6.
142226031Sstas
143226031Sstas7. Unassigned Code Points in Internationalized Domain Names
144226031Sstas
145226031Sstas   If the processing in [IDNA] specifies that a list of unassigned code
146226031Sstas   points be used, the system uses table A.1 from [STRINGPREP] as its
147226031Sstas   list of unassigned code points.
148226031Sstas
149226031Sstas8. References
150226031Sstas
151226031Sstas8.1 Normative References
152226031Sstas
153226031Sstas   [RFC2119]    Bradner, S., "Key words for use in RFCs to Indicate
154226031Sstas                Requirement Levels", BCP 14, RFC 2119, March 1997.
155226031Sstas
156226031Sstas   [STRINGPREP] Hoffman, P. and M. Blanchet, "Preparation of
157226031Sstas                Internationalized Strings ("stringprep")", RFC 3454,
158226031Sstas                December 2002.
159226031Sstas
160226031Sstas   [IDNA]       Faltstrom, P., Hoffman, P. and A. Costello,
161226031Sstas                "Internationalizing Domain Names in Applications
162226031Sstas                (IDNA)", RFC 3490, March 2003.
163226031Sstas
164226031Sstas
165226031Sstas
166226031Sstas
167226031Sstas
168226031Sstas
169226031Sstas
170226031SstasHoffman & Blanchet          Standards Track                     [Page 3]
171226031Sstas
172226031SstasRFC 3491                      IDN Nameprep                    March 2003
173226031Sstas
174226031Sstas
175226031Sstas8.2 Informative references
176226031Sstas
177226031Sstas   [STD13]      Mockapetris, P., "Domain names - concepts and
178226031Sstas                facilities", STD 13, RFC 1034, and "Domain names -
179226031Sstas                implementation and specification", STD 13, RFC 1035,
180226031Sstas                November 1987.
181226031Sstas
182226031Sstas9. Security Considerations
183226031Sstas
184226031Sstas   The Unicode and ISO/IEC 10646 repertoires have many characters that
185226031Sstas   look similar.  In many cases, users of security protocols might do
186226031Sstas   visual matching, such as when comparing the names of trusted third
187226031Sstas   parties.  Because it is impossible to map similar-looking characters
188226031Sstas   without a great deal of context such as knowing the fonts used,
189226031Sstas   stringprep does nothing to map similar-looking characters together
190226031Sstas   nor to prohibit some characters because they look like others.
191226031Sstas
192226031Sstas   Security on the Internet partly relies on the DNS.  Thus, any change
193226031Sstas   to the characteristics of the DNS can change the security of much of
194226031Sstas   the Internet.
195226031Sstas
196226031Sstas   Domain names are used by users to connect to Internet servers.  The
197226031Sstas   security of the Internet would be compromised if a user entering a
198226031Sstas   single internationalized name could be connected to different servers
199226031Sstas   based on different interpretations of the internationalized domain
200226031Sstas   name.
201226031Sstas
202226031Sstas   Current applications might assume that the characters allowed in
203226031Sstas   domain names will always be the same as they are in [STD13].  This
204226031Sstas   document vastly increases the number of characters available in
205226031Sstas   domain names.  Every program that uses "special" characters in
206226031Sstas   conjunction with domain names may be vulnerable to attack based on
207226031Sstas   the new characters allowed by this specification.
208226031Sstas
209226031Sstas
210226031Sstas
211226031Sstas
212226031Sstas
213226031Sstas
214226031Sstas
215226031Sstas
216226031Sstas
217226031Sstas
218226031Sstas
219226031Sstas
220226031Sstas
221226031Sstas
222226031Sstas
223226031Sstas
224226031Sstas
225226031Sstas
226226031SstasHoffman & Blanchet          Standards Track                     [Page 4]
227226031Sstas
228226031SstasRFC 3491                      IDN Nameprep                    March 2003
229226031Sstas
230226031Sstas
231226031Sstas10. IANA Considerations
232226031Sstas
233226031Sstas   This is a profile of stringprep.  It has been registered by the IANA
234226031Sstas   in the stringprep profile registry
235226031Sstas   (www.iana.org/assignments/stringprep-profiles).
236226031Sstas
237226031Sstas      Name of this profile:
238226031Sstas         Nameprep
239226031Sstas
240226031Sstas      RFC in which the profile is defined:
241226031Sstas         This document.
242226031Sstas
243226031Sstas      Indicator whether or not this is the newest version of the
244226031Sstas      profile:
245226031Sstas         This is the first version of Nameprep.
246226031Sstas
247226031Sstas11. Acknowledgements
248226031Sstas
249226031Sstas   Many people from the IETF IDN Working Group and the Unicode Technical
250226031Sstas   Committee contributed ideas that went into this document.
251226031Sstas
252226031Sstas   The IDN Nameprep design team made many useful changes to the
253226031Sstas   document.  That team and its advisors include:
254226031Sstas
255226031Sstas      Asmus Freytag
256226031Sstas      Cathy Wissink
257226031Sstas      Francois Yergeau
258226031Sstas      James Seng
259226031Sstas      Marc Blanchet
260226031Sstas      Mark Davis
261226031Sstas      Martin Duerst
262226031Sstas      Patrik Faltstrom
263226031Sstas      Paul Hoffman
264226031Sstas
265226031Sstas   Additional significant improvements were proposed by:
266226031Sstas
267226031Sstas      Jonathan Rosenne
268226031Sstas      Kent Karlsson
269226031Sstas      Scott Hollenbeck
270226031Sstas      Dave Crocker
271226031Sstas      Erik Nordmark
272226031Sstas      Matitiahu Allouche
273226031Sstas
274226031Sstas
275226031Sstas
276226031Sstas
277226031Sstas
278226031Sstas
279226031Sstas
280226031Sstas
281226031Sstas
282226031SstasHoffman & Blanchet          Standards Track                     [Page 5]
283226031Sstas
284226031SstasRFC 3491                      IDN Nameprep                    March 2003
285226031Sstas
286226031Sstas
287226031Sstas12. Authors' Addresses
288226031Sstas
289226031Sstas   Paul Hoffman
290226031Sstas   Internet Mail Consortium and VPN Consortium
291226031Sstas   127 Segre Place
292226031Sstas   Santa Cruz, CA  95060 USA
293226031Sstas
294226031Sstas   EMail: paul.hoffman@imc.org and paul.hoffman@vpnc.org
295226031Sstas
296226031Sstas
297226031Sstas   Marc Blanchet
298226031Sstas   Viagenie inc.
299226031Sstas   2875 boul. Laurier, bur. 300
300226031Sstas   Ste-Foy, Quebec, Canada, G1V 2M2
301226031Sstas
302226031Sstas   EMail: Marc.Blanchet@viagenie.qc.ca
303226031Sstas
304226031Sstas
305226031Sstas
306226031Sstas
307226031Sstas
308226031Sstas
309226031Sstas
310226031Sstas
311226031Sstas
312226031Sstas
313226031Sstas
314226031Sstas
315226031Sstas
316226031Sstas
317226031Sstas
318226031Sstas
319226031Sstas
320226031Sstas
321226031Sstas
322226031Sstas
323226031Sstas
324226031Sstas
325226031Sstas
326226031Sstas
327226031Sstas
328226031Sstas
329226031Sstas
330226031Sstas
331226031Sstas
332226031Sstas
333226031Sstas
334226031Sstas
335226031Sstas
336226031Sstas
337226031Sstas
338226031SstasHoffman & Blanchet          Standards Track                     [Page 6]
339226031Sstas
340226031SstasRFC 3491                      IDN Nameprep                    March 2003
341226031Sstas
342226031Sstas
343226031Sstas13.  Full Copyright Statement
344226031Sstas
345226031Sstas   Copyright (C) The Internet Society (2003).  All Rights Reserved.
346226031Sstas
347226031Sstas   This document and translations of it may be copied and furnished to
348226031Sstas   others, and derivative works that comment on or otherwise explain it
349226031Sstas   or assist in its implementation may be prepared, copied, published
350226031Sstas   and distributed, in whole or in part, without restriction of any
351226031Sstas   kind, provided that the above copyright notice and this paragraph are
352226031Sstas   included on all such copies and derivative works.  However, this
353226031Sstas   document itself may not be modified in any way, such as by removing
354226031Sstas   the copyright notice or references to the Internet Society or other
355226031Sstas   Internet organizations, except as needed for the purpose of
356226031Sstas   developing Internet standards in which case the procedures for
357226031Sstas   copyrights defined in the Internet Standards process must be
358226031Sstas   followed, or as required to translate it into languages other than
359226031Sstas   English.
360226031Sstas
361226031Sstas   The limited permissions granted above are perpetual and will not be
362226031Sstas   revoked by the Internet Society or its successors or assigns.
363226031Sstas
364226031Sstas   This document and the information contained herein is provided on an
365226031Sstas   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
366226031Sstas   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
367226031Sstas   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
368226031Sstas   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
369226031Sstas   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
370226031Sstas
371226031SstasAcknowledgement
372226031Sstas
373226031Sstas   Funding for the RFC Editor function is currently provided by the
374226031Sstas   Internet Society.
375226031Sstas
376226031Sstas
377226031Sstas
378226031Sstas
379226031Sstas
380226031Sstas
381226031Sstas
382226031Sstas
383226031Sstas
384226031Sstas
385226031Sstas
386226031Sstas
387226031Sstas
388226031Sstas
389226031Sstas
390226031Sstas
391226031Sstas
392226031Sstas
393226031Sstas
394226031SstasHoffman & Blanchet          Standards Track                     [Page 7]
395226031Sstas
396