1<html lang="en"> 2<head> 3<title>Initial processing - The C Preprocessor</title> 4<meta http-equiv="Content-Type" content="text/html"> 5<meta name="description" content="The C Preprocessor"> 6<meta name="generator" content="makeinfo 4.13"> 7<link title="Top" rel="start" href="index.html#Top"> 8<link rel="up" href="Overview.html#Overview" title="Overview"> 9<link rel="prev" href="Character-sets.html#Character-sets" title="Character sets"> 10<link rel="next" href="Tokenization.html#Tokenization" title="Tokenization"> 11<link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage"> 12<!-- 13Copyright (C) 1987, 1989, 1991, 1992, 1993, 1994, 1995, 1996, 141997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 152008, 2009, 2010, 2011 16Free Software Foundation, Inc. 17 18Permission is granted to copy, distribute and/or modify this document 19under the terms of the GNU Free Documentation License, Version 1.3 or 20any later version published by the Free Software Foundation. A copy of 21the license is included in the 22section entitled ``GNU Free Documentation License''. 23 24This manual contains no Invariant Sections. The Front-Cover Texts are 25(a) (see below), and the Back-Cover Texts are (b) (see below). 26 27(a) The FSF's Front-Cover Text is: 28 29 A GNU Manual 30 31(b) The FSF's Back-Cover Text is: 32 33 You have freedom to copy and modify this GNU Manual, like GNU 34 software. Copies published by the Free Software Foundation raise 35 funds for GNU development. 36--> 37<meta http-equiv="Content-Style-Type" content="text/css"> 38<style type="text/css"><!-- 39 pre.display { font-family:inherit } 40 pre.format { font-family:inherit } 41 pre.smalldisplay { font-family:inherit; font-size:smaller } 42 pre.smallformat { font-family:inherit; font-size:smaller } 43 pre.smallexample { font-size:smaller } 44 pre.smalllisp { font-size:smaller } 45 span.sc { font-variant:small-caps } 46 span.roman { font-family:serif; font-weight:normal; } 47 span.sansserif { font-family:sans-serif; font-weight:normal; } 48--></style> 49<link rel="stylesheet" type="text/css" href="../cs.css"> 50</head> 51<body> 52<div class="node"> 53<a name="Initial-processing"></a> 54<p> 55Next: <a rel="next" accesskey="n" href="Tokenization.html#Tokenization">Tokenization</a>, 56Previous: <a rel="previous" accesskey="p" href="Character-sets.html#Character-sets">Character sets</a>, 57Up: <a rel="up" accesskey="u" href="Overview.html#Overview">Overview</a> 58<hr> 59</div> 60 61<h3 class="section">1.2 Initial processing</h3> 62 63<p>The preprocessor performs a series of textual transformations on its 64input. These happen before all other processing. Conceptually, they 65happen in a rigid order, and the entire file is run through each 66transformation before the next one begins. CPP actually does them 67all at once, for performance reasons. These transformations correspond 68roughly to the first three “phases of translation” described in the C 69standard. 70 71 <ol type=1 start=1> 72<li><a name="index-line-endings-1"></a>The input file is read into memory and broken into lines. 73 74 <p>Different systems use different conventions to indicate the end of a 75line. GCC accepts the ASCII control sequences <kbd>LF</kbd>, <kbd>CR LF<!-- /@w --></kbd> and <kbd>CR</kbd> as end-of-line markers. These are the canonical 76sequences used by Unix, DOS and VMS, and the classic Mac OS (before 77OSX) respectively. You may therefore safely copy source code written 78on any of those systems to a different one and use it without 79conversion. (GCC may lose track of the current line number if a file 80doesn't consistently use one convention, as sometimes happens when it 81is edited on computers with different conventions that share a network 82file system.) 83 84 <p>If the last line of any input file lacks an end-of-line marker, the end 85of the file is considered to implicitly supply one. The C standard says 86that this condition provokes undefined behavior, so GCC will emit a 87warning message. 88 89 <li><a name="index-trigraphs-2"></a><a name="trigraphs"></a>If trigraphs are enabled, they are replaced by their 90corresponding single characters. By default GCC ignores trigraphs, 91but if you request a strictly conforming mode with the <samp><span class="option">-std</span></samp> 92option, or you specify the <samp><span class="option">-trigraphs</span></samp> option, then it 93converts them. 94 95 <p>These are nine three-character sequences, all starting with ‘<samp><span class="samp">??</span></samp>’, 96that are defined by ISO C to stand for single characters. They permit 97obsolete systems that lack some of C's punctuation to use C. For 98example, ‘<samp><span class="samp">??/</span></samp>’ stands for ‘<samp><span class="samp">\</span></samp>’, so <tt>'??/n'</tt> is a character 99constant for a newline. 100 101 <p>Trigraphs are not popular and many compilers implement them 102incorrectly. Portable code should not rely on trigraphs being either 103converted or ignored. With <samp><span class="option">-Wtrigraphs</span></samp> GCC will warn you 104when a trigraph may change the meaning of your program if it were 105converted. See <a href="Wtrigraphs.html#Wtrigraphs">Wtrigraphs</a>. 106 107 <p>In a string constant, you can prevent a sequence of question marks 108from being confused with a trigraph by inserting a backslash between 109the question marks, or by separating the string literal at the 110trigraph and making use of string literal concatenation. <tt>"(??\?)"</tt> 111is the string ‘<samp><span class="samp">(???)</span></samp>’, not ‘<samp><span class="samp">(?]</span></samp>’. Traditional C compilers 112do not recognize these idioms. 113 114 <p>The nine trigraphs and their replacements are 115 116 <pre class="smallexample"> Trigraph: ??( ??) ??< ??> ??= ??/ ??' ??! ??- 117 Replacement: [ ] { } # \ ^ | ~ 118</pre> 119 <li><a name="index-continued-lines-3"></a><a name="index-backslash_002dnewline-4"></a>Continued lines are merged into one long line. 120 121 <p>A continued line is a line which ends with a backslash, ‘<samp><span class="samp">\</span></samp>’. The 122backslash is removed and the following line is joined with the current 123one. No space is inserted, so you may split a line anywhere, even in 124the middle of a word. (It is generally more readable to split lines 125only at white space.) 126 127 <p>The trailing backslash on a continued line is commonly referred to as a 128<dfn>backslash-newline</dfn>. 129 130 <p>If there is white space between a backslash and the end of a line, that 131is still a continued line. However, as this is usually the result of an 132editing mistake, and many compilers will not accept it as a continued 133line, GCC will warn you about it. 134 135 <li><a name="index-comments-5"></a><a name="index-line-comments-6"></a><a name="index-block-comments-7"></a>All comments are replaced with single spaces. 136 137 <p>There are two kinds of comments. <dfn>Block comments</dfn> begin with 138‘<samp><span class="samp">/*</span></samp>’ and continue until the next ‘<samp><span class="samp">*/</span></samp>’. Block comments do not 139nest: 140 141 <pre class="smallexample"> /* <span class="roman">this is</span> /* <span class="roman">one comment</span> */ <span class="roman">text outside comment</span> 142</pre> 143 <p><dfn>Line comments</dfn> begin with ‘<samp><span class="samp">//</span></samp>’ and continue to the end of the 144current line. Line comments do not nest either, but it does not matter, 145because they would end in the same place anyway. 146 147 <pre class="smallexample"> // <span class="roman">this is</span> // <span class="roman">one comment</span> 148 <span class="roman">text outside comment</span> 149</pre> 150 </ol> 151 152 <p>It is safe to put line comments inside block comments, or vice versa. 153 154<pre class="smallexample"> /* <span class="roman">block comment</span> 155 // <span class="roman">contains line comment</span> 156 <span class="roman">yet more comment</span> 157 */ <span class="roman">outside comment</span> 158 159 // <span class="roman">line comment</span> /* <span class="roman">contains block comment</span> */ 160</pre> 161 <p>But beware of commenting out one end of a block comment with a line 162comment. 163 164<pre class="smallexample"> // <span class="roman">l.c.</span> /* <span class="roman">block comment begins</span> 165 <span class="roman">oops! this isn't a comment anymore</span> */ 166</pre> 167 <p>Comments are not recognized within string literals. 168<tt>"/* blah */"<!-- /@w --></tt> is the string constant ‘<samp><span class="samp">/* blah */<!-- /@w --></span></samp>’, not 169an empty string. 170 171 <p>Line comments are not in the 1989 edition of the C standard, but they 172are recognized by GCC as an extension. In C++ and in the 1999 edition 173of the C standard, they are an official part of the language. 174 175 <p>Since these transformations happen before all other processing, you can 176split a line mechanically with backslash-newline anywhere. You can 177comment out the end of a line. You can continue a line comment onto the 178next line with backslash-newline. You can even split ‘<samp><span class="samp">/*</span></samp>’, 179‘<samp><span class="samp">*/</span></samp>’, and ‘<samp><span class="samp">//</span></samp>’ onto multiple lines with backslash-newline. 180For example: 181 182<pre class="smallexample"> /\ 183 * 184 */ # /* 185 */ defi\ 186 ne FO\ 187 O 10\ 188 20 189</pre> 190 <p class="noindent">is equivalent to <code>#define FOO 1020<!-- /@w --></code>. All these tricks are 191extremely confusing and should not be used in code intended to be 192readable. 193 194 <p>There is no way to prevent a backslash at the end of a line from being 195interpreted as a backslash-newline. This cannot affect any correct 196program, however. 197 198 </body></html> 199 200