1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2                      "http://www.w3.org/TR/html4/strict.dtd">
3
4<html>
5<head>
6  <title>Kaleidoscope: Extending the Language: Control Flow</title>
7  <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
8  <meta name="author" content="Chris Lattner">
9  <link rel="stylesheet" href="/_static/llvm.css" type="text/css">
10</head>
11
12<body>
13
14<h1>Kaleidoscope: Extending the Language: Control Flow</h1>
15
16<ul>
17<li><a href="index.html">Up to Tutorial Index</a></li>
18<li>Chapter 5
19  <ol>
20    <li><a href="#intro">Chapter 5 Introduction</a></li>
21    <li><a href="#ifthen">If/Then/Else</a>
22    <ol>
23      <li><a href="#iflexer">Lexer Extensions</a></li>
24      <li><a href="#ifast">AST Extensions</a></li>
25      <li><a href="#ifparser">Parser Extensions</a></li>
26      <li><a href="#ifir">LLVM IR</a></li>
27      <li><a href="#ifcodegen">Code Generation</a></li>
28    </ol>
29    </li>
30    <li><a href="#for">'for' Loop Expression</a>
31    <ol>
32      <li><a href="#forlexer">Lexer Extensions</a></li>
33      <li><a href="#forast">AST Extensions</a></li>
34      <li><a href="#forparser">Parser Extensions</a></li>
35      <li><a href="#forir">LLVM IR</a></li>
36      <li><a href="#forcodegen">Code Generation</a></li>
37    </ol>
38    </li>
39    <li><a href="#code">Full Code Listing</a></li>
40  </ol>
41</li>
42<li><a href="LangImpl6.html">Chapter 6</a>: Extending the Language: 
43User-defined Operators</li>
44</ul>
45
46<div class="doc_author">
47  <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
48</div>
49
50<!-- *********************************************************************** -->
51<h2><a name="intro">Chapter 5 Introduction</a></h2>
52<!-- *********************************************************************** -->
53
54<div>
55
56<p>Welcome to Chapter 5 of the "<a href="index.html">Implementing a language
57with LLVM</a>" tutorial.  Parts 1-4 described the implementation of the simple
58Kaleidoscope language and included support for generating LLVM IR, followed by
59optimizations and a JIT compiler.  Unfortunately, as presented, Kaleidoscope is
60mostly useless: it has no control flow other than call and return.  This means
61that you can't have conditional branches in the code, significantly limiting its
62power.  In this episode of "build that compiler", we'll extend Kaleidoscope to
63have an if/then/else expression plus a simple 'for' loop.</p>
64
65</div>
66
67<!-- *********************************************************************** -->
68<h2><a name="ifthen">If/Then/Else</a></h2>
69<!-- *********************************************************************** -->
70
71<div>
72
73<p>
74Extending Kaleidoscope to support if/then/else is quite straightforward.  It
75basically requires adding support for this "new" concept to the lexer,
76parser, AST, and LLVM code emitter.  This example is nice, because it shows how
77easy it is to "grow" a language over time, incrementally extending it as new
78ideas are discovered.</p>
79
80<p>Before we get going on "how" we add this extension, lets talk about "what" we
81want.  The basic idea is that we want to be able to write this sort of thing:
82</p>
83
84<div class="doc_code">
85<pre>
86def fib(x)
87  if x &lt; 3 then
88    1
89  else
90    fib(x-1)+fib(x-2);
91</pre>
92</div>
93
94<p>In Kaleidoscope, every construct is an expression: there are no statements.
95As such, the if/then/else expression needs to return a value like any other.
96Since we're using a mostly functional form, we'll have it evaluate its
97conditional, then return the 'then' or 'else' value based on how the condition
98was resolved.  This is very similar to the C "?:" expression.</p>
99
100<p>The semantics of the if/then/else expression is that it evaluates the
101condition to a boolean equality value: 0.0 is considered to be false and
102everything else is considered to be true.
103If the condition is true, the first subexpression is evaluated and returned, if
104the condition is false, the second subexpression is evaluated and returned.
105Since Kaleidoscope allows side-effects, this behavior is important to nail down.
106</p>
107
108<p>Now that we know what we "want", lets break this down into its constituent
109pieces.</p>
110
111<!-- ======================================================================= -->
112<h4><a name="iflexer">Lexer Extensions for If/Then/Else</a></h4>
113<!-- ======================================================================= -->
114
115
116<div>
117
118<p>The lexer extensions are straightforward.  First we add new enum values
119for the relevant tokens:</p>
120
121<div class="doc_code">
122<pre>
123  // control
124  tok_if = -6, tok_then = -7, tok_else = -8,
125</pre>
126</div>
127
128<p>Once we have that, we recognize the new keywords in the lexer. This is pretty simple
129stuff:</p>
130
131<div class="doc_code">
132<pre>
133    ...
134    if (IdentifierStr == "def") return tok_def;
135    if (IdentifierStr == "extern") return tok_extern;
136    <b>if (IdentifierStr == "if") return tok_if;
137    if (IdentifierStr == "then") return tok_then;
138    if (IdentifierStr == "else") return tok_else;</b>
139    return tok_identifier;
140</pre>
141</div>
142
143</div>
144
145<!-- ======================================================================= -->
146<h4><a name="ifast">AST Extensions for If/Then/Else</a></h4>
147<!-- ======================================================================= -->
148
149<div>
150
151<p>To represent the new expression we add a new AST node for it:</p>
152
153<div class="doc_code">
154<pre>
155/// IfExprAST - Expression class for if/then/else.
156class IfExprAST : public ExprAST {
157  ExprAST *Cond, *Then, *Else;
158public:
159  IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
160    : Cond(cond), Then(then), Else(_else) {}
161  virtual Value *Codegen();
162};
163</pre>
164</div>
165
166<p>The AST node just has pointers to the various subexpressions.</p>
167
168</div>
169
170<!-- ======================================================================= -->
171<h4><a name="ifparser">Parser Extensions for If/Then/Else</a></h4>
172<!-- ======================================================================= -->
173
174<div>
175
176<p>Now that we have the relevant tokens coming from the lexer and we have the
177AST node to build, our parsing logic is relatively straightforward.  First we
178define a new parsing function:</p>
179
180<div class="doc_code">
181<pre>
182/// ifexpr ::= 'if' expression 'then' expression 'else' expression
183static ExprAST *ParseIfExpr() {
184  getNextToken();  // eat the if.
185  
186  // condition.
187  ExprAST *Cond = ParseExpression();
188  if (!Cond) return 0;
189  
190  if (CurTok != tok_then)
191    return Error("expected then");
192  getNextToken();  // eat the then
193  
194  ExprAST *Then = ParseExpression();
195  if (Then == 0) return 0;
196  
197  if (CurTok != tok_else)
198    return Error("expected else");
199  
200  getNextToken();
201  
202  ExprAST *Else = ParseExpression();
203  if (!Else) return 0;
204  
205  return new IfExprAST(Cond, Then, Else);
206}
207</pre>
208</div>
209
210<p>Next we hook it up as a primary expression:</p>
211
212<div class="doc_code">
213<pre>
214static ExprAST *ParsePrimary() {
215  switch (CurTok) {
216  default: return Error("unknown token when expecting an expression");
217  case tok_identifier: return ParseIdentifierExpr();
218  case tok_number:     return ParseNumberExpr();
219  case '(':            return ParseParenExpr();
220  <b>case tok_if:         return ParseIfExpr();</b>
221  }
222}
223</pre>
224</div>
225
226</div>
227
228<!-- ======================================================================= -->
229<h4><a name="ifir">LLVM IR for If/Then/Else</a></h4>
230<!-- ======================================================================= -->
231
232<div>
233
234<p>Now that we have it parsing and building the AST, the final piece is adding
235LLVM code generation support.  This is the most interesting part of the
236if/then/else example, because this is where it starts to introduce new concepts.
237All of the code above has been thoroughly described in previous chapters.
238</p>
239
240<p>To motivate the code we want to produce, lets take a look at a simple
241example.  Consider:</p>
242
243<div class="doc_code">
244<pre>
245extern foo();
246extern bar();
247def baz(x) if x then foo() else bar();
248</pre>
249</div>
250
251<p>If you disable optimizations, the code you'll (soon) get from Kaleidoscope
252looks like this:</p>
253
254<div class="doc_code">
255<pre>
256declare double @foo()
257
258declare double @bar()
259
260define double @baz(double %x) {
261entry:
262  %ifcond = fcmp one double %x, 0.000000e+00
263  br i1 %ifcond, label %then, label %else
264
265then:		; preds = %entry
266  %calltmp = call double @foo()
267  br label %ifcont
268
269else:		; preds = %entry
270  %calltmp1 = call double @bar()
271  br label %ifcont
272
273ifcont:		; preds = %else, %then
274  %iftmp = phi double [ %calltmp, %then ], [ %calltmp1, %else ]
275  ret double %iftmp
276}
277</pre>
278</div>
279
280<p>To visualize the control flow graph, you can use a nifty feature of the LLVM
281'<a href="http://llvm.org/cmds/opt.html">opt</a>' tool.  If you put this LLVM IR
282into "t.ll" and run "<tt>llvm-as &lt; t.ll | opt -analyze -view-cfg</tt>", <a
283href="/ProgrammersManual.html#ViewGraph">a window will pop up</a> and you'll
284see this graph:</p>
285
286<div style="text-align: center"><img src="LangImpl5-cfg.png" alt="Example CFG" width="423" 
287height="315"></div>
288
289<p>Another way to get this is to call "<tt>F-&gt;viewCFG()</tt>" or
290"<tt>F-&gt;viewCFGOnly()</tt>" (where F is a "<tt>Function*</tt>") either by
291inserting actual calls into the code and recompiling or by calling these in the
292debugger.  LLVM has many nice features for visualizing various graphs.</p>
293
294<p>Getting back to the generated code, it is fairly simple: the entry block 
295evaluates the conditional expression ("x" in our case here) and compares the
296result to 0.0 with the "<tt><a href="/LangRef.html#i_fcmp">fcmp</a> one</tt>"
297instruction ('one' is "Ordered and Not Equal").  Based on the result of this
298expression, the code jumps to either the "then" or "else" blocks, which contain
299the expressions for the true/false cases.</p>
300
301<p>Once the then/else blocks are finished executing, they both branch back to the
302'ifcont' block to execute the code that happens after the if/then/else.  In this
303case the only thing left to do is to return to the caller of the function.  The
304question then becomes: how does the code know which expression to return?</p>
305
306<p>The answer to this question involves an important SSA operation: the
307<a href="http://en.wikipedia.org/wiki/Static_single_assignment_form">Phi
308operation</a>.  If you're not familiar with SSA, <a 
309href="http://en.wikipedia.org/wiki/Static_single_assignment_form">the wikipedia
310article</a> is a good introduction and there are various other introductions to
311it available on your favorite search engine.  The short version is that
312"execution" of the Phi operation requires "remembering" which block control came
313from.  The Phi operation takes on the value corresponding to the input control
314block.  In this case, if control comes in from the "then" block, it gets the
315value of "calltmp".  If control comes from the "else" block, it gets the value
316of "calltmp1".</p>
317
318<p>At this point, you are probably starting to think "Oh no! This means my
319simple and elegant front-end will have to start generating SSA form in order to
320use LLVM!".  Fortunately, this is not the case, and we strongly advise
321<em>not</em> implementing an SSA construction algorithm in your front-end
322unless there is an amazingly good reason to do so.  In practice, there are two
323sorts of values that float around in code written for your average imperative
324programming language that might need Phi nodes:</p>
325
326<ol>
327<li>Code that involves user variables: <tt>x = 1; x = x + 1; </tt></li>
328<li>Values that are implicit in the structure of your AST, such as the Phi node
329in this case.</li>
330</ol>
331
332<p>In <a href="LangImpl7.html">Chapter 7</a> of this tutorial ("mutable 
333variables"), we'll talk about #1
334in depth.  For now, just believe me that you don't need SSA construction to
335handle this case.  For #2, you have the choice of using the techniques that we will 
336describe for #1, or you can insert Phi nodes directly, if convenient.  In this 
337case, it is really really easy to generate the Phi node, so we choose to do it
338directly.</p>
339
340<p>Okay, enough of the motivation and overview, lets generate code!</p>
341
342</div>
343
344<!-- ======================================================================= -->
345<h4><a name="ifcodegen">Code Generation for If/Then/Else</a></h4>
346<!-- ======================================================================= -->
347
348<div>
349
350<p>In order to generate code for this, we implement the <tt>Codegen</tt> method
351for <tt>IfExprAST</tt>:</p>
352
353<div class="doc_code">
354<pre>
355Value *IfExprAST::Codegen() {
356  Value *CondV = Cond-&gt;Codegen();
357  if (CondV == 0) return 0;
358  
359  // Convert condition to a bool by comparing equal to 0.0.
360  CondV = Builder.CreateFCmpONE(CondV, 
361                              ConstantFP::get(getGlobalContext(), APFloat(0.0)),
362                                "ifcond");
363</pre>
364</div>
365
366<p>This code is straightforward and similar to what we saw before.  We emit the
367expression for the condition, then compare that value to zero to get a truth
368value as a 1-bit (bool) value.</p>
369
370<div class="doc_code">
371<pre>
372  Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
373  
374  // Create blocks for the then and else cases.  Insert the 'then' block at the
375  // end of the function.
376  BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction);
377  BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
378  BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
379
380  Builder.CreateCondBr(CondV, ThenBB, ElseBB);
381</pre>
382</div>
383
384<p>This code creates the basic blocks that are related to the if/then/else
385statement, and correspond directly to the blocks in the example above.  The
386first line gets the current Function object that is being built.  It
387gets this by asking the builder for the current BasicBlock, and asking that
388block for its "parent" (the function it is currently embedded into).</p>
389
390<p>Once it has that, it creates three blocks.  Note that it passes "TheFunction"
391into the constructor for the "then" block.  This causes the constructor to
392automatically insert the new block into the end of the specified function.  The
393other two blocks are created, but aren't yet inserted into the function.</p>
394
395<p>Once the blocks are created, we can emit the conditional branch that chooses
396between them.  Note that creating new blocks does not implicitly affect the
397IRBuilder, so it is still inserting into the block that the condition
398went into.  Also note that it is creating a branch to the "then" block and the
399"else" block, even though the "else" block isn't inserted into the function yet.
400This is all ok: it is the standard way that LLVM supports forward 
401references.</p>
402
403<div class="doc_code">
404<pre>
405  // Emit then value.
406  Builder.SetInsertPoint(ThenBB);
407  
408  Value *ThenV = Then-&gt;Codegen();
409  if (ThenV == 0) return 0;
410  
411  Builder.CreateBr(MergeBB);
412  // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
413  ThenBB = Builder.GetInsertBlock();
414</pre>
415</div>
416
417<p>After the conditional branch is inserted, we move the builder to start
418inserting into the "then" block.  Strictly speaking, this call moves the
419insertion point to be at the end of the specified block.  However, since the
420"then" block is empty, it also starts out by inserting at the beginning of the
421block.  :)</p>
422
423<p>Once the insertion point is set, we recursively codegen the "then" expression
424from the AST.  To finish off the "then" block, we create an unconditional branch
425to the merge block.  One interesting (and very important) aspect of the LLVM IR
426is that it <a href="/LangRef.html#functionstructure">requires all basic blocks
427to be "terminated"</a> with a <a href="/LangRef.html#terminators">control flow
428instruction</a> such as return or branch.  This means that all control flow,
429<em>including fall throughs</em> must be made explicit in the LLVM IR.  If you
430violate this rule, the verifier will emit an error.</p>
431
432<p>The final line here is quite subtle, but is very important.  The basic issue
433is that when we create the Phi node in the merge block, we need to set up the
434block/value pairs that indicate how the Phi will work.  Importantly, the Phi
435node expects to have an entry for each predecessor of the block in the CFG.  Why
436then, are we getting the current block when we just set it to ThenBB 5 lines
437above?  The problem is that the "Then" expression may actually itself change the
438block that the Builder is emitting into if, for example, it contains a nested
439"if/then/else" expression.  Because calling Codegen recursively could
440arbitrarily change the notion of the current block, we are required to get an
441up-to-date value for code that will set up the Phi node.</p>
442
443<div class="doc_code">
444<pre>
445  // Emit else block.
446  TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
447  Builder.SetInsertPoint(ElseBB);
448  
449  Value *ElseV = Else-&gt;Codegen();
450  if (ElseV == 0) return 0;
451  
452  Builder.CreateBr(MergeBB);
453  // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
454  ElseBB = Builder.GetInsertBlock();
455</pre>
456</div>
457
458<p>Code generation for the 'else' block is basically identical to codegen for
459the 'then' block.  The only significant difference is the first line, which adds
460the 'else' block to the function.  Recall previously that the 'else' block was
461created, but not added to the function.  Now that the 'then' and 'else' blocks
462are emitted, we can finish up with the merge code:</p>
463
464<div class="doc_code">
465<pre>
466  // Emit merge block.
467  TheFunction->getBasicBlockList().push_back(MergeBB);
468  Builder.SetInsertPoint(MergeBB);
469  PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2,
470                                  "iftmp");
471  
472  PN->addIncoming(ThenV, ThenBB);
473  PN->addIncoming(ElseV, ElseBB);
474  return PN;
475}
476</pre>
477</div>
478
479<p>The first two lines here are now familiar: the first adds the "merge" block
480to the Function object (it was previously floating, like the else block above).
481The second block changes the insertion point so that newly created code will go
482into the "merge" block.  Once that is done, we need to create the PHI node and
483set up the block/value pairs for the PHI.</p>
484
485<p>Finally, the CodeGen function returns the phi node as the value computed by
486the if/then/else expression.  In our example above, this returned value will 
487feed into the code for the top-level function, which will create the return
488instruction.</p>
489
490<p>Overall, we now have the ability to execute conditional code in
491Kaleidoscope.  With this extension, Kaleidoscope is a fairly complete language
492that can calculate a wide variety of numeric functions.  Next up we'll add
493another useful expression that is familiar from non-functional languages...</p>
494
495</div>
496
497</div>
498
499<!-- *********************************************************************** -->
500<h2><a name="for">'for' Loop Expression</a></h2>
501<!-- *********************************************************************** -->
502
503<div>
504
505<p>Now that we know how to add basic control flow constructs to the language,
506we have the tools to add more powerful things.  Lets add something more
507aggressive, a 'for' expression:</p>
508
509<div class="doc_code">
510<pre>
511 extern putchard(char)
512 def printstar(n)
513   for i = 1, i &lt; n, 1.0 in
514     putchard(42);  # ascii 42 = '*'
515     
516 # print 100 '*' characters
517 printstar(100);
518</pre>
519</div>
520
521<p>This expression defines a new variable ("i" in this case) which iterates from
522a starting value, while the condition ("i &lt; n" in this case) is true, 
523incrementing by an optional step value ("1.0" in this case).  If the step value
524is omitted, it defaults to 1.0.  While the loop is true, it executes its 
525body expression.  Because we don't have anything better to return, we'll just
526define the loop as always returning 0.0.  In the future when we have mutable
527variables, it will get more useful.</p>
528
529<p>As before, lets talk about the changes that we need to Kaleidoscope to
530support this.</p>
531
532<!-- ======================================================================= -->
533<h4><a name="forlexer">Lexer Extensions for the 'for' Loop</a></h4>
534<!-- ======================================================================= -->
535
536<div>
537
538<p>The lexer extensions are the same sort of thing as for if/then/else:</p>
539
540<div class="doc_code">
541<pre>
542  ... in enum Token ...
543  // control
544  tok_if = -6, tok_then = -7, tok_else = -8,
545<b>  tok_for = -9, tok_in = -10</b>
546
547  ... in gettok ...
548  if (IdentifierStr == "def") return tok_def;
549  if (IdentifierStr == "extern") return tok_extern;
550  if (IdentifierStr == "if") return tok_if;
551  if (IdentifierStr == "then") return tok_then;
552  if (IdentifierStr == "else") return tok_else;
553  <b>if (IdentifierStr == "for") return tok_for;
554  if (IdentifierStr == "in") return tok_in;</b>
555  return tok_identifier;
556</pre>
557</div>
558
559</div>
560
561<!-- ======================================================================= -->
562<h4><a name="forast">AST Extensions for the 'for' Loop</a></h4>
563<!-- ======================================================================= -->
564
565<div>
566
567<p>The AST node is just as simple.  It basically boils down to capturing
568the variable name and the constituent expressions in the node.</p>
569
570<div class="doc_code">
571<pre>
572/// ForExprAST - Expression class for for/in.
573class ForExprAST : public ExprAST {
574  std::string VarName;
575  ExprAST *Start, *End, *Step, *Body;
576public:
577  ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
578             ExprAST *step, ExprAST *body)
579    : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
580  virtual Value *Codegen();
581};
582</pre>
583</div>
584
585</div>
586
587<!-- ======================================================================= -->
588<h4><a name="forparser">Parser Extensions for the 'for' Loop</a></h4>
589<!-- ======================================================================= -->
590
591<div>
592
593<p>The parser code is also fairly standard.  The only interesting thing here is
594handling of the optional step value.  The parser code handles it by checking to
595see if the second comma is present.  If not, it sets the step value to null in
596the AST node:</p>
597
598<div class="doc_code">
599<pre>
600/// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
601static ExprAST *ParseForExpr() {
602  getNextToken();  // eat the for.
603
604  if (CurTok != tok_identifier)
605    return Error("expected identifier after for");
606  
607  std::string IdName = IdentifierStr;
608  getNextToken();  // eat identifier.
609  
610  if (CurTok != '=')
611    return Error("expected '=' after for");
612  getNextToken();  // eat '='.
613  
614  
615  ExprAST *Start = ParseExpression();
616  if (Start == 0) return 0;
617  if (CurTok != ',')
618    return Error("expected ',' after for start value");
619  getNextToken();
620  
621  ExprAST *End = ParseExpression();
622  if (End == 0) return 0;
623  
624  // The step value is optional.
625  ExprAST *Step = 0;
626  if (CurTok == ',') {
627    getNextToken();
628    Step = ParseExpression();
629    if (Step == 0) return 0;
630  }
631  
632  if (CurTok != tok_in)
633    return Error("expected 'in' after for");
634  getNextToken();  // eat 'in'.
635  
636  ExprAST *Body = ParseExpression();
637  if (Body == 0) return 0;
638
639  return new ForExprAST(IdName, Start, End, Step, Body);
640}
641</pre>
642</div>
643
644</div>
645
646<!-- ======================================================================= -->
647<h4><a name="forir">LLVM IR for the 'for' Loop</a></h4>
648<!-- ======================================================================= -->
649
650<div>
651
652<p>Now we get to the good part: the LLVM IR we want to generate for this thing.
653With the simple example above, we get this LLVM IR (note that this dump is
654generated with optimizations disabled for clarity):
655</p>
656
657<div class="doc_code">
658<pre>
659declare double @putchard(double)
660
661define double @printstar(double %n) {
662entry:
663  ; initial value = 1.0 (inlined into phi)
664  br label %loop
665
666loop:		; preds = %loop, %entry
667  %i = phi double [ 1.000000e+00, %entry ], [ %nextvar, %loop ]
668  ; body
669  %calltmp = call double @putchard(double 4.200000e+01)
670  ; increment
671  %nextvar = fadd double %i, 1.000000e+00
672
673  ; termination test
674  %cmptmp = fcmp ult double %i, %n
675  %booltmp = uitofp i1 %cmptmp to double
676  %loopcond = fcmp one double %booltmp, 0.000000e+00
677  br i1 %loopcond, label %loop, label %afterloop
678
679afterloop:		; preds = %loop
680  ; loop always returns 0.0
681  ret double 0.000000e+00
682}
683</pre>
684</div>
685
686<p>This loop contains all the same constructs we saw before: a phi node, several
687expressions, and some basic blocks.  Lets see how this fits together.</p>
688
689</div>
690
691<!-- ======================================================================= -->
692<h4><a name="forcodegen">Code Generation for the 'for' Loop</a></h4>
693<!-- ======================================================================= -->
694
695<div>
696
697<p>The first part of Codegen is very simple: we just output the start expression
698for the loop value:</p>
699
700<div class="doc_code">
701<pre>
702Value *ForExprAST::Codegen() {
703  // Emit the start code first, without 'variable' in scope.
704  Value *StartVal = Start-&gt;Codegen();
705  if (StartVal == 0) return 0;
706</pre>
707</div>
708
709<p>With this out of the way, the next step is to set up the LLVM basic block
710for the start of the loop body.  In the case above, the whole loop body is one
711block, but remember that the body code itself could consist of multiple blocks
712(e.g. if it contains an if/then/else or a for/in expression).</p>
713
714<div class="doc_code">
715<pre>
716  // Make the new basic block for the loop header, inserting after current
717  // block.
718  Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
719  BasicBlock *PreheaderBB = Builder.GetInsertBlock();
720  BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
721  
722  // Insert an explicit fall through from the current block to the LoopBB.
723  Builder.CreateBr(LoopBB);
724</pre>
725</div>
726
727<p>This code is similar to what we saw for if/then/else.  Because we will need
728it to create the Phi node, we remember the block that falls through into the
729loop.  Once we have that, we create the actual block that starts the loop and
730create an unconditional branch for the fall-through between the two blocks.</p>
731  
732<div class="doc_code">
733<pre>
734  // Start insertion in LoopBB.
735  Builder.SetInsertPoint(LoopBB);
736  
737  // Start the PHI node with an entry for Start.
738  PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2, VarName.c_str());
739  Variable-&gt;addIncoming(StartVal, PreheaderBB);
740</pre>
741</div>
742
743<p>Now that the "preheader" for the loop is set up, we switch to emitting code
744for the loop body.  To begin with, we move the insertion point and create the
745PHI node for the loop induction variable.  Since we already know the incoming
746value for the starting value, we add it to the Phi node.  Note that the Phi will
747eventually get a second value for the backedge, but we can't set it up yet
748(because it doesn't exist!).</p>
749
750<div class="doc_code">
751<pre>
752  // Within the loop, the variable is defined equal to the PHI node.  If it
753  // shadows an existing variable, we have to restore it, so save it now.
754  Value *OldVal = NamedValues[VarName];
755  NamedValues[VarName] = Variable;
756  
757  // Emit the body of the loop.  This, like any other expr, can change the
758  // current BB.  Note that we ignore the value computed by the body, but don't
759  // allow an error.
760  if (Body-&gt;Codegen() == 0)
761    return 0;
762</pre>
763</div>
764
765<p>Now the code starts to get more interesting.  Our 'for' loop introduces a new
766variable to the symbol table.  This means that our symbol table can now contain
767either function arguments or loop variables.  To handle this, before we codegen
768the body of the loop, we add the loop variable as the current value for its
769name.  Note that it is possible that there is a variable of the same name in the
770outer scope.  It would be easy to make this an error (emit an error and return
771null if there is already an entry for VarName) but we choose to allow shadowing
772of variables.  In order to handle this correctly, we remember the Value that
773we are potentially shadowing in <tt>OldVal</tt> (which will be null if there is
774no shadowed variable).</p>
775
776<p>Once the loop variable is set into the symbol table, the code recursively
777codegen's the body.  This allows the body to use the loop variable: any
778references to it will naturally find it in the symbol table.</p>
779
780<div class="doc_code">
781<pre>
782  // Emit the step value.
783  Value *StepVal;
784  if (Step) {
785    StepVal = Step-&gt;Codegen();
786    if (StepVal == 0) return 0;
787  } else {
788    // If not specified, use 1.0.
789    StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
790  }
791  
792  Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar");
793</pre>
794</div>
795
796<p>Now that the body is emitted, we compute the next value of the iteration
797variable by adding the step value, or 1.0 if it isn't present. '<tt>NextVar</tt>'
798will be the value of the loop variable on the next iteration of the loop.</p>
799
800<div class="doc_code">
801<pre>
802  // Compute the end condition.
803  Value *EndCond = End-&gt;Codegen();
804  if (EndCond == 0) return EndCond;
805  
806  // Convert condition to a bool by comparing equal to 0.0.
807  EndCond = Builder.CreateFCmpONE(EndCond, 
808                              ConstantFP::get(getGlobalContext(), APFloat(0.0)),
809                                  "loopcond");
810</pre>
811</div>
812
813<p>Finally, we evaluate the exit value of the loop, to determine whether the
814loop should exit.  This mirrors the condition evaluation for the if/then/else
815statement.</p>
816      
817<div class="doc_code">
818<pre>
819  // Create the "after loop" block and insert it.
820  BasicBlock *LoopEndBB = Builder.GetInsertBlock();
821  BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
822  
823  // Insert the conditional branch into the end of LoopEndBB.
824  Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
825  
826  // Any new code will be inserted in AfterBB.
827  Builder.SetInsertPoint(AfterBB);
828</pre>
829</div>
830
831<p>With the code for the body of the loop complete, we just need to finish up
832the control flow for it.  This code remembers the end block (for the phi node),
833then creates the block for the loop exit ("afterloop").  Based on the value of
834the exit condition, it creates a conditional branch that chooses between
835executing the loop again and exiting the loop.  Any future code is emitted in
836the "afterloop" block, so it sets the insertion position to it.</p>
837  
838<div class="doc_code">
839<pre>
840  // Add a new entry to the PHI node for the backedge.
841  Variable-&gt;addIncoming(NextVar, LoopEndBB);
842  
843  // Restore the unshadowed variable.
844  if (OldVal)
845    NamedValues[VarName] = OldVal;
846  else
847    NamedValues.erase(VarName);
848  
849  // for expr always returns 0.0.
850  return Constant::getNullValue(Type::getDoubleTy(getGlobalContext()));
851}
852</pre>
853</div>
854
855<p>The final code handles various cleanups: now that we have the "NextVar"
856value, we can add the incoming value to the loop PHI node.  After that, we
857remove the loop variable from the symbol table, so that it isn't in scope after
858the for loop.  Finally, code generation of the for loop always returns 0.0, so
859that is what we return from <tt>ForExprAST::Codegen</tt>.</p>
860
861<p>With this, we conclude the "adding control flow to Kaleidoscope" chapter of
862the tutorial.  In this chapter we added two control flow constructs, and used them to motivate a couple of aspects of the LLVM IR that are important for front-end implementors
863to know.  In the next chapter of our saga, we will get a bit crazier and add
864<a href="LangImpl6.html">user-defined operators</a> to our poor innocent 
865language.</p>
866
867</div>
868
869</div>
870
871<!-- *********************************************************************** -->
872<h2><a name="code">Full Code Listing</a></h2>
873<!-- *********************************************************************** -->
874
875<div>
876
877<p>
878Here is the complete code listing for our running example, enhanced with the
879if/then/else and for expressions..  To build this example, use:
880</p>
881
882<div class="doc_code">
883<pre>
884# Compile
885clang++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
886# Run
887/toy
888</pre>
889</div>
890
891<p>Here is the code:</p>
892
893<div class="doc_code">
894<pre>
895#include "llvm/DerivedTypes.h"
896#include "llvm/ExecutionEngine/ExecutionEngine.h"
897#include "llvm/ExecutionEngine/JIT.h"
898#include "llvm/IRBuilder.h"
899#include "llvm/LLVMContext.h"
900#include "llvm/Module.h"
901#include "llvm/PassManager.h"
902#include "llvm/Analysis/Verifier.h"
903#include "llvm/Analysis/Passes.h"
904#include "llvm/Target/TargetData.h"
905#include "llvm/Transforms/Scalar.h"
906#include "llvm/Support/TargetSelect.h"
907#include &lt;cstdio&gt;
908#include &lt;string&gt;
909#include &lt;map&gt;
910#include &lt;vector&gt;
911using namespace llvm;
912
913//===----------------------------------------------------------------------===//
914// Lexer
915//===----------------------------------------------------------------------===//
916
917// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
918// of these for known things.
919enum Token {
920  tok_eof = -1,
921
922  // commands
923  tok_def = -2, tok_extern = -3,
924
925  // primary
926  tok_identifier = -4, tok_number = -5,
927  
928  // control
929  tok_if = -6, tok_then = -7, tok_else = -8,
930  tok_for = -9, tok_in = -10
931};
932
933static std::string IdentifierStr;  // Filled in if tok_identifier
934static double NumVal;              // Filled in if tok_number
935
936/// gettok - Return the next token from standard input.
937static int gettok() {
938  static int LastChar = ' ';
939
940  // Skip any whitespace.
941  while (isspace(LastChar))
942    LastChar = getchar();
943
944  if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
945    IdentifierStr = LastChar;
946    while (isalnum((LastChar = getchar())))
947      IdentifierStr += LastChar;
948
949    if (IdentifierStr == "def") return tok_def;
950    if (IdentifierStr == "extern") return tok_extern;
951    if (IdentifierStr == "if") return tok_if;
952    if (IdentifierStr == "then") return tok_then;
953    if (IdentifierStr == "else") return tok_else;
954    if (IdentifierStr == "for") return tok_for;
955    if (IdentifierStr == "in") return tok_in;
956    return tok_identifier;
957  }
958
959  if (isdigit(LastChar) || LastChar == '.') {   // Number: [0-9.]+
960    std::string NumStr;
961    do {
962      NumStr += LastChar;
963      LastChar = getchar();
964    } while (isdigit(LastChar) || LastChar == '.');
965
966    NumVal = strtod(NumStr.c_str(), 0);
967    return tok_number;
968  }
969
970  if (LastChar == '#') {
971    // Comment until end of line.
972    do LastChar = getchar();
973    while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp;&amp; LastChar != '\r');
974    
975    if (LastChar != EOF)
976      return gettok();
977  }
978  
979  // Check for end of file.  Don't eat the EOF.
980  if (LastChar == EOF)
981    return tok_eof;
982
983  // Otherwise, just return the character as its ascii value.
984  int ThisChar = LastChar;
985  LastChar = getchar();
986  return ThisChar;
987}
988
989//===----------------------------------------------------------------------===//
990// Abstract Syntax Tree (aka Parse Tree)
991//===----------------------------------------------------------------------===//
992
993/// ExprAST - Base class for all expression nodes.
994class ExprAST {
995public:
996  virtual ~ExprAST() {}
997  virtual Value *Codegen() = 0;
998};
999
1000/// NumberExprAST - Expression class for numeric literals like "1.0".
1001class NumberExprAST : public ExprAST {
1002  double Val;
1003public:
1004  NumberExprAST(double val) : Val(val) {}
1005  virtual Value *Codegen();
1006};
1007
1008/// VariableExprAST - Expression class for referencing a variable, like "a".
1009class VariableExprAST : public ExprAST {
1010  std::string Name;
1011public:
1012  VariableExprAST(const std::string &amp;name) : Name(name) {}
1013  virtual Value *Codegen();
1014};
1015
1016/// BinaryExprAST - Expression class for a binary operator.
1017class BinaryExprAST : public ExprAST {
1018  char Op;
1019  ExprAST *LHS, *RHS;
1020public:
1021  BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs) 
1022    : Op(op), LHS(lhs), RHS(rhs) {}
1023  virtual Value *Codegen();
1024};
1025
1026/// CallExprAST - Expression class for function calls.
1027class CallExprAST : public ExprAST {
1028  std::string Callee;
1029  std::vector&lt;ExprAST*&gt; Args;
1030public:
1031  CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
1032    : Callee(callee), Args(args) {}
1033  virtual Value *Codegen();
1034};
1035
1036/// IfExprAST - Expression class for if/then/else.
1037class IfExprAST : public ExprAST {
1038  ExprAST *Cond, *Then, *Else;
1039public:
1040  IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
1041  : Cond(cond), Then(then), Else(_else) {}
1042  virtual Value *Codegen();
1043};
1044
1045/// ForExprAST - Expression class for for/in.
1046class ForExprAST : public ExprAST {
1047  std::string VarName;
1048  ExprAST *Start, *End, *Step, *Body;
1049public:
1050  ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
1051             ExprAST *step, ExprAST *body)
1052    : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
1053  virtual Value *Codegen();
1054};
1055
1056/// PrototypeAST - This class represents the "prototype" for a function,
1057/// which captures its name, and its argument names (thus implicitly the number
1058/// of arguments the function takes).
1059class PrototypeAST {
1060  std::string Name;
1061  std::vector&lt;std::string&gt; Args;
1062public:
1063  PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args)
1064    : Name(name), Args(args) {}
1065  
1066  Function *Codegen();
1067};
1068
1069/// FunctionAST - This class represents a function definition itself.
1070class FunctionAST {
1071  PrototypeAST *Proto;
1072  ExprAST *Body;
1073public:
1074  FunctionAST(PrototypeAST *proto, ExprAST *body)
1075    : Proto(proto), Body(body) {}
1076  
1077  Function *Codegen();
1078};
1079
1080//===----------------------------------------------------------------------===//
1081// Parser
1082//===----------------------------------------------------------------------===//
1083
1084/// CurTok/getNextToken - Provide a simple token buffer.  CurTok is the current
1085/// token the parser is looking at.  getNextToken reads another token from the
1086/// lexer and updates CurTok with its results.
1087static int CurTok;
1088static int getNextToken() {
1089  return CurTok = gettok();
1090}
1091
1092/// BinopPrecedence - This holds the precedence for each binary operator that is
1093/// defined.
1094static std::map&lt;char, int&gt; BinopPrecedence;
1095
1096/// GetTokPrecedence - Get the precedence of the pending binary operator token.
1097static int GetTokPrecedence() {
1098  if (!isascii(CurTok))
1099    return -1;
1100  
1101  // Make sure it's a declared binop.
1102  int TokPrec = BinopPrecedence[CurTok];
1103  if (TokPrec &lt;= 0) return -1;
1104  return TokPrec;
1105}
1106
1107/// Error* - These are little helper functions for error handling.
1108ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
1109PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
1110FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
1111
1112static ExprAST *ParseExpression();
1113
1114/// identifierexpr
1115///   ::= identifier
1116///   ::= identifier '(' expression* ')'
1117static ExprAST *ParseIdentifierExpr() {
1118  std::string IdName = IdentifierStr;
1119  
1120  getNextToken();  // eat identifier.
1121  
1122  if (CurTok != '(') // Simple variable ref.
1123    return new VariableExprAST(IdName);
1124  
1125  // Call.
1126  getNextToken();  // eat (
1127  std::vector&lt;ExprAST*&gt; Args;
1128  if (CurTok != ')') {
1129    while (1) {
1130      ExprAST *Arg = ParseExpression();
1131      if (!Arg) return 0;
1132      Args.push_back(Arg);
1133
1134      if (CurTok == ')') break;
1135
1136      if (CurTok != ',')
1137        return Error("Expected ')' or ',' in argument list");
1138      getNextToken();
1139    }
1140  }
1141
1142  // Eat the ')'.
1143  getNextToken();
1144  
1145  return new CallExprAST(IdName, Args);
1146}
1147
1148/// numberexpr ::= number
1149static ExprAST *ParseNumberExpr() {
1150  ExprAST *Result = new NumberExprAST(NumVal);
1151  getNextToken(); // consume the number
1152  return Result;
1153}
1154
1155/// parenexpr ::= '(' expression ')'
1156static ExprAST *ParseParenExpr() {
1157  getNextToken();  // eat (.
1158  ExprAST *V = ParseExpression();
1159  if (!V) return 0;
1160  
1161  if (CurTok != ')')
1162    return Error("expected ')'");
1163  getNextToken();  // eat ).
1164  return V;
1165}
1166
1167/// ifexpr ::= 'if' expression 'then' expression 'else' expression
1168static ExprAST *ParseIfExpr() {
1169  getNextToken();  // eat the if.
1170  
1171  // condition.
1172  ExprAST *Cond = ParseExpression();
1173  if (!Cond) return 0;
1174  
1175  if (CurTok != tok_then)
1176    return Error("expected then");
1177  getNextToken();  // eat the then
1178  
1179  ExprAST *Then = ParseExpression();
1180  if (Then == 0) return 0;
1181  
1182  if (CurTok != tok_else)
1183    return Error("expected else");
1184  
1185  getNextToken();
1186  
1187  ExprAST *Else = ParseExpression();
1188  if (!Else) return 0;
1189  
1190  return new IfExprAST(Cond, Then, Else);
1191}
1192
1193/// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
1194static ExprAST *ParseForExpr() {
1195  getNextToken();  // eat the for.
1196
1197  if (CurTok != tok_identifier)
1198    return Error("expected identifier after for");
1199  
1200  std::string IdName = IdentifierStr;
1201  getNextToken();  // eat identifier.
1202  
1203  if (CurTok != '=')
1204    return Error("expected '=' after for");
1205  getNextToken();  // eat '='.
1206  
1207  
1208  ExprAST *Start = ParseExpression();
1209  if (Start == 0) return 0;
1210  if (CurTok != ',')
1211    return Error("expected ',' after for start value");
1212  getNextToken();
1213  
1214  ExprAST *End = ParseExpression();
1215  if (End == 0) return 0;
1216  
1217  // The step value is optional.
1218  ExprAST *Step = 0;
1219  if (CurTok == ',') {
1220    getNextToken();
1221    Step = ParseExpression();
1222    if (Step == 0) return 0;
1223  }
1224  
1225  if (CurTok != tok_in)
1226    return Error("expected 'in' after for");
1227  getNextToken();  // eat 'in'.
1228  
1229  ExprAST *Body = ParseExpression();
1230  if (Body == 0) return 0;
1231
1232  return new ForExprAST(IdName, Start, End, Step, Body);
1233}
1234
1235/// primary
1236///   ::= identifierexpr
1237///   ::= numberexpr
1238///   ::= parenexpr
1239///   ::= ifexpr
1240///   ::= forexpr
1241static ExprAST *ParsePrimary() {
1242  switch (CurTok) {
1243  default: return Error("unknown token when expecting an expression");
1244  case tok_identifier: return ParseIdentifierExpr();
1245  case tok_number:     return ParseNumberExpr();
1246  case '(':            return ParseParenExpr();
1247  case tok_if:         return ParseIfExpr();
1248  case tok_for:        return ParseForExpr();
1249  }
1250}
1251
1252/// binoprhs
1253///   ::= ('+' primary)*
1254static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
1255  // If this is a binop, find its precedence.
1256  while (1) {
1257    int TokPrec = GetTokPrecedence();
1258    
1259    // If this is a binop that binds at least as tightly as the current binop,
1260    // consume it, otherwise we are done.
1261    if (TokPrec &lt; ExprPrec)
1262      return LHS;
1263    
1264    // Okay, we know this is a binop.
1265    int BinOp = CurTok;
1266    getNextToken();  // eat binop
1267    
1268    // Parse the primary expression after the binary operator.
1269    ExprAST *RHS = ParsePrimary();
1270    if (!RHS) return 0;
1271    
1272    // If BinOp binds less tightly with RHS than the operator after RHS, let
1273    // the pending operator take RHS as its LHS.
1274    int NextPrec = GetTokPrecedence();
1275    if (TokPrec &lt; NextPrec) {
1276      RHS = ParseBinOpRHS(TokPrec+1, RHS);
1277      if (RHS == 0) return 0;
1278    }
1279    
1280    // Merge LHS/RHS.
1281    LHS = new BinaryExprAST(BinOp, LHS, RHS);
1282  }
1283}
1284
1285/// expression
1286///   ::= primary binoprhs
1287///
1288static ExprAST *ParseExpression() {
1289  ExprAST *LHS = ParsePrimary();
1290  if (!LHS) return 0;
1291  
1292  return ParseBinOpRHS(0, LHS);
1293}
1294
1295/// prototype
1296///   ::= id '(' id* ')'
1297static PrototypeAST *ParsePrototype() {
1298  if (CurTok != tok_identifier)
1299    return ErrorP("Expected function name in prototype");
1300
1301  std::string FnName = IdentifierStr;
1302  getNextToken();
1303  
1304  if (CurTok != '(')
1305    return ErrorP("Expected '(' in prototype");
1306  
1307  std::vector&lt;std::string&gt; ArgNames;
1308  while (getNextToken() == tok_identifier)
1309    ArgNames.push_back(IdentifierStr);
1310  if (CurTok != ')')
1311    return ErrorP("Expected ')' in prototype");
1312  
1313  // success.
1314  getNextToken();  // eat ')'.
1315  
1316  return new PrototypeAST(FnName, ArgNames);
1317}
1318
1319/// definition ::= 'def' prototype expression
1320static FunctionAST *ParseDefinition() {
1321  getNextToken();  // eat def.
1322  PrototypeAST *Proto = ParsePrototype();
1323  if (Proto == 0) return 0;
1324
1325  if (ExprAST *E = ParseExpression())
1326    return new FunctionAST(Proto, E);
1327  return 0;
1328}
1329
1330/// toplevelexpr ::= expression
1331static FunctionAST *ParseTopLevelExpr() {
1332  if (ExprAST *E = ParseExpression()) {
1333    // Make an anonymous proto.
1334    PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
1335    return new FunctionAST(Proto, E);
1336  }
1337  return 0;
1338}
1339
1340/// external ::= 'extern' prototype
1341static PrototypeAST *ParseExtern() {
1342  getNextToken();  // eat extern.
1343  return ParsePrototype();
1344}
1345
1346//===----------------------------------------------------------------------===//
1347// Code Generation
1348//===----------------------------------------------------------------------===//
1349
1350static Module *TheModule;
1351static IRBuilder&lt;&gt; Builder(getGlobalContext());
1352static std::map&lt;std::string, Value*&gt; NamedValues;
1353static FunctionPassManager *TheFPM;
1354
1355Value *ErrorV(const char *Str) { Error(Str); return 0; }
1356
1357Value *NumberExprAST::Codegen() {
1358  return ConstantFP::get(getGlobalContext(), APFloat(Val));
1359}
1360
1361Value *VariableExprAST::Codegen() {
1362  // Look this variable up in the function.
1363  Value *V = NamedValues[Name];
1364  return V ? V : ErrorV("Unknown variable name");
1365}
1366
1367Value *BinaryExprAST::Codegen() {
1368  Value *L = LHS-&gt;Codegen();
1369  Value *R = RHS-&gt;Codegen();
1370  if (L == 0 || R == 0) return 0;
1371  
1372  switch (Op) {
1373  case '+': return Builder.CreateFAdd(L, R, "addtmp");
1374  case '-': return Builder.CreateFSub(L, R, "subtmp");
1375  case '*': return Builder.CreateFMul(L, R, "multmp");
1376  case '&lt;':
1377    L = Builder.CreateFCmpULT(L, R, "cmptmp");
1378    // Convert bool 0/1 to double 0.0 or 1.0
1379    return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
1380                                "booltmp");
1381  default: return ErrorV("invalid binary operator");
1382  }
1383}
1384
1385Value *CallExprAST::Codegen() {
1386  // Look up the name in the global module table.
1387  Function *CalleeF = TheModule-&gt;getFunction(Callee);
1388  if (CalleeF == 0)
1389    return ErrorV("Unknown function referenced");
1390  
1391  // If argument mismatch error.
1392  if (CalleeF-&gt;arg_size() != Args.size())
1393    return ErrorV("Incorrect # arguments passed");
1394
1395  std::vector&lt;Value*&gt; ArgsV;
1396  for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1397    ArgsV.push_back(Args[i]-&gt;Codegen());
1398    if (ArgsV.back() == 0) return 0;
1399  }
1400  
1401  return Builder.CreateCall(CalleeF, ArgsV, "calltmp");
1402}
1403
1404Value *IfExprAST::Codegen() {
1405  Value *CondV = Cond-&gt;Codegen();
1406  if (CondV == 0) return 0;
1407  
1408  // Convert condition to a bool by comparing equal to 0.0.
1409  CondV = Builder.CreateFCmpONE(CondV, 
1410                              ConstantFP::get(getGlobalContext(), APFloat(0.0)),
1411                                "ifcond");
1412  
1413  Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1414  
1415  // Create blocks for the then and else cases.  Insert the 'then' block at the
1416  // end of the function.
1417  BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction);
1418  BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
1419  BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
1420  
1421  Builder.CreateCondBr(CondV, ThenBB, ElseBB);
1422  
1423  // Emit then value.
1424  Builder.SetInsertPoint(ThenBB);
1425  
1426  Value *ThenV = Then-&gt;Codegen();
1427  if (ThenV == 0) return 0;
1428  
1429  Builder.CreateBr(MergeBB);
1430  // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
1431  ThenBB = Builder.GetInsertBlock();
1432  
1433  // Emit else block.
1434  TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
1435  Builder.SetInsertPoint(ElseBB);
1436  
1437  Value *ElseV = Else-&gt;Codegen();
1438  if (ElseV == 0) return 0;
1439  
1440  Builder.CreateBr(MergeBB);
1441  // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
1442  ElseBB = Builder.GetInsertBlock();
1443  
1444  // Emit merge block.
1445  TheFunction-&gt;getBasicBlockList().push_back(MergeBB);
1446  Builder.SetInsertPoint(MergeBB);
1447  PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2,
1448                                  "iftmp");
1449  
1450  PN-&gt;addIncoming(ThenV, ThenBB);
1451  PN-&gt;addIncoming(ElseV, ElseBB);
1452  return PN;
1453}
1454
1455Value *ForExprAST::Codegen() {
1456  // Output this as:
1457  //   ...
1458  //   start = startexpr
1459  //   goto loop
1460  // loop: 
1461  //   variable = phi [start, loopheader], [nextvariable, loopend]
1462  //   ...
1463  //   bodyexpr
1464  //   ...
1465  // loopend:
1466  //   step = stepexpr
1467  //   nextvariable = variable + step
1468  //   endcond = endexpr
1469  //   br endcond, loop, endloop
1470  // outloop:
1471  
1472  // Emit the start code first, without 'variable' in scope.
1473  Value *StartVal = Start-&gt;Codegen();
1474  if (StartVal == 0) return 0;
1475  
1476  // Make the new basic block for the loop header, inserting after current
1477  // block.
1478  Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1479  BasicBlock *PreheaderBB = Builder.GetInsertBlock();
1480  BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
1481  
1482  // Insert an explicit fall through from the current block to the LoopBB.
1483  Builder.CreateBr(LoopBB);
1484
1485  // Start insertion in LoopBB.
1486  Builder.SetInsertPoint(LoopBB);
1487  
1488  // Start the PHI node with an entry for Start.
1489  PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2, VarName.c_str());
1490  Variable-&gt;addIncoming(StartVal, PreheaderBB);
1491  
1492  // Within the loop, the variable is defined equal to the PHI node.  If it
1493  // shadows an existing variable, we have to restore it, so save it now.
1494  Value *OldVal = NamedValues[VarName];
1495  NamedValues[VarName] = Variable;
1496  
1497  // Emit the body of the loop.  This, like any other expr, can change the
1498  // current BB.  Note that we ignore the value computed by the body, but don't
1499  // allow an error.
1500  if (Body-&gt;Codegen() == 0)
1501    return 0;
1502  
1503  // Emit the step value.
1504  Value *StepVal;
1505  if (Step) {
1506    StepVal = Step-&gt;Codegen();
1507    if (StepVal == 0) return 0;
1508  } else {
1509    // If not specified, use 1.0.
1510    StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
1511  }
1512  
1513  Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar");
1514
1515  // Compute the end condition.
1516  Value *EndCond = End-&gt;Codegen();
1517  if (EndCond == 0) return EndCond;
1518  
1519  // Convert condition to a bool by comparing equal to 0.0.
1520  EndCond = Builder.CreateFCmpONE(EndCond, 
1521                              ConstantFP::get(getGlobalContext(), APFloat(0.0)),
1522                                  "loopcond");
1523  
1524  // Create the "after loop" block and insert it.
1525  BasicBlock *LoopEndBB = Builder.GetInsertBlock();
1526  BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
1527  
1528  // Insert the conditional branch into the end of LoopEndBB.
1529  Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
1530  
1531  // Any new code will be inserted in AfterBB.
1532  Builder.SetInsertPoint(AfterBB);
1533  
1534  // Add a new entry to the PHI node for the backedge.
1535  Variable-&gt;addIncoming(NextVar, LoopEndBB);
1536  
1537  // Restore the unshadowed variable.
1538  if (OldVal)
1539    NamedValues[VarName] = OldVal;
1540  else
1541    NamedValues.erase(VarName);
1542
1543  
1544  // for expr always returns 0.0.
1545  return Constant::getNullValue(Type::getDoubleTy(getGlobalContext()));
1546}
1547
1548Function *PrototypeAST::Codegen() {
1549  // Make the function type:  double(double,double) etc.
1550  std::vector&lt;Type*&gt; Doubles(Args.size(),
1551                             Type::getDoubleTy(getGlobalContext()));
1552  FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
1553                                       Doubles, false);
1554  
1555  Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
1556  
1557  // If F conflicted, there was already something named 'Name'.  If it has a
1558  // body, don't allow redefinition or reextern.
1559  if (F-&gt;getName() != Name) {
1560    // Delete the one we just made and get the existing one.
1561    F-&gt;eraseFromParent();
1562    F = TheModule-&gt;getFunction(Name);
1563    
1564    // If F already has a body, reject this.
1565    if (!F-&gt;empty()) {
1566      ErrorF("redefinition of function");
1567      return 0;
1568    }
1569    
1570    // If F took a different number of args, reject.
1571    if (F-&gt;arg_size() != Args.size()) {
1572      ErrorF("redefinition of function with different # args");
1573      return 0;
1574    }
1575  }
1576  
1577  // Set names for all arguments.
1578  unsigned Idx = 0;
1579  for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
1580       ++AI, ++Idx) {
1581    AI-&gt;setName(Args[Idx]);
1582    
1583    // Add arguments to variable symbol table.
1584    NamedValues[Args[Idx]] = AI;
1585  }
1586  
1587  return F;
1588}
1589
1590Function *FunctionAST::Codegen() {
1591  NamedValues.clear();
1592  
1593  Function *TheFunction = Proto-&gt;Codegen();
1594  if (TheFunction == 0)
1595    return 0;
1596  
1597  // Create a new basic block to start insertion into.
1598  BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
1599  Builder.SetInsertPoint(BB);
1600  
1601  if (Value *RetVal = Body-&gt;Codegen()) {
1602    // Finish off the function.
1603    Builder.CreateRet(RetVal);
1604
1605    // Validate the generated code, checking for consistency.
1606    verifyFunction(*TheFunction);
1607
1608    // Optimize the function.
1609    TheFPM-&gt;run(*TheFunction);
1610    
1611    return TheFunction;
1612  }
1613  
1614  // Error reading body, remove function.
1615  TheFunction-&gt;eraseFromParent();
1616  return 0;
1617}
1618
1619//===----------------------------------------------------------------------===//
1620// Top-Level parsing and JIT Driver
1621//===----------------------------------------------------------------------===//
1622
1623static ExecutionEngine *TheExecutionEngine;
1624
1625static void HandleDefinition() {
1626  if (FunctionAST *F = ParseDefinition()) {
1627    if (Function *LF = F-&gt;Codegen()) {
1628      fprintf(stderr, "Read function definition:");
1629      LF-&gt;dump();
1630    }
1631  } else {
1632    // Skip token for error recovery.
1633    getNextToken();
1634  }
1635}
1636
1637static void HandleExtern() {
1638  if (PrototypeAST *P = ParseExtern()) {
1639    if (Function *F = P-&gt;Codegen()) {
1640      fprintf(stderr, "Read extern: ");
1641      F-&gt;dump();
1642    }
1643  } else {
1644    // Skip token for error recovery.
1645    getNextToken();
1646  }
1647}
1648
1649static void HandleTopLevelExpression() {
1650  // Evaluate a top-level expression into an anonymous function.
1651  if (FunctionAST *F = ParseTopLevelExpr()) {
1652    if (Function *LF = F-&gt;Codegen()) {
1653      // JIT the function, returning a function pointer.
1654      void *FPtr = TheExecutionEngine-&gt;getPointerToFunction(LF);
1655      
1656      // Cast it to the right type (takes no arguments, returns a double) so we
1657      // can call it as a native function.
1658      double (*FP)() = (double (*)())(intptr_t)FPtr;
1659      fprintf(stderr, "Evaluated to %f\n", FP());
1660    }
1661  } else {
1662    // Skip token for error recovery.
1663    getNextToken();
1664  }
1665}
1666
1667/// top ::= definition | external | expression | ';'
1668static void MainLoop() {
1669  while (1) {
1670    fprintf(stderr, "ready&gt; ");
1671    switch (CurTok) {
1672    case tok_eof:    return;
1673    case ';':        getNextToken(); break;  // ignore top-level semicolons.
1674    case tok_def:    HandleDefinition(); break;
1675    case tok_extern: HandleExtern(); break;
1676    default:         HandleTopLevelExpression(); break;
1677    }
1678  }
1679}
1680
1681//===----------------------------------------------------------------------===//
1682// "Library" functions that can be "extern'd" from user code.
1683//===----------------------------------------------------------------------===//
1684
1685/// putchard - putchar that takes a double and returns 0.
1686extern "C" 
1687double putchard(double X) {
1688  putchar((char)X);
1689  return 0;
1690}
1691
1692//===----------------------------------------------------------------------===//
1693// Main driver code.
1694//===----------------------------------------------------------------------===//
1695
1696int main() {
1697  InitializeNativeTarget();
1698  LLVMContext &amp;Context = getGlobalContext();
1699
1700  // Install standard binary operators.
1701  // 1 is lowest precedence.
1702  BinopPrecedence['&lt;'] = 10;
1703  BinopPrecedence['+'] = 20;
1704  BinopPrecedence['-'] = 20;
1705  BinopPrecedence['*'] = 40;  // highest.
1706
1707  // Prime the first token.
1708  fprintf(stderr, "ready&gt; ");
1709  getNextToken();
1710
1711  // Make the module, which holds all the code.
1712  TheModule = new Module("my cool jit", Context);
1713
1714  // Create the JIT.  This takes ownership of the module.
1715  std::string ErrStr;
1716  TheExecutionEngine = EngineBuilder(TheModule).setErrorStr(&amp;ErrStr).create();
1717  if (!TheExecutionEngine) {
1718    fprintf(stderr, "Could not create ExecutionEngine: %s\n", ErrStr.c_str());
1719    exit(1);
1720  }
1721
1722  FunctionPassManager OurFPM(TheModule);
1723
1724  // Set up the optimizer pipeline.  Start with registering info about how the
1725  // target lays out data structures.
1726  OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
1727  // Provide basic AliasAnalysis support for GVN.
1728  OurFPM.add(createBasicAliasAnalysisPass());
1729  // Do simple "peephole" optimizations and bit-twiddling optzns.
1730  OurFPM.add(createInstructionCombiningPass());
1731  // Reassociate expressions.
1732  OurFPM.add(createReassociatePass());
1733  // Eliminate Common SubExpressions.
1734  OurFPM.add(createGVNPass());
1735  // Simplify the control flow graph (deleting unreachable blocks, etc).
1736  OurFPM.add(createCFGSimplificationPass());
1737
1738  OurFPM.doInitialization();
1739
1740  // Set the global so the code gen can use this.
1741  TheFPM = &amp;OurFPM;
1742
1743  // Run the main "interpreter loop" now.
1744  MainLoop();
1745
1746  TheFPM = 0;
1747
1748  // Print out all of the generated code.
1749  TheModule-&gt;dump();
1750
1751  return 0;
1752}
1753</pre>
1754</div>
1755
1756<a href="LangImpl6.html">Next: Extending the language: user-defined operators</a>
1757</div>
1758
1759<!-- *********************************************************************** -->
1760<hr>
1761<address>
1762  <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1763  src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
1764  <a href="http://validator.w3.org/check/referer"><img
1765  src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
1766
1767  <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
1768  <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br>
1769  Last modified: $Date$
1770</address>
1771</body>
1772</html>
1773