1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2                      "http://www.w3.org/TR/html4/strict.dtd">
3
4<html>
5<head>
6  <title>Kaleidoscope: Extending the Language: Mutable Variables / SSA
7         construction</title>
8  <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
9  <meta name="author" content="Chris Lattner">
10  <link rel="stylesheet" href="/_static/llvm.css" type="text/css">
11</head>
12
13<body>
14
15<h1>Kaleidoscope: Extending the Language: Mutable Variables</h1>
16
17<ul>
18<li><a href="index.html">Up to Tutorial Index</a></li>
19<li>Chapter 7
20  <ol>
21    <li><a href="#intro">Chapter 7 Introduction</a></li>
22    <li><a href="#why">Why is this a hard problem?</a></li>
23    <li><a href="#memory">Memory in LLVM</a></li>
24    <li><a href="#kalvars">Mutable Variables in Kaleidoscope</a></li>
25    <li><a href="#adjustments">Adjusting Existing Variables for
26     Mutation</a></li>
27    <li><a href="#assignment">New Assignment Operator</a></li>
28    <li><a href="#localvars">User-defined Local Variables</a></li>
29    <li><a href="#code">Full Code Listing</a></li>
30  </ol>
31</li>
32<li><a href="LangImpl8.html">Chapter 8</a>: Conclusion and other useful LLVM
33 tidbits</li>
34</ul>
35
36<div class="doc_author">
37  <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
38</div>
39
40<!-- *********************************************************************** -->
41<h2><a name="intro">Chapter 7 Introduction</a></h2>
42<!-- *********************************************************************** -->
43
44<div>
45
46<p>Welcome to Chapter 7 of the "<a href="index.html">Implementing a language
47with LLVM</a>" tutorial.  In chapters 1 through 6, we've built a very
48respectable, albeit simple, <a 
49href="http://en.wikipedia.org/wiki/Functional_programming">functional
50programming language</a>.  In our journey, we learned some parsing techniques,
51how to build and represent an AST, how to build LLVM IR, and how to optimize
52the resultant code as well as JIT compile it.</p>
53
54<p>While Kaleidoscope is interesting as a functional language, the fact that it
55is functional makes it "too easy" to generate LLVM IR for it.  In particular, a 
56functional language makes it very easy to build LLVM IR directly in <a 
57href="http://en.wikipedia.org/wiki/Static_single_assignment_form">SSA form</a>.
58Since LLVM requires that the input code be in SSA form, this is a very nice
59property and it is often unclear to newcomers how to generate code for an
60imperative language with mutable variables.</p>
61
62<p>The short (and happy) summary of this chapter is that there is no need for
63your front-end to build SSA form: LLVM provides highly tuned and well tested
64support for this, though the way it works is a bit unexpected for some.</p>
65
66</div>
67
68<!-- *********************************************************************** -->
69<h2><a name="why">Why is this a hard problem?</a></h2>
70<!-- *********************************************************************** -->
71
72<div>
73
74<p>
75To understand why mutable variables cause complexities in SSA construction, 
76consider this extremely simple C example:
77</p>
78
79<div class="doc_code">
80<pre>
81int G, H;
82int test(_Bool Condition) {
83  int X;
84  if (Condition)
85    X = G;
86  else
87    X = H;
88  return X;
89}
90</pre>
91</div>
92
93<p>In this case, we have the variable "X", whose value depends on the path 
94executed in the program.  Because there are two different possible values for X
95before the return instruction, a PHI node is inserted to merge the two values.
96The LLVM IR that we want for this example looks like this:</p>
97
98<div class="doc_code">
99<pre>
100@G = weak global i32 0   ; type of @G is i32*
101@H = weak global i32 0   ; type of @H is i32*
102
103define i32 @test(i1 %Condition) {
104entry:
105  br i1 %Condition, label %cond_true, label %cond_false
106
107cond_true:
108  %X.0 = load i32* @G
109  br label %cond_next
110
111cond_false:
112  %X.1 = load i32* @H
113  br label %cond_next
114
115cond_next:
116  %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
117  ret i32 %X.2
118}
119</pre>
120</div>
121
122<p>In this example, the loads from the G and H global variables are explicit in
123the LLVM IR, and they live in the then/else branches of the if statement
124(cond_true/cond_false).  In order to merge the incoming values, the X.2 phi node
125in the cond_next block selects the right value to use based on where control 
126flow is coming from: if control flow comes from the cond_false block, X.2 gets
127the value of X.1.  Alternatively, if control flow comes from cond_true, it gets
128the value of X.0.  The intent of this chapter is not to explain the details of
129SSA form.  For more information, see one of the many <a 
130href="http://en.wikipedia.org/wiki/Static_single_assignment_form">online 
131references</a>.</p>
132
133<p>The question for this article is "who places the phi nodes when lowering 
134assignments to mutable variables?".  The issue here is that LLVM 
135<em>requires</em> that its IR be in SSA form: there is no "non-ssa" mode for it.
136However, SSA construction requires non-trivial algorithms and data structures,
137so it is inconvenient and wasteful for every front-end to have to reproduce this
138logic.</p>
139
140</div>
141
142<!-- *********************************************************************** -->
143<h2><a name="memory">Memory in LLVM</a></h2>
144<!-- *********************************************************************** -->
145
146<div>
147
148<p>The 'trick' here is that while LLVM does require all register values to be
149in SSA form, it does not require (or permit) memory objects to be in SSA form.
150In the example above, note that the loads from G and H are direct accesses to
151G and H: they are not renamed or versioned.  This differs from some other
152compiler systems, which do try to version memory objects.  In LLVM, instead of
153encoding dataflow analysis of memory into the LLVM IR, it is handled with <a 
154href="/WritingAnLLVMPass.html">Analysis Passes</a> which are computed on
155demand.</p>
156
157<p>
158With this in mind, the high-level idea is that we want to make a stack variable
159(which lives in memory, because it is on the stack) for each mutable object in
160a function.  To take advantage of this trick, we need to talk about how LLVM
161represents stack variables.
162</p>
163
164<p>In LLVM, all memory accesses are explicit with load/store instructions, and
165it is carefully designed not to have (or need) an "address-of" operator.  Notice
166how the type of the @G/@H global variables is actually "i32*" even though the 
167variable is defined as "i32".  What this means is that @G defines <em>space</em>
168for an i32 in the global data area, but its <em>name</em> actually refers to the
169address for that space.  Stack variables work the same way, except that instead of 
170being declared with global variable definitions, they are declared with the 
171<a href="/LangRef.html#i_alloca">LLVM alloca instruction</a>:</p>
172
173<div class="doc_code">
174<pre>
175define i32 @example() {
176entry:
177  %X = alloca i32           ; type of %X is i32*.
178  ...
179  %tmp = load i32* %X       ; load the stack value %X from the stack.
180  %tmp2 = add i32 %tmp, 1   ; increment it
181  store i32 %tmp2, i32* %X  ; store it back
182  ...
183</pre>
184</div>
185
186<p>This code shows an example of how you can declare and manipulate a stack
187variable in the LLVM IR.  Stack memory allocated with the alloca instruction is
188fully general: you can pass the address of the stack slot to functions, you can
189store it in other variables, etc.  In our example above, we could rewrite the
190example to use the alloca technique to avoid using a PHI node:</p>
191
192<div class="doc_code">
193<pre>
194@G = weak global i32 0   ; type of @G is i32*
195@H = weak global i32 0   ; type of @H is i32*
196
197define i32 @test(i1 %Condition) {
198entry:
199  %X = alloca i32           ; type of %X is i32*.
200  br i1 %Condition, label %cond_true, label %cond_false
201
202cond_true:
203  %X.0 = load i32* @G
204  store i32 %X.0, i32* %X   ; Update X
205  br label %cond_next
206
207cond_false:
208  %X.1 = load i32* @H
209  store i32 %X.1, i32* %X   ; Update X
210  br label %cond_next
211
212cond_next:
213  %X.2 = load i32* %X       ; Read X
214  ret i32 %X.2
215}
216</pre>
217</div>
218
219<p>With this, we have discovered a way to handle arbitrary mutable variables
220without the need to create Phi nodes at all:</p>
221
222<ol>
223<li>Each mutable variable becomes a stack allocation.</li>
224<li>Each read of the variable becomes a load from the stack.</li>
225<li>Each update of the variable becomes a store to the stack.</li>
226<li>Taking the address of a variable just uses the stack address directly.</li>
227</ol>
228
229<p>While this solution has solved our immediate problem, it introduced another
230one: we have now apparently introduced a lot of stack traffic for very simple
231and common operations, a major performance problem.  Fortunately for us, the
232LLVM optimizer has a highly-tuned optimization pass named "mem2reg" that handles
233this case, promoting allocas like this into SSA registers, inserting Phi nodes
234as appropriate.  If you run this example through the pass, for example, you'll
235get:</p>
236
237<div class="doc_code">
238<pre>
239$ <b>llvm-as &lt; example.ll | opt -mem2reg | llvm-dis</b>
240@G = weak global i32 0
241@H = weak global i32 0
242
243define i32 @test(i1 %Condition) {
244entry:
245  br i1 %Condition, label %cond_true, label %cond_false
246
247cond_true:
248  %X.0 = load i32* @G
249  br label %cond_next
250
251cond_false:
252  %X.1 = load i32* @H
253  br label %cond_next
254
255cond_next:
256  %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
257  ret i32 %X.01
258}
259</pre>
260</div>
261
262<p>The mem2reg pass implements the standard "iterated dominance frontier"
263algorithm for constructing SSA form and has a number of optimizations that speed
264up (very common) degenerate cases. The mem2reg optimization pass is the answer to dealing 
265with mutable variables, and we highly recommend that you depend on it.  Note that
266mem2reg only works on variables in certain circumstances:</p>
267
268<ol>
269<li>mem2reg is alloca-driven: it looks for allocas and if it can handle them, it
270promotes them.  It does not apply to global variables or heap allocations.</li>
271
272<li>mem2reg only looks for alloca instructions in the entry block of the
273function.  Being in the entry block guarantees that the alloca is only executed
274once, which makes analysis simpler.</li>
275
276<li>mem2reg only promotes allocas whose uses are direct loads and stores.  If
277the address of the stack object is passed to a function, or if any funny pointer
278arithmetic is involved, the alloca will not be promoted.</li>
279
280<li>mem2reg only works on allocas of <a 
281href="/LangRef.html#t_classifications">first class</a> 
282values (such as pointers, scalars and vectors), and only if the array size
283of the allocation is 1 (or missing in the .ll file).  mem2reg is not capable of
284promoting structs or arrays to registers.  Note that the "scalarrepl" pass is
285more powerful and can promote structs, "unions", and arrays in many cases.</li>
286
287</ol>
288
289<p>
290All of these properties are easy to satisfy for most imperative languages, and
291we'll illustrate it below with Kaleidoscope.  The final question you may be
292asking is: should I bother with this nonsense for my front-end?  Wouldn't it be
293better if I just did SSA construction directly, avoiding use of the mem2reg
294optimization pass?  In short, we strongly recommend that you use this technique
295for building SSA form, unless there is an extremely good reason not to.  Using
296this technique is:</p>
297
298<ul>
299<li>Proven and well tested: llvm-gcc and clang both use this technique for local
300mutable variables.  As such, the most common clients of LLVM are using this to
301handle a bulk of their variables.  You can be sure that bugs are found fast and
302fixed early.</li>
303
304<li>Extremely Fast: mem2reg has a number of special cases that make it fast in
305common cases as well as fully general.  For example, it has fast-paths for
306variables that are only used in a single block, variables that only have one
307assignment point, good heuristics to avoid insertion of unneeded phi nodes, etc.
308</li>
309
310<li>Needed for debug info generation: <a href="/SourceLevelDebugging.html">
311Debug information in LLVM</a> relies on having the address of the variable
312exposed so that debug info can be attached to it.  This technique dovetails 
313very naturally with this style of debug info.</li>
314</ul>
315
316<p>If nothing else, this makes it much easier to get your front-end up and 
317running, and is very simple to implement.  Lets extend Kaleidoscope with mutable
318variables now!
319</p>
320
321</div>
322
323<!-- *********************************************************************** -->
324<h2><a name="kalvars">Mutable Variables in Kaleidoscope</a></h2>
325<!-- *********************************************************************** -->
326
327<div>
328
329<p>Now that we know the sort of problem we want to tackle, lets see what this
330looks like in the context of our little Kaleidoscope language.  We're going to
331add two features:</p>
332
333<ol>
334<li>The ability to mutate variables with the '=' operator.</li>
335<li>The ability to define new variables.</li>
336</ol>
337
338<p>While the first item is really what this is about, we only have variables
339for incoming arguments as well as for induction variables, and redefining those only
340goes so far :).  Also, the ability to define new variables is a
341useful thing regardless of whether you will be mutating them.  Here's a
342motivating example that shows how we could use these:</p>
343
344<div class="doc_code">
345<pre>
346# Define ':' for sequencing: as a low-precedence operator that ignores operands
347# and just returns the RHS.
348def binary : 1 (x y) y;
349
350# Recursive fib, we could do this before.
351def fib(x)
352  if (x &lt; 3) then
353    1
354  else
355    fib(x-1)+fib(x-2);
356
357# Iterative fib.
358def fibi(x)
359  <b>var a = 1, b = 1, c in</b>
360  (for i = 3, i &lt; x in 
361     <b>c = a + b</b> :
362     <b>a = b</b> :
363     <b>b = c</b>) :
364  b;
365
366# Call it. 
367fibi(10);
368</pre>
369</div>
370
371<p>
372In order to mutate variables, we have to change our existing variables to use
373the "alloca trick".  Once we have that, we'll add our new operator, then extend
374Kaleidoscope to support new variable definitions.
375</p>
376
377</div>
378
379<!-- *********************************************************************** -->
380<h2><a name="adjustments">Adjusting Existing Variables for Mutation</a></h2>
381<!-- *********************************************************************** -->
382
383<div>
384
385<p>
386The symbol table in Kaleidoscope is managed at code generation time by the 
387'<tt>NamedValues</tt>' map.  This map currently keeps track of the LLVM "Value*"
388that holds the double value for the named variable.  In order to support
389mutation, we need to change this slightly, so that it <tt>NamedValues</tt> holds
390the <em>memory location</em> of the variable in question.  Note that this 
391change is a refactoring: it changes the structure of the code, but does not
392(by itself) change the behavior of the compiler.  All of these changes are 
393isolated in the Kaleidoscope code generator.</p>
394
395<p>
396At this point in Kaleidoscope's development, it only supports variables for two
397things: incoming arguments to functions and the induction variable of 'for'
398loops.  For consistency, we'll allow mutation of these variables in addition to
399other user-defined variables.  This means that these will both need memory
400locations.
401</p>
402
403<p>To start our transformation of Kaleidoscope, we'll change the NamedValues
404map so that it maps to AllocaInst* instead of Value*.  Once we do this, the C++ 
405compiler will tell us what parts of the code we need to update:</p>
406
407<div class="doc_code">
408<pre>
409static std::map&lt;std::string, AllocaInst*&gt; NamedValues;
410</pre>
411</div>
412
413<p>Also, since we will need to create these alloca's, we'll use a helper
414function that ensures that the allocas are created in the entry block of the
415function:</p>
416
417<div class="doc_code">
418<pre>
419/// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
420/// the function.  This is used for mutable variables etc.
421static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
422                                          const std::string &amp;VarName) {
423  IRBuilder&lt;&gt; TmpB(&amp;TheFunction-&gt;getEntryBlock(),
424                 TheFunction-&gt;getEntryBlock().begin());
425  return TmpB.CreateAlloca(Type::getDoubleTy(getGlobalContext()), 0,
426                           VarName.c_str());
427}
428</pre>
429</div>
430
431<p>This funny looking code creates an IRBuilder object that is pointing at
432the first instruction (.begin()) of the entry block.  It then creates an alloca
433with the expected name and returns it.  Because all values in Kaleidoscope are
434doubles, there is no need to pass in a type to use.</p>
435
436<p>With this in place, the first functionality change we want to make is to
437variable references.  In our new scheme, variables live on the stack, so code
438generating a reference to them actually needs to produce a load from the stack
439slot:</p>
440
441<div class="doc_code">
442<pre>
443Value *VariableExprAST::Codegen() {
444  // Look this variable up in the function.
445  Value *V = NamedValues[Name];
446  if (V == 0) return ErrorV("Unknown variable name");
447
448  <b>// Load the value.
449  return Builder.CreateLoad(V, Name.c_str());</b>
450}
451</pre>
452</div>
453
454<p>As you can see, this is pretty straightforward.  Now we need to update the
455things that define the variables to set up the alloca.  We'll start with 
456<tt>ForExprAST::Codegen</tt> (see the <a href="#code">full code listing</a> for
457the unabridged code):</p>
458
459<div class="doc_code">
460<pre>
461  Function *TheFunction = Builder.GetInsertBlock()->getParent();
462
463  <b>// Create an alloca for the variable in the entry block.
464  AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);</b>
465  
466    // Emit the start code first, without 'variable' in scope.
467  Value *StartVal = Start-&gt;Codegen();
468  if (StartVal == 0) return 0;
469  
470  <b>// Store the value into the alloca.
471  Builder.CreateStore(StartVal, Alloca);</b>
472  ...
473
474  // Compute the end condition.
475  Value *EndCond = End-&gt;Codegen();
476  if (EndCond == 0) return EndCond;
477  
478  <b>// Reload, increment, and restore the alloca.  This handles the case where
479  // the body of the loop mutates the variable.
480  Value *CurVar = Builder.CreateLoad(Alloca);
481  Value *NextVar = Builder.CreateFAdd(CurVar, StepVal, "nextvar");
482  Builder.CreateStore(NextVar, Alloca);</b>
483  ...
484</pre>
485</div>
486
487<p>This code is virtually identical to the code <a 
488href="LangImpl5.html#forcodegen">before we allowed mutable variables</a>.  The
489big difference is that we no longer have to construct a PHI node, and we use
490load/store to access the variable as needed.</p>
491
492<p>To support mutable argument variables, we need to also make allocas for them.
493The code for this is also pretty simple:</p>
494
495<div class="doc_code">
496<pre>
497/// CreateArgumentAllocas - Create an alloca for each argument and register the
498/// argument in the symbol table so that references to it will succeed.
499void PrototypeAST::CreateArgumentAllocas(Function *F) {
500  Function::arg_iterator AI = F-&gt;arg_begin();
501  for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
502    // Create an alloca for this variable.
503    AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
504
505    // Store the initial value into the alloca.
506    Builder.CreateStore(AI, Alloca);
507
508    // Add arguments to variable symbol table.
509    NamedValues[Args[Idx]] = Alloca;
510  }
511}
512</pre>
513</div>
514
515<p>For each argument, we make an alloca, store the input value to the function
516into the alloca, and register the alloca as the memory location for the
517argument.  This method gets invoked by <tt>FunctionAST::Codegen</tt> right after
518it sets up the entry block for the function.</p>
519
520<p>The final missing piece is adding the mem2reg pass, which allows us to get
521good codegen once again:</p>
522
523<div class="doc_code">
524<pre>
525    // Set up the optimizer pipeline.  Start with registering info about how the
526    // target lays out data structures.
527    OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
528    <b>// Promote allocas to registers.
529    OurFPM.add(createPromoteMemoryToRegisterPass());</b>
530    // Do simple "peephole" optimizations and bit-twiddling optzns.
531    OurFPM.add(createInstructionCombiningPass());
532    // Reassociate expressions.
533    OurFPM.add(createReassociatePass());
534</pre>
535</div>
536
537<p>It is interesting to see what the code looks like before and after the
538mem2reg optimization runs.  For example, this is the before/after code for our
539recursive fib function.  Before the optimization:</p>
540
541<div class="doc_code">
542<pre>
543define double @fib(double %x) {
544entry:
545  <b>%x1 = alloca double
546  store double %x, double* %x1
547  %x2 = load double* %x1</b>
548  %cmptmp = fcmp ult double %x2, 3.000000e+00
549  %booltmp = uitofp i1 %cmptmp to double
550  %ifcond = fcmp one double %booltmp, 0.000000e+00
551  br i1 %ifcond, label %then, label %else
552
553then:		; preds = %entry
554  br label %ifcont
555
556else:		; preds = %entry
557  <b>%x3 = load double* %x1</b>
558  %subtmp = fsub double %x3, 1.000000e+00
559  %calltmp = call double @fib(double %subtmp)
560  <b>%x4 = load double* %x1</b>
561  %subtmp5 = fsub double %x4, 2.000000e+00
562  %calltmp6 = call double @fib(double %subtmp5)
563  %addtmp = fadd double %calltmp, %calltmp6
564  br label %ifcont
565
566ifcont:		; preds = %else, %then
567  %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
568  ret double %iftmp
569}
570</pre>
571</div>
572
573<p>Here there is only one variable (x, the input argument) but you can still
574see the extremely simple-minded code generation strategy we are using.  In the
575entry block, an alloca is created, and the initial input value is stored into
576it.  Each reference to the variable does a reload from the stack.  Also, note
577that we didn't modify the if/then/else expression, so it still inserts a PHI
578node.  While we could make an alloca for it, it is actually easier to create a 
579PHI node for it, so we still just make the PHI.</p>
580
581<p>Here is the code after the mem2reg pass runs:</p>
582
583<div class="doc_code">
584<pre>
585define double @fib(double %x) {
586entry:
587  %cmptmp = fcmp ult double <b>%x</b>, 3.000000e+00
588  %booltmp = uitofp i1 %cmptmp to double
589  %ifcond = fcmp one double %booltmp, 0.000000e+00
590  br i1 %ifcond, label %then, label %else
591
592then:
593  br label %ifcont
594
595else:
596  %subtmp = fsub double <b>%x</b>, 1.000000e+00
597  %calltmp = call double @fib(double %subtmp)
598  %subtmp5 = fsub double <b>%x</b>, 2.000000e+00
599  %calltmp6 = call double @fib(double %subtmp5)
600  %addtmp = fadd double %calltmp, %calltmp6
601  br label %ifcont
602
603ifcont:		; preds = %else, %then
604  %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
605  ret double %iftmp
606}
607</pre>
608</div>
609
610<p>This is a trivial case for mem2reg, since there are no redefinitions of the
611variable.  The point of showing this is to calm your tension about inserting
612such blatent inefficiencies :).</p>
613
614<p>After the rest of the optimizers run, we get:</p>
615
616<div class="doc_code">
617<pre>
618define double @fib(double %x) {
619entry:
620  %cmptmp = fcmp ult double %x, 3.000000e+00
621  %booltmp = uitofp i1 %cmptmp to double
622  %ifcond = fcmp ueq double %booltmp, 0.000000e+00
623  br i1 %ifcond, label %else, label %ifcont
624
625else:
626  %subtmp = fsub double %x, 1.000000e+00
627  %calltmp = call double @fib(double %subtmp)
628  %subtmp5 = fsub double %x, 2.000000e+00
629  %calltmp6 = call double @fib(double %subtmp5)
630  %addtmp = fadd double %calltmp, %calltmp6
631  ret double %addtmp
632
633ifcont:
634  ret double 1.000000e+00
635}
636</pre>
637</div>
638
639<p>Here we see that the simplifycfg pass decided to clone the return instruction
640into the end of the 'else' block.  This allowed it to eliminate some branches
641and the PHI node.</p>
642
643<p>Now that all symbol table references are updated to use stack variables, 
644we'll add the assignment operator.</p>
645
646</div>
647
648<!-- *********************************************************************** -->
649<h2><a name="assignment">New Assignment Operator</a></h2>
650<!-- *********************************************************************** -->
651
652<div>
653
654<p>With our current framework, adding a new assignment operator is really
655simple.  We will parse it just like any other binary operator, but handle it
656internally (instead of allowing the user to define it).  The first step is to
657set a precedence:</p>
658
659<div class="doc_code">
660<pre>
661 int main() {
662   // Install standard binary operators.
663   // 1 is lowest precedence.
664   <b>BinopPrecedence['='] = 2;</b>
665   BinopPrecedence['&lt;'] = 10;
666   BinopPrecedence['+'] = 20;
667   BinopPrecedence['-'] = 20;
668</pre>
669</div>
670
671<p>Now that the parser knows the precedence of the binary operator, it takes
672care of all the parsing and AST generation.  We just need to implement codegen
673for the assignment operator.  This looks like:</p> 
674
675<div class="doc_code">
676<pre>
677Value *BinaryExprAST::Codegen() {
678  // Special case '=' because we don't want to emit the LHS as an expression.
679  if (Op == '=') {
680    // Assignment requires the LHS to be an identifier.
681    VariableExprAST *LHSE = dynamic_cast&lt;VariableExprAST*&gt;(LHS);
682    if (!LHSE)
683      return ErrorV("destination of '=' must be a variable");
684</pre>
685</div>
686
687<p>Unlike the rest of the binary operators, our assignment operator doesn't
688follow the "emit LHS, emit RHS, do computation" model.  As such, it is handled
689as a special case before the other binary operators are handled.  The other 
690strange thing is that it requires the LHS to be a variable.  It is invalid to
691have "(x+1) = expr" - only things like "x = expr" are allowed.
692</p>
693
694<div class="doc_code">
695<pre>
696    // Codegen the RHS.
697    Value *Val = RHS-&gt;Codegen();
698    if (Val == 0) return 0;
699
700    // Look up the name.
701    Value *Variable = NamedValues[LHSE-&gt;getName()];
702    if (Variable == 0) return ErrorV("Unknown variable name");
703
704    Builder.CreateStore(Val, Variable);
705    return Val;
706  }
707  ...  
708</pre>
709</div>
710
711<p>Once we have the variable, codegen'ing the assignment is straightforward:
712we emit the RHS of the assignment, create a store, and return the computed
713value.  Returning a value allows for chained assignments like "X = (Y = Z)".</p>
714
715<p>Now that we have an assignment operator, we can mutate loop variables and
716arguments.  For example, we can now run code like this:</p>
717
718<div class="doc_code">
719<pre>
720# Function to print a double.
721extern printd(x);
722
723# Define ':' for sequencing: as a low-precedence operator that ignores operands
724# and just returns the RHS.
725def binary : 1 (x y) y;
726
727def test(x)
728  printd(x) :
729  x = 4 :
730  printd(x);
731
732test(123);
733</pre>
734</div>
735
736<p>When run, this example prints "123" and then "4", showing that we did
737actually mutate the value!  Okay, we have now officially implemented our goal:
738getting this to work requires SSA construction in the general case.  However,
739to be really useful, we want the ability to define our own local variables, lets
740add this next! 
741</p>
742
743</div>
744
745<!-- *********************************************************************** -->
746<h2><a name="localvars">User-defined Local Variables</a></h2>
747<!-- *********************************************************************** -->
748
749<div>
750
751<p>Adding var/in is just like any other other extensions we made to 
752Kaleidoscope: we extend the lexer, the parser, the AST and the code generator.
753The first step for adding our new 'var/in' construct is to extend the lexer.
754As before, this is pretty trivial, the code looks like this:</p>
755
756<div class="doc_code">
757<pre>
758enum Token {
759  ...
760  <b>// var definition
761  tok_var = -13</b>
762...
763}
764...
765static int gettok() {
766...
767    if (IdentifierStr == "in") return tok_in;
768    if (IdentifierStr == "binary") return tok_binary;
769    if (IdentifierStr == "unary") return tok_unary;
770    <b>if (IdentifierStr == "var") return tok_var;</b>
771    return tok_identifier;
772...
773</pre>
774</div>
775
776<p>The next step is to define the AST node that we will construct.  For var/in,
777it looks like this:</p>
778
779<div class="doc_code">
780<pre>
781/// VarExprAST - Expression class for var/in
782class VarExprAST : public ExprAST {
783  std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
784  ExprAST *Body;
785public:
786  VarExprAST(const std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; &amp;varnames,
787             ExprAST *body)
788  : VarNames(varnames), Body(body) {}
789  
790  virtual Value *Codegen();
791};
792</pre>
793</div>
794
795<p>var/in allows a list of names to be defined all at once, and each name can
796optionally have an initializer value.  As such, we capture this information in
797the VarNames vector.  Also, var/in has a body, this body is allowed to access
798the variables defined by the var/in.</p>
799
800<p>With this in place, we can define the parser pieces.  The first thing we do is add
801it as a primary expression:</p>
802
803<div class="doc_code">
804<pre>
805/// primary
806///   ::= identifierexpr
807///   ::= numberexpr
808///   ::= parenexpr
809///   ::= ifexpr
810///   ::= forexpr
811<b>///   ::= varexpr</b>
812static ExprAST *ParsePrimary() {
813  switch (CurTok) {
814  default: return Error("unknown token when expecting an expression");
815  case tok_identifier: return ParseIdentifierExpr();
816  case tok_number:     return ParseNumberExpr();
817  case '(':            return ParseParenExpr();
818  case tok_if:         return ParseIfExpr();
819  case tok_for:        return ParseForExpr();
820  <b>case tok_var:        return ParseVarExpr();</b>
821  }
822}
823</pre>
824</div>
825
826<p>Next we define ParseVarExpr:</p>
827
828<div class="doc_code">
829<pre>
830/// varexpr ::= 'var' identifier ('=' expression)? 
831//                    (',' identifier ('=' expression)?)* 'in' expression
832static ExprAST *ParseVarExpr() {
833  getNextToken();  // eat the var.
834
835  std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
836
837  // At least one variable name is required.
838  if (CurTok != tok_identifier)
839    return Error("expected identifier after var");
840</pre>
841</div>
842
843<p>The first part of this code parses the list of identifier/expr pairs into the
844local <tt>VarNames</tt> vector.  
845
846<div class="doc_code">
847<pre>
848  while (1) {
849    std::string Name = IdentifierStr;
850    getNextToken();  // eat identifier.
851
852    // Read the optional initializer.
853    ExprAST *Init = 0;
854    if (CurTok == '=') {
855      getNextToken(); // eat the '='.
856      
857      Init = ParseExpression();
858      if (Init == 0) return 0;
859    }
860    
861    VarNames.push_back(std::make_pair(Name, Init));
862    
863    // End of var list, exit loop.
864    if (CurTok != ',') break;
865    getNextToken(); // eat the ','.
866    
867    if (CurTok != tok_identifier)
868      return Error("expected identifier list after var");
869  }
870</pre>
871</div>
872
873<p>Once all the variables are parsed, we then parse the body and create the
874AST node:</p>
875
876<div class="doc_code">
877<pre>
878  // At this point, we have to have 'in'.
879  if (CurTok != tok_in)
880    return Error("expected 'in' keyword after 'var'");
881  getNextToken();  // eat 'in'.
882  
883  ExprAST *Body = ParseExpression();
884  if (Body == 0) return 0;
885  
886  return new VarExprAST(VarNames, Body);
887}
888</pre>
889</div>
890
891<p>Now that we can parse and represent the code, we need to support emission of
892LLVM IR for it.  This code starts out with:</p>
893
894<div class="doc_code">
895<pre>
896Value *VarExprAST::Codegen() {
897  std::vector&lt;AllocaInst *&gt; OldBindings;
898  
899  Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
900
901  // Register all variables and emit their initializer.
902  for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
903    const std::string &amp;VarName = VarNames[i].first;
904    ExprAST *Init = VarNames[i].second;
905</pre>
906</div>
907
908<p>Basically it loops over all the variables, installing them one at a time.
909For each variable we put into the symbol table, we remember the previous value
910that we replace in OldBindings.</p>
911
912<div class="doc_code">
913<pre>
914    // Emit the initializer before adding the variable to scope, this prevents
915    // the initializer from referencing the variable itself, and permits stuff
916    // like this:
917    //  var a = 1 in
918    //    var a = a in ...   # refers to outer 'a'.
919    Value *InitVal;
920    if (Init) {
921      InitVal = Init-&gt;Codegen();
922      if (InitVal == 0) return 0;
923    } else { // If not specified, use 0.0.
924      InitVal = ConstantFP::get(getGlobalContext(), APFloat(0.0));
925    }
926    
927    AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
928    Builder.CreateStore(InitVal, Alloca);
929
930    // Remember the old variable binding so that we can restore the binding when
931    // we unrecurse.
932    OldBindings.push_back(NamedValues[VarName]);
933    
934    // Remember this binding.
935    NamedValues[VarName] = Alloca;
936  }
937</pre>
938</div>
939
940<p>There are more comments here than code.  The basic idea is that we emit the
941initializer, create the alloca, then update the symbol table to point to it.
942Once all the variables are installed in the symbol table, we evaluate the body
943of the var/in expression:</p>
944
945<div class="doc_code">
946<pre>
947  // Codegen the body, now that all vars are in scope.
948  Value *BodyVal = Body-&gt;Codegen();
949  if (BodyVal == 0) return 0;
950</pre>
951</div>
952
953<p>Finally, before returning, we restore the previous variable bindings:</p>
954
955<div class="doc_code">
956<pre>
957  // Pop all our variables from scope.
958  for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
959    NamedValues[VarNames[i].first] = OldBindings[i];
960
961  // Return the body computation.
962  return BodyVal;
963}
964</pre>
965</div>
966
967<p>The end result of all of this is that we get properly scoped variable 
968definitions, and we even (trivially) allow mutation of them :).</p>
969
970<p>With this, we completed what we set out to do.  Our nice iterative fib
971example from the intro compiles and runs just fine.  The mem2reg pass optimizes
972all of our stack variables into SSA registers, inserting PHI nodes where needed,
973and our front-end remains simple: no "iterated dominance frontier" computation
974anywhere in sight.</p>
975
976</div>
977
978<!-- *********************************************************************** -->
979<h2><a name="code">Full Code Listing</a></h2>
980<!-- *********************************************************************** -->
981
982<div>
983
984<p>
985Here is the complete code listing for our running example, enhanced with mutable
986variables and var/in support.  To build this example, use:
987</p>
988
989<div class="doc_code">
990<pre>
991# Compile
992clang++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
993# Run
994/toy
995</pre>
996</div>
997
998<p>Here is the code:</p>
999
1000<div class="doc_code">
1001<pre>
1002#include "llvm/DerivedTypes.h"
1003#include "llvm/ExecutionEngine/ExecutionEngine.h"
1004#include "llvm/ExecutionEngine/JIT.h"
1005#include "llvm/IRBuilder.h"
1006#include "llvm/LLVMContext.h"
1007#include "llvm/Module.h"
1008#include "llvm/PassManager.h"
1009#include "llvm/Analysis/Verifier.h"
1010#include "llvm/Analysis/Passes.h"
1011#include "llvm/Target/TargetData.h"
1012#include "llvm/Transforms/Scalar.h"
1013#include "llvm/Support/TargetSelect.h"
1014#include &lt;cstdio&gt;
1015#include &lt;string&gt;
1016#include &lt;map&gt;
1017#include &lt;vector&gt;
1018using namespace llvm;
1019
1020//===----------------------------------------------------------------------===//
1021// Lexer
1022//===----------------------------------------------------------------------===//
1023
1024// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
1025// of these for known things.
1026enum Token {
1027  tok_eof = -1,
1028
1029  // commands
1030  tok_def = -2, tok_extern = -3,
1031
1032  // primary
1033  tok_identifier = -4, tok_number = -5,
1034  
1035  // control
1036  tok_if = -6, tok_then = -7, tok_else = -8,
1037  tok_for = -9, tok_in = -10,
1038  
1039  // operators
1040  tok_binary = -11, tok_unary = -12,
1041  
1042  // var definition
1043  tok_var = -13
1044};
1045
1046static std::string IdentifierStr;  // Filled in if tok_identifier
1047static double NumVal;              // Filled in if tok_number
1048
1049/// gettok - Return the next token from standard input.
1050static int gettok() {
1051  static int LastChar = ' ';
1052
1053  // Skip any whitespace.
1054  while (isspace(LastChar))
1055    LastChar = getchar();
1056
1057  if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
1058    IdentifierStr = LastChar;
1059    while (isalnum((LastChar = getchar())))
1060      IdentifierStr += LastChar;
1061
1062    if (IdentifierStr == "def") return tok_def;
1063    if (IdentifierStr == "extern") return tok_extern;
1064    if (IdentifierStr == "if") return tok_if;
1065    if (IdentifierStr == "then") return tok_then;
1066    if (IdentifierStr == "else") return tok_else;
1067    if (IdentifierStr == "for") return tok_for;
1068    if (IdentifierStr == "in") return tok_in;
1069    if (IdentifierStr == "binary") return tok_binary;
1070    if (IdentifierStr == "unary") return tok_unary;
1071    if (IdentifierStr == "var") return tok_var;
1072    return tok_identifier;
1073  }
1074
1075  if (isdigit(LastChar) || LastChar == '.') {   // Number: [0-9.]+
1076    std::string NumStr;
1077    do {
1078      NumStr += LastChar;
1079      LastChar = getchar();
1080    } while (isdigit(LastChar) || LastChar == '.');
1081
1082    NumVal = strtod(NumStr.c_str(), 0);
1083    return tok_number;
1084  }
1085
1086  if (LastChar == '#') {
1087    // Comment until end of line.
1088    do LastChar = getchar();
1089    while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp;&amp; LastChar != '\r');
1090    
1091    if (LastChar != EOF)
1092      return gettok();
1093  }
1094  
1095  // Check for end of file.  Don't eat the EOF.
1096  if (LastChar == EOF)
1097    return tok_eof;
1098
1099  // Otherwise, just return the character as its ascii value.
1100  int ThisChar = LastChar;
1101  LastChar = getchar();
1102  return ThisChar;
1103}
1104
1105//===----------------------------------------------------------------------===//
1106// Abstract Syntax Tree (aka Parse Tree)
1107//===----------------------------------------------------------------------===//
1108
1109/// ExprAST - Base class for all expression nodes.
1110class ExprAST {
1111public:
1112  virtual ~ExprAST() {}
1113  virtual Value *Codegen() = 0;
1114};
1115
1116/// NumberExprAST - Expression class for numeric literals like "1.0".
1117class NumberExprAST : public ExprAST {
1118  double Val;
1119public:
1120  NumberExprAST(double val) : Val(val) {}
1121  virtual Value *Codegen();
1122};
1123
1124/// VariableExprAST - Expression class for referencing a variable, like "a".
1125class VariableExprAST : public ExprAST {
1126  std::string Name;
1127public:
1128  VariableExprAST(const std::string &amp;name) : Name(name) {}
1129  const std::string &amp;getName() const { return Name; }
1130  virtual Value *Codegen();
1131};
1132
1133/// UnaryExprAST - Expression class for a unary operator.
1134class UnaryExprAST : public ExprAST {
1135  char Opcode;
1136  ExprAST *Operand;
1137public:
1138  UnaryExprAST(char opcode, ExprAST *operand) 
1139    : Opcode(opcode), Operand(operand) {}
1140  virtual Value *Codegen();
1141};
1142
1143/// BinaryExprAST - Expression class for a binary operator.
1144class BinaryExprAST : public ExprAST {
1145  char Op;
1146  ExprAST *LHS, *RHS;
1147public:
1148  BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs) 
1149    : Op(op), LHS(lhs), RHS(rhs) {}
1150  virtual Value *Codegen();
1151};
1152
1153/// CallExprAST - Expression class for function calls.
1154class CallExprAST : public ExprAST {
1155  std::string Callee;
1156  std::vector&lt;ExprAST*&gt; Args;
1157public:
1158  CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
1159    : Callee(callee), Args(args) {}
1160  virtual Value *Codegen();
1161};
1162
1163/// IfExprAST - Expression class for if/then/else.
1164class IfExprAST : public ExprAST {
1165  ExprAST *Cond, *Then, *Else;
1166public:
1167  IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
1168  : Cond(cond), Then(then), Else(_else) {}
1169  virtual Value *Codegen();
1170};
1171
1172/// ForExprAST - Expression class for for/in.
1173class ForExprAST : public ExprAST {
1174  std::string VarName;
1175  ExprAST *Start, *End, *Step, *Body;
1176public:
1177  ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
1178             ExprAST *step, ExprAST *body)
1179    : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
1180  virtual Value *Codegen();
1181};
1182
1183/// VarExprAST - Expression class for var/in
1184class VarExprAST : public ExprAST {
1185  std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
1186  ExprAST *Body;
1187public:
1188  VarExprAST(const std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; &amp;varnames,
1189             ExprAST *body)
1190  : VarNames(varnames), Body(body) {}
1191  
1192  virtual Value *Codegen();
1193};
1194
1195/// PrototypeAST - This class represents the "prototype" for a function,
1196/// which captures its name, and its argument names (thus implicitly the number
1197/// of arguments the function takes), as well as if it is an operator.
1198class PrototypeAST {
1199  std::string Name;
1200  std::vector&lt;std::string&gt; Args;
1201  bool isOperator;
1202  unsigned Precedence;  // Precedence if a binary op.
1203public:
1204  PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args,
1205               bool isoperator = false, unsigned prec = 0)
1206  : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
1207  
1208  bool isUnaryOp() const { return isOperator &amp;&amp; Args.size() == 1; }
1209  bool isBinaryOp() const { return isOperator &amp;&amp; Args.size() == 2; }
1210  
1211  char getOperatorName() const {
1212    assert(isUnaryOp() || isBinaryOp());
1213    return Name[Name.size()-1];
1214  }
1215  
1216  unsigned getBinaryPrecedence() const { return Precedence; }
1217  
1218  Function *Codegen();
1219  
1220  void CreateArgumentAllocas(Function *F);
1221};
1222
1223/// FunctionAST - This class represents a function definition itself.
1224class FunctionAST {
1225  PrototypeAST *Proto;
1226  ExprAST *Body;
1227public:
1228  FunctionAST(PrototypeAST *proto, ExprAST *body)
1229    : Proto(proto), Body(body) {}
1230  
1231  Function *Codegen();
1232};
1233
1234//===----------------------------------------------------------------------===//
1235// Parser
1236//===----------------------------------------------------------------------===//
1237
1238/// CurTok/getNextToken - Provide a simple token buffer.  CurTok is the current
1239/// token the parser is looking at.  getNextToken reads another token from the
1240/// lexer and updates CurTok with its results.
1241static int CurTok;
1242static int getNextToken() {
1243  return CurTok = gettok();
1244}
1245
1246/// BinopPrecedence - This holds the precedence for each binary operator that is
1247/// defined.
1248static std::map&lt;char, int&gt; BinopPrecedence;
1249
1250/// GetTokPrecedence - Get the precedence of the pending binary operator token.
1251static int GetTokPrecedence() {
1252  if (!isascii(CurTok))
1253    return -1;
1254  
1255  // Make sure it's a declared binop.
1256  int TokPrec = BinopPrecedence[CurTok];
1257  if (TokPrec &lt;= 0) return -1;
1258  return TokPrec;
1259}
1260
1261/// Error* - These are little helper functions for error handling.
1262ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
1263PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
1264FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
1265
1266static ExprAST *ParseExpression();
1267
1268/// identifierexpr
1269///   ::= identifier
1270///   ::= identifier '(' expression* ')'
1271static ExprAST *ParseIdentifierExpr() {
1272  std::string IdName = IdentifierStr;
1273  
1274  getNextToken();  // eat identifier.
1275  
1276  if (CurTok != '(') // Simple variable ref.
1277    return new VariableExprAST(IdName);
1278  
1279  // Call.
1280  getNextToken();  // eat (
1281  std::vector&lt;ExprAST*&gt; Args;
1282  if (CurTok != ')') {
1283    while (1) {
1284      ExprAST *Arg = ParseExpression();
1285      if (!Arg) return 0;
1286      Args.push_back(Arg);
1287
1288      if (CurTok == ')') break;
1289
1290      if (CurTok != ',')
1291        return Error("Expected ')' or ',' in argument list");
1292      getNextToken();
1293    }
1294  }
1295
1296  // Eat the ')'.
1297  getNextToken();
1298  
1299  return new CallExprAST(IdName, Args);
1300}
1301
1302/// numberexpr ::= number
1303static ExprAST *ParseNumberExpr() {
1304  ExprAST *Result = new NumberExprAST(NumVal);
1305  getNextToken(); // consume the number
1306  return Result;
1307}
1308
1309/// parenexpr ::= '(' expression ')'
1310static ExprAST *ParseParenExpr() {
1311  getNextToken();  // eat (.
1312  ExprAST *V = ParseExpression();
1313  if (!V) return 0;
1314  
1315  if (CurTok != ')')
1316    return Error("expected ')'");
1317  getNextToken();  // eat ).
1318  return V;
1319}
1320
1321/// ifexpr ::= 'if' expression 'then' expression 'else' expression
1322static ExprAST *ParseIfExpr() {
1323  getNextToken();  // eat the if.
1324  
1325  // condition.
1326  ExprAST *Cond = ParseExpression();
1327  if (!Cond) return 0;
1328  
1329  if (CurTok != tok_then)
1330    return Error("expected then");
1331  getNextToken();  // eat the then
1332  
1333  ExprAST *Then = ParseExpression();
1334  if (Then == 0) return 0;
1335  
1336  if (CurTok != tok_else)
1337    return Error("expected else");
1338  
1339  getNextToken();
1340  
1341  ExprAST *Else = ParseExpression();
1342  if (!Else) return 0;
1343  
1344  return new IfExprAST(Cond, Then, Else);
1345}
1346
1347/// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
1348static ExprAST *ParseForExpr() {
1349  getNextToken();  // eat the for.
1350
1351  if (CurTok != tok_identifier)
1352    return Error("expected identifier after for");
1353  
1354  std::string IdName = IdentifierStr;
1355  getNextToken();  // eat identifier.
1356  
1357  if (CurTok != '=')
1358    return Error("expected '=' after for");
1359  getNextToken();  // eat '='.
1360  
1361  
1362  ExprAST *Start = ParseExpression();
1363  if (Start == 0) return 0;
1364  if (CurTok != ',')
1365    return Error("expected ',' after for start value");
1366  getNextToken();
1367  
1368  ExprAST *End = ParseExpression();
1369  if (End == 0) return 0;
1370  
1371  // The step value is optional.
1372  ExprAST *Step = 0;
1373  if (CurTok == ',') {
1374    getNextToken();
1375    Step = ParseExpression();
1376    if (Step == 0) return 0;
1377  }
1378  
1379  if (CurTok != tok_in)
1380    return Error("expected 'in' after for");
1381  getNextToken();  // eat 'in'.
1382  
1383  ExprAST *Body = ParseExpression();
1384  if (Body == 0) return 0;
1385
1386  return new ForExprAST(IdName, Start, End, Step, Body);
1387}
1388
1389/// varexpr ::= 'var' identifier ('=' expression)? 
1390//                    (',' identifier ('=' expression)?)* 'in' expression
1391static ExprAST *ParseVarExpr() {
1392  getNextToken();  // eat the var.
1393
1394  std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
1395
1396  // At least one variable name is required.
1397  if (CurTok != tok_identifier)
1398    return Error("expected identifier after var");
1399  
1400  while (1) {
1401    std::string Name = IdentifierStr;
1402    getNextToken();  // eat identifier.
1403
1404    // Read the optional initializer.
1405    ExprAST *Init = 0;
1406    if (CurTok == '=') {
1407      getNextToken(); // eat the '='.
1408      
1409      Init = ParseExpression();
1410      if (Init == 0) return 0;
1411    }
1412    
1413    VarNames.push_back(std::make_pair(Name, Init));
1414    
1415    // End of var list, exit loop.
1416    if (CurTok != ',') break;
1417    getNextToken(); // eat the ','.
1418    
1419    if (CurTok != tok_identifier)
1420      return Error("expected identifier list after var");
1421  }
1422  
1423  // At this point, we have to have 'in'.
1424  if (CurTok != tok_in)
1425    return Error("expected 'in' keyword after 'var'");
1426  getNextToken();  // eat 'in'.
1427  
1428  ExprAST *Body = ParseExpression();
1429  if (Body == 0) return 0;
1430  
1431  return new VarExprAST(VarNames, Body);
1432}
1433
1434/// primary
1435///   ::= identifierexpr
1436///   ::= numberexpr
1437///   ::= parenexpr
1438///   ::= ifexpr
1439///   ::= forexpr
1440///   ::= varexpr
1441static ExprAST *ParsePrimary() {
1442  switch (CurTok) {
1443  default: return Error("unknown token when expecting an expression");
1444  case tok_identifier: return ParseIdentifierExpr();
1445  case tok_number:     return ParseNumberExpr();
1446  case '(':            return ParseParenExpr();
1447  case tok_if:         return ParseIfExpr();
1448  case tok_for:        return ParseForExpr();
1449  case tok_var:        return ParseVarExpr();
1450  }
1451}
1452
1453/// unary
1454///   ::= primary
1455///   ::= '!' unary
1456static ExprAST *ParseUnary() {
1457  // If the current token is not an operator, it must be a primary expr.
1458  if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
1459    return ParsePrimary();
1460  
1461  // If this is a unary operator, read it.
1462  int Opc = CurTok;
1463  getNextToken();
1464  if (ExprAST *Operand = ParseUnary())
1465    return new UnaryExprAST(Opc, Operand);
1466  return 0;
1467}
1468
1469/// binoprhs
1470///   ::= ('+' unary)*
1471static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
1472  // If this is a binop, find its precedence.
1473  while (1) {
1474    int TokPrec = GetTokPrecedence();
1475    
1476    // If this is a binop that binds at least as tightly as the current binop,
1477    // consume it, otherwise we are done.
1478    if (TokPrec &lt; ExprPrec)
1479      return LHS;
1480    
1481    // Okay, we know this is a binop.
1482    int BinOp = CurTok;
1483    getNextToken();  // eat binop
1484    
1485    // Parse the unary expression after the binary operator.
1486    ExprAST *RHS = ParseUnary();
1487    if (!RHS) return 0;
1488    
1489    // If BinOp binds less tightly with RHS than the operator after RHS, let
1490    // the pending operator take RHS as its LHS.
1491    int NextPrec = GetTokPrecedence();
1492    if (TokPrec &lt; NextPrec) {
1493      RHS = ParseBinOpRHS(TokPrec+1, RHS);
1494      if (RHS == 0) return 0;
1495    }
1496    
1497    // Merge LHS/RHS.
1498    LHS = new BinaryExprAST(BinOp, LHS, RHS);
1499  }
1500}
1501
1502/// expression
1503///   ::= unary binoprhs
1504///
1505static ExprAST *ParseExpression() {
1506  ExprAST *LHS = ParseUnary();
1507  if (!LHS) return 0;
1508  
1509  return ParseBinOpRHS(0, LHS);
1510}
1511
1512/// prototype
1513///   ::= id '(' id* ')'
1514///   ::= binary LETTER number? (id, id)
1515///   ::= unary LETTER (id)
1516static PrototypeAST *ParsePrototype() {
1517  std::string FnName;
1518  
1519  unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
1520  unsigned BinaryPrecedence = 30;
1521  
1522  switch (CurTok) {
1523  default:
1524    return ErrorP("Expected function name in prototype");
1525  case tok_identifier:
1526    FnName = IdentifierStr;
1527    Kind = 0;
1528    getNextToken();
1529    break;
1530  case tok_unary:
1531    getNextToken();
1532    if (!isascii(CurTok))
1533      return ErrorP("Expected unary operator");
1534    FnName = "unary";
1535    FnName += (char)CurTok;
1536    Kind = 1;
1537    getNextToken();
1538    break;
1539  case tok_binary:
1540    getNextToken();
1541    if (!isascii(CurTok))
1542      return ErrorP("Expected binary operator");
1543    FnName = "binary";
1544    FnName += (char)CurTok;
1545    Kind = 2;
1546    getNextToken();
1547    
1548    // Read the precedence if present.
1549    if (CurTok == tok_number) {
1550      if (NumVal &lt; 1 || NumVal &gt; 100)
1551        return ErrorP("Invalid precedecnce: must be 1..100");
1552      BinaryPrecedence = (unsigned)NumVal;
1553      getNextToken();
1554    }
1555    break;
1556  }
1557  
1558  if (CurTok != '(')
1559    return ErrorP("Expected '(' in prototype");
1560  
1561  std::vector&lt;std::string&gt; ArgNames;
1562  while (getNextToken() == tok_identifier)
1563    ArgNames.push_back(IdentifierStr);
1564  if (CurTok != ')')
1565    return ErrorP("Expected ')' in prototype");
1566  
1567  // success.
1568  getNextToken();  // eat ')'.
1569  
1570  // Verify right number of names for operator.
1571  if (Kind &amp;&amp; ArgNames.size() != Kind)
1572    return ErrorP("Invalid number of operands for operator");
1573  
1574  return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
1575}
1576
1577/// definition ::= 'def' prototype expression
1578static FunctionAST *ParseDefinition() {
1579  getNextToken();  // eat def.
1580  PrototypeAST *Proto = ParsePrototype();
1581  if (Proto == 0) return 0;
1582
1583  if (ExprAST *E = ParseExpression())
1584    return new FunctionAST(Proto, E);
1585  return 0;
1586}
1587
1588/// toplevelexpr ::= expression
1589static FunctionAST *ParseTopLevelExpr() {
1590  if (ExprAST *E = ParseExpression()) {
1591    // Make an anonymous proto.
1592    PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
1593    return new FunctionAST(Proto, E);
1594  }
1595  return 0;
1596}
1597
1598/// external ::= 'extern' prototype
1599static PrototypeAST *ParseExtern() {
1600  getNextToken();  // eat extern.
1601  return ParsePrototype();
1602}
1603
1604//===----------------------------------------------------------------------===//
1605// Code Generation
1606//===----------------------------------------------------------------------===//
1607
1608static Module *TheModule;
1609static IRBuilder&lt;&gt; Builder(getGlobalContext());
1610static std::map&lt;std::string, AllocaInst*&gt; NamedValues;
1611static FunctionPassManager *TheFPM;
1612
1613Value *ErrorV(const char *Str) { Error(Str); return 0; }
1614
1615/// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
1616/// the function.  This is used for mutable variables etc.
1617static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
1618                                          const std::string &amp;VarName) {
1619  IRBuilder&lt;&gt; TmpB(&amp;TheFunction-&gt;getEntryBlock(),
1620                 TheFunction-&gt;getEntryBlock().begin());
1621  return TmpB.CreateAlloca(Type::getDoubleTy(getGlobalContext()), 0,
1622                           VarName.c_str());
1623}
1624
1625Value *NumberExprAST::Codegen() {
1626  return ConstantFP::get(getGlobalContext(), APFloat(Val));
1627}
1628
1629Value *VariableExprAST::Codegen() {
1630  // Look this variable up in the function.
1631  Value *V = NamedValues[Name];
1632  if (V == 0) return ErrorV("Unknown variable name");
1633
1634  // Load the value.
1635  return Builder.CreateLoad(V, Name.c_str());
1636}
1637
1638Value *UnaryExprAST::Codegen() {
1639  Value *OperandV = Operand-&gt;Codegen();
1640  if (OperandV == 0) return 0;
1641  
1642  Function *F = TheModule-&gt;getFunction(std::string("unary")+Opcode);
1643  if (F == 0)
1644    return ErrorV("Unknown unary operator");
1645  
1646  return Builder.CreateCall(F, OperandV, "unop");
1647}
1648
1649Value *BinaryExprAST::Codegen() {
1650  // Special case '=' because we don't want to emit the LHS as an expression.
1651  if (Op == '=') {
1652    // Assignment requires the LHS to be an identifier.
1653    VariableExprAST *LHSE = dynamic_cast&lt;VariableExprAST*&gt;(LHS);
1654    if (!LHSE)
1655      return ErrorV("destination of '=' must be a variable");
1656    // Codegen the RHS.
1657    Value *Val = RHS-&gt;Codegen();
1658    if (Val == 0) return 0;
1659
1660    // Look up the name.
1661    Value *Variable = NamedValues[LHSE-&gt;getName()];
1662    if (Variable == 0) return ErrorV("Unknown variable name");
1663
1664    Builder.CreateStore(Val, Variable);
1665    return Val;
1666  }
1667  
1668  Value *L = LHS-&gt;Codegen();
1669  Value *R = RHS-&gt;Codegen();
1670  if (L == 0 || R == 0) return 0;
1671  
1672  switch (Op) {
1673  case '+': return Builder.CreateFAdd(L, R, "addtmp");
1674  case '-': return Builder.CreateFSub(L, R, "subtmp");
1675  case '*': return Builder.CreateFMul(L, R, "multmp");
1676  case '&lt;':
1677    L = Builder.CreateFCmpULT(L, R, "cmptmp");
1678    // Convert bool 0/1 to double 0.0 or 1.0
1679    return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
1680                                "booltmp");
1681  default: break;
1682  }
1683  
1684  // If it wasn't a builtin binary operator, it must be a user defined one. Emit
1685  // a call to it.
1686  Function *F = TheModule-&gt;getFunction(std::string("binary")+Op);
1687  assert(F &amp;&amp; "binary operator not found!");
1688  
1689  Value *Ops[2] = { L, R };
1690  return Builder.CreateCall(F, Ops, "binop");
1691}
1692
1693Value *CallExprAST::Codegen() {
1694  // Look up the name in the global module table.
1695  Function *CalleeF = TheModule-&gt;getFunction(Callee);
1696  if (CalleeF == 0)
1697    return ErrorV("Unknown function referenced");
1698  
1699  // If argument mismatch error.
1700  if (CalleeF-&gt;arg_size() != Args.size())
1701    return ErrorV("Incorrect # arguments passed");
1702
1703  std::vector&lt;Value*&gt; ArgsV;
1704  for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1705    ArgsV.push_back(Args[i]-&gt;Codegen());
1706    if (ArgsV.back() == 0) return 0;
1707  }
1708  
1709  return Builder.CreateCall(CalleeF, ArgsV, "calltmp");
1710}
1711
1712Value *IfExprAST::Codegen() {
1713  Value *CondV = Cond-&gt;Codegen();
1714  if (CondV == 0) return 0;
1715  
1716  // Convert condition to a bool by comparing equal to 0.0.
1717  CondV = Builder.CreateFCmpONE(CondV, 
1718                              ConstantFP::get(getGlobalContext(), APFloat(0.0)),
1719                                "ifcond");
1720  
1721  Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1722  
1723  // Create blocks for the then and else cases.  Insert the 'then' block at the
1724  // end of the function.
1725  BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction);
1726  BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
1727  BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
1728  
1729  Builder.CreateCondBr(CondV, ThenBB, ElseBB);
1730  
1731  // Emit then value.
1732  Builder.SetInsertPoint(ThenBB);
1733  
1734  Value *ThenV = Then-&gt;Codegen();
1735  if (ThenV == 0) return 0;
1736  
1737  Builder.CreateBr(MergeBB);
1738  // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
1739  ThenBB = Builder.GetInsertBlock();
1740  
1741  // Emit else block.
1742  TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
1743  Builder.SetInsertPoint(ElseBB);
1744  
1745  Value *ElseV = Else-&gt;Codegen();
1746  if (ElseV == 0) return 0;
1747  
1748  Builder.CreateBr(MergeBB);
1749  // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
1750  ElseBB = Builder.GetInsertBlock();
1751  
1752  // Emit merge block.
1753  TheFunction-&gt;getBasicBlockList().push_back(MergeBB);
1754  Builder.SetInsertPoint(MergeBB);
1755  PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2,
1756                                  "iftmp");
1757  
1758  PN-&gt;addIncoming(ThenV, ThenBB);
1759  PN-&gt;addIncoming(ElseV, ElseBB);
1760  return PN;
1761}
1762
1763Value *ForExprAST::Codegen() {
1764  // Output this as:
1765  //   var = alloca double
1766  //   ...
1767  //   start = startexpr
1768  //   store start -&gt; var
1769  //   goto loop
1770  // loop: 
1771  //   ...
1772  //   bodyexpr
1773  //   ...
1774  // loopend:
1775  //   step = stepexpr
1776  //   endcond = endexpr
1777  //
1778  //   curvar = load var
1779  //   nextvar = curvar + step
1780  //   store nextvar -&gt; var
1781  //   br endcond, loop, endloop
1782  // outloop:
1783  
1784  Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1785
1786  // Create an alloca for the variable in the entry block.
1787  AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
1788  
1789  // Emit the start code first, without 'variable' in scope.
1790  Value *StartVal = Start-&gt;Codegen();
1791  if (StartVal == 0) return 0;
1792  
1793  // Store the value into the alloca.
1794  Builder.CreateStore(StartVal, Alloca);
1795  
1796  // Make the new basic block for the loop header, inserting after current
1797  // block.
1798  BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
1799  
1800  // Insert an explicit fall through from the current block to the LoopBB.
1801  Builder.CreateBr(LoopBB);
1802
1803  // Start insertion in LoopBB.
1804  Builder.SetInsertPoint(LoopBB);
1805  
1806  // Within the loop, the variable is defined equal to the PHI node.  If it
1807  // shadows an existing variable, we have to restore it, so save it now.
1808  AllocaInst *OldVal = NamedValues[VarName];
1809  NamedValues[VarName] = Alloca;
1810  
1811  // Emit the body of the loop.  This, like any other expr, can change the
1812  // current BB.  Note that we ignore the value computed by the body, but don't
1813  // allow an error.
1814  if (Body-&gt;Codegen() == 0)
1815    return 0;
1816  
1817  // Emit the step value.
1818  Value *StepVal;
1819  if (Step) {
1820    StepVal = Step-&gt;Codegen();
1821    if (StepVal == 0) return 0;
1822  } else {
1823    // If not specified, use 1.0.
1824    StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
1825  }
1826  
1827  // Compute the end condition.
1828  Value *EndCond = End-&gt;Codegen();
1829  if (EndCond == 0) return EndCond;
1830  
1831  // Reload, increment, and restore the alloca.  This handles the case where
1832  // the body of the loop mutates the variable.
1833  Value *CurVar = Builder.CreateLoad(Alloca, VarName.c_str());
1834  Value *NextVar = Builder.CreateFAdd(CurVar, StepVal, "nextvar");
1835  Builder.CreateStore(NextVar, Alloca);
1836  
1837  // Convert condition to a bool by comparing equal to 0.0.
1838  EndCond = Builder.CreateFCmpONE(EndCond, 
1839                              ConstantFP::get(getGlobalContext(), APFloat(0.0)),
1840                                  "loopcond");
1841  
1842  // Create the "after loop" block and insert it.
1843  BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
1844  
1845  // Insert the conditional branch into the end of LoopEndBB.
1846  Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
1847  
1848  // Any new code will be inserted in AfterBB.
1849  Builder.SetInsertPoint(AfterBB);
1850  
1851  // Restore the unshadowed variable.
1852  if (OldVal)
1853    NamedValues[VarName] = OldVal;
1854  else
1855    NamedValues.erase(VarName);
1856
1857  
1858  // for expr always returns 0.0.
1859  return Constant::getNullValue(Type::getDoubleTy(getGlobalContext()));
1860}
1861
1862Value *VarExprAST::Codegen() {
1863  std::vector&lt;AllocaInst *&gt; OldBindings;
1864  
1865  Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1866
1867  // Register all variables and emit their initializer.
1868  for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
1869    const std::string &amp;VarName = VarNames[i].first;
1870    ExprAST *Init = VarNames[i].second;
1871    
1872    // Emit the initializer before adding the variable to scope, this prevents
1873    // the initializer from referencing the variable itself, and permits stuff
1874    // like this:
1875    //  var a = 1 in
1876    //    var a = a in ...   # refers to outer 'a'.
1877    Value *InitVal;
1878    if (Init) {
1879      InitVal = Init-&gt;Codegen();
1880      if (InitVal == 0) return 0;
1881    } else { // If not specified, use 0.0.
1882      InitVal = ConstantFP::get(getGlobalContext(), APFloat(0.0));
1883    }
1884    
1885    AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
1886    Builder.CreateStore(InitVal, Alloca);
1887
1888    // Remember the old variable binding so that we can restore the binding when
1889    // we unrecurse.
1890    OldBindings.push_back(NamedValues[VarName]);
1891    
1892    // Remember this binding.
1893    NamedValues[VarName] = Alloca;
1894  }
1895  
1896  // Codegen the body, now that all vars are in scope.
1897  Value *BodyVal = Body-&gt;Codegen();
1898  if (BodyVal == 0) return 0;
1899  
1900  // Pop all our variables from scope.
1901  for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
1902    NamedValues[VarNames[i].first] = OldBindings[i];
1903
1904  // Return the body computation.
1905  return BodyVal;
1906}
1907
1908Function *PrototypeAST::Codegen() {
1909  // Make the function type:  double(double,double) etc.
1910  std::vector&lt;Type*&gt; Doubles(Args.size(),
1911                             Type::getDoubleTy(getGlobalContext()));
1912  FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
1913                                       Doubles, false);
1914  
1915  Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
1916  
1917  // If F conflicted, there was already something named 'Name'.  If it has a
1918  // body, don't allow redefinition or reextern.
1919  if (F-&gt;getName() != Name) {
1920    // Delete the one we just made and get the existing one.
1921    F-&gt;eraseFromParent();
1922    F = TheModule-&gt;getFunction(Name);
1923    
1924    // If F already has a body, reject this.
1925    if (!F-&gt;empty()) {
1926      ErrorF("redefinition of function");
1927      return 0;
1928    }
1929    
1930    // If F took a different number of args, reject.
1931    if (F-&gt;arg_size() != Args.size()) {
1932      ErrorF("redefinition of function with different # args");
1933      return 0;
1934    }
1935  }
1936  
1937  // Set names for all arguments.
1938  unsigned Idx = 0;
1939  for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
1940       ++AI, ++Idx)
1941    AI-&gt;setName(Args[Idx]);
1942    
1943  return F;
1944}
1945
1946/// CreateArgumentAllocas - Create an alloca for each argument and register the
1947/// argument in the symbol table so that references to it will succeed.
1948void PrototypeAST::CreateArgumentAllocas(Function *F) {
1949  Function::arg_iterator AI = F-&gt;arg_begin();
1950  for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
1951    // Create an alloca for this variable.
1952    AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
1953
1954    // Store the initial value into the alloca.
1955    Builder.CreateStore(AI, Alloca);
1956
1957    // Add arguments to variable symbol table.
1958    NamedValues[Args[Idx]] = Alloca;
1959  }
1960}
1961
1962Function *FunctionAST::Codegen() {
1963  NamedValues.clear();
1964  
1965  Function *TheFunction = Proto-&gt;Codegen();
1966  if (TheFunction == 0)
1967    return 0;
1968  
1969  // If this is an operator, install it.
1970  if (Proto-&gt;isBinaryOp())
1971    BinopPrecedence[Proto-&gt;getOperatorName()] = Proto-&gt;getBinaryPrecedence();
1972  
1973  // Create a new basic block to start insertion into.
1974  BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
1975  Builder.SetInsertPoint(BB);
1976  
1977  // Add all arguments to the symbol table and create their allocas.
1978  Proto-&gt;CreateArgumentAllocas(TheFunction);
1979
1980  if (Value *RetVal = Body-&gt;Codegen()) {
1981    // Finish off the function.
1982    Builder.CreateRet(RetVal);
1983
1984    // Validate the generated code, checking for consistency.
1985    verifyFunction(*TheFunction);
1986
1987    // Optimize the function.
1988    TheFPM-&gt;run(*TheFunction);
1989    
1990    return TheFunction;
1991  }
1992  
1993  // Error reading body, remove function.
1994  TheFunction-&gt;eraseFromParent();
1995
1996  if (Proto-&gt;isBinaryOp())
1997    BinopPrecedence.erase(Proto-&gt;getOperatorName());
1998  return 0;
1999}
2000
2001//===----------------------------------------------------------------------===//
2002// Top-Level parsing and JIT Driver
2003//===----------------------------------------------------------------------===//
2004
2005static ExecutionEngine *TheExecutionEngine;
2006
2007static void HandleDefinition() {
2008  if (FunctionAST *F = ParseDefinition()) {
2009    if (Function *LF = F-&gt;Codegen()) {
2010      fprintf(stderr, "Read function definition:");
2011      LF-&gt;dump();
2012    }
2013  } else {
2014    // Skip token for error recovery.
2015    getNextToken();
2016  }
2017}
2018
2019static void HandleExtern() {
2020  if (PrototypeAST *P = ParseExtern()) {
2021    if (Function *F = P-&gt;Codegen()) {
2022      fprintf(stderr, "Read extern: ");
2023      F-&gt;dump();
2024    }
2025  } else {
2026    // Skip token for error recovery.
2027    getNextToken();
2028  }
2029}
2030
2031static void HandleTopLevelExpression() {
2032  // Evaluate a top-level expression into an anonymous function.
2033  if (FunctionAST *F = ParseTopLevelExpr()) {
2034    if (Function *LF = F-&gt;Codegen()) {
2035      // JIT the function, returning a function pointer.
2036      void *FPtr = TheExecutionEngine-&gt;getPointerToFunction(LF);
2037      
2038      // Cast it to the right type (takes no arguments, returns a double) so we
2039      // can call it as a native function.
2040      double (*FP)() = (double (*)())(intptr_t)FPtr;
2041      fprintf(stderr, "Evaluated to %f\n", FP());
2042    }
2043  } else {
2044    // Skip token for error recovery.
2045    getNextToken();
2046  }
2047}
2048
2049/// top ::= definition | external | expression | ';'
2050static void MainLoop() {
2051  while (1) {
2052    fprintf(stderr, "ready&gt; ");
2053    switch (CurTok) {
2054    case tok_eof:    return;
2055    case ';':        getNextToken(); break;  // ignore top-level semicolons.
2056    case tok_def:    HandleDefinition(); break;
2057    case tok_extern: HandleExtern(); break;
2058    default:         HandleTopLevelExpression(); break;
2059    }
2060  }
2061}
2062
2063//===----------------------------------------------------------------------===//
2064// "Library" functions that can be "extern'd" from user code.
2065//===----------------------------------------------------------------------===//
2066
2067/// putchard - putchar that takes a double and returns 0.
2068extern "C" 
2069double putchard(double X) {
2070  putchar((char)X);
2071  return 0;
2072}
2073
2074/// printd - printf that takes a double prints it as "%f\n", returning 0.
2075extern "C" 
2076double printd(double X) {
2077  printf("%f\n", X);
2078  return 0;
2079}
2080
2081//===----------------------------------------------------------------------===//
2082// Main driver code.
2083//===----------------------------------------------------------------------===//
2084
2085int main() {
2086  InitializeNativeTarget();
2087  LLVMContext &amp;Context = getGlobalContext();
2088
2089  // Install standard binary operators.
2090  // 1 is lowest precedence.
2091  BinopPrecedence['='] = 2;
2092  BinopPrecedence['&lt;'] = 10;
2093  BinopPrecedence['+'] = 20;
2094  BinopPrecedence['-'] = 20;
2095  BinopPrecedence['*'] = 40;  // highest.
2096
2097  // Prime the first token.
2098  fprintf(stderr, "ready&gt; ");
2099  getNextToken();
2100
2101  // Make the module, which holds all the code.
2102  TheModule = new Module("my cool jit", Context);
2103
2104  // Create the JIT.  This takes ownership of the module.
2105  std::string ErrStr;
2106  TheExecutionEngine = EngineBuilder(TheModule).setErrorStr(&amp;ErrStr).create();
2107  if (!TheExecutionEngine) {
2108    fprintf(stderr, "Could not create ExecutionEngine: %s\n", ErrStr.c_str());
2109    exit(1);
2110  }
2111
2112  FunctionPassManager OurFPM(TheModule);
2113
2114  // Set up the optimizer pipeline.  Start with registering info about how the
2115  // target lays out data structures.
2116  OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
2117  // Provide basic AliasAnalysis support for GVN.
2118  OurFPM.add(createBasicAliasAnalysisPass());
2119  // Promote allocas to registers.
2120  OurFPM.add(createPromoteMemoryToRegisterPass());
2121  // Do simple "peephole" optimizations and bit-twiddling optzns.
2122  OurFPM.add(createInstructionCombiningPass());
2123  // Reassociate expressions.
2124  OurFPM.add(createReassociatePass());
2125  // Eliminate Common SubExpressions.
2126  OurFPM.add(createGVNPass());
2127  // Simplify the control flow graph (deleting unreachable blocks, etc).
2128  OurFPM.add(createCFGSimplificationPass());
2129
2130  OurFPM.doInitialization();
2131
2132  // Set the global so the code gen can use this.
2133  TheFPM = &amp;OurFPM;
2134
2135  // Run the main "interpreter loop" now.
2136  MainLoop();
2137
2138  TheFPM = 0;
2139
2140  // Print out all of the generated code.
2141  TheModule-&gt;dump();
2142
2143  return 0;
2144}
2145</pre>
2146</div>
2147
2148<a href="LangImpl8.html">Next: Conclusion and other useful LLVM tidbits</a>
2149</div>
2150
2151<!-- *********************************************************************** -->
2152<hr>
2153<address>
2154  <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
2155  src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
2156  <a href="http://validator.w3.org/check/referer"><img
2157  src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
2158
2159  <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
2160  <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br>
2161  Last modified: $Date$
2162</address>
2163</body>
2164</html>
2165