1.. Copyright (C) 2015-2020 Free Software Foundation, Inc.
2   Originally contributed by David Malcolm <dmalcolm@redhat.com>
3
4   This is free software: you can redistribute it and/or modify it
5   under the terms of the GNU General Public License as published by
6   the Free Software Foundation, either version 3 of the License, or
7   (at your option) any later version.
8
9   This program is distributed in the hope that it will be useful, but
10   WITHOUT ANY WARRANTY; without even the implied warranty of
11   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
12   General Public License for more details.
13
14   You should have received a copy of the GNU General Public License
15   along with this program.  If not, see
16   <http://www.gnu.org/licenses/>.
17
18Tutorial part 5: Implementing an Ahead-of-Time compiler
19-------------------------------------------------------
20
21If you have a pre-existing language frontend that's compatible with
22libgccjit's license, it's possible to hook it up to libgccjit as a
23backend.  In the previous example we showed
24how to do that for in-memory JIT-compilation, but libgccjit can also
25compile code directly to a file, allowing you to implement a more
26traditional ahead-of-time compiler ("JIT" is something of a misnomer
27for this use-case).
28
29The essential difference is to compile the context using
30:c:func:`gcc_jit_context_compile_to_file` rather than
31:c:func:`gcc_jit_context_compile`.
32
33The "brainf" language
34*********************
35
36In this example we use libgccjit to construct an ahead-of-time compiler
37for an esoteric programming language that we shall refer to as "brainf".
38
39brainf scripts operate on an array of bytes, with a notional data pointer
40within the array.
41
42brainf is hard for humans to read, but it's trivial to write a parser for
43it, as there is no lexing; just a stream of bytes.  The operations are:
44
45====================== =============================
46Character              Meaning
47====================== =============================
48``>``                  ``idx += 1``
49``<``                  ``idx -= 1``
50``+``                  ``data[idx] += 1``
51``-``                  ``data[idx] -= 1``
52``.``                  ``output (data[idx])``
53``,``                  ``data[idx] = input ()``
54``[``                  loop until ``data[idx] == 0``
55``]``                  end of loop
56Anything else          ignored
57====================== =============================
58
59Unlike the previous example, we'll implement an ahead-of-time compiler,
60which reads ``.bf`` scripts and outputs executables (though it would
61be trivial to have it run them JIT-compiled in-process).
62
63Here's what a simple ``.bf`` script looks like:
64
65   .. literalinclude:: ../examples/emit-alphabet.bf
66    :lines: 1-
67
68.. note::
69
70   This example makes use of whitespace and comments for legibility, but
71   could have been written as::
72
73     ++++++++++++++++++++++++++
74     >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<
75     [>.+<-]
76
77   It's not a particularly useful language, except for providing
78   compiler-writers with a test case that's easy to parse.  The point
79   is that you can use :c:func:`gcc_jit_context_compile_to_file`
80   to use libgccjit as a backend for a pre-existing language frontend
81   (provided that the pre-existing frontend is compatible with libgccjit's
82   license).
83
84Converting a brainf script to libgccjit IR
85******************************************
86
87As before we write simple code to populate a :c:type:`gcc_jit_context *`.
88
89   .. literalinclude:: ../examples/tut05-bf.c
90    :start-after: #define MAX_OPEN_PARENS 16
91    :end-before: /* Entrypoint to the compiler.  */
92    :language: c
93
94Compiling a context to a file
95*****************************
96
97Unlike the previous tutorial, this time we'll compile the context
98directly to an executable, using :c:func:`gcc_jit_context_compile_to_file`:
99
100.. code-block:: c
101
102    gcc_jit_context_compile_to_file (ctxt,
103                                     GCC_JIT_OUTPUT_KIND_EXECUTABLE,
104                                     output_file);
105
106Here's the top-level of the compiler, which is what actually calls into
107:c:func:`gcc_jit_context_compile_to_file`:
108
109 .. literalinclude:: ../examples/tut05-bf.c
110    :start-after: /* Entrypoint to the compiler.  */
111    :end-before: /* Use the built compiler to compile the example to an executable:
112    :language: c
113
114Note how once the context is populated you could trivially instead compile
115it to memory using :c:func:`gcc_jit_context_compile` and run it in-process
116as in the previous tutorial.
117
118To create an executable, we need to export a ``main`` function.  Here's
119how to create one from the JIT API:
120
121 .. literalinclude:: ../examples/tut05-bf.c
122    :start-after: #include "libgccjit.h"
123    :end-before: #define MAX_OPEN_PARENS 16
124    :language: c
125
126.. note::
127
128   The above implementation ignores ``argc`` and ``argv``, but you could
129   make use of them by exposing ``param_argc`` and ``param_argv`` to the
130   caller.
131
132Upon compiling this C code, we obtain a bf-to-machine-code compiler;
133let's call it ``bfc``:
134
135.. code-block:: console
136
137  $ gcc \
138      tut05-bf.c \
139      -o bfc \
140      -lgccjit
141
142We can now use ``bfc`` to compile .bf files into machine code executables:
143
144.. code-block:: console
145
146  $ ./bfc \
147       emit-alphabet.bf \
148       a.out
149
150which we can run directly:
151
152.. code-block:: console
153
154  $ ./a.out
155  ABCDEFGHIJKLMNOPQRSTUVWXYZ
156
157Success!
158
159We can also inspect the generated executable using standard tools:
160
161.. code-block:: console
162
163  $ objdump -d a.out |less
164
165which shows that libgccjit has managed to optimize the function
166somewhat (for example, the runs of 26 and 65 increment operations
167have become integer constants 0x1a and 0x41):
168
169.. code-block:: console
170
171  0000000000400620 <main>:
172    400620:     80 3d 39 0a 20 00 00    cmpb   $0x0,0x200a39(%rip)        # 601060 <data
173    400627:     74 07                   je     400630 <main
174    400629:     eb fe                   jmp    400629 <main+0x9>
175    40062b:     0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
176    400630:     48 83 ec 08             sub    $0x8,%rsp
177    400634:     0f b6 05 26 0a 20 00    movzbl 0x200a26(%rip),%eax        # 601061 <data_cells+0x1>
178    40063b:     c6 05 1e 0a 20 00 1a    movb   $0x1a,0x200a1e(%rip)       # 601060 <data_cells>
179    400642:     8d 78 41                lea    0x41(%rax),%edi
180    400645:     40 88 3d 15 0a 20 00    mov    %dil,0x200a15(%rip)        # 601061 <data_cells+0x1>
181    40064c:     0f 1f 40 00             nopl   0x0(%rax)
182    400650:     40 0f b6 ff             movzbl %dil,%edi
183    400654:     e8 87 fe ff ff          callq  4004e0 <putchar@plt>
184    400659:     0f b6 05 01 0a 20 00    movzbl 0x200a01(%rip),%eax        # 601061 <data_cells+0x1>
185    400660:     80 2d f9 09 20 00 01    subb   $0x1,0x2009f9(%rip)        # 601060 <data_cells>
186    400667:     8d 78 01                lea    0x1(%rax),%edi
187    40066a:     40 88 3d f0 09 20 00    mov    %dil,0x2009f0(%rip)        # 601061 <data_cells+0x1>
188    400671:     75 dd                   jne    400650 <main+0x30>
189    400673:     31 c0                   xor    %eax,%eax
190    400675:     48 83 c4 08             add    $0x8,%rsp
191    400679:     c3                      retq
192    40067a:     66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
193
194We also set up debugging information (via
195:c:func:`gcc_jit_context_new_location` and
196:c:macro:`GCC_JIT_BOOL_OPTION_DEBUGINFO`), so it's possible to use ``gdb``
197to singlestep through the generated binary and inspect the internal
198state ``idx`` and ``data_cells``:
199
200.. code-block:: console
201
202  (gdb) break main
203  Breakpoint 1 at 0x400790
204  (gdb) run
205  Starting program: a.out
206
207  Breakpoint 1, 0x0000000000400790 in main (argc=1, argv=0x7fffffffe448)
208  (gdb) stepi
209  0x0000000000400797 in main (argc=1, argv=0x7fffffffe448)
210  (gdb) stepi
211  0x00000000004007a0 in main (argc=1, argv=0x7fffffffe448)
212  (gdb) stepi
213  9     >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<
214  (gdb) list
215  4
216  5     cell 0 = 26
217  6     ++++++++++++++++++++++++++
218  7
219  8     cell 1 = 65
220  9     >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<
221  10
222  11    while cell#0 != 0
223  12    [
224  13     >
225  (gdb) n
226  6     ++++++++++++++++++++++++++
227  (gdb) n
228  9     >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<
229  (gdb) p idx
230  $1 = 1
231  (gdb) p data_cells
232  $2 = "\032", '\000' <repeats 29998 times>
233  (gdb) p data_cells[0]
234  $3 = 26 '\032'
235  (gdb) p data_cells[1]
236  $4 = 0 '\000'
237  (gdb) list
238  4
239  5     cell 0 = 26
240  6     ++++++++++++++++++++++++++
241  7
242  8     cell 1 = 65
243  9     >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<
244  10
245  11    while cell#0 != 0
246  12    [
247  13     >
248
249
250Other forms of ahead-of-time-compilation
251****************************************
252
253The above demonstrates compiling a :c:type:`gcc_jit_context *` directly
254to an executable.  It's also possible to compile it to an object file,
255and to a dynamic library.  See the documentation of
256:c:func:`gcc_jit_context_compile_to_file` for more information.
257