tutorial05.rst revision 1.1.1.6
1.. Copyright (C) 2015-2018 Free Software Foundation, Inc. 2 Originally contributed by David Malcolm <dmalcolm@redhat.com> 3 4 This is free software: you can redistribute it and/or modify it 5 under the terms of the GNU General Public License as published by 6 the Free Software Foundation, either version 3 of the License, or 7 (at your option) any later version. 8 9 This program is distributed in the hope that it will be useful, but 10 WITHOUT ANY WARRANTY; without even the implied warranty of 11 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 12 General Public License for more details. 13 14 You should have received a copy of the GNU General Public License 15 along with this program. If not, see 16 <http://www.gnu.org/licenses/>. 17 18Tutorial part 5: Implementing an Ahead-of-Time compiler 19------------------------------------------------------- 20 21If you have a pre-existing language frontend that's compatible with 22libgccjit's license, it's possible to hook it up to libgccjit as a 23backend. In the previous example we showed 24how to do that for in-memory JIT-compilation, but libgccjit can also 25compile code directly to a file, allowing you to implement a more 26traditional ahead-of-time compiler ("JIT" is something of a misnomer 27for this use-case). 28 29The essential difference is to compile the context using 30:c:func:`gcc_jit_context_compile_to_file` rather than 31:c:func:`gcc_jit_context_compile`. 32 33The "brainf" language 34********************* 35 36In this example we use libgccjit to construct an ahead-of-time compiler 37for an esoteric programming language that we shall refer to as "brainf". 38 39brainf scripts operate on an array of bytes, with a notional data pointer 40within the array. 41 42brainf is hard for humans to read, but it's trivial to write a parser for 43it, as there is no lexing; just a stream of bytes. The operations are: 44 45====================== ============================= 46Character Meaning 47====================== ============================= 48``>`` ``idx += 1`` 49``<`` ``idx -= 1`` 50``+`` ``data[idx] += 1`` 51``-`` ``data[idx] -= 1`` 52``.`` ``output (data[idx])`` 53``,`` ``data[idx] = input ()`` 54``[`` loop until ``data[idx] == 0`` 55``]`` end of loop 56Anything else ignored 57====================== ============================= 58 59Unlike the previous example, we'll implement an ahead-of-time compiler, 60which reads ``.bf`` scripts and outputs executables (though it would 61be trivial to have it run them JIT-compiled in-process). 62 63Here's what a simple ``.bf`` script looks like: 64 65 .. literalinclude:: ../examples/emit-alphabet.bf 66 :lines: 1- 67 68.. note:: 69 70 This example makes use of whitespace and comments for legibility, but 71 could have been written as:: 72 73 ++++++++++++++++++++++++++ 74 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< 75 [>.+<-] 76 77 It's not a particularly useful language, except for providing 78 compiler-writers with a test case that's easy to parse. The point 79 is that you can use :c:func:`gcc_jit_context_compile_to_file` 80 to use libgccjit as a backend for a pre-existing language frontend 81 (provided that the pre-existing frontend is compatible with libgccjit's 82 license). 83 84Converting a brainf script to libgccjit IR 85****************************************** 86 87As before we write simple code to populate a :c:type:`gcc_jit_context *`. 88 89 .. literalinclude:: ../examples/tut05-bf.c 90 :start-after: #define MAX_OPEN_PARENS 16 91 :end-before: /* Entrypoint to the compiler. */ 92 :language: c 93 94Compiling a context to a file 95***************************** 96 97Unlike the previous tutorial, this time we'll compile the context 98directly to an executable, using :c:func:`gcc_jit_context_compile_to_file`: 99 100.. code-block:: c 101 102 gcc_jit_context_compile_to_file (ctxt, 103 GCC_JIT_OUTPUT_KIND_EXECUTABLE, 104 output_file); 105 106Here's the top-level of the compiler, which is what actually calls into 107:c:func:`gcc_jit_context_compile_to_file`: 108 109 .. literalinclude:: ../examples/tut05-bf.c 110 :start-after: /* Entrypoint to the compiler. */ 111 :end-before: /* Use the built compiler to compile the example to an executable: 112 :language: c 113 114Note how once the context is populated you could trivially instead compile 115it to memory using :c:func:`gcc_jit_context_compile` and run it in-process 116as in the previous tutorial. 117 118To create an executable, we need to export a ``main`` function. Here's 119how to create one from the JIT API: 120 121 .. literalinclude:: ../examples/tut05-bf.c 122 :start-after: #include "libgccjit.h" 123 :end-before: #define MAX_OPEN_PARENS 16 124 :language: c 125 126.. note:: 127 128 The above implementation ignores ``argc`` and ``argv``, but you could 129 make use of them by exposing ``param_argc`` and ``param_argv`` to the 130 caller. 131 132Upon compiling this C code, we obtain a bf-to-machine-code compiler; 133let's call it ``bfc``: 134 135.. code-block:: console 136 137 $ gcc \ 138 tut05-bf.c \ 139 -o bfc \ 140 -lgccjit 141 142We can now use ``bfc`` to compile .bf files into machine code executables: 143 144.. code-block:: console 145 146 $ ./bfc \ 147 emit-alphabet.bf \ 148 a.out 149 150which we can run directly: 151 152.. code-block:: console 153 154 $ ./a.out 155 ABCDEFGHIJKLMNOPQRSTUVWXYZ 156 157Success! 158 159We can also inspect the generated executable using standard tools: 160 161.. code-block:: console 162 163 $ objdump -d a.out |less 164 165which shows that libgccjit has managed to optimize the function 166somewhat (for example, the runs of 26 and 65 increment operations 167have become integer constants 0x1a and 0x41): 168 169.. code-block:: console 170 171 0000000000400620 <main>: 172 400620: 80 3d 39 0a 20 00 00 cmpb $0x0,0x200a39(%rip) # 601060 <data 173 400627: 74 07 je 400630 <main 174 400629: eb fe jmp 400629 <main+0x9> 175 40062b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 176 400630: 48 83 ec 08 sub $0x8,%rsp 177 400634: 0f b6 05 26 0a 20 00 movzbl 0x200a26(%rip),%eax # 601061 <data_cells+0x1> 178 40063b: c6 05 1e 0a 20 00 1a movb $0x1a,0x200a1e(%rip) # 601060 <data_cells> 179 400642: 8d 78 41 lea 0x41(%rax),%edi 180 400645: 40 88 3d 15 0a 20 00 mov %dil,0x200a15(%rip) # 601061 <data_cells+0x1> 181 40064c: 0f 1f 40 00 nopl 0x0(%rax) 182 400650: 40 0f b6 ff movzbl %dil,%edi 183 400654: e8 87 fe ff ff callq 4004e0 <putchar@plt> 184 400659: 0f b6 05 01 0a 20 00 movzbl 0x200a01(%rip),%eax # 601061 <data_cells+0x1> 185 400660: 80 2d f9 09 20 00 01 subb $0x1,0x2009f9(%rip) # 601060 <data_cells> 186 400667: 8d 78 01 lea 0x1(%rax),%edi 187 40066a: 40 88 3d f0 09 20 00 mov %dil,0x2009f0(%rip) # 601061 <data_cells+0x1> 188 400671: 75 dd jne 400650 <main+0x30> 189 400673: 31 c0 xor %eax,%eax 190 400675: 48 83 c4 08 add $0x8,%rsp 191 400679: c3 retq 192 40067a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 193 194We also set up debugging information (via 195:c:func:`gcc_jit_context_new_location` and 196:c:macro:`GCC_JIT_BOOL_OPTION_DEBUGINFO`), so it's possible to use ``gdb`` 197to singlestep through the generated binary and inspect the internal 198state ``idx`` and ``data_cells``: 199 200.. code-block:: console 201 202 (gdb) break main 203 Breakpoint 1 at 0x400790 204 (gdb) run 205 Starting program: a.out 206 207 Breakpoint 1, 0x0000000000400790 in main (argc=1, argv=0x7fffffffe448) 208 (gdb) stepi 209 0x0000000000400797 in main (argc=1, argv=0x7fffffffe448) 210 (gdb) stepi 211 0x00000000004007a0 in main (argc=1, argv=0x7fffffffe448) 212 (gdb) stepi 213 9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< 214 (gdb) list 215 4 216 5 cell 0 = 26 217 6 ++++++++++++++++++++++++++ 218 7 219 8 cell 1 = 65 220 9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< 221 10 222 11 while cell#0 != 0 223 12 [ 224 13 > 225 (gdb) n 226 6 ++++++++++++++++++++++++++ 227 (gdb) n 228 9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< 229 (gdb) p idx 230 $1 = 1 231 (gdb) p data_cells 232 $2 = "\032", '\000' <repeats 29998 times> 233 (gdb) p data_cells[0] 234 $3 = 26 '\032' 235 (gdb) p data_cells[1] 236 $4 = 0 '\000' 237 (gdb) list 238 4 239 5 cell 0 = 26 240 6 ++++++++++++++++++++++++++ 241 7 242 8 cell 1 = 65 243 9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< 244 10 245 11 while cell#0 != 0 246 12 [ 247 13 > 248 249 250Other forms of ahead-of-time-compilation 251**************************************** 252 253The above demonstrates compiling a :c:type:`gcc_jit_context *` directly 254to an executable. It's also possible to compile it to an object file, 255and to a dynamic library. See the documentation of 256:c:func:`gcc_jit_context_compile_to_file` for more information. 257