1The following text is a brief overview of those key
2principles which are useful to know when generating code
3with SLJIT. Further details can be found in sljitLir.h.
4
5----------------------------------------------------------------
6  What is SLJIT?
7----------------------------------------------------------------
8
9SLJIT is a platform independent assembler which
10  - provides access to common CPU features
11  - can be easily ported to wide-spread CPU
12    architectures (e.g. x86, ARM, POWER, MIPS, SPARC)
13
14The key challenge of this project is finding a common
15subset of CPU features which
16  - covers traditional assembly level programming
17  - can be translated to machine code efficiently
18
19This aim is achieved by selecting those instructions / CPU
20features which are either available on all platforms or
21simulating them has a low performance overhead.
22
23For example, some SLJIT instructions support base register
24pre-update when [base+offs] memory accessing mode is used.
25Although this feature is only available on ARM and POWER
26CPUs, the simulation overhead is low on other CPUs.
27
28----------------------------------------------------------------
29  The generic CPU model of SLJIT
30----------------------------------------------------------------
31
32The CPU has
33  - integer registers, which can store either an
34    int32_t (4 byte) or intptr_t (4 or 8 byte) value
35  - floating point registers, which can store either a
36    single (4 byte) or double (8 byte) precision value
37  - boolean status flags
38
39*** Integer registers:
40
41The most important rule is: when a source operand of
42an instruction is a register, the data type of the
43register must match the data type expected by an
44instruction.
45
46For example, the following code snippet
47is a valid instruction sequence:
48
49    sljit_emit_op1(compiler, SLJIT_IMOV,
50        SLJIT_R0, 0, SLJIT_MEM1(SLJIT_R1), 0);
51    // An int32_t value is loaded into SLJIT_R0
52    sljit_emit_op1(compiler, SLJIT_INEG,
53        SLJIT_R0, 0, SLJIT_R0, 0);
54    // the int32_t value in SLJIT_R0 is negated
55    // and the type of the result is still int32_t
56
57The next code snippet is not allowed:
58
59    sljit_emit_op1(compiler, SLJIT_MOV,
60        SLJIT_R0, 0, SLJIT_MEM1(SLJIT_R1), 0);
61    // An intptr_t value is loaded into SLJIT_R0
62    sljit_emit_op1(compiler, SLJIT_INEG,
63        SLJIT_R0, 0, SLJIT_R0, 0);
64    // The result of SLJIT_INEG instruction
65    // is undefined. Even crash is possible
66    // (e.g. on MIPS-64).
67
68However, it is always allowed to overwrite a
69register regardless its previous value:
70
71    sljit_emit_op1(compiler, SLJIT_MOV,
72        SLJIT_R0, 0, SLJIT_MEM1(SLJIT_R1), 0);
73    // An intptr_t value is loaded into SLJIT_R0
74    sljit_emit_op1(compiler, SLJIT_IMOV,
75        SLJIT_R0, 0, SLJIT_MEM1(SLJIT_R2), 0);
76    // From now on SLJIT_R0 contains an int32_t
77    // value. The previous value is discarded.
78
79Type conversion instructions are provided to convert an
80int32_t value to an intptr_t value and vice versa. In
81certain architectures these conversions are nops (no
82instructions are emitted).
83
84Memory accessing:
85
86Registers arguments of SLJIT_MEM1 / SLJIT_MEM2 addressing
87modes must contain intptr_t data.
88
89Signed / unsigned values:
90
91Most operations are executed in the same way regardless
92the value is signed or unsigned. These operations have
93only one instruction form (e.g. SLJIT_ADD / SLJIT_MUL).
94Instructions where the result depends on the sign have
95two forms (e.g. integer division, long multiply).
96
97*** Floating point registers
98
99Floating point registers can either contain a single
100or double precision value. Similar to integer registers,
101the data type of the value stored in a source register
102must match the data type expected by the instruction.
103Otherwise the result is undefined (even crash is possible).
104
105Rounding:
106
107Similar to standard C, floating point computation
108results are rounded toward zero.
109
110*** Boolean status flags:
111
112Conditional branches usually depend on the value
113of CPU status flags. These status flags are boolean
114values and can be set by certain instructions.
115
116To achive maximum efficiency and portability, the
117following rules were introduced:
118  - Most instructions can freely modify these status
119    flags except if SLJIT_KEEP_FLAGS is passed.
120  - The SLJIT_KEEP_FLAGS option may have a performance
121    overhead, so it should only be used when necessary.
122  - The SLJIT_SET_E, SLJIT_SET_U, etc. options can
123    force an instruction to correctly set the
124    specified status flags. However, all other
125    status flags are undefined. This rule must
126    always be kept in mind!
127  - Status flags cannot be controlled directly
128    (there are no set/clear/invert operations)
129
130The last two rules allows efficent mapping of status flags.
131For example the arithmetic and multiply overflow flag is
132mapped to the same overflow flag bit on x86. This is allowed,
133since no instruction can set both of these flags. When
134either of them is set by an instruction, the other can
135have any value (this satisfies the "all other flags are
136undefined" rule). Therefore mapping two SLJIT flags to the
137same CPU flag is possible. Even though SLJIT supports
138a dozen status flags, they can be efficiently mapped
139to CPUs with only 4 status flags (e.g. ARM or SPARC).
140
141----------------------------------------------------------------
142  Complex instructions
143----------------------------------------------------------------
144
145We noticed, that introducing complex instructions for common
146tasks can improve performance. For example, compare and
147branch instruction sequences can be optimized if certain
148conditions apply, but these conditions depend on the target
149CPU. SLJIT can do these optimizations, but it needs to
150understand the "purpose" of the generated code. Static
151instruction analysis has a large performance overhead
152however, so we choose another approach: we introduced
153complex instruction forms for certain non-atomic tasks.
154SLJIT can optimize these "instructions" more efficiently
155since the "purpose" is known to the compiler. These complex
156instruction forms can often be assembled from other SLJIT
157instructions, but we recommended to use them since the
158compiler can optimize them on certain CPUs.
159
160----------------------------------------------------------------
161  Generating functions
162----------------------------------------------------------------
163
164SLJIT is often used for generating function bodies which are
165called from C. SLJIT provides two complex instructions for
166generating function entry and return: sljit_emit_enter and
167sljit_emit_return. The sljit_emit_enter also initializes the
168"compiling context" which specify the current register mapping,
169local space size, etc. configurations. The sljit_set_context
170can also set this context without emitting any machine
171instructions.
172
173This context is important since it affects the compiler, so
174the first instruction after a compiler is created must be
175either sljit_emit_enter or sljit_set_context. The context can
176be changed by calling sljit_emit_enter or sljit_set_context
177again.
178
179----------------------------------------------------------------
180  All-in-one building
181----------------------------------------------------------------
182
183Instead of using a separate library, the whole SLJIT
184compiler infrastructure can be directly included:
185
186#define SLJIT_CONFIG_STATIC 1
187#include "sljitLir.c"
188
189This approach is useful for single file compilers.
190
191Advantages:
192  - Everything provided by SLJIT is available
193    (no need to include anything else).
194  - Configuring SLJIT is easy
195    (e.g. redefining SLJIT_MALLOC / SLJIT_FREE).
196  - The SLJIT compiler API is hidden from the
197    world which improves securtity.
198  - The C compiler can optimize the SLJIT code
199    generator (e.g. removing unused functions).
200
201----------------------------------------------------------------
202  Types and macros
203----------------------------------------------------------------
204
205The sljitConfig.h contains those defines, which controls
206the compiler. The beginning of sljitConfigInternal.h
207lists architecture specific types and macros provided
208by SLJIT. Some of these macros:
209
210SLJIT_DEBUG : enabled by default
211  Enables assertions. Should be disabled in release mode.
212
213SLJIT_VERBOSE : enabled by default
214  When this macro is enabled, the sljit_compiler_verbose
215  function can be used to dump SLJIT instructions.
216  Otherwise this function is not available. Should be
217  disabled in release mode.
218
219SLJIT_SINGLE_THREADED : disabled by default
220  Single threaded programs can define this flag which
221  eliminates the pthread dependency.
222
223sljit_sw, sljit_uw, etc. :
224  It is recommended to use these types instead of long,
225  intptr_t, etc. Improves readability / portability of
226  the code.
227