1[comment {-*- tcl -*- doctools manpage}]
2[manpage_begin grammar::me::cpu::gasm n 0.1]
3[copyright {2005 Andreas Kupries <andreas_kupries@users.sourceforge.net>}]
4[moddesc   {Grammar operations and usage}]
5[titledesc {ME assembler}]
6[category  {Grammars and finite automata}]
7[require grammar::me::cpu::gasm [opt 0.1]]
8[description]
9[keywords {virtual machine}]
10[keywords parsing grammar assembler graph tree]
11
12This package provides a simple in-memory assembler. Its origin is that
13of a support package for use by packages converting PEG and other
14grammars into a corresponding matcher based on the ME virtual machine,
15like [package page::compiler::peg::mecpu]. Despite that it is actually
16mostly agnostic regarding the instructions, users can choose any
17instruction set they like.
18
19[para]
20
21The program under construction is held in a graph structure (See
22package [package struct::graph]) during assembly and subsequent
23manipulation, with instructions represented by nodes, and the flow of
24execution between instructions explicitly encoded in the arcs between
25them.
26
27[para]
28
29In this model jumps are not encoded explicitly, they are implicit in
30the arcs. The generation of explicit jumps is left to any code
31converting the graph structure into a more conventional
32representation. The same goes for branches. They are implicitly
33encoded by all instructions which have two outgoing arcs, whereas all
34other instructions have only one outgoing arc. Their conditonality is
35handled by tagging their outgoing arcs with information about the
36conditions under which they are taken.
37
38[para]
39
40While the graph the assembler operates on is supplied from the
41outside, i.e. external, it does manage some internal state, namely:
42
43[list_begin enumerated]
44[enum] The handle of the graph node most assembler operations will
45work on, the [term anchor].
46
47[enum] A mapping from arbitrary strings to instructions. I.e. it is
48possible to [term label] an instruction during assembly, and later
49recall that instruction by its label.
50
51[enum] The condition code to use when creating arcs between
52instructions, which is one of [const always], [const ok], and
53[const fail].
54
55[enum] The current operation mode, one of [const halt],
56[const okfail], and [const !okfail].
57
58[enum] The name of a node in a tree. This, and the operation mode
59above are the parts most heavily influenced by the needs of a grammar
60compiler, as they assume some basic program structures (selected
61through the operation mode), and intertwine the graph with a tree,
62like the AST for the grammar to be compiled.
63
64[list_end]
65
66[section DEFINITIONS]
67
68As the graph the assembler is operating on, and the tree it is
69intertwined with, are supplied to the assembler from the outside it is
70necessary to specify the API expected from them, and to describe the
71structures expected and/or generated by the assembler in either.
72
73[para]
74
75[list_begin enumerated]
76
77[enum] Any graph object command used by the assembler has to provide
78the API as specified in the documentation for the package
79[package struct::graph].
80
81[enum] Any tree object command used by the assembler has to provide
82the API as specified in the documentation for the package
83[package struct::tree].
84
85[enum] Any instruction (node) generated by the assembler in a graph
86will have at least two, and at most three attributes:
87
88[list_begin definitions]
89
90[def [const instruction]] The value of this attribute is the name of
91the instruction. The only names currently defined by the assembler are
92the three pseudo-instructions
93
94[comment {Fix nroff backend so that the put the proper . on the command name}]
95[list_begin definitions]
96
97[def [const NOP]] This instruction does nothing. Useful for fixed
98framework nodes, unchanging jump destinations, and the like. No
99arguments.
100
101[def [const C]] A .NOP to allow the insertion of arbitrary comments
102into the instruction stream, i.e. a comment node. One argument, the
103text of the comment.
104
105[def [const BRA]] A .NOP serving as explicitly coded conditional
106branch. No arguments.
107
108[list_end]
109
110However we reserve the space of all instructions whose names begin
111with a "." (dot) for future use by the assembler.
112
113[def [const arguments]] The value of this attribute is a list of
114strings, the arguments of the instruction. The contents are dependent
115on the actual instruction and the assembler doesn't know or care about
116them. This means for example that it has no builtin knowledge about
117what instruction need which arguments and thus doesn't perform any
118type of checking.
119
120[def [const expr]] This attribute is optional. When it is present its
121value is the name of a node in the tree intertwined with the graph.
122
123[list_end]
124
125[enum] Any arc between two instructions will have one attribute:
126
127[list_begin definitions]
128
129[def [const condition]] The value of this attribute determines under which
130condition execution will take this arc. It is one of [const always],
131[const ok], and [const fail]. The first condition is used for all arcs
132which are the single outgoing arc of an instruction. The other two are
133used for the two outgoing arcs of an instruction which implicitly
134encode a branch.
135
136[list_end]
137
138[enum] A tree node given to the assembler for cross-referencing will
139be written to and given the following attributes, some fixed, some
140dependent on the operation mode. All values will be references to
141nodes in the instruction graph. Some of the instruction will expect
142some or specific sets of these attributes.
143
144[list_begin definitions]
145[def [const gas::entry]]      Always written.
146[def [const gas::exit]]       Written for all modes but [const okfail].
147[def [const gas::exit::ok]]   Written for mode [const okfail].
148[def [const gas::exit::fail]] Written for mode [const okfail].
149[list_end]
150
151[list_end]
152
153
154[section API]
155
156[list_begin definitions]
157
158[call [cmd ::grammar::me::cpu::gasm::begin] [arg g] [arg n] [opt [arg mode]] [opt [arg note]]]
159
160This command starts the assembly of an instruction sequence, and
161(re)initializes the state of the assembler. After completion of the
162instruction sequence use [cmd ::grammar::me::cpu::gasm::done] to
163finalize the assembler.
164
165[para]
166
167It will operate on the graph [arg g] in the specified [arg mode]
168(Default is [const okfail]). As part of the initialization it will
169always create a standard .NOP instruction and label it "entry". The
170creation of the remaining standard instructions is
171[arg mode]-dependent:
172
173[list_begin definitions]
174
175[def [const halt]] An "icf_halt" instruction labeled "exit/return".
176
177[def [const !okfail]] An "icf_ntreturn" instruction labeled "exit/return".
178
179[def [const okfail]] Two .NOP instructions labeled "exit/ok" and
180"exit/fail" respectively.
181
182[list_end]
183
184The [arg note], if specified (default is not), is given to the "entry" .NOP instruction.
185
186[para]
187
188The node reference [arg n] is simply stored for use by
189[cmd ::grammar::me::cpu::gasm::done]. It has to refer to a node in the
190tree [arg t] argument of that command.
191
192[para]
193
194After the initialization is done the "entry" instruction will be the
195[term anchor], and the condition code will be set to [const always].
196
197[para]
198
199The command returns the empy string as its result.
200
201
202[call [cmd ::grammar::me::cpu::gasm::done] [const -->] [arg t]]
203
204This command finalizes the creation of an instruction sequence and
205then clears the state of the assembler.
206[emph NOTE] that this [emph {does not}] delete any of the created
207instructions. They can be made available to future begin/done cycles.
208Further assembly will be possible only after reinitialization of the
209system via [cmd ::grammar::me::cpu::gasm::begin].
210
211[para]
212
213Before the state is cleared selected references to selected
214instructions will be written to attributes of the node [arg n] in the
215tree [arg t].
216
217Which instructions are saved is [arg mode]-dependent. Both [arg mode]
218and the destination node [arg n] were specified during invokation of
219[cmd ::grammar::me::cpu::gasm::begin].
220
221[para]
222
223Independent of the mode a reference to the instruction labeled "entry"
224will be saved to the attribute [const gas::entry] of [arg n]. The
225reference to the node [arg n] will further be saved into the attribute
226"expr" of the "entry" instruction. Beyond that
227
228[list_begin definitions]
229
230[def [const halt]] A reference to the instruction labeled
231"exit/return" will be saved to the attribute [const gas::exit] of
232[arg n].
233
234[def [const okfail]] See [const halt].
235
236[def [const !okfail]] Reference to the two instructions labeled
237"exit/ok" and "exit/fail" will be saved to the attributes
238[const gas::exit::ok] and [const gas::exit::fail] of [arg n]
239respectively.
240
241[list_end]
242
243[para]
244
245The command returns the empy string as its result.
246
247
248[call [cmd ::grammar::me::cpu::gasm::state]]
249
250This command returns the current state of the assembler. Its format is
251not documented and considered to be internal to the package.
252
253
254[call [cmd ::grammar::me::cpu::gasm::state!] [arg s]]
255
256This command takes a serialized assembler state [arg s] as returned by
257[cmd ::grammar::me::cpu::gasm::state] and makes it the current state
258of the assembler.
259
260[para]
261
262[emph Note] that this may overwrite label definitions, however all
263non-conflicting label definitions in the state before are not touched
264and merged with [arg s].
265
266[para]
267
268The command returns the empty string as its result.
269
270
271[call [cmd ::grammar::me::cpu::gasm::lift] [arg t] [arg dst] [const =] [arg src]]
272
273This command operates on the tree [arg t]. It copies the contents of
274the attributes [const gas::entry], [const gas::exit::ok] and
275[const gas::exit::fail] from the node [arg src] to the node [arg dst].
276
277It returns the empty string as its result.
278
279
280[call [cmd ::grammar::me::cpu::gasm::Inline] [arg t] [arg node] [arg label]]
281
282This command links an instruction sequence created by an earlier
283begin/done pair into the current instruction sequence.
284
285[para]
286
287To this end it
288
289[list_begin enumerated]
290
291[enum] reads the instruction references from the attributes
292[const gas::entry], [const gas::exit::ok], and [const gas::exit::fail]
293from the node [arg n] of the tree [arg t] and makes them available to
294assembler und the labels [arg label]/entry, [arg label]/exit::ok, and
295[arg label]/exit::fail respectively.
296
297[enum] Creates an arc from the [term anchor] to the node labeled
298[arg label]/entry, and tags it with the current condition code.
299
300[enum] Makes the node labeled [arg label]/exit/ok the new [term anchor].
301
302[list_end]
303
304The command returns the empty string as its result.
305
306
307[call [cmd ::grammar::me::cpu::gasm::Cmd] [arg cmd] [opt [arg arg]...]]
308
309This is the basic command to add instructions to the graph.
310
311It creates a new instruction of type [arg cmd] with the given
312arguments [arg arg]...
313
314If the [term anchor] was defined it will also create an arc from the
315[term anchor] to the new instruction using the current condition code.
316
317After the call the new instruction will be the [term anchor] and the
318current condition code will be set to [const always].
319
320[para]
321
322The command returns the empty string as its result.
323
324
325[call [cmd ::grammar::me::cpu::gasm::Bra]]
326
327This is a convenience command to create a .BRA pseudo-instruction. It
328uses [cmd ::grammar::me::cpu::gasm::Cmd] to actually create the
329instruction and inherits its behaviour.
330
331
332[call [cmd ::grammar::me::cpu::gasm::Nop] [arg text]]
333
334This is a convenience command to create a .NOP pseudo-instruction. It
335uses [cmd ::grammar::me::cpu::gasm::Cmd] to actually create the
336instruction and inherits its behaviour.
337
338The [arg text] will be saved as the first and only argument of the new
339instruction.
340
341
342[call [cmd ::grammar::me::cpu::gasm::Note] [arg text]]
343
344This is a convenience command to create a .C pseudo-instruction,
345i.e. a comment. It uses [cmd ::grammar::me::cpu::gasm::Cmd] to
346actually create the instruction and inherits its behaviour.
347
348The [arg text] will be saved as the first and only argument of the new
349instruction.
350
351
352[call [cmd ::grammar::me::cpu::gasm::Jmp] [arg label]]
353
354This command creates an arc from the [term anchor] to the instruction
355labeled with [arg label], and tags with the the current condition
356code.
357
358[para]
359
360The command returns the empty string as its result.
361
362[call [cmd ::grammar::me::cpu::gasm::Exit]]
363
364This command creates an arc from the [term anchor] to one of the exit
365instructions, based on the operation mode (see
366[cmd ::grammar::me::cpu::gasm::begin]), and tags it with current
367condition code.
368
369[para]
370
371For mode [const okfail] it links to the instruction labeled either
372"exit/ok" or "exit/fail", depending on the current condition code, and
373tagging it with the current condition code
374
375For the other two modes it links to the instruction labeled
376"exit/return", tagging it condition code [const always], independent
377the current condition code.
378
379[para]
380
381The command returns the empty string as its result.
382
383
384[call [cmd ::grammar::me::cpu::gasm::Who] [arg label]]
385
386This command returns a reference to the instruction labeled with
387[arg label].
388
389
390[call [cmd ::grammar::me::cpu::gasm::/Label] [arg name]]
391
392This command labels the [term anchor] with [arg name].
393
394[emph Note] that an instruction can have more than one label.
395
396[para]
397
398The command returns the empty string as its result.
399
400
401[call [cmd ::grammar::me::cpu::gasm::/Clear]]
402
403This command clears the [term anchor], leaving it undefined, and
404further resets the current condition code to [const always].
405
406[para]
407
408The command returns the empty string as its result.
409
410
411[call [cmd ::grammar::me::cpu::gasm::/Ok]]
412
413This command sets the current condition code to [const ok].
414
415[para]
416
417The command returns the empty string as its result.
418
419
420[call [cmd ::grammar::me::cpu::gasm::/Fail]]
421
422This command sets the current condition code to [const fail].
423
424[para]
425
426The command returns the empty string as its result.
427
428
429[call [cmd ::grammar::me::cpu::gasm::/At] [arg name]]
430
431This command sets the [term anchor] to the instruction labeled with
432[arg name], and further resets the current condition code to
433[const always].
434
435[para]
436
437The command returns the empty string as its result.
438
439[call [cmd ::grammar::me::cpu::gasm::/CloseLoop]]
440
441This command marks the [term anchor] as the last instruction in a loop
442body, by creating the attribute [const LOOP].
443
444[para]
445
446The command returns the empty string as its result.
447
448
449[list_end]
450
451[section {BUGS, IDEAS, FEEDBACK}]
452
453This document, and the package it describes, will undoubtedly contain
454bugs and other problems.
455
456Please report such in the category [emph grammar_me] of the
457[uri {http://sourceforge.net/tracker/?group_id=12883} {Tcllib SF Trackers}].
458
459Please also report any ideas for enhancements you may have for either
460package and/or documentation.
461
462[manpage_end]
463