1[comment {-*- tcl -*- doctools manpage}] 2[manpage_begin grammar::me::cpu::gasm n 0.1] 3[copyright {2005 Andreas Kupries <andreas_kupries@users.sourceforge.net>}] 4[moddesc {Grammar operations and usage}] 5[titledesc {ME assembler}] 6[category {Grammars and finite automata}] 7[require grammar::me::cpu::gasm [opt 0.1]] 8[description] 9[keywords {virtual machine}] 10[keywords parsing grammar assembler graph tree] 11 12This package provides a simple in-memory assembler. Its origin is that 13of a support package for use by packages converting PEG and other 14grammars into a corresponding matcher based on the ME virtual machine, 15like [package page::compiler::peg::mecpu]. Despite that it is actually 16mostly agnostic regarding the instructions, users can choose any 17instruction set they like. 18 19[para] 20 21The program under construction is held in a graph structure (See 22package [package struct::graph]) during assembly and subsequent 23manipulation, with instructions represented by nodes, and the flow of 24execution between instructions explicitly encoded in the arcs between 25them. 26 27[para] 28 29In this model jumps are not encoded explicitly, they are implicit in 30the arcs. The generation of explicit jumps is left to any code 31converting the graph structure into a more conventional 32representation. The same goes for branches. They are implicitly 33encoded by all instructions which have two outgoing arcs, whereas all 34other instructions have only one outgoing arc. Their conditonality is 35handled by tagging their outgoing arcs with information about the 36conditions under which they are taken. 37 38[para] 39 40While the graph the assembler operates on is supplied from the 41outside, i.e. external, it does manage some internal state, namely: 42 43[list_begin enumerated] 44[enum] The handle of the graph node most assembler operations will 45work on, the [term anchor]. 46 47[enum] A mapping from arbitrary strings to instructions. I.e. it is 48possible to [term label] an instruction during assembly, and later 49recall that instruction by its label. 50 51[enum] The condition code to use when creating arcs between 52instructions, which is one of [const always], [const ok], and 53[const fail]. 54 55[enum] The current operation mode, one of [const halt], 56[const okfail], and [const !okfail]. 57 58[enum] The name of a node in a tree. This, and the operation mode 59above are the parts most heavily influenced by the needs of a grammar 60compiler, as they assume some basic program structures (selected 61through the operation mode), and intertwine the graph with a tree, 62like the AST for the grammar to be compiled. 63 64[list_end] 65 66[section DEFINITIONS] 67 68As the graph the assembler is operating on, and the tree it is 69intertwined with, are supplied to the assembler from the outside it is 70necessary to specify the API expected from them, and to describe the 71structures expected and/or generated by the assembler in either. 72 73[para] 74 75[list_begin enumerated] 76 77[enum] Any graph object command used by the assembler has to provide 78the API as specified in the documentation for the package 79[package struct::graph]. 80 81[enum] Any tree object command used by the assembler has to provide 82the API as specified in the documentation for the package 83[package struct::tree]. 84 85[enum] Any instruction (node) generated by the assembler in a graph 86will have at least two, and at most three attributes: 87 88[list_begin definitions] 89 90[def [const instruction]] The value of this attribute is the name of 91the instruction. The only names currently defined by the assembler are 92the three pseudo-instructions 93 94[comment {Fix nroff backend so that the put the proper . on the command name}] 95[list_begin definitions] 96 97[def [const NOP]] This instruction does nothing. Useful for fixed 98framework nodes, unchanging jump destinations, and the like. No 99arguments. 100 101[def [const C]] A .NOP to allow the insertion of arbitrary comments 102into the instruction stream, i.e. a comment node. One argument, the 103text of the comment. 104 105[def [const BRA]] A .NOP serving as explicitly coded conditional 106branch. No arguments. 107 108[list_end] 109 110However we reserve the space of all instructions whose names begin 111with a "." (dot) for future use by the assembler. 112 113[def [const arguments]] The value of this attribute is a list of 114strings, the arguments of the instruction. The contents are dependent 115on the actual instruction and the assembler doesn't know or care about 116them. This means for example that it has no builtin knowledge about 117what instruction need which arguments and thus doesn't perform any 118type of checking. 119 120[def [const expr]] This attribute is optional. When it is present its 121value is the name of a node in the tree intertwined with the graph. 122 123[list_end] 124 125[enum] Any arc between two instructions will have one attribute: 126 127[list_begin definitions] 128 129[def [const condition]] The value of this attribute determines under which 130condition execution will take this arc. It is one of [const always], 131[const ok], and [const fail]. The first condition is used for all arcs 132which are the single outgoing arc of an instruction. The other two are 133used for the two outgoing arcs of an instruction which implicitly 134encode a branch. 135 136[list_end] 137 138[enum] A tree node given to the assembler for cross-referencing will 139be written to and given the following attributes, some fixed, some 140dependent on the operation mode. All values will be references to 141nodes in the instruction graph. Some of the instruction will expect 142some or specific sets of these attributes. 143 144[list_begin definitions] 145[def [const gas::entry]] Always written. 146[def [const gas::exit]] Written for all modes but [const okfail]. 147[def [const gas::exit::ok]] Written for mode [const okfail]. 148[def [const gas::exit::fail]] Written for mode [const okfail]. 149[list_end] 150 151[list_end] 152 153 154[section API] 155 156[list_begin definitions] 157 158[call [cmd ::grammar::me::cpu::gasm::begin] [arg g] [arg n] [opt [arg mode]] [opt [arg note]]] 159 160This command starts the assembly of an instruction sequence, and 161(re)initializes the state of the assembler. After completion of the 162instruction sequence use [cmd ::grammar::me::cpu::gasm::done] to 163finalize the assembler. 164 165[para] 166 167It will operate on the graph [arg g] in the specified [arg mode] 168(Default is [const okfail]). As part of the initialization it will 169always create a standard .NOP instruction and label it "entry". The 170creation of the remaining standard instructions is 171[arg mode]-dependent: 172 173[list_begin definitions] 174 175[def [const halt]] An "icf_halt" instruction labeled "exit/return". 176 177[def [const !okfail]] An "icf_ntreturn" instruction labeled "exit/return". 178 179[def [const okfail]] Two .NOP instructions labeled "exit/ok" and 180"exit/fail" respectively. 181 182[list_end] 183 184The [arg note], if specified (default is not), is given to the "entry" .NOP instruction. 185 186[para] 187 188The node reference [arg n] is simply stored for use by 189[cmd ::grammar::me::cpu::gasm::done]. It has to refer to a node in the 190tree [arg t] argument of that command. 191 192[para] 193 194After the initialization is done the "entry" instruction will be the 195[term anchor], and the condition code will be set to [const always]. 196 197[para] 198 199The command returns the empy string as its result. 200 201 202[call [cmd ::grammar::me::cpu::gasm::done] [const -->] [arg t]] 203 204This command finalizes the creation of an instruction sequence and 205then clears the state of the assembler. 206[emph NOTE] that this [emph {does not}] delete any of the created 207instructions. They can be made available to future begin/done cycles. 208Further assembly will be possible only after reinitialization of the 209system via [cmd ::grammar::me::cpu::gasm::begin]. 210 211[para] 212 213Before the state is cleared selected references to selected 214instructions will be written to attributes of the node [arg n] in the 215tree [arg t]. 216 217Which instructions are saved is [arg mode]-dependent. Both [arg mode] 218and the destination node [arg n] were specified during invokation of 219[cmd ::grammar::me::cpu::gasm::begin]. 220 221[para] 222 223Independent of the mode a reference to the instruction labeled "entry" 224will be saved to the attribute [const gas::entry] of [arg n]. The 225reference to the node [arg n] will further be saved into the attribute 226"expr" of the "entry" instruction. Beyond that 227 228[list_begin definitions] 229 230[def [const halt]] A reference to the instruction labeled 231"exit/return" will be saved to the attribute [const gas::exit] of 232[arg n]. 233 234[def [const okfail]] See [const halt]. 235 236[def [const !okfail]] Reference to the two instructions labeled 237"exit/ok" and "exit/fail" will be saved to the attributes 238[const gas::exit::ok] and [const gas::exit::fail] of [arg n] 239respectively. 240 241[list_end] 242 243[para] 244 245The command returns the empy string as its result. 246 247 248[call [cmd ::grammar::me::cpu::gasm::state]] 249 250This command returns the current state of the assembler. Its format is 251not documented and considered to be internal to the package. 252 253 254[call [cmd ::grammar::me::cpu::gasm::state!] [arg s]] 255 256This command takes a serialized assembler state [arg s] as returned by 257[cmd ::grammar::me::cpu::gasm::state] and makes it the current state 258of the assembler. 259 260[para] 261 262[emph Note] that this may overwrite label definitions, however all 263non-conflicting label definitions in the state before are not touched 264and merged with [arg s]. 265 266[para] 267 268The command returns the empty string as its result. 269 270 271[call [cmd ::grammar::me::cpu::gasm::lift] [arg t] [arg dst] [const =] [arg src]] 272 273This command operates on the tree [arg t]. It copies the contents of 274the attributes [const gas::entry], [const gas::exit::ok] and 275[const gas::exit::fail] from the node [arg src] to the node [arg dst]. 276 277It returns the empty string as its result. 278 279 280[call [cmd ::grammar::me::cpu::gasm::Inline] [arg t] [arg node] [arg label]] 281 282This command links an instruction sequence created by an earlier 283begin/done pair into the current instruction sequence. 284 285[para] 286 287To this end it 288 289[list_begin enumerated] 290 291[enum] reads the instruction references from the attributes 292[const gas::entry], [const gas::exit::ok], and [const gas::exit::fail] 293from the node [arg n] of the tree [arg t] and makes them available to 294assembler und the labels [arg label]/entry, [arg label]/exit::ok, and 295[arg label]/exit::fail respectively. 296 297[enum] Creates an arc from the [term anchor] to the node labeled 298[arg label]/entry, and tags it with the current condition code. 299 300[enum] Makes the node labeled [arg label]/exit/ok the new [term anchor]. 301 302[list_end] 303 304The command returns the empty string as its result. 305 306 307[call [cmd ::grammar::me::cpu::gasm::Cmd] [arg cmd] [opt [arg arg]...]] 308 309This is the basic command to add instructions to the graph. 310 311It creates a new instruction of type [arg cmd] with the given 312arguments [arg arg]... 313 314If the [term anchor] was defined it will also create an arc from the 315[term anchor] to the new instruction using the current condition code. 316 317After the call the new instruction will be the [term anchor] and the 318current condition code will be set to [const always]. 319 320[para] 321 322The command returns the empty string as its result. 323 324 325[call [cmd ::grammar::me::cpu::gasm::Bra]] 326 327This is a convenience command to create a .BRA pseudo-instruction. It 328uses [cmd ::grammar::me::cpu::gasm::Cmd] to actually create the 329instruction and inherits its behaviour. 330 331 332[call [cmd ::grammar::me::cpu::gasm::Nop] [arg text]] 333 334This is a convenience command to create a .NOP pseudo-instruction. It 335uses [cmd ::grammar::me::cpu::gasm::Cmd] to actually create the 336instruction and inherits its behaviour. 337 338The [arg text] will be saved as the first and only argument of the new 339instruction. 340 341 342[call [cmd ::grammar::me::cpu::gasm::Note] [arg text]] 343 344This is a convenience command to create a .C pseudo-instruction, 345i.e. a comment. It uses [cmd ::grammar::me::cpu::gasm::Cmd] to 346actually create the instruction and inherits its behaviour. 347 348The [arg text] will be saved as the first and only argument of the new 349instruction. 350 351 352[call [cmd ::grammar::me::cpu::gasm::Jmp] [arg label]] 353 354This command creates an arc from the [term anchor] to the instruction 355labeled with [arg label], and tags with the the current condition 356code. 357 358[para] 359 360The command returns the empty string as its result. 361 362[call [cmd ::grammar::me::cpu::gasm::Exit]] 363 364This command creates an arc from the [term anchor] to one of the exit 365instructions, based on the operation mode (see 366[cmd ::grammar::me::cpu::gasm::begin]), and tags it with current 367condition code. 368 369[para] 370 371For mode [const okfail] it links to the instruction labeled either 372"exit/ok" or "exit/fail", depending on the current condition code, and 373tagging it with the current condition code 374 375For the other two modes it links to the instruction labeled 376"exit/return", tagging it condition code [const always], independent 377the current condition code. 378 379[para] 380 381The command returns the empty string as its result. 382 383 384[call [cmd ::grammar::me::cpu::gasm::Who] [arg label]] 385 386This command returns a reference to the instruction labeled with 387[arg label]. 388 389 390[call [cmd ::grammar::me::cpu::gasm::/Label] [arg name]] 391 392This command labels the [term anchor] with [arg name]. 393 394[emph Note] that an instruction can have more than one label. 395 396[para] 397 398The command returns the empty string as its result. 399 400 401[call [cmd ::grammar::me::cpu::gasm::/Clear]] 402 403This command clears the [term anchor], leaving it undefined, and 404further resets the current condition code to [const always]. 405 406[para] 407 408The command returns the empty string as its result. 409 410 411[call [cmd ::grammar::me::cpu::gasm::/Ok]] 412 413This command sets the current condition code to [const ok]. 414 415[para] 416 417The command returns the empty string as its result. 418 419 420[call [cmd ::grammar::me::cpu::gasm::/Fail]] 421 422This command sets the current condition code to [const fail]. 423 424[para] 425 426The command returns the empty string as its result. 427 428 429[call [cmd ::grammar::me::cpu::gasm::/At] [arg name]] 430 431This command sets the [term anchor] to the instruction labeled with 432[arg name], and further resets the current condition code to 433[const always]. 434 435[para] 436 437The command returns the empty string as its result. 438 439[call [cmd ::grammar::me::cpu::gasm::/CloseLoop]] 440 441This command marks the [term anchor] as the last instruction in a loop 442body, by creating the attribute [const LOOP]. 443 444[para] 445 446The command returns the empty string as its result. 447 448 449[list_end] 450 451[section {BUGS, IDEAS, FEEDBACK}] 452 453This document, and the package it describes, will undoubtedly contain 454bugs and other problems. 455 456Please report such in the category [emph grammar_me] of the 457[uri {http://sourceforge.net/tracker/?group_id=12883} {Tcllib SF Trackers}]. 458 459Please also report any ideas for enhancements you may have for either 460package and/or documentation. 461 462[manpage_end] 463