Syntax.doc revision 605:98cb887364d3
1#
2# Copyright 1997-1998 Sun Microsystems, Inc.  All Rights Reserved.
3# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
4#
5# This code is free software; you can redistribute it and/or modify it
6# under the terms of the GNU General Public License version 2 only, as
7# published by the Free Software Foundation.
8#
9# This code is distributed in the hope that it will be useful, but WITHOUT
10# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
11# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
12# version 2 for more details (a copy is included in the LICENSE file that
13# accompanied this code).
14#
15# You should have received a copy of the GNU General Public License version
16# 2 along with this work; if not, write to the Free Software Foundation,
17# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
18#
19# Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara,
20# CA 95054 USA or visit www.sun.com if you need additional information or
21# have any questions.
22#  
23#
24
25JavaSoft HotSpot Architecture Description Language Syntax Specification
26
27Version 0.4 - September 19, 1997
28
29A. Introduction
30
31This document specifies the syntax and associated semantics for the JavaSoft
32HotSpot Architecture Description Language.  This language is used to describe
33the architecture of a processor, and is the input to the ADL Compiler.  The
34ADL Compiler compiles an ADL file into code which is incorporated into the
35Optimizing Just In Time Compiler (OJIT) to generate efficient and correct code
36for the target architecture.  The ADL describes three bassic different types
37of architectural features.  It describes the instruction set (and associated
38operands) of the target architecture.  It describes the register set of the
39target architecture along with relevant information for the register allocator.
40Finally, it describes the architecture's pipeline for scheduling purposes.
41The ADL is used to create an architecture description file for a target
42architecture.  The architecture description file along with some additional
43target specific oracles, written in C++, represent the principal effort in
44porting the OJIT to a new target architecture.
45
46
47B. Example Syntax
48
49	1. Instruction/Operand Syntax for Matching and Encoding
50
51// Create a cost attribute for all operands, and specify the default value
52op_attrib  op_cost(10);	
53
54// Create a cost attribute for all instruction, and specify a default value
55ins_attrib ins_cost(100);
56
57// example operand form
58operand x_reg(REG_NUM rnum)
59%{
60	constraint(IS_RCLASS(rnum, RC_X_REG));
61
62	match(VREG) %{ rnum = new_VR(); %} // block after rule is constructor
63
64	encode %{ return rnum; %}          // encode rules are required
65
66	// this operand has no op_cost entry because it uses the default value
67%}
68
69// example instruction form
70instruct add_accum_reg(x_reg dst, reg src)
71%{
72	match(SET dst (PLUS dst src)); // no block = use default constructor
73
74	encode			       // rule is body of a C++ function
75	%{
76	    return pentium_encode_add_accum_reg(rnum);
77	%}
78
79	ins_cost(200);	// this instruction is more costly than the default
80%}
81
82	2. Register Set Description Syntax for Allocation
83
84reg_def AX(SOE, 1); // declare register AX, mark it save on entry with index 0
85reg_def BX(SOC);    // declare register BX, and mark it save on call
86
87reg_class X_REG(AX, BX); // form a matcher register class of X_REG
88                         // these are used for constraints, etc.
89
90alloc_class class1(AX, BX); // form an allocation class of registers
91                            // used by the register allocator for separate
92                            // allocation of target register classes
93
94	3. Pipeline Syntax for Scheduling
95
96
97C. Keywords
98
991. Matching/Encoding 
100	a. instruct    - indicates a machine instruction entry
101	b. operand     - indicates a machine operand entry
102	c. opclass     - indicates a class of machine operands
103	d. source      - indicates a block of C++ source code
104	e. op_attrib   - indicates an optional attribute for all operands
105	f. ins_attrib  - indicates an optional attribute for all instructions
106	g. match       - indicates a matching rule for an operand/instruction
107	h. encode      - indicates an encoding rule for an operand/instruction
108	i. predicate   - indicates a predicate for matching operand/instruction
109       *j. constraint  - indicates a constraint on the value of an operand
110       *k. effect      - describes the dataflow effect of an operand which 
111			 is not part of a match rule
112       *l. expand      - indicates a single instruction for matching which
113			 expands to multiple instructions for output
114       *m. rewrite     - indicates a single instruction for matching which
115			 gets rewritten to one or more instructions after
116			 allocation depending upon the registers allocated
117       *n. format      - indicates a format rule for an operand/instruction
118	o. construct   - indicates a default constructor for an operand
119
120	[NOTE: * indicates a feature which will not be in first release ]
121
1222. Register
123       *a. register    - indicates an architecture register definition section
124       *b. reg_def     - indicates a register declaration
125       *b. reg_class   - indicates a class (list) of registers for matching
126       *c. alloc_class - indicates a class (list) of registers for allocation
127
1283. Pipeline
129       *a. pipeline    - indicates an architecture pipeline definition section
130       *b. resource    - indicates a pipeline resource used by instructions
131       *c. pipe_desc   - indicates the relevant stages of the pipeline
132       *d. pipe_class  - indicates a description of the pipeline behavior
133	   		 for a group of instructions
134       *e. ins_pipe    - indicates what pipeline class an instruction is in
135
136
137D. Delimiters
138
139	1. Comments
140		a. /* ... */  (like C code)
141		b. // ... EOL (like C++ code)
142		c. Comments must be preceeded by whitespace
143
144	2. Blocks
145		a. %{ ... %} (% just distinguishes from C++ syntax)
146		b. Must be whitespace before and after block delimiters
147
148	3. Terminators
149		a. ;   (standard statement terminator)
150		b. %}  (block terminator)
151		c. EOF (file terminator)
152
153	4. Each statement must start on a separate line
154
155	5. Identifiers cannot contain: (){}%;,"/\
156
157E. Instruction Form: instruct instr1(oper1 dst, oper2 src) %{ ... %}
158
159	1.  Identifier (scope of all instruction names is global in ADL file)
160	2.  Operands
161		a. Specified in argument style: (<type> <name>, ...)
162		b. Type must be the name of an Operand Form
163		c. Name is a locally scoped name, used for substitution, etc.
164	3.  Match Rule: match(SET dst (PLUS dst src));
165		a. Parenthesized Inorder Binary Tree: [lft = left; rgh = right]
166		   (root root->lft (root->rgh (root->rgh->lft root->rgh->rgh)))
167		b. Interior nodes in tree are operators (nodes) in abstract IR
168		c. Leaves are operands from instruction operand list
169		d. Assignment operation and destination, which are implicit
170		   in the abstract IR, must be specified in the match rule.
171	4.  Encode Rule: encode %{ return CONST; %}
172		a. Block form must contain C++ code which constitutes the
173		   body of a C++ function which takes no arguments, and
174		   returns an integer.
175		b. Local names (operand names) are can be used as substitution
176		   symbols in the code.
177	5.  Attribute (ins_attrib): ins_cost(37);
178		a. Identifier (must be name defined as instruction attribute)
179		b. Argument must be a constant value or a C++ expression which
180		   evaluates to a constant at compile time.
181       *6.  Effect: effect(src, OP_KILL);
182		a. Arguments must be the name of an operand and one of the
183		   pre-defined effect type symbols:
184		   OP_DEF, OP_USE, OP_KILL, OP_USE_DEF, OP_DEF_USE, OP_USE_KILL
185       *7.  Expand: 
186		a. Parameters for the new instructions must be the name of 
187		   an expand rule temporary operand or must match the local
188		   operand name in both the instruction being expanded and 
189		   the new instruction being generated.
190
191		instruct convI2B( xRegI dst, eRegI src ) %{
192  		   match(Set dst (Conv2B src));
193
194		   expand %{
195	 	     eFlagsReg cr;
196			 loadZero(dst);
197			 testI_zero(cr,src);
198			 set_nz(dst,cr);
199		   %}
200		%}
201		// Move zero into a register without setting flags
202		instruct loadZero(eRegI dst) %{
203		  effect( DEF dst );
204		  format %{ "MOV    $dst,0" %}
205		  opcode(0xB8); // +rd
206		  ins_encode( LdI0(dst) );
207		%}
208
209       *8.  Rewrite
210	9.  Format: format(add_X $src, $dst); | format %{ ... %}
211		a. Argument form takes a text string, possibly containing
212		   some substitution symbols, which will be printed out
213		   to the assembly language file.
214		b. The block form takes valid C++ code which forms the body
215		   of a function which takes no arguments, and returns a
216		   pointer to a string to print out to the assembly file.
217
218		Mentions of a literal register r in a or b must be of
219		the form r_enc or r_num. The form r_enc refers to the
220		encoding of the register in an instruction. The form
221		r_num refers to the number of the register. While
222		r_num is unique, two different registers may have the
223		same r_enc, as, for example, an integer and a floating
224		point register.
225
226
227F. Operand Form: operand x_reg(REG_T rall) %{ ... %}
228	1.  Identifier (scope of all operand names is global in ADL file)
229	2.  Components
230		a. Specified in argument style: (<type> <name>, ...)
231		b. Type must be a predefined Component Type
232		c. Name is a locally scoped name, used for substitution, etc.
233	3.  Match: (VREG)
234		a. Parenthesized Inorder Binary Tree: [lft = left; rgh = right]
235		   (root root->lft (root->rgh (root->rgh->lft root->rgh->rgh)))
236		b. Interior nodes in tree are operators (nodes) in abstract IR
237		c. Leaves are components from operand component list
238		d. Block following tree is the body of a C++ function taking
239		   no arguments and returning no value, which assigns values
240		   to the components of the operand at match time.
241	4.  Encode: encode %{ return CONST; %}
242		a. Block form must contain C++ code which constitutes the
243		   body of a C++ function which takes no arguments, and
244		   returns an integer.
245		b. Local names (operand names) are can be used as substitution
246		   symbols in the code.
247	5.  Attribute (op_attrib): op_cost(5);
248		a. Identifier (must be name defined as operand attribute)
249		b. Argument must be a constant value or a C++ expression which
250		   evaluates to a constant at compile time.
251	6.  Predicate: predicate(0 <= src < 256);
252		a. Argument must be a valid C++ expression which evaluates
253		   to either TRUE of FALSE at run time.
254       *7.  Constraint: constraint(IS_RCLASS(dst, RC_X_CLASS));
255                a. Arguments must contain only predefined constraint
256                   functions on values defined in the AD file.
257                b. Multiple arguments can be chained together logically
258                   with "&&".
259 	8.  Construct: construct %{ ... %}
260		a. Block must be a valid C++ function body which takes no
261		   arguments, and returns no values.
262		b. Purpose of block is to assign values to the elements
263		   of an operand which is constructed outside the matching
264		   process.
265		c. This block is logically identical to the constructor
266		   block in a match rule.
267	9.  Format: format(add_X $src, $dst); | format %{ ... %}
268		a. Argument form takes a text string, possibly containing
269		   some substitution symbols, which will be printed out
270		   to the assembly language file.
271		b. The block form takes valid C++ code which forms the body
272		   of a function which takes no arguments, and returns a
273		   pointer to a string to print out to the assembly file.
274
275		Mentions of a literal register r in a or b must be of
276		the form r_enc or r_num. The form r_enc refers to the
277		encoding of the register in an instruction. The form
278		r_num refers to the number of the register. While
279		r_num is unique, two different registers may have the
280		same r_enc, as, for example, an integer and a floating
281		point register.
282
283G. Operand Class Form: opclass memory( direct, indirect, ind_offset);
284
285H. Attribute Form (keywords ins_atrib & op_attrib): ins_attrib ins_cost(10);
286	1. Identifier (scope of all attribute names is global in ADL file)
287	2. Argument must be a valid C++ expression which evaluates to a
288	   constant at compile time, and specifies the default value of
289	   this attribute if attribute definition is not included in an
290	   operand/instruction.
291
292I. Source Form: source %{ ... %}
293	1. Source Block
294		a. All source blocks are delimited by "%{" and "%}".
295		b. All source blocks are copied verbatim into the
296		   C++ output file, and must be valid C++ code.
297
298		   Mentions of a literal register r in this code must
299		   be of the form r_enc or r_num. The form r_enc
300		   refers to the encoding of the register in an
301		   instruction. The form r_num refers to the number of
302		   the register. While r_num is unique, two different
303		   registers may have the same r_enc, as, for example,
304		   an integer and a floating point register.
305
306
307J. *Register Form: register %{ ... %}
308	1. Block contains architecture specific information for allocation
309	2. Reg_def: reg_def reg_AX(1);
310		a. Identifier is name by which register will be referenced
311		   throughout the rest of the AD, and the allocator and
312		   back-end.
313		b. Argument is the Save on Entry index (where 0 means that
314		   the register is Save on Call).  This is used by the
315		   frame management routines for generating register saves
316		   and restores.
317	3. Reg_class: reg_class x_regs(reg_AX, reg_BX, reg_CX, reg_DX);
318		a. Identifier is the name of the class used throughout the
319		   instruction selector.
320		b. Arguments are a list of register names in the class.
321	4. Alloc_class: alloc_class x_alloc(reg_AX, reg_BX, reg_CX, reg_DX);
322		a. Identifier is the name of the class used throughout the
323		   register allocator.
324		b. Arguments are a list of register names in the class.
325
326
327K. *Pipeline Form: pipeline %{ ... %}
328	1. Block contains architecture specific information for scheduling
329	2. Resource: resource(ialu1);
330		a. Argument is the name of the resource.
331	3. Pipe_desc: pipe_desc(Address, Access, Read, Execute);
332		a. Arguments are names of relevant phases of the pipeline.
333		b. If ALL instructions behave identically in a pipeline
334		   phase, it does not need to be specified. (This is typically
335		   true for pre-fetch, fetch, and decode pipeline stages.)
336		c. There must be one name per cycle consumed in the
337		   pipeline, even if there is no instruction which has
338		   significant behavior in that stage (for instance, extra
339		   stages inserted for load and store instructions which
340		   are just cycles which pass waiting for the completion
341		   of the memory operation).
342	4. Pipe_class: pipe_class pipe_normal(dagen; ; membus; ialu1, ialu2);
343		a. Identifier names the class for use in ins_pipe statements
344		b. Arguments are a list of stages, separated by ;'s which
345		   contain comma separated lists of resource names.
346		c. There must be an entry for each stage defined in the
347		   pipe_desc statement, (even if it is empty as in the example)
348		   and entries are associated with stages by the order of
349		   stage declarations in the pipe_desc statement.
350
351