1\DOC add_rule
2
3\BLTYPE
4add_rule :
5  {term_name : string, fixity : fixity,
6   pp_elements: term_grammar.pp_element list,
7   paren_style : term_grammar.ParenStyle,
8   block_style : term_grammar.PhraseBlockStyle *
9                 term_grammar.block_info}  -> unit
10\ELTYPE
11
12\SYNOPSIS
13Adds a parsing/printing rule to the global grammar.
14
15\KEYWORDS
16parsing
17
18\LIBRARY
19Parse
20
21\DESCRIBE
22The function {add_rule} is a fundamental method for adding parsing
23(and thus printing) rules to the global term grammar that sits behind
24the term-parsing function {Parse.Term}, and the pretty-printer
25installed for terms. It is used for everything except the addition of
26list-forms, for which refer to the entry for {add_listform}.
27
28There are five components in the record argument to {add_rule}.  The
29{term_name} component is the name of the term (whether a constant or a
30variable) that will be generated at the head of the function
31application.  Thus, the {term_name} component when specifying parsing
32for conditional expressions is {COND}.
33
34The following values (all in structure {Parse}) are useful for
35constructing fixity values:
36{
37   val LEFT       : HOLgrammars.associativity
38   val RIGHT      : HOLgrammars.associativity
39   val NONASSOC   : HOLgrammars.associativity
40
41   val Binder     : fixity
42   val Closefix   : fixity
43   val Infixl     : int -> fixity
44   val Infixr     : int -> fixity
45   val Infix      : HOLgrammars.associativity * int -> fixity
46   val Prefix     : int -> fixity
47   val Suffix     : int -> fixity
48}
49The {Binder} fixity is for binders such as universal and existential
50quantifiers ({!} and {?}).  Binders can actually be seen as (true)
51prefixes (should {`!x. p /\ q`} be parsed as {`(!x. p) /\ q`} or as
52{`!x. (p /\ q)`}?), but the {add_rule} interface only allows binders to
53be added at the one level (the weakest in the grammar).  Further, when
54binders are added using this interface, all elements of the record
55apart from the {term_name} are ignored, so the name of the binder must
56be the same as the string that is parsed and printed  (but see also
57restricted quantifiers: {associate_restriction}).
58
59The remaining fixities all cause {add_rule} to pay due heed to the
60{pp_elements} (``parsing/printing elements'') component of the
61record.  As far as parsing is concerned, the only important elements
62are {TOK} and {TM} values, of the following types:
63{
64   val TM  : term_grammar.pp_element
65   val TOK : string -> term_grammar.pp_element
66}
67The {TM} value corresponds to a ``hole'' where a sub-term is
68possible.  The {TOK} value corresponds to a piece of concrete syntax,
69a string that is required when parsing, and which will appear when
70printing.   The sequence of {pp_elements} specified in the record
71passed to {add_rule} specifies the ``kernel'' syntax of an operator in the
72grammar.  The ``kernel'' of a rule is extended (or not) by additional
73sub-terms depending on the fixity type, thus:
74{
75   Closefix    :      [Kernel]     (* no external arguments *)
76   Prefix      :      [Kernel] _   (* an argument to the right *)
77   Suffix      :    _ [Kernel]     (* an argument to the left *)
78   Infix       :    _ [Kernel] _   (* arguments on both sides *)
79}
80Thus simple infixes, suffixes and prefixes would have singleton
81{pp_element} lists, consisting of just the symbol desired.  More
82complicated mix-fix syntax can be constructed by identifying whether
83or not sub-term arguments exist beyond the kernel of concrete syntax.
84For example, syntax for the evaluation relation of an operational
85semantics ({ _ |- _ --> _ }) is an infix with a kernel delimited by
86{|-} and {-->} tokens.  Syntax for denotation brackets {[| _ |]} is a
87closefix with one internal argument in the kernel.
88
89The remaining sorts of possible {pp_element} values are concerned with
90pretty-printing.  (The basic scheme is implemented on top of a
91standard Oppen-style pretty-printing package.) They are
92{
93   (* where
94        type term_grammar.block_info = PP.break_style * int
95   *)
96   val BreakSpace : (int * int) -> term_grammar.pp_element
97   val HardSpace : int -> term_grammar.pp_element
98
99   val BeginFinalBlock : term_grammar.block_info -> term_grammar.pp_element
100   val EndInitialBlock : term_grammar.block_info -> term_grammar.pp_element
101   val PPBlock : term_grammar.pp_element list * term_grammar.block_info
102                 -> term_grammar.pp_element
103
104   val OnlyIfNecessary : term_grammar.ParenStyle
105   val ParoundName : term_grammar.ParenStyle
106   val ParoundPrec : term_grammar.ParenStyle
107   val Always : term_grammar.ParenStyle
108
109   val AroundEachPhrase : term_grammar.PhraseBlockStyle
110   val AroundSamePrec   : term_grammar.PhraseBlockStyle
111   val AroundSameName   : term_grammar.PhraseBlockStyle
112   val NoPhrasing       : term_grammar.PhraseBlockStyle
113}
114The two spacing values provide ways of specifying
115white-space should be added when terms are printed.  Use of
116{HardSpace n} results in {n} spaces being added to the term whatever
117the context.  On the other hand, {BreakSpace(m,n)} results in a break
118of width {m} spaces unless this makes the current line too wide, in
119which case a line-break will occur, and the next line will be indented
120an extra {n} spaces.
121
122For example, the {add_infix} function (q.v.) is implemented in terms
123of {add_rule} in such a way that a single token infix {s}, has a
124{pp_element} list of
125{
126   [HardSpace 1, TOK s, BreakSpace(1,0)]
127}
128
129This results in chains of infixes (such as those that occur
130with conjunctions) that break so as to leave the infix on the right
131hand side of the line.  Under this constraint, printing can't break so
132as to put the infix symbol on the start of a line, because that would
133imply that the {HardSpace} had in fact been broken.  (Consequently, if
134a change to this behaviour is desired, there is no global way of
135effecting it, but one can do it on an infix-by-infix basis by deleting
136the given rule (see, for example, {remove_termtok}) and then ``putting
137it back'' with different pretty-printing constraints.)
138
139The {PPBlock} function allows the specification of nested blocks
140(blocks in the Oppen pretty-printing sense) within the list of
141{pp_element}s.  Because there are sub-terms in all but the {Closefix}
142fixities that occur beyond the scope of the {pp_element} list, the
143{BeginFinalBlock} and {EndInitialBlock} functions can also be used to
144indicate the boundary of blocks whose outer extent is the term beyond
145the kernel represented by the {pp_element} list.  There is an example
146of this below.
147
148The possible {ParenStyle} values describe when parentheses should be
149added to terms.  The {OnlyIfNecessary} value will cause parentheses to
150be added only when required to disambiguate syntax.  The {ParoundName}
151will cause parentheses to be added if necessary, or where the head
152symbol has the given {term_name} and where this term is not the
153argument of a function with the same head name.  This style of
154parenthesisation is used with tuples, for example.  The {ParoundPrec}
155value is similar, but causes parentheses to be added when the term is
156the argument to a function with a different precedence level.
157Finally, the {Always} value causes parentheses always to be added.
158
159The {PhraseBlockStyle} values describe when pretty-printing blocks
160involving this term should be entered.  The {AroundEachPhrase} style
161causes a pretty-printing block to be created around each term.  This
162is not appropriate for operators such as conjunction however, where
163all of the arguments to the conjunctions in a list are more pleasingly
164thought of as being at the same level.  This effect is gained by
165specifying either {AroundSamePrec} or {AroundSameName}.  The former
166will cause the creation of a new block for the phrase if it is at a
167different precedence level from its parent, while the latter creates
168the block if the parent name is not the same.  The former is
169appropriate for {+} and {-} which are at the same precedence level,
170while the latter is appropriate for {/\}.  Finally, the {NoPhrasing}
171style causes there to be no block at all around terms controlled by
172this rule.  The intention in using such a style is to have block
173structure controlled by the level above.
174
175\FAILURE
176This function will fail if the {pp_element} list does not have {TOK}
177values at the beginning and the end of the list, or if there are two
178adjacent {TM} values in the list.  It will fail if the rule specifies
179a fixity with a precedence, and if that precedence level in the
180grammar is already taken by rules with a different sort of fixity.
181
182\EXAMPLE
183There are two conditional expression syntaxes defined in the theory
184{bool}.  The first is the traditional HOL88/90 syntax.  Because the
185syntax involves ``dangling'' terms to the left and right, it is
186an infix (and one of very weak precedence at that).
187{
188   val _ = add_rule{term_name = "COND",
189                    fixity = Infix (HOLgrammars.RIGHT, 3),
190                    pp_elements = [HardSpace 1, TOK "=>",
191                                   BreakSpace(1,0), TM,
192                                   BreakSpace(1,0), TOK "|",
193                                   HardSpace 1],
194                    paren_style = OnlyIfNecessary,
195                    block_style = (AroundEachPhrase,
196                                   (PP.INCONSISTENT, 0))};
197}
198The second rule added uses the more familiar {if-then-else}
199syntax.  Here there is only a ``dangling'' term to the right of the
200construction, so this rule's fixity is of type {Prefix}. (If the
201rule was made a {Closefix}, strings such as {`if P then Q else R`}
202would still parse, but so too would {`if P then Q else`}.)  This
203example also illustrates the use of blocks within rules to improve
204pretty-printing.
205{
206   val _ = add_rule{term_name = "COND", fixity = Prefix 70,
207                    pp_elements = [PPBlock([TOK "if", BreakSpace(1,2),
208                                            TM, BreakSpace(1,0),
209                                            TOK "then"], (PP.CONSISTENT, 0)),
210                                   BreakSpace(1,2), TM, BreakSpace(1,0),
211                                   BeginFinalBlock(PP.CONSISTENT, 2),
212                                   TOK "else", BreakSpace(1,0)],
213                    paren_style = OnlyIfNecessary,
214                    block_style = (AroundEachPhrase,
215                                   (PP.INCONSISTENT, 0))};
216}
217Note that the above form is not that actually used in the
218system.  As written, it allows for pretty-printing some expressions
219as:
220{
221   if P then
222      <very long term> else Q
223}
224because the {block_style} is {INCONSISTENT}.
225
226The pretty-printer prefers later rules over earlier rules by
227default (though this choice can be changed with {prefer_form_with_tok}
228(q.v.)), so conditional expressions print using the {if-then-else}
229syntax rather than the {_ => _ | _} syntax.
230
231\USES
232For making pretty concrete syntax possible.
233
234\COMMENTS
235Because adding new rules to the grammar may result in precedence
236conflicts in the operator-precedence matrix, it is as well with
237interactive use to test the {Term} parser immediately after adding a
238new rule, as it is only with this call that the precedence matrix is
239built.
240
241As with other functions in the {Parse} structure, there is a companion
242{temp_add_rule} function, which has the same effect on the global
243grammar, but which does not cause this effect to persist when the
244current theory is exported.
245
246An Isabelle-style concrete syntax for specifying rules would probably
247be desirable as it would conceal the complexity of the above from most
248users.
249
250\SEEALSO
251Parse.add_listform, Parse.add_infix, Parse.prefer_form_with_tok, Parse.remove_rules_for_term.
252\ENDDOC
253