1(*:maxLineLen=78:*) 2 3theory Syntax 4imports Base 5begin 6 7chapter \<open>Concrete syntax and type-checking\<close> 8 9text \<open> 10 Pure \<open>\<lambda>\<close>-calculus as introduced in \chref{ch:logic} is an adequate 11 foundation for logical languages --- in the tradition of \<^emph>\<open>higher-order 12 abstract syntax\<close> --- but end-users require additional means for reading and 13 printing of terms and types. This important add-on outside the logical core 14 is called \<^emph>\<open>inner syntax\<close> in Isabelle jargon, as opposed to the \<^emph>\<open>outer 15 syntax\<close> of the theory and proof language @{cite "isabelle-isar-ref"}. 16 17 For example, according to @{cite church40} quantifiers are represented as 18 higher-order constants \<open>All :: ('a \<Rightarrow> bool) \<Rightarrow> bool\<close> such that \<open>All (\<lambda>x::'a. B 19 x)\<close> faithfully represents the idea that is displayed in Isabelle as \<open>\<forall>x::'a. 20 B x\<close> via @{keyword "binder"} notation. Moreover, type-inference in the style 21 of Hindley-Milner @{cite hindleymilner} (and extensions) enables users to 22 write \<open>\<forall>x. B x\<close> concisely, when the type \<open>'a\<close> is already clear from the 23 context.\<^footnote>\<open>Type-inference taken to the extreme can easily confuse users. 24 Beginners often stumble over unexpectedly general types inferred by the 25 system.\<close> 26 27 \<^medskip> 28 The main inner syntax operations are \<^emph>\<open>read\<close> for parsing together with 29 type-checking, and \<^emph>\<open>pretty\<close> for formatted output. See also 30 \secref{sec:read-print}. 31 32 Furthermore, the input and output syntax layers are sub-divided into 33 separate phases for \<^emph>\<open>concrete syntax\<close> versus \<^emph>\<open>abstract syntax\<close>, see also 34 \secref{sec:parse-unparse} and \secref{sec:term-check}, respectively. This 35 results in the following decomposition of the main operations: 36 37 \<^item> \<open>read = parse; check\<close> 38 39 \<^item> \<open>pretty = uncheck; unparse\<close> 40 41 For example, some specification package might thus intercept syntax 42 processing at a well-defined stage after \<open>parse\<close>, to a augment the resulting 43 pre-term before full type-reconstruction is performed by \<open>check\<close>. Note that 44 the formal status of bound variables, versus free variables, versus 45 constants must not be changed between these phases. 46 47 \<^medskip> 48 In general, \<open>check\<close> and \<open>uncheck\<close> operate simultaneously on a list of terms. 49 This is particular important for type-checking, to reconstruct types for 50 several terms of the same context and scope. In contrast, \<open>parse\<close> and 51 \<open>unparse\<close> operate separately on single terms. 52 53 There are analogous operations to read and print types, with the same 54 sub-division into phases. 55\<close> 56 57 58section \<open>Reading and pretty printing \label{sec:read-print}\<close> 59 60text \<open> 61 Read and print operations are roughly dual to each other, such that for the 62 user \<open>s' = pretty (read s)\<close> looks similar to the original source text \<open>s\<close>, 63 but the details depend on many side-conditions. There are also explicit 64 options to control the removal of type information in the output. The 65 default configuration routinely looses information, so \<open>t' = read (pretty 66 t)\<close> might fail, or produce a differently typed term, or a completely 67 different term in the face of syntactic overloading. 68\<close> 69 70text %mlref \<open> 71 \begin{mldecls} 72 @{index_ML Syntax.read_typs: "Proof.context -> string list -> typ list"} \\ 73 @{index_ML Syntax.read_terms: "Proof.context -> string list -> term list"} \\ 74 @{index_ML Syntax.read_props: "Proof.context -> string list -> term list"} \\[0.5ex] 75 @{index_ML Syntax.read_typ: "Proof.context -> string -> typ"} \\ 76 @{index_ML Syntax.read_term: "Proof.context -> string -> term"} \\ 77 @{index_ML Syntax.read_prop: "Proof.context -> string -> term"} \\[0.5ex] 78 @{index_ML Syntax.pretty_typ: "Proof.context -> typ -> Pretty.T"} \\ 79 @{index_ML Syntax.pretty_term: "Proof.context -> term -> Pretty.T"} \\ 80 @{index_ML Syntax.string_of_typ: "Proof.context -> typ -> string"} \\ 81 @{index_ML Syntax.string_of_term: "Proof.context -> term -> string"} \\ 82 \end{mldecls} 83 84 \<^descr> @{ML Syntax.read_typs}~\<open>ctxt strs\<close> parses and checks a simultaneous list 85 of source strings as types of the logic. 86 87 \<^descr> @{ML Syntax.read_terms}~\<open>ctxt strs\<close> parses and checks a simultaneous list 88 of source strings as terms of the logic. Type-reconstruction puts all parsed 89 terms into the same scope: types of free variables ultimately need to 90 coincide. 91 92 If particular type-constraints are required for some of the arguments, the 93 read operations needs to be split into its parse and check phases. Then it 94 is possible to use @{ML Type.constraint} on the intermediate pre-terms 95 (\secref{sec:term-check}). 96 97 \<^descr> @{ML Syntax.read_props}~\<open>ctxt strs\<close> parses and checks a simultaneous list 98 of source strings as terms of the logic, with an implicit type-constraint 99 for each argument to enforce type @{typ prop}; this also affects the inner 100 syntax for parsing. The remaining type-reconstruction works as for @{ML 101 Syntax.read_terms}. 102 103 \<^descr> @{ML Syntax.read_typ}, @{ML Syntax.read_term}, @{ML Syntax.read_prop} are 104 like the simultaneous versions, but operate on a single argument only. This 105 convenient shorthand is adequate in situations where a single item in its 106 own scope is processed. Do not use @{ML "map o Syntax.read_term"} where @{ML 107 Syntax.read_terms} is actually intended! 108 109 \<^descr> @{ML Syntax.pretty_typ}~\<open>ctxt T\<close> and @{ML Syntax.pretty_term}~\<open>ctxt t\<close> 110 uncheck and pretty-print the given type or term, respectively. Although the 111 uncheck phase acts on a simultaneous list as well, this is rarely used in 112 practice, so only the singleton case is provided as combined pretty 113 operation. There is no distinction of term vs.\ proposition. 114 115 \<^descr> @{ML Syntax.string_of_typ} and @{ML Syntax.string_of_term} are convenient 116 compositions of @{ML Syntax.pretty_typ} and @{ML Syntax.pretty_term} with 117 @{ML Pretty.string_of} for output. The result may be concatenated with other 118 strings, as long as there is no further formatting and line-breaking 119 involved. 120 121 122 @{ML Syntax.read_term}, @{ML Syntax.read_prop}, and @{ML 123 Syntax.string_of_term} are the most important operations in practice. 124 125 \<^medskip> 126 Note that the string values that are passed in and out are annotated by the 127 system, to carry further markup that is relevant for the Prover IDE @{cite 128 "isabelle-jedit"}. User code should neither compose its own input strings, 129 nor try to analyze the output strings. Conceptually this is an abstract 130 datatype, encoded as concrete string for historical reasons. 131 132 The standard way to provide the required position markup for input works via 133 the outer syntax parser wrapper @{ML Parse.inner_syntax}, which is already 134 part of @{ML Parse.typ}, @{ML Parse.term}, @{ML Parse.prop}. So a string 135 obtained from one of the latter may be directly passed to the corresponding 136 read operation: this yields PIDE markup of the input and precise positions 137 for warning and error messages. 138\<close> 139 140 141section \<open>Parsing and unparsing \label{sec:parse-unparse}\<close> 142 143text \<open> 144 Parsing and unparsing converts between actual source text and a certain 145 \<^emph>\<open>pre-term\<close> format, where all bindings and scopes are already resolved 146 faithfully. Thus the names of free variables or constants are determined in 147 the sense of the logical context, but type information might be still 148 missing. Pre-terms support an explicit language of \<^emph>\<open>type constraints\<close> that 149 may be augmented by user code to guide the later \<^emph>\<open>check\<close> phase. 150 151 Actual parsing is based on traditional lexical analysis and Earley parsing 152 for arbitrary context-free grammars. The user can specify the grammar 153 declaratively via mixfix annotations. Moreover, there are \<^emph>\<open>syntax 154 translations\<close> that can be augmented by the user, either declaratively via 155 @{command translations} or programmatically via @{command 156 parse_translation}, @{command print_translation} @{cite 157 "isabelle-isar-ref"}. The final scope-resolution is performed by the system, 158 according to name spaces for types, term variables and constants determined 159 by the context. 160\<close> 161 162text %mlref \<open> 163 \begin{mldecls} 164 @{index_ML Syntax.parse_typ: "Proof.context -> string -> typ"} \\ 165 @{index_ML Syntax.parse_term: "Proof.context -> string -> term"} \\ 166 @{index_ML Syntax.parse_prop: "Proof.context -> string -> term"} \\[0.5ex] 167 @{index_ML Syntax.unparse_typ: "Proof.context -> typ -> Pretty.T"} \\ 168 @{index_ML Syntax.unparse_term: "Proof.context -> term -> Pretty.T"} \\ 169 \end{mldecls} 170 171 \<^descr> @{ML Syntax.parse_typ}~\<open>ctxt str\<close> parses a source string as pre-type that 172 is ready to be used with subsequent check operations. 173 174 \<^descr> @{ML Syntax.parse_term}~\<open>ctxt str\<close> parses a source string as pre-term that 175 is ready to be used with subsequent check operations. 176 177 \<^descr> @{ML Syntax.parse_prop}~\<open>ctxt str\<close> parses a source string as pre-term that 178 is ready to be used with subsequent check operations. The inner syntax 179 category is @{typ prop} and a suitable type-constraint is included to ensure 180 that this information is observed in subsequent type reconstruction. 181 182 \<^descr> @{ML Syntax.unparse_typ}~\<open>ctxt T\<close> unparses a type after uncheck 183 operations, to turn it into a pretty tree. 184 185 \<^descr> @{ML Syntax.unparse_term}~\<open>ctxt T\<close> unparses a term after uncheck 186 operations, to turn it into a pretty tree. There is no distinction for 187 propositions here. 188 189 190 These operations always operate on a single item; use the combinator @{ML 191 map} to apply them to a list. 192\<close> 193 194 195section \<open>Checking and unchecking \label{sec:term-check}\<close> 196 197text \<open> 198 These operations define the transition from pre-terms and fully-annotated 199 terms in the sense of the logical core (\chref{ch:logic}). 200 201 The \<^emph>\<open>check\<close> phase is meant to subsume a variety of mechanisms in the manner 202 of ``type-inference'' or ``type-reconstruction'' or ``type-improvement'', 203 not just type-checking in the narrow sense. The \<^emph>\<open>uncheck\<close> phase is roughly 204 dual, it prunes type-information before pretty printing. 205 206 A typical add-on for the check/uncheck syntax layer is the @{command 207 abbreviation} mechanism @{cite "isabelle-isar-ref"}. Here the user specifies 208 syntactic definitions that are managed by the system as polymorphic \<open>let\<close> 209 bindings. These are expanded during the \<open>check\<close> phase, and contracted during 210 the \<open>uncheck\<close> phase, without affecting the type-assignment of the given 211 terms. 212 213 \<^medskip> 214 The precise meaning of type checking depends on the context --- additional 215 check/uncheck modules might be defined in user space. 216 217 For example, the @{command class} command defines a context where \<open>check\<close> 218 treats certain type instances of overloaded constants according to the 219 ``dictionary construction'' of its logical foundation. This involves ``type 220 improvement'' (specialization of slightly too general types) and replacement 221 by certain locale parameters. See also @{cite "Haftmann-Wenzel:2009"}. 222\<close> 223 224text %mlref \<open> 225 \begin{mldecls} 226 @{index_ML Syntax.check_typs: "Proof.context -> typ list -> typ list"} \\ 227 @{index_ML Syntax.check_terms: "Proof.context -> term list -> term list"} \\ 228 @{index_ML Syntax.check_props: "Proof.context -> term list -> term list"} \\[0.5ex] 229 @{index_ML Syntax.uncheck_typs: "Proof.context -> typ list -> typ list"} \\ 230 @{index_ML Syntax.uncheck_terms: "Proof.context -> term list -> term list"} \\ 231 \end{mldecls} 232 233 \<^descr> @{ML Syntax.check_typs}~\<open>ctxt Ts\<close> checks a simultaneous list of pre-types 234 as types of the logic. Typically, this involves normalization of type 235 synonyms. 236 237 \<^descr> @{ML Syntax.check_terms}~\<open>ctxt ts\<close> checks a simultaneous list of pre-terms 238 as terms of the logic. Typically, this involves type-inference and 239 normalization term abbreviations. The types within the given terms are 240 treated in the same way as for @{ML Syntax.check_typs}. 241 242 Applications sometimes need to check several types and terms together. The 243 standard approach uses @{ML Logic.mk_type} to embed the language of types 244 into that of terms; all arguments are appended into one list of terms that 245 is checked; afterwards the type arguments are recovered with @{ML 246 Logic.dest_type}. 247 248 \<^descr> @{ML Syntax.check_props}~\<open>ctxt ts\<close> checks a simultaneous list of pre-terms 249 as terms of the logic, such that all terms are constrained by type @{typ 250 prop}. The remaining check operation works as @{ML Syntax.check_terms} 251 above. 252 253 \<^descr> @{ML Syntax.uncheck_typs}~\<open>ctxt Ts\<close> unchecks a simultaneous list of types 254 of the logic, in preparation of pretty printing. 255 256 \<^descr> @{ML Syntax.uncheck_terms}~\<open>ctxt ts\<close> unchecks a simultaneous list of terms 257 of the logic, in preparation of pretty printing. There is no distinction for 258 propositions here. 259 260 261 These operations always operate simultaneously on a list; use the combinator 262 @{ML singleton} to apply them to a single item. 263\<close> 264 265end 266