1\chapter{Functional Programming in HOL}
2
3This chapter describes how to write
4functional programs in HOL and how to verify them.  However, 
5most of the constructs and
6proof procedures introduced are general and recur in any specification
7or verification task.  We really should speak of functional
8\emph{modelling} rather than functional \emph{programming}: 
9our primary aim is not
10to write programs but to design abstract models of systems.  HOL is
11a specification language that goes well beyond what can be expressed as a
12program. However, for the time being we concentrate on the computable.
13
14If you are a purist functional programmer, please note that all functions
15in HOL must be total:
16they must terminate for all inputs.  Lazy data structures are not
17directly available.
18
19\section{An Introductory Theory}
20\label{sec:intro-theory}
21
22Functional programming needs datatypes and functions. Both of them can be
23defined in a theory with a syntax reminiscent of languages like ML or
24Haskell. As an example consider the theory in figure~\ref{fig:ToyList}.
25We will now examine it line by line.
26
27\begin{figure}[htbp]
28\begin{ttbox}\makeatother
29\input{ToyList1.txt}\end{ttbox}
30\caption{A Theory of Lists}
31\label{fig:ToyList}
32\end{figure}
33
34\index{*ToyList example|(}
35{\makeatother\medskip\input{ToyList.tex}}
36
37The complete proof script is shown in Fig.\ts\ref{fig:ToyList-proofs}. The
38concatenation of Figs.\ts\ref{fig:ToyList} and~\ref{fig:ToyList-proofs}
39constitutes the complete theory \texttt{ToyList} and should reside in file
40\texttt{ToyList.thy}.
41% It is good practice to present all declarations and
42%definitions at the beginning of a theory to facilitate browsing.%
43\index{*ToyList example|)}
44
45\begin{figure}[htbp]
46\begin{ttbox}\makeatother
47\input{ToyList2.txt}\end{ttbox}
48\caption{Proofs about Lists}
49\label{fig:ToyList-proofs}
50\end{figure}
51
52\subsubsection*{Review}
53
54This is the end of our toy proof. It should have familiarized you with
55\begin{itemize}
56\item the standard theorem proving procedure:
57state a goal (lemma or theorem); proceed with proof until a separate lemma is
58required; prove that lemma; come back to the original goal.
59\item a specific procedure that works well for functional programs:
60induction followed by all-out simplification via \isa{auto}.
61\item a basic repertoire of proof commands.
62\end{itemize}
63
64\begin{warn}
65It is tempting to think that all lemmas should have the \isa{simp} attribute
66just because this was the case in the example above. However, in that example
67all lemmas were equations, and the right-hand side was simpler than the
68left-hand side --- an ideal situation for simplification purposes. Unless
69this is clearly the case, novices should refrain from awarding a lemma the
70\isa{simp} attribute, which has a global effect. Instead, lemmas can be
71applied locally where they are needed, which is discussed in the following
72chapter.
73\end{warn}
74
75\section{Some Helpful Commands}
76\label{sec:commands-and-hints}
77
78This section discusses a few basic commands for manipulating the proof state
79and can be skipped by casual readers.
80
81There are two kinds of commands used during a proof: the actual proof
82commands and auxiliary commands for examining the proof state and controlling
83the display. Simple proof commands are of the form
84\commdx{apply}(\textit{method}), where \textit{method} is typically 
85\isa{induct_tac} or \isa{auto}.  All such theorem proving operations
86are referred to as \bfindex{methods}, and further ones are
87introduced throughout the tutorial.  Unless stated otherwise, you may
88assume that a method attacks merely the first subgoal. An exception is
89\isa{auto}, which tries to solve all subgoals.
90
91The most useful auxiliary commands are as follows:
92\begin{description}
93\item[Modifying the order of subgoals:]
94\commdx{defer} moves the first subgoal to the end and
95\commdx{prefer}~$n$ moves subgoal $n$ to the front.
96\item[Printing theorems:]
97  \commdx{thm}~\textit{name}$@1$~\dots~\textit{name}$@n$
98  prints the named theorems.
99\item[Reading terms and types:] \commdx{term}
100  \textit{string} reads, type-checks and prints the given string as a term in
101  the current context; the inferred type is output as well.
102  \commdx{typ} \textit{string} reads and prints the given
103  string as a type in the current context.
104\end{description}
105Further commands are found in the Isabelle/Isar Reference
106Manual~\cite{isabelle-isar-ref}.
107
108\begin{pgnote}
109Clicking on the \pgmenu{State} button redisplays the current proof state.
110This is helpful in case commands like \isacommand{thm} have overwritten it.
111\end{pgnote}
112
113We now examine Isabelle's functional programming constructs systematically,
114starting with inductive datatypes.
115
116
117\section{Datatypes}
118\label{sec:datatype}
119
120\index{datatypes|(}%
121Inductive datatypes are part of almost every non-trivial application of HOL.
122First we take another look at an important example, the datatype of
123lists, before we turn to datatypes in general. The section closes with a
124case study.
125
126
127\subsection{Lists}
128
129\index{*list (type)}%
130Lists are one of the essential datatypes in computing.  We expect that you
131are already familiar with their basic operations.
132Theory \isa{ToyList} is only a small fragment of HOL's predefined theory
133\thydx{List}\footnote{\url{https://isabelle.in.tum.de/library/HOL/List.html}}.
134The latter contains many further operations. For example, the functions
135\cdx{hd} (``head'') and \cdx{tl} (``tail'') return the first
136element and the remainder of a list. (However, pattern matching is usually
137preferable to \isa{hd} and \isa{tl}.)  
138Also available are higher-order functions like \isa{map} and \isa{filter}.
139Theory \isa{List} also contains
140more syntactic sugar: \isa{[}$x@1$\isa{,}\dots\isa{,}$x@n$\isa{]} abbreviates
141$x@1$\isa{\#}\dots\isa{\#}$x@n$\isa{\#[]}.  In the rest of the tutorial we
142always use HOL's predefined lists by building on theory \isa{Main}.
143\begin{warn}
144Looking ahead to sets and quanifiers in Part II:
145The best way to express that some element \isa{x} is in a list \isa{xs} is
146\isa{x $\in$ set xs}, where \isa{set} is a function that turns a list into the
147set of its elements.
148By the same device you can also write bounded quantifiers like
149\isa{$\forall$x $\in$ set xs} or embed lists in other set expressions.
150\end{warn}
151
152
153\subsection{The General Format}
154\label{sec:general-datatype}
155
156The general HOL \isacommand{datatype} definition is of the form
157\[
158\isacommand{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~
159C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~
160C@m~\tau@{m1}~\dots~\tau@{mk@m}
161\]
162where $\alpha@i$ are distinct type variables (the parameters), $C@i$ are distinct
163constructor names and $\tau@{ij}$ are types; it is customary to capitalize
164the first letter in constructor names. There are a number of
165restrictions (such as that the type should not be empty) detailed
166elsewhere~\cite{isabelle-HOL}. Isabelle notifies you if you violate them.
167
168Laws about datatypes, such as \isa{[] \isasymnoteq~x\#xs} and
169\isa{(x\#xs = y\#ys) = (x=y \isasymand~xs=ys)}, are used automatically
170during proofs by simplification.  The same is true for the equations in
171primitive recursive function definitions.
172
173Every\footnote{Except for advanced datatypes where the recursion involves
174``\isasymRightarrow'' as in {\S}\ref{sec:nested-fun-datatype}.} datatype $t$
175comes equipped with a \isa{size} function from $t$ into the natural numbers
176(see~{\S}\ref{sec:nat} below). For lists, \isa{size} is just the length, i.e.\
177\isa{size [] = 0} and \isa{size(x \# xs) = size xs + 1}.  In general,
178\cdx{size} returns
179\begin{itemize}
180\item zero for all constructors that do not have an argument of type $t$,
181\item one plus the sum of the sizes of all arguments of type~$t$,
182for all other constructors.
183\end{itemize}
184Note that because
185\isa{size} is defined on every datatype, it is overloaded; on lists
186\isa{size} is also called \sdx{length}, which is not overloaded.
187Isabelle will always show \isa{size} on lists as \isa{length}.
188
189
190\subsection{Primitive Recursion}
191
192\index{recursion!primitive}%
193Functions on datatypes are usually defined by recursion. In fact, most of the
194time they are defined by what is called \textbf{primitive recursion} over some
195datatype $t$. This means that the recursion equations must be of the form
196\[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \]
197such that $C$ is a constructor of $t$ and all recursive calls of
198$f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus
199Isabelle immediately sees that $f$ terminates because one (fixed!) argument
200becomes smaller with every recursive call. There must be at most one equation
201for each constructor.  Their order is immaterial.
202A more general method for defining total recursive functions is introduced in
203{\S}\ref{sec:fun}.
204
205\begin{exercise}\label{ex:Tree}
206\input{Tree.tex}%
207\end{exercise}
208
209\input{case_exprs.tex}
210
211\input{Ifexpr.tex}
212\index{datatypes|)}
213
214
215\section{Some Basic Types}
216
217This section introduces the types of natural numbers and ordered pairs.  Also
218described is type \isa{option}, which is useful for modelling exceptional
219cases. 
220
221\subsection{Natural Numbers}
222\label{sec:nat}\index{natural numbers}%
223\index{linear arithmetic|(}
224
225\input{fakenat.tex}\medskip
226\input{natsum.tex}
227
228\index{linear arithmetic|)}
229
230
231\subsection{Pairs}
232\input{pairs2.tex}
233
234\subsection{Datatype {\tt\slshape option}}
235\label{sec:option}
236\input{Option2.tex}
237
238\section{Definitions}
239\label{sec:Definitions}
240
241A definition is simply an abbreviation, i.e.\ a new name for an existing
242construction. In particular, definitions cannot be recursive. Isabelle offers
243definitions on the level of types and terms. Those on the type level are
244called \textbf{type synonyms}; those on the term level are simply called 
245definitions.
246
247
248\subsection{Type Synonyms}
249
250\index{type synonyms}%
251Type synonyms are similar to those found in ML\@. They are created by a 
252\commdx{type\protect\_synonym} command:
253
254\medskip
255\input{types.tex}
256
257\input{prime_def.tex}
258
259
260\section{The Definitional Approach}
261\label{sec:definitional}
262
263\index{Definitional Approach}%
264As we pointed out at the beginning of the chapter, asserting arbitrary
265axioms such as $f(n) = f(n) + 1$ can easily lead to contradictions. In order
266to avoid this danger, we advocate the definitional rather than
267the axiomatic approach: introduce new concepts by definitions. However,  Isabelle/HOL seems to
268support many richer definitional constructs, such as
269\isacommand{primrec}. The point is that Isabelle reduces such constructs to first principles. For example, each
270\isacommand{primrec} function definition is turned into a proper
271(nonrecursive!) definition from which the user-supplied recursion equations are
272automatically proved.  This process is
273hidden from the user, who does not have to understand the details.  Other commands described
274later, like \isacommand{fun} and \isacommand{inductive}, work similarly.  
275This strict adherence to the definitional approach reduces the risk of 
276soundness errors.
277
278\chapter{More Functional Programming}
279
280The purpose of this chapter is to deepen your understanding of the
281concepts encountered so far and to introduce advanced forms of datatypes and
282recursive functions. The first two sections give a structured presentation of
283theorem proving by simplification ({\S}\ref{sec:Simplification}) and discuss
284important heuristics for induction ({\S}\ref{sec:InductionHeuristics}).  You can
285skip them if you are not planning to perform proofs yourself.
286We then present a case
287study: a compiler for expressions ({\S}\ref{sec:ExprCompiler}). Advanced
288datatypes, including those involving function spaces, are covered in
289{\S}\ref{sec:advanced-datatypes}; it closes with another case study, search
290trees (``tries'').  Finally we introduce \isacommand{fun}, a general
291form of recursive function definition that goes well beyond 
292\isacommand{primrec} ({\S}\ref{sec:fun}).
293
294
295\section{Simplification}
296\label{sec:Simplification}
297\index{simplification|(}
298
299So far we have proved our theorems by \isa{auto}, which simplifies
300all subgoals. In fact, \isa{auto} can do much more than that. 
301To go beyond toy examples, you
302need to understand the ingredients of \isa{auto}.  This section covers the
303method that \isa{auto} always applies first, simplification.
304
305Simplification is one of the central theorem proving tools in Isabelle and
306many other systems. The tool itself is called the \textbf{simplifier}. 
307This section introduces the many features of the simplifier
308and is required reading if you intend to perform proofs.  Later on,
309{\S}\ref{sec:simplification-II} explains some more advanced features and a
310little bit of how the simplifier works. The serious student should read that
311section as well, in particular to understand why the simplifier did
312something unexpected.
313
314\subsection{What is Simplification?}
315
316In its most basic form, simplification means repeated application of
317equations from left to right. For example, taking the rules for \isa{\at}
318and applying them to the term \isa{[0,1] \at\ []} results in a sequence of
319simplification steps:
320\begin{ttbox}\makeatother
321(0#1#[]) @ []  \(\leadsto\)  0#((1#[]) @ [])  \(\leadsto\)  0#(1#([] @ []))  \(\leadsto\)  0#1#[]
322\end{ttbox}
323This is also known as \bfindex{term rewriting}\indexbold{rewriting} and the
324equations are referred to as \bfindex{rewrite rules}.
325``Rewriting'' is more honest than ``simplification'' because the terms do not
326necessarily become simpler in the process.
327
328The simplifier proves arithmetic goals as described in
329{\S}\ref{sec:nat} above.  Arithmetic expressions are simplified using built-in
330procedures that go beyond mere rewrite rules.  New simplification procedures
331can be coded and installed, but they are definitely not a matter for this
332tutorial. 
333
334\input{simp.tex}
335
336\index{simplification|)}
337
338\input{Itrev.tex}
339\begin{exercise}
340\input{Plus.tex}%
341\end{exercise}
342\begin{exercise}
343\input{Tree2.tex}%
344\end{exercise}
345
346\input{CodeGen.tex}
347
348
349\section{Advanced Datatypes}
350\label{sec:advanced-datatypes}
351\index{datatype@\isacommand {datatype} (command)|(}
352\index{primrec@\isacommand {primrec} (command)|(}
353%|)
354
355This section presents advanced forms of datatypes: mutual and nested
356recursion.  A series of examples will culminate in a treatment of the trie
357data structure.
358
359
360\subsection{Mutual Recursion}
361\label{sec:datatype-mut-rec}
362
363\input{ABexpr.tex}
364
365\subsection{Nested Recursion}
366\label{sec:nested-datatype}
367
368{\makeatother\input{Nested.tex}}
369
370
371\subsection{The Limits of Nested Recursion}
372\label{sec:nested-fun-datatype}
373
374How far can we push nested recursion? By the unfolding argument above, we can
375reduce nested to mutual recursion provided the nested recursion only involves
376previously defined datatypes. This does not include functions:
377\begin{isabelle}
378\isacommand{datatype} t = C "t \isasymRightarrow\ bool"
379\end{isabelle}
380This declaration is a real can of worms.
381In HOL it must be ruled out because it requires a type
382\isa{t} such that \isa{t} and its power set \isa{t \isasymFun\ bool} have the
383same cardinality --- an impossibility. For the same reason it is not possible
384to allow recursion involving the type \isa{t set}, which is isomorphic to
385\isa{t \isasymFun\ bool}.
386
387Fortunately, a limited form of recursion
388involving function spaces is permitted: the recursive type may occur on the
389right of a function arrow, but never on the left. Hence the above can of worms
390is ruled out but the following example of a potentially 
391\index{infinitely branching trees}%
392infinitely branching tree is accepted:
393\smallskip
394
395\input{Fundata.tex}
396
397If you need nested recursion on the left of a function arrow, there are
398alternatives to pure HOL\@.  In the Logic for Computable Functions 
399(\rmindex{LCF}), types like
400\begin{isabelle}
401\isacommand{datatype} lam = C "lam \isasymrightarrow\ lam"
402\end{isabelle}
403do indeed make sense~\cite{paulson87}.  Note the different arrow,
404\isa{\isasymrightarrow} instead of \isa{\isasymRightarrow},
405expressing the type of \emph{continuous} functions. 
406There is even a version of LCF on top of HOL,
407called \rmindex{HOLCF}~\cite{MuellerNvOS99}.
408\index{datatype@\isacommand {datatype} (command)|)}
409\index{primrec@\isacommand {primrec} (command)|)}
410
411
412\subsection{Case Study: Tries}
413\label{sec:Trie}
414
415\index{tries|(}%
416Tries are a classic search tree data structure~\cite{Knuth3-75} for fast
417indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a
418trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and
419``cat''.  When searching a string in a trie, the letters of the string are
420examined sequentially. Each letter determines which subtrie to search next.
421In this case study we model tries as a datatype, define a lookup and an
422update function, and prove that they behave as expected.
423
424\begin{figure}[htbp]
425\begin{center}
426\unitlength1mm
427\begin{picture}(60,30)
428\put( 5, 0){\makebox(0,0)[b]{l}}
429\put(25, 0){\makebox(0,0)[b]{e}}
430\put(35, 0){\makebox(0,0)[b]{n}}
431\put(45, 0){\makebox(0,0)[b]{r}}
432\put(55, 0){\makebox(0,0)[b]{t}}
433%
434\put( 5, 9){\line(0,-1){5}}
435\put(25, 9){\line(0,-1){5}}
436\put(44, 9){\line(-3,-2){9}}
437\put(45, 9){\line(0,-1){5}}
438\put(46, 9){\line(3,-2){9}}
439%
440\put( 5,10){\makebox(0,0)[b]{l}}
441\put(15,10){\makebox(0,0)[b]{n}}
442\put(25,10){\makebox(0,0)[b]{p}}
443\put(45,10){\makebox(0,0)[b]{a}}
444%
445\put(14,19){\line(-3,-2){9}}
446\put(15,19){\line(0,-1){5}}
447\put(16,19){\line(3,-2){9}}
448\put(45,19){\line(0,-1){5}}
449%
450\put(15,20){\makebox(0,0)[b]{a}}
451\put(45,20){\makebox(0,0)[b]{c}}
452%
453\put(30,30){\line(-3,-2){13}}
454\put(30,30){\line(3,-2){13}}
455\end{picture}
456\end{center}
457\caption{A Sample Trie}
458\label{fig:trie}
459\end{figure}
460
461Proper tries associate some value with each string. Since the
462information is stored only in the final node associated with the string, many
463nodes do not carry any value. This distinction is modeled with the help
464of the predefined datatype \isa{option} (see {\S}\ref{sec:option}).
465\input{Trie.tex}
466\index{tries|)}
467
468\section{Total Recursive Functions: \isacommand{fun}}
469\label{sec:fun}
470\index{fun@\isacommand {fun} (command)|(}\index{functions!total|(}
471
472Although many total functions have a natural primitive recursive definition,
473this is not always the case. Arbitrary total recursive functions can be
474defined by means of \isacommand{fun}: you can use full pattern matching,
475recursion need not involve datatypes, and termination is proved by showing
476that the arguments of all recursive calls are smaller in a suitable sense.
477In this section we restrict ourselves to functions where Isabelle can prove
478termination automatically. More advanced function definitions, including user
479supplied termination proofs, nested recursion and partiality, are discussed
480in a separate tutorial~\cite{isabelle-function}.
481
482\input{fun0.tex}
483
484\index{fun@\isacommand {fun} (command)|)}\index{functions!total|)}
485