1\chapter{Functional Programming in HOL} 2 3This chapter describes how to write 4functional programs in HOL and how to verify them. However, 5most of the constructs and 6proof procedures introduced are general and recur in any specification 7or verification task. We really should speak of functional 8\emph{modelling} rather than functional \emph{programming}: 9our primary aim is not 10to write programs but to design abstract models of systems. HOL is 11a specification language that goes well beyond what can be expressed as a 12program. However, for the time being we concentrate on the computable. 13 14If you are a purist functional programmer, please note that all functions 15in HOL must be total: 16they must terminate for all inputs. Lazy data structures are not 17directly available. 18 19\section{An Introductory Theory} 20\label{sec:intro-theory} 21 22Functional programming needs datatypes and functions. Both of them can be 23defined in a theory with a syntax reminiscent of languages like ML or 24Haskell. As an example consider the theory in figure~\ref{fig:ToyList}. 25We will now examine it line by line. 26 27\begin{figure}[htbp] 28\begin{ttbox}\makeatother 29\input{ToyList1.txt}\end{ttbox} 30\caption{A Theory of Lists} 31\label{fig:ToyList} 32\end{figure} 33 34\index{*ToyList example|(} 35{\makeatother\medskip\input{ToyList.tex}} 36 37The complete proof script is shown in Fig.\ts\ref{fig:ToyList-proofs}. The 38concatenation of Figs.\ts\ref{fig:ToyList} and~\ref{fig:ToyList-proofs} 39constitutes the complete theory \texttt{ToyList} and should reside in file 40\texttt{ToyList.thy}. 41% It is good practice to present all declarations and 42%definitions at the beginning of a theory to facilitate browsing.% 43\index{*ToyList example|)} 44 45\begin{figure}[htbp] 46\begin{ttbox}\makeatother 47\input{ToyList2.txt}\end{ttbox} 48\caption{Proofs about Lists} 49\label{fig:ToyList-proofs} 50\end{figure} 51 52\subsubsection*{Review} 53 54This is the end of our toy proof. It should have familiarized you with 55\begin{itemize} 56\item the standard theorem proving procedure: 57state a goal (lemma or theorem); proceed with proof until a separate lemma is 58required; prove that lemma; come back to the original goal. 59\item a specific procedure that works well for functional programs: 60induction followed by all-out simplification via \isa{auto}. 61\item a basic repertoire of proof commands. 62\end{itemize} 63 64\begin{warn} 65It is tempting to think that all lemmas should have the \isa{simp} attribute 66just because this was the case in the example above. However, in that example 67all lemmas were equations, and the right-hand side was simpler than the 68left-hand side --- an ideal situation for simplification purposes. Unless 69this is clearly the case, novices should refrain from awarding a lemma the 70\isa{simp} attribute, which has a global effect. Instead, lemmas can be 71applied locally where they are needed, which is discussed in the following 72chapter. 73\end{warn} 74 75\section{Some Helpful Commands} 76\label{sec:commands-and-hints} 77 78This section discusses a few basic commands for manipulating the proof state 79and can be skipped by casual readers. 80 81There are two kinds of commands used during a proof: the actual proof 82commands and auxiliary commands for examining the proof state and controlling 83the display. Simple proof commands are of the form 84\commdx{apply}(\textit{method}), where \textit{method} is typically 85\isa{induct_tac} or \isa{auto}. All such theorem proving operations 86are referred to as \bfindex{methods}, and further ones are 87introduced throughout the tutorial. Unless stated otherwise, you may 88assume that a method attacks merely the first subgoal. An exception is 89\isa{auto}, which tries to solve all subgoals. 90 91The most useful auxiliary commands are as follows: 92\begin{description} 93\item[Modifying the order of subgoals:] 94\commdx{defer} moves the first subgoal to the end and 95\commdx{prefer}~$n$ moves subgoal $n$ to the front. 96\item[Printing theorems:] 97 \commdx{thm}~\textit{name}$@1$~\dots~\textit{name}$@n$ 98 prints the named theorems. 99\item[Reading terms and types:] \commdx{term} 100 \textit{string} reads, type-checks and prints the given string as a term in 101 the current context; the inferred type is output as well. 102 \commdx{typ} \textit{string} reads and prints the given 103 string as a type in the current context. 104\end{description} 105Further commands are found in the Isabelle/Isar Reference 106Manual~\cite{isabelle-isar-ref}. 107 108\begin{pgnote} 109Clicking on the \pgmenu{State} button redisplays the current proof state. 110This is helpful in case commands like \isacommand{thm} have overwritten it. 111\end{pgnote} 112 113We now examine Isabelle's functional programming constructs systematically, 114starting with inductive datatypes. 115 116 117\section{Datatypes} 118\label{sec:datatype} 119 120\index{datatypes|(}% 121Inductive datatypes are part of almost every non-trivial application of HOL. 122First we take another look at an important example, the datatype of 123lists, before we turn to datatypes in general. The section closes with a 124case study. 125 126 127\subsection{Lists} 128 129\index{*list (type)}% 130Lists are one of the essential datatypes in computing. We expect that you 131are already familiar with their basic operations. 132Theory \isa{ToyList} is only a small fragment of HOL's predefined theory 133\thydx{List}\footnote{\url{https://isabelle.in.tum.de/library/HOL/List.html}}. 134The latter contains many further operations. For example, the functions 135\cdx{hd} (``head'') and \cdx{tl} (``tail'') return the first 136element and the remainder of a list. (However, pattern matching is usually 137preferable to \isa{hd} and \isa{tl}.) 138Also available are higher-order functions like \isa{map} and \isa{filter}. 139Theory \isa{List} also contains 140more syntactic sugar: \isa{[}$x@1$\isa{,}\dots\isa{,}$x@n$\isa{]} abbreviates 141$x@1$\isa{\#}\dots\isa{\#}$x@n$\isa{\#[]}. In the rest of the tutorial we 142always use HOL's predefined lists by building on theory \isa{Main}. 143\begin{warn} 144Looking ahead to sets and quanifiers in Part II: 145The best way to express that some element \isa{x} is in a list \isa{xs} is 146\isa{x $\in$ set xs}, where \isa{set} is a function that turns a list into the 147set of its elements. 148By the same device you can also write bounded quantifiers like 149\isa{$\forall$x $\in$ set xs} or embed lists in other set expressions. 150\end{warn} 151 152 153\subsection{The General Format} 154\label{sec:general-datatype} 155 156The general HOL \isacommand{datatype} definition is of the form 157\[ 158\isacommand{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~ 159C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~ 160C@m~\tau@{m1}~\dots~\tau@{mk@m} 161\] 162where $\alpha@i$ are distinct type variables (the parameters), $C@i$ are distinct 163constructor names and $\tau@{ij}$ are types; it is customary to capitalize 164the first letter in constructor names. There are a number of 165restrictions (such as that the type should not be empty) detailed 166elsewhere~\cite{isabelle-HOL}. Isabelle notifies you if you violate them. 167 168Laws about datatypes, such as \isa{[] \isasymnoteq~x\#xs} and 169\isa{(x\#xs = y\#ys) = (x=y \isasymand~xs=ys)}, are used automatically 170during proofs by simplification. The same is true for the equations in 171primitive recursive function definitions. 172 173Every\footnote{Except for advanced datatypes where the recursion involves 174``\isasymRightarrow'' as in {\S}\ref{sec:nested-fun-datatype}.} datatype $t$ 175comes equipped with a \isa{size} function from $t$ into the natural numbers 176(see~{\S}\ref{sec:nat} below). For lists, \isa{size} is just the length, i.e.\ 177\isa{size [] = 0} and \isa{size(x \# xs) = size xs + 1}. In general, 178\cdx{size} returns 179\begin{itemize} 180\item zero for all constructors that do not have an argument of type $t$, 181\item one plus the sum of the sizes of all arguments of type~$t$, 182for all other constructors. 183\end{itemize} 184Note that because 185\isa{size} is defined on every datatype, it is overloaded; on lists 186\isa{size} is also called \sdx{length}, which is not overloaded. 187Isabelle will always show \isa{size} on lists as \isa{length}. 188 189 190\subsection{Primitive Recursion} 191 192\index{recursion!primitive}% 193Functions on datatypes are usually defined by recursion. In fact, most of the 194time they are defined by what is called \textbf{primitive recursion} over some 195datatype $t$. This means that the recursion equations must be of the form 196\[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \] 197such that $C$ is a constructor of $t$ and all recursive calls of 198$f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus 199Isabelle immediately sees that $f$ terminates because one (fixed!) argument 200becomes smaller with every recursive call. There must be at most one equation 201for each constructor. Their order is immaterial. 202A more general method for defining total recursive functions is introduced in 203{\S}\ref{sec:fun}. 204 205\begin{exercise}\label{ex:Tree} 206\input{Tree.tex}% 207\end{exercise} 208 209\input{case_exprs.tex} 210 211\input{Ifexpr.tex} 212\index{datatypes|)} 213 214 215\section{Some Basic Types} 216 217This section introduces the types of natural numbers and ordered pairs. Also 218described is type \isa{option}, which is useful for modelling exceptional 219cases. 220 221\subsection{Natural Numbers} 222\label{sec:nat}\index{natural numbers}% 223\index{linear arithmetic|(} 224 225\input{fakenat.tex}\medskip 226\input{natsum.tex} 227 228\index{linear arithmetic|)} 229 230 231\subsection{Pairs} 232\input{pairs2.tex} 233 234\subsection{Datatype {\tt\slshape option}} 235\label{sec:option} 236\input{Option2.tex} 237 238\section{Definitions} 239\label{sec:Definitions} 240 241A definition is simply an abbreviation, i.e.\ a new name for an existing 242construction. In particular, definitions cannot be recursive. Isabelle offers 243definitions on the level of types and terms. Those on the type level are 244called \textbf{type synonyms}; those on the term level are simply called 245definitions. 246 247 248\subsection{Type Synonyms} 249 250\index{type synonyms}% 251Type synonyms are similar to those found in ML\@. They are created by a 252\commdx{type\protect\_synonym} command: 253 254\medskip 255\input{types.tex} 256 257\input{prime_def.tex} 258 259 260\section{The Definitional Approach} 261\label{sec:definitional} 262 263\index{Definitional Approach}% 264As we pointed out at the beginning of the chapter, asserting arbitrary 265axioms such as $f(n) = f(n) + 1$ can easily lead to contradictions. In order 266to avoid this danger, we advocate the definitional rather than 267the axiomatic approach: introduce new concepts by definitions. However, Isabelle/HOL seems to 268support many richer definitional constructs, such as 269\isacommand{primrec}. The point is that Isabelle reduces such constructs to first principles. For example, each 270\isacommand{primrec} function definition is turned into a proper 271(nonrecursive!) definition from which the user-supplied recursion equations are 272automatically proved. This process is 273hidden from the user, who does not have to understand the details. Other commands described 274later, like \isacommand{fun} and \isacommand{inductive}, work similarly. 275This strict adherence to the definitional approach reduces the risk of 276soundness errors. 277 278\chapter{More Functional Programming} 279 280The purpose of this chapter is to deepen your understanding of the 281concepts encountered so far and to introduce advanced forms of datatypes and 282recursive functions. The first two sections give a structured presentation of 283theorem proving by simplification ({\S}\ref{sec:Simplification}) and discuss 284important heuristics for induction ({\S}\ref{sec:InductionHeuristics}). You can 285skip them if you are not planning to perform proofs yourself. 286We then present a case 287study: a compiler for expressions ({\S}\ref{sec:ExprCompiler}). Advanced 288datatypes, including those involving function spaces, are covered in 289{\S}\ref{sec:advanced-datatypes}; it closes with another case study, search 290trees (``tries''). Finally we introduce \isacommand{fun}, a general 291form of recursive function definition that goes well beyond 292\isacommand{primrec} ({\S}\ref{sec:fun}). 293 294 295\section{Simplification} 296\label{sec:Simplification} 297\index{simplification|(} 298 299So far we have proved our theorems by \isa{auto}, which simplifies 300all subgoals. In fact, \isa{auto} can do much more than that. 301To go beyond toy examples, you 302need to understand the ingredients of \isa{auto}. This section covers the 303method that \isa{auto} always applies first, simplification. 304 305Simplification is one of the central theorem proving tools in Isabelle and 306many other systems. The tool itself is called the \textbf{simplifier}. 307This section introduces the many features of the simplifier 308and is required reading if you intend to perform proofs. Later on, 309{\S}\ref{sec:simplification-II} explains some more advanced features and a 310little bit of how the simplifier works. The serious student should read that 311section as well, in particular to understand why the simplifier did 312something unexpected. 313 314\subsection{What is Simplification?} 315 316In its most basic form, simplification means repeated application of 317equations from left to right. For example, taking the rules for \isa{\at} 318and applying them to the term \isa{[0,1] \at\ []} results in a sequence of 319simplification steps: 320\begin{ttbox}\makeatother 321(0#1#[]) @ [] \(\leadsto\) 0#((1#[]) @ []) \(\leadsto\) 0#(1#([] @ [])) \(\leadsto\) 0#1#[] 322\end{ttbox} 323This is also known as \bfindex{term rewriting}\indexbold{rewriting} and the 324equations are referred to as \bfindex{rewrite rules}. 325``Rewriting'' is more honest than ``simplification'' because the terms do not 326necessarily become simpler in the process. 327 328The simplifier proves arithmetic goals as described in 329{\S}\ref{sec:nat} above. Arithmetic expressions are simplified using built-in 330procedures that go beyond mere rewrite rules. New simplification procedures 331can be coded and installed, but they are definitely not a matter for this 332tutorial. 333 334\input{simp.tex} 335 336\index{simplification|)} 337 338\input{Itrev.tex} 339\begin{exercise} 340\input{Plus.tex}% 341\end{exercise} 342\begin{exercise} 343\input{Tree2.tex}% 344\end{exercise} 345 346\input{CodeGen.tex} 347 348 349\section{Advanced Datatypes} 350\label{sec:advanced-datatypes} 351\index{datatype@\isacommand {datatype} (command)|(} 352\index{primrec@\isacommand {primrec} (command)|(} 353%|) 354 355This section presents advanced forms of datatypes: mutual and nested 356recursion. A series of examples will culminate in a treatment of the trie 357data structure. 358 359 360\subsection{Mutual Recursion} 361\label{sec:datatype-mut-rec} 362 363\input{ABexpr.tex} 364 365\subsection{Nested Recursion} 366\label{sec:nested-datatype} 367 368{\makeatother\input{Nested.tex}} 369 370 371\subsection{The Limits of Nested Recursion} 372\label{sec:nested-fun-datatype} 373 374How far can we push nested recursion? By the unfolding argument above, we can 375reduce nested to mutual recursion provided the nested recursion only involves 376previously defined datatypes. This does not include functions: 377\begin{isabelle} 378\isacommand{datatype} t = C "t \isasymRightarrow\ bool" 379\end{isabelle} 380This declaration is a real can of worms. 381In HOL it must be ruled out because it requires a type 382\isa{t} such that \isa{t} and its power set \isa{t \isasymFun\ bool} have the 383same cardinality --- an impossibility. For the same reason it is not possible 384to allow recursion involving the type \isa{t set}, which is isomorphic to 385\isa{t \isasymFun\ bool}. 386 387Fortunately, a limited form of recursion 388involving function spaces is permitted: the recursive type may occur on the 389right of a function arrow, but never on the left. Hence the above can of worms 390is ruled out but the following example of a potentially 391\index{infinitely branching trees}% 392infinitely branching tree is accepted: 393\smallskip 394 395\input{Fundata.tex} 396 397If you need nested recursion on the left of a function arrow, there are 398alternatives to pure HOL\@. In the Logic for Computable Functions 399(\rmindex{LCF}), types like 400\begin{isabelle} 401\isacommand{datatype} lam = C "lam \isasymrightarrow\ lam" 402\end{isabelle} 403do indeed make sense~\cite{paulson87}. Note the different arrow, 404\isa{\isasymrightarrow} instead of \isa{\isasymRightarrow}, 405expressing the type of \emph{continuous} functions. 406There is even a version of LCF on top of HOL, 407called \rmindex{HOLCF}~\cite{MuellerNvOS99}. 408\index{datatype@\isacommand {datatype} (command)|)} 409\index{primrec@\isacommand {primrec} (command)|)} 410 411 412\subsection{Case Study: Tries} 413\label{sec:Trie} 414 415\index{tries|(}% 416Tries are a classic search tree data structure~\cite{Knuth3-75} for fast 417indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a 418trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and 419``cat''. When searching a string in a trie, the letters of the string are 420examined sequentially. Each letter determines which subtrie to search next. 421In this case study we model tries as a datatype, define a lookup and an 422update function, and prove that they behave as expected. 423 424\begin{figure}[htbp] 425\begin{center} 426\unitlength1mm 427\begin{picture}(60,30) 428\put( 5, 0){\makebox(0,0)[b]{l}} 429\put(25, 0){\makebox(0,0)[b]{e}} 430\put(35, 0){\makebox(0,0)[b]{n}} 431\put(45, 0){\makebox(0,0)[b]{r}} 432\put(55, 0){\makebox(0,0)[b]{t}} 433% 434\put( 5, 9){\line(0,-1){5}} 435\put(25, 9){\line(0,-1){5}} 436\put(44, 9){\line(-3,-2){9}} 437\put(45, 9){\line(0,-1){5}} 438\put(46, 9){\line(3,-2){9}} 439% 440\put( 5,10){\makebox(0,0)[b]{l}} 441\put(15,10){\makebox(0,0)[b]{n}} 442\put(25,10){\makebox(0,0)[b]{p}} 443\put(45,10){\makebox(0,0)[b]{a}} 444% 445\put(14,19){\line(-3,-2){9}} 446\put(15,19){\line(0,-1){5}} 447\put(16,19){\line(3,-2){9}} 448\put(45,19){\line(0,-1){5}} 449% 450\put(15,20){\makebox(0,0)[b]{a}} 451\put(45,20){\makebox(0,0)[b]{c}} 452% 453\put(30,30){\line(-3,-2){13}} 454\put(30,30){\line(3,-2){13}} 455\end{picture} 456\end{center} 457\caption{A Sample Trie} 458\label{fig:trie} 459\end{figure} 460 461Proper tries associate some value with each string. Since the 462information is stored only in the final node associated with the string, many 463nodes do not carry any value. This distinction is modeled with the help 464of the predefined datatype \isa{option} (see {\S}\ref{sec:option}). 465\input{Trie.tex} 466\index{tries|)} 467 468\section{Total Recursive Functions: \isacommand{fun}} 469\label{sec:fun} 470\index{fun@\isacommand {fun} (command)|(}\index{functions!total|(} 471 472Although many total functions have a natural primitive recursive definition, 473this is not always the case. Arbitrary total recursive functions can be 474defined by means of \isacommand{fun}: you can use full pattern matching, 475recursion need not involve datatypes, and termination is proved by showing 476that the arguments of all recursive calls are smaller in a suitable sense. 477In this section we restrict ourselves to functions where Isabelle can prove 478termination automatically. More advanced function definitions, including user 479supplied termination proofs, nested recursion and partiality, are discussed 480in a separate tutorial~\cite{isabelle-function}. 481 482\input{fun0.tex} 483 484\index{fun@\isacommand {fun} (command)|)}\index{functions!total|)} 485