% binomial.tex % % The Binomial Theorem proven in HOL % \setlength{\unitlength}{1mm} % \vskip10mm % \begin{center}\LARGE\bf % The Binomial Theorem proven in HOL. % \end{center} % \vskip10mm \def\obj#1{\mbox{\tt#1}} % --------------------------------------------------------------------- % Parameters customising the document; set by whoever \input's it % --------------------------------------------------------------------- % \self{} is one word denoting this document, such as "article" or "chapter" % \path{} is the path denoting the case-study directory % % Typical examples: % \def\self{article} % \def\path{\verb%Training/studies/binomial%} % --------------------------------------------------------------------- \section{Pascal's Triangle and the Binomial Theorem} Pascal's Triangle\footnote{ According to Knuth \cite{knuth73}, the triangle gets its name from its appearance in Blaise Pascal's {\em Trait\'e du triangle arithm\'etique} in 1653, although the binomial coefficients were known long before then. The earliest surviving description is from the tenth century, in Hal\={a}yudha's commentary on the Hindu classic, the Chandah-S\^{u}tra. The Binomial Theorem itself was first reported in 1676 by Isaac Newton. } is the infinite triangle of numbers which starts like this: \[\begin{array}{ccccccc} 1 \\ 1 & 1 \\ 1 & 2 & 1 \\ 1 & 3 & 3 & 1 \\ 1 & 4 & 6 & 4 & 1 \\ 1 & 5 & 10 & 10 & 5 & 1 \\ \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ \end{array}\] The construction is as follows. The numbers at the edges are $1$s. Each number in the interior is the sum of the number immediately above it and the number to its left in the previous row. The numbers in Pascal's Triangle are called Binomial Coefficients. The number in the $n$th row and $k$th column (where $0 \leq k \leq n$, and the topmost row and the leftmost column are numbered $0$) is written $n \choose k$ and pronounced ``$n$ choose $k$''. The coefficients are pronounced this way because $n \choose k$ turns out to be the number of ways to choose $k$ things out of $n$ things, but that is another story (see for instance Knuth's book \cite{knuth73}). A simple form of the Binomial Theorem \cite{maclane67} \cite{mostow63} is the following equation, where $a$ and $b$ are integers and $n$ is any natural number, \[ (a + b)^n = \sum_{k=0}^n {n \choose k} a^k b^{n-k} \] The rest of this \self{} describes how the Binomial Theorem can be proven in \HOL{}. In fact, a more general theorem is proven, where $a$ and $b$ are elements of an arbitrary commutative ring, but the form displayed here can be derived from the general result. The motivation for the proof of the Binomial Theorem in \HOL{} is tutorial. % ---to be a medium sized worked example whose subject matter is % more widely known than any specific piece of hardware or software. The proof illustrates specific ways in \HOL{} to represent mathematical notations and manage theorems proven about general structures such as monoids, groups and rings. The following files accompany this \self{} in directory \path{}: \begin{description} \item[{\tt mk\_BINOMIAL.ml}] The \ML{} program containing the definitions, lemmas and theorems\\ needed to prove the Binomial Theorem in \HOL{}. \item [{\tt BINOMIAL.th}] The theory file which is generated by \verb@mk_BINOMIAL.ml@. \item [{\tt BINOMIAL.ml}] An \ML{} program which loads the theory \verb@BINOMIAL@ and declares a few derived rules and one tactic used in the proof. The Binomial Theorem is bound to the \ML{} identifier \verb@BINOMIAL@. \end{description} % The aim of this \self{} is to introduce the small amount of algebra and % mathematical notation needed to state and prove the Binomial Theorem, show % how this is rendered in \HOL{}, and outline the structure of the proof % contained in \verb@mk_BINOMIAL.ml@. The \self{} is meant to be intelligible without examination of any of the accompanying files. To avoid clutter not every detail is spelt out. Often definitions and theorems are indicated in the form displayed by \HOL{}, rather than as \ML{} source text. Specific details of the \ML{} definitions or tactics can be found in the file \verb@mk_BINOMIAL.ml@. The way to work with this case study is first to study this \self{} to see the structure of the proof, and then to work interactively with \HOL{}. Start \HOL{} and say: \begin{session} \begin{verbatim} #loadt `BINOMIAL`;; ... File BINOMIAL loaded () : void #BINOMIAL;; |- !plus times. RING(plus,times) ==> (!a b n. POWER times n(plus a b) = SIGMA(plus,0,SUC n)(BINTERM plus times a b n)) \end{verbatim} \end{session} (This is the first display of an interaction with \HOL{}. The difference between lines input to and lines output from \HOL{} is that only the former are preceded by the \HOL{} prompt ``\verb@#@''. A line of ellipsis ``\verb@...@'' stands for output from \HOL{} which has been omitted from the display.) The first input line in the display above loads in all the definitions, theorems, derived rules and the tactic defined for the proof. The second input line asks \HOL{} to print theorem \verb@BINOMIAL@. The constants used in \verb@BINOMIAL@ are explained later in the \self{}. The \HOL{} session is now set up either to make use of \verb@BINOMIAL@ by specialising it to a specific ring, or to study \verb@mk_BINOMIAL.ml@ by using the subgoal package to work through proofs contained in it. The remainder of the \self{} follows the plan: \begin{description} \item[Section~\ref{BinomialCoefficients}] shows how to define the number $n \choose k$ in \HOL{} as the term $\obj{CHOOSE}\,n\,k$, where $\obj{CHOOSE}$ is a function defined by primitive recursion. \item[Section~\ref{MonoidsGroupsRings}] shows how to define elementary algebraic concepts like associativity, commutativity, monoid, group and ring in \HOL{}, and how to apply theorems proved in the general case to particular situations. \item[Section~\ref{PowersReductionsRangesSums}] shows how to define the iterated operations of raising a term to a power and reducing (summing) a list of terms. These operations are part of the everyday mathematical language used to state the Binomial Theorem, but they are not built-in to \HOL{} and so need to be defined. This section and the previous two together describe enough definitions in \HOL{} to state the Binomial Theorem. \item[Section~\ref{BinomialTheorem}] shows how the Binomial Theorem is proven in \HOL{}. The proof is by induction, and depends on three main lemmas. \item[Section~\ref{BinomialTheoremForIntegers}] outlines how to apply the general theorem to a particular case: the ring of integers. \item[Appendix~\ref{PrincipalLemmas}] states the principal lemmas used to prove the theorem. Some of these are {\em ad hoc} but others could be reused elsewhere. \end{description} % ---------------------------------------------------------------------------- \section{The Binomial Coefficients} \label{BinomialCoefficients} The definition of the binomial coefficients as the numbers in Pascal's Triangle is formalised in three equations: \begin{eqnarray*} n \choose 0 &=& 1 \\ 0 \choose {k+1} &=& 0 \\ {n+1} \choose {k+1} &=& {{n} \choose {k+1}} + {{n} \choose {k}} \end{eqnarray*} (These actually define $n \choose k$ to be $0$ if $k>n$, but this is consistent with Pascal's Triangle: think of the spaces to the right of the triangle as filled with $0$'s.) The desire is to define a constant \verb@CHOOSE@ in \HOL{} such that for all numbers $n$, $k$ in the type {\tt num}, $\verb@CHOOSE@\,n\,k$ denotes the number $n \choose k$. Unfortunately these three equations cannot be used directly as the definition of \verb@CHOOSE@ in \HOL{} because they are not in the form of a base case (when $n=0$) together with an inductive case (when $n>0$). To work towards the definition, consider a term that specifies a base case and an inductive case intended to be equivalent to the three equations above: \begin{session} \begin{verbatim} let base_and_inductive: term = "(CHOOSE 0 k = ((0 = k) => 1 | 0)) /\ (CHOOSE (SUC n) k = ((0 = k) => 1 | (CHOOSE n (k-1)) + (CHOOSE n k)))";; \end{verbatim} \end{session} The theory \verb@prim_rec@ contains a primitive recursion theorem which says that any base case and inductive case identifies a unique function, \verb@fn@: \begin{session} \begin{verbatim} #num_Axiom;; |- !e f. ?! fn. (fn 0 = e) /\ (!n. fn (SUC n) = f (fn n) n) \end{verbatim} \end{session} Given the theorem \verb@prim_rec@, the builtin function \verb@prove_rec_fn_exists@ can prove that there is a function that satisfies the property specified by the term \verb@base_and_inductive@: \begin{session} \begin{verbatim} #let RAW_CHOOSE_DEF = prove_rec_fn_exists num_Axiom base_and_inductive;; RAW_CHOOSE_DEF = |- ?CHOOSE. (!k. CHOOSE 0 k = ((0 = k) => 1 | 0)) /\ (!n k. CHOOSE(SUC n)k = ((0 = k) => 1 | (CHOOSE n(k - 1)) + (CHOOSE n k))) \end{verbatim} \end{session} These equations would not do as the only definition of \verb@CHOOSE@ due to the presence of the conditional operator, which makes rewriting difficult. But having obtained the theorem \verb@RAW_CHOOSE_DEF@ it is easy to prove there is a function which satisfies the original three equations: \begin{session} \begin{verbatim} CHOOSE_DEF_LEMMA = |- ?CHOOSE. (!n. CHOOSE n 0 = 1) /\ (!k. CHOOSE 0(SUC k) = 0) /\ (!n k. CHOOSE(SUC n)(SUC k) = (CHOOSE n(SUC k)) + (CHOOSE n k)) \end{verbatim} \end{session} Now the constant \verb@CHOOSE@ can be specified as originally desired: \begin{session} \begin{verbatim} #let CHOOSE_DEF = # new_specification `CHOOSE_DEF` [`constant`,`CHOOSE`] CHOOSE_DEF_LEMMA;; CHOOSE_DEF = |- (!n. CHOOSE n 0 = 1) /\ (!k. CHOOSE 0(SUC k) = 0) /\ (!n k. CHOOSE(SUC n)(SUC k) = (CHOOSE n(SUC k)) + (CHOOSE n k)) \end{verbatim} \end{session} Two elementary properties of the binomial coefficients can now be proven by induction: \begin{session} \begin{verbatim} CHOOSE_LESS = |- !n k. n < k ==> (CHOOSE n k = 0) CHOOSE_SAME = |- !n. CHOOSE n n = 1 \end{verbatim} \end{session} % ---------------------------------------------------------------------------- \section{Monoids, Groups and Commutative Rings} \label{MonoidsGroupsRings} %A simple way to define algebraic structures in \HOL{}. \subsection{Associativity and Commutativity} An {\em algebraic structure} is a set equipped with some operators that obey some laws. An elementary example of such a structure is the pair $(A,+)$, where $A$ is a set and $+$ is a binary operator on $A$ that obeys the laws of associativity, \[ a + (b + c) = (a + b) + c \] and commutativity, \[ a + b = b + a \] How might these laws be represented in \HOL{}? The simplest scheme is to deal with structures of the form $(\sigma, +)$, where $\sigma$ is a type of the \HOL{} logic, and $+$ is a \HOL{} term of type $\sigma\to\sigma\to\sigma$. Notice that the set in the structure is represented as a \HOL{} type: see the theory \verb@group@ presented in Elsa Gunter's case study of Modular Arithmetic for a more sophisticated approach where the set in a structure can be a subset of a \HOL{} type. The two laws can be defined as two constants in \HOL{}: \begin{session} \begin{verbatim} #let ASSOCIATIVE = # new_definition (`ASSOCIATIVE`, # "!plus: *->*->*. ASSOCIATIVE plus = # (!a b c. plus a (plus b c) = plus (plus a b) c)");; ... #let COMMUTATIVE = # new_definition (`COMMUTATIVE`, # "!plus: *->*->*. COMMUTATIVE plus = # (!a b. plus a b = plus b a)");; \end{verbatim} \end{session} By instantiating the type variable \verb@*@ and specialising the variable \verb@plus@ these constants are applicable to any particular structure of the form $(\sigma, +: \sigma\to\sigma\to\sigma)$. A simple example to illustrate this methodology is a permutation theorem about an arbitrary associative commutative operator (which is called \verb@PERM@ in file \verb@mk_BINOMIAL.ml@). It is worthwhile going through the proof in detail because it illustrates how a difficulty in making use of assumptions like \verb@ASSOCIATIVE plus@ and \verb@COMMUTATIVE plus@ can be solved. \begin{session} \begin{verbatim} #g "!plus: *->*->*. # ASSOCIATIVE plus /\ COMMUTATIVE plus ==> # !a b c. plus (plus a b) c = plus b (plus a c)";; \end{verbatim} \end{session} The first step is to strip off the quantifiers, and break up the implication: \begin{session} \begin{verbatim} #e (REPEAT STRIP_TAC);; OK.. "plus(plus a b)c = plus b(plus a c)" [ "ASSOCIATIVE plus" ] [ "COMMUTATIVE plus" ] \end{verbatim} \end{session} The next step is to rewrite \verb@plus a b@ to \verb@plus b a@ on the left-hand side, from the assumption \verb@COMMUTATIVE plus@. If the theorem was about a specific operator, such as \verb@$+: num->num->num@ from the theory of numbers in \HOL{}, the rewriting could be done with \verb@REWRITE_TAC@ and the specific commutativity theorem, such as \verb@ADD_SYM@: \[ \mbox{\tt |- !m n. m + n = n + m}. \] But in the general case all that is available is the definition of \verb@COMMUTATIVE plus@ which cannot be used directly with \verb@REWRITE_TAC@ to swap the terms \verb@a@ and \verb@b@. One approach is first to prove the following equation from the definition \verb@COMMUTATIVE plus@, \[ \mbox{\tt COMMUTATIVE plus |- !a b. plus a b = plus b a}, \] and then to use the equation with \verb@REWRITE_TAC@ in the same way as one might use \verb@ADD_SYM@. This is a valid tactic because the assumption \verb@COMMUTATIVE plus@ is already present in the assumption list. A little forward proof achieves the first step, the derivation of the equation: \begin{session} \begin{verbatim} #top_print print_all_thm;; - : (thm -> void) #let forwards = fst(EQ_IMP_RULE (SPEC "plus: *->*->*" COMMUTATIVE));; forwards = |- COMMUTATIVE plus ==> (!a b. plus a b = plus b a) #let eqn = UNDISCH forwards;; eqn = COMMUTATIVE plus |- !a b. plus a b = plus b a \end{verbatim} \end{session} This forward proof is a special case of a derived rule of the logic, \[ \Gamma\turn \uquant{x} t_1 = t_2 \over \Gamma, t_1[t'/x] \turn t_2[t'/x] \] which can be coded as the \ML{} function \verb@SPEC_EQ@: \begin{session} \begin{verbatim} #let SPEC_EQ v th = UNDISCH(fst(EQ_IMP_RULE (SPEC v th)));; SPEC_EQ = - : (term -> thm -> thm) \end{verbatim} \end{session} Now the equation \verb@eqn@ can be used to prove swap terms \verb@a@ and \verb@b@: \begin{session} \begin{verbatim} #e(SUBST1_TAC (SPECL ["a:*";"b:*"] eqn));; OK.. "plus(plus b a)c = plus b(plus a c)" [ "ASSOCIATIVE plus" ] [ "COMMUTATIVE plus" ] \end{verbatim} \end{session} Now that the terms appear in both sides in the same order but with different grouping, associativity can be used to make the grouping on both sides the same. Again the definition of \verb@ASSOCIATIVE@ cannot be used directly, but the derived rule \verb@SPEC_EQ@ obtains from the definition an equation analogous to \verb@eqn@ which can be used: \begin{session} \begin{verbatim} #e(REWRITE_TAC [SPEC_EQ "plus: *->*->*" ASSOCIATIVE]);; OK.. goal proved ... |- !plus. ASSOCIATIVE plus /\ COMMUTATIVE plus ==> (!a b c. plus(plus a b)c = plus b(plus a c)) \end{verbatim} \end{session} The point of this example was to illustrate how to prove theorems about an arbitrary operator whose behaviour is specified by general predicates like \verb@ASSOCIATIVE@ and \verb@COMMUTATIVE@. The main difficulty is in deriving equations for use with rewriting or substitution. The solution used in the example and extensively in the proof of the Binomial Theorem is to derive equations using the derived rule \verb@SPEC_EQ@. Another derived rule, \verb@SPEC_IMP@, is used extensively and is a variant of \verb@SPEC_EQ@: \[ \Gamma\turn \uquant{x} t_1 \Rightarrow t_2 \over \Gamma, t_1[t'/x] \turn t_2[t'/x] \] \begin{session} \begin{verbatim} #let SPEC_IMP v th = UNDISCH(SPEC v th);; SPEC_IMP = - : (term -> thm -> thm) \end{verbatim} \end{session} Two following two rules are versions of \verb@SPEC_EQ@ and \verb@SPEC_IMP@ generalised to take lists of variables rather than just one: \begin{session} \begin{verbatim} SPECL_EQ = - : (term list -> thm -> thm) SPECL_IMP = - : (term list -> thm -> thm) \end{verbatim} \end{session} The final new derived rule used in the proof of the Binomial Theorem is \verb@STRIP_SPEC_IMP@, a variant of \verb@SPEC_IMP@, which splits up a conjunction in the antecedent into separate assumptions: \[ \Gamma\turn \uquant{x} (t_1 \wedge \cdots \wedge t_n) \Rightarrow t_{n+1} \over \Gamma, t_1[t'/x], \ldots, t_n[t'/x] \turn t_{n+1}[t'/x] \] \begin{session} \begin{verbatim} #let STRIP_SPEC_IMP v th = # UNDISCH_ALL(SPEC v (CONV_RULE (ONCE_DEPTH_CONV ANTE_CONJ_CONV) th));; STRIP_SPEC_IMP = - : (term -> thm -> thm) \end{verbatim} \end{session} There are other methods for proving theorems which are conditional on predicates such as \verb@ASSOCIATIVE@ and \verb@COMMUTATIVE@. One is to rewrite with the definition before assuming the predicate (see, for instance, the proof of \verb@RIGHT_CANCELLATION@ in \verb@mk_BINOMIAL.ml@). This method is fine for small proofs, but in larger proofs it presents the problem of how to specify a particular assumption on the assumption list. Another method is to put the predicates on the assumption list, and use resolution to extract the body of the definition (see the proof of \verb@UNIQUE_LEFT_INV@). When the assumption list is not short this is rather a blunt instrument, in the sense that resolution can find many more theorems than the one desired, and is also rather computationally costly when compared to carefully constructing a theorem with \verb@SPEC_EQ@ and its variants. \subsection{Monoids} A {\em monoid} is a structure $(M,+)$ where $M$ is a set, $+$ is an associative binary operator on $M$, such that there is an {\em identity element}, $u \in M$, which is a left and right identity of $+$, that is, it satisfies $u+a = a+u = a$ for all $a \in M$. When the operator is written as $+$, the structure $(M,+)$ is called an {\em additive monoid} and the identity element is written as $0$; otherwise if the operator is written as $\times$, the structure $(M,\times)$ is called a {\em multiplicative monoid} and the identity element is written as $1$. \begin{session} \begin{verbatim} #let RAW_MONOID = # new_definition (`RAW_MONOID`, # "!plus: *->*->*. MONOID plus = # ASSOCIATIVE plus /\ # (?u. (!a. plus a u = a) /\ (!a. plus u a = a))");; \end{verbatim} \end{session} This definition is inconvenient to manipulate as it stands, because of the presence of the existential quantifier. However it is possible via Hilbert's $\epsilon$-operator to specify a function \verb@Id@ such that \verb@Id plus@ stands for an identity element of \verb@plus@, if such exists. In fact it is easy to prove that the identity element in a monoid is unique. The first step is to prove that there exists a function \verb@Id@ with the property just described: \begin{session} \begin{verbatim} ID_LEMMA = |- ?Id. !plus. (?u. (!a. plus u a = a) /\ (!a. plus a u = a)) ==> (!a. plus(Id plus)a = a) /\ (!a. plus a(Id plus) = a) \end{verbatim} \end{session} The proof of \verb@ID_LEMMA@ uses Hilbert's $\epsilon$-operator to construct a witness for the outer existential quantifier, \begin{verbatim} \plus.@u:*.(!a:*. (plus u a = a)) /\ (!a. (plus a u = a)) \end{verbatim} Given \verb@ID_LEMMA@, the constant \verb@Id@ is specified as follows: \begin{session} \begin{verbatim} #let ID = new_specification `Id` [`constant`, `Id`] ID_LEMMA;; ID = |- !plus. (?u. (!a. plus u a = a) /\ (!a. plus a u = a)) ==> (!a. plus(Id plus)a = a) /\ (!a. plus a(Id plus) = a) \end{verbatim} \end{session} The condition that \verb@Id plus@ is a left and right identity is expressed by the predicates \verb@LEFT_ID@ and \verb@RIGHT_ID@, respectively: \begin{session} \begin{verbatim} LEFT_ID = |- !plus. LEFT_ID plus = (!a. plus(Id plus)a = a) RIGHT_ID = |- !plus. RIGHT_ID plus = (!a. plus a(Id plus) = a) \end{verbatim} \end{session} Given these abbreviations it is easy to prove in \HOL{} that if a left and right identity element exists, then it is unique: \begin{session} \begin{verbatim} UNIQUE_LEFT_ID = |- !plus. LEFT_ID plus /\ RIGHT_ID plus ==> (!l. (!a. plus l a = a) ==> (Id plus = l)) UNIQUE_RIGHT_ID = |- !plus. LEFT_ID plus /\ RIGHT_ID plus ==> (!r. (!a. plus a r = a) ==> (Id plus = r)) \end{verbatim} \end{session} The use of these last two theorems is to help identify the element \verb@Id plus@ for some specific \verb@plus@, such as \verb@$+@ from the theory of \verb@arithmetic@. The abbreviations \verb@LEFT_ID@ and \verb@RIGHT_ID@ admit a new characterisation of the predicate \verb@MONOID@, which is easier to manipulate in \HOL{} than \verb@RAW_MONOID@: \begin{session} \begin{verbatim} MONOID = |- !plus. MONOID plus = LEFT_ID plus /\ RIGHT_ID plus /\ ASSOCIATIVE plus \end{verbatim} \end{session} \subsection{Groups} A {\em group} is a monoid in which every element is invertible. \begin{session} \begin{verbatim} RAW_GROUP = |- !plus. Group plus = MONOID plus /\ (!a. ?b. (plus b a = Id plus) /\ (plus a b = Id plus)) \end{verbatim} \end{session} An alternative characterisation of \verb@Group@\footnote{The mixed case identifier \texttt{Group} is used rather than \texttt{GROUP}, because the latter is already defined in Elsa Gunter's theory \texttt{group}, a parent of her theory \texttt{integer}, which needs to coexist with the present theory so that integers can be used as an example ring (Section~\ref{BinomialTheoremForIntegers}).} which makes the existential quantification implicit can be had by specifying a function \verb@Inv@ so that, if \verb@Group plus@ holds, then the element \verb@Inv plus a@ is an inverse of \verb@a@. The function \verb@Inv@ is specified in a way analogous to the specification of \verb@Id@. First a lemma is proven which says there does exist a function \verb@Inv@ with the property desired: \begin{session} \begin{verbatim} INV_LEMMA = |- ?Inv. !plus. (!a. ?b. (plus b a = Id plus) /\ (plus a b = Id plus)) ==> (!a. (plus(Inv plus a)a = Id plus) /\ (plus a(Inv plus a) = Id plus)) \end{verbatim} \end{session} Given this lemma, which is proven by an easy tactic almost the same as the one for \verb@ID_LEMMA@, the constant \verb@Inv@ can be specified: \begin{session} \begin{verbatim} #let INV = # new_specification `Inv` [`constant`, `Inv`] INV_LEMMA;; \end{verbatim} \end{session} The predicates which say that \verb@Inv@ produces left and right inverses are defined as new constants: \begin{session} \begin{verbatim} LEFT_INV = |- !plus. LEFT_INV plus = (!a. plus(Inv plus a)a = Id plus) RIGHT_INV = |- !plus. RIGHT_INV plus = (!a. plus a(Inv plus a) = Id plus) \end{verbatim} \end{session} Finally, the alternative characterisation of \verb@Group@ can be proven: \begin{session} \begin{verbatim} GROUP = |- Group plus = MONOID plus /\ LEFT_INV plus /\ RIGHT_INV plus \end{verbatim} \end{session} The theory of groups has been developed only as far as needed to prove the Binomial Theorem; Appendix~\ref{PrincipalLemmas} shows the one lemma, \verb@RIGHT_CANCELLATION@, specifically about groups. \subsection{Rings} A {\em ring} is a structure $(R,+,\times)$ such that structure $(R,+)$ is a group, structure $(R,\times)$ is a monoid, {\em addition}, $+$, is commutative, and {\em multiplication}, $\times$, distributes (on both sides) over addition. A {\em commutative ring} is a ring in which multiplication is commutative. The word ring is used in the remainder of this \self{} and in the \HOL{} code to mean a commutative ring: \begin{session} \begin{verbatim} RING = |- !plus times. RING(plus,times) = Group plus /\ COMMUTATIVE plus /\ MONOID times /\ COMMUTATIVE times /\ LEFT_DISTRIB(plus,times) /\ RIGHT_DISTRIB(plus,times) \end{verbatim} \end{session} \verb@LEFT_DISTRIB@ and \verb@RIGHT_DISTRIB@ are new constants defined by the theorems: \begin{session} \begin{verbatim} LEFT_DISTRIB = |- !plus times. LEFT_DISTRIB(plus,times) = (!a b c. times a(plus b c) = plus(times a b)(times a c)) RIGHT_DISTRIB = |- !plus times. RIGHT_DISTRIB(plus,times) = (!a b c. times(plus a b)c = plus(times a c)(times b c)) \end{verbatim} \end{session} Notice that the definition of \verb@RING@ abbreviates a conjunction of predicates, some of which are atomic, in the sense that they are some basic property of \verb@plus@ or \verb@times@ or both, and some are compound, in the sense that they abbreviate a further conjunction of properties. The definition is a tree of predicates with atomic ones like \verb@LEFT_DISTRIB@ and \verb@ASSOCIATIVE@, at the leaves, and compound ones like \verb@Group@ and \verb@MONOID@, at the branches. Many theorems conditional on \verb@RING(plus,times)@ need to make use of both atomic and compound predicates contained in the tree. To make access to all these predicates easy, the following lemma is proven, which makes explicit all the predicates in the tree: \begin{session} \begin{verbatim} RING_LEMMA = |- RING(plus,times) ==> RING(plus,times) /\ Group plus /\ MONOID plus /\ LEFT_ID plus /\ RIGHT_ID plus /\ ASSOCIATIVE plus /\ LEFT_INV plus /\ RIGHT_INV plus /\ COMMUTATIVE plus /\ MONOID times /\ LEFT_ID times /\ RIGHT_ID times /\ ASSOCIATIVE times /\ COMMUTATIVE times /\ LEFT_DISTRIB(plus,times) /\ RIGHT_DISTRIB(plus,times) \end{verbatim} \end{session} Goals of the form, \[ \mbox{\tt !plus times. RING(plus,times) ==> ...} \] are consistently tackled with the following tactic, which uses \verb@RING_LEMMA@ to add all the atomic and compound predicates in the tree to the assumption list: \begin{session} \begin{verbatim} #let RING_TAC = # GEN_TAC THEN GEN_TAC THEN # DISCH_THEN (\th. STRIP_ASSUME_TAC (MP RING_LEMMA th));; \end{verbatim} \end{session} The one lemma about rings needed here, \verb@RING_0@, is stated in Appendix~\ref{PrincipalLemmas}. % ---------------------------------------------------------------------------- \section{Powers, Reductions, Ranges and Sums} \label{PowersReductionsRangesSums} %Translation of mathematical definitions such as sums of series into HOL The aim of this section is to explain how the mathematical notation $\sum_{k=0}^n {n \choose k} a^k b^{n-k}$ can be rendered in \HOL{}. If this is written out more explicitly than necessary for human consumption it looks like this: \[ \sum_{k=0}^n {n \choose k} \cdot (a^k \times b^{n-k}) \] where $\times$ is multiplication in the ring, and where exponentiation and $\cdot$ stand for the scalar powers of multiplication and addition, respectively. Since binomial coefficients were treated in Section~\ref{BinomialCoefficients}, what remain to be defined in this section are scalar powers and the $\Sigma$-notation. The latter is tackled in three steps: reduction of a list of monoid elements; generation of lists comprising subranges of natural numbers, $[n,n+1,\ldots,n+k-1]$; and finally the reduction of a list of ring elements indexed by a subrange of the natural numbers. \subsection{Scalar powers in a monoid} If $(M,+)$ and $(M,\times)$ are additive and multiplicative monoids respectively, the scalar powers of addition and multiplication are defined informally by the following equations, where $n>0$: \begin{eqnarray*} n \cdot a &=& \overbrace{a + \cdots + a}^{n} \\ a^n &=& \overbrace{a \times \cdots \times a}^{n} \end{eqnarray*} When $n=0$, $n \cdot a = 0$ and $a^n = 1$. Scalar power is defined in \HOL{} via the constant \verb@POWER@ which is defined by primitive recursion: \begin{session} \begin{verbatim} POWER = |- (!plus a. POWER plus 0 a = Id plus) /\ (!plus n a. POWER plus(SUC n)a = plus a(POWER plus n a)) \end{verbatim} \end{session} Three lemmas about \verb@POWER@ are stated in Appendix~\ref{PrincipalLemmas}. \subsection{Reduction in a monoid} This section makes use of the standard \HOL{} theory of lists, in particular the constants \verb@NIL@, \verb@CONS@, \verb@APPEND@ and \verb@MAP@, definition by primitive recursion and proof by induction. The theory of lists is described in \DESCRIPTION. If $(M,+)$ is a monoid, then the reduction of a list $[a_1,\ldots,a_n]$, where each $a_i$ is an element of $M$, is the sum, \[ a_1 + \cdots + a_n, \] or the monoid's identity element if $n=0$. Reduction is represented in \HOL{} by the constant \verb@REDUCE@, with type, \begin{verbatim} REDUCE : (*->*->*) -> (*)list -> * \end{verbatim} and which is defined by primitive recursion on lists on its second argument: \begin{session} \begin{verbatim} #let REDUCE = # new_list_rec_definition (`REDUCE`, # "(!plus. (REDUCE plus NIL) = (Id plus):*) /\ # (!plus (a:*) as. REDUCE plus (CONS a as) = plus a (REDUCE plus as))");; REDUCE = |- (!plus. REDUCE plus[] = Id plus) /\ (!plus a as. REDUCE plus(CONS a as) = plus a(REDUCE plus as)) \end{verbatim} \end{session} Three lemmas about \verb@REDUCE@ are stated in Appendix~\ref{PrincipalLemmas}. \subsection{Subranges of the natural numbers} The constant \verb@RANGE@, which is a curried function of two numbers, is defined by primitive recursion so that, \[ \verb@RANGE@\,m\,n = [m, m+1, \ldots, m+n-1] \] This definition means that the subrange is the first $n$ numbers starting at $m$. In particular, when $n=0$ the list is empty. Appendix~\ref{PrincipalLemmas} states two simple properties of \verb@RANGE@ \verb@RANGE@ generates a subrange given its least element and length; an alternative method is to generate the subrange given its two endpoints, via a constant \verb@INTERVAL@ which satisfies the equation: \[ \verb@INTERVAL@\,m\,n = \left\{\begin{array}{ll} [m, m+1, \ldots, n] & \mbox{if $m \leq n$} \\ {}[] & \mbox{otherwise} \end{array}\right. \] For comparison, the two methods are definable in \HOL{} as follows: \begin{session} \begin{verbatim} #let RANGE = # new_recursive_definition false num_Axiom `RANGE` # "(RANGE m 0 = NIL) /\ # (RANGE m (SUC n) = CONS m (RANGE (SUC m) n))";; # #let INTERVAL = # new_recursive_definition false num_Axiom `INTERVAL` # "(INTERVAL m 0 = # (m > 0) => NIL | [0]) /\ # (INTERVAL m (SUC n) = # (m > (SUC n)) => NIL | APPEND (INTERVAL m n) ([SUC n]))";; \end{verbatim} \end{session} \subsection{$\Sigma$-notation} The standard $\Sigma$-notation is defined informally by the equation, \[ \sum_{i=m}^n a_i = \left\{\begin{array}{ll} a_m + a_{m+1} + \cdots + a_n & \mbox{if $m \leq n$} \\ 0 & \mbox{otherwise} \end{array}\right. \] Therefore the standard $\Sigma$-notation is naturally definable via a subrange generated by \verb@INTERVAL@. An alternative notation, naturally definable via \verb@RANGE@, can be defined informally as follows: \[ \sum_{i=m}^{(n)} a_i = \left\{\begin{array}{ll} a_m + a_{m+1} + \cdots + a_{m+n-1} & \mbox{if $n > 0$} \\ 0 & \mbox{otherwise} \end{array}\right. \] The brackets around the $n$ above the $\Sigma$ distinguish this notation from the standard one. This second notation is used in the \HOL{} proof of the Binomial Theorem. The difference between the two is just that the standard notation is definable via \verb@INTERVAL@, and the second via \verb@RANGE@. The second notation was chosen entirely because the constant \verb@RANGE@ is easier to manipulate in \HOL{} than \verb@INTERVAL@, which is because the latter has a more complex definition. The disadvantage of this choice is that the proof of the theorem uses a non-standard notation, but that's all the difference is: notation. If desired, the Binomial Theorem stated in the standard notation could be trivially deduced from its non-standard statement. The non-standard $\Sigma$-notation is defined in \HOL{} as the following abbreviation: \begin{session} \begin{verbatim} #let SIGMA = # new_definition (`SIGMA`, # "SIGMA (plus,m,n) f = REDUCE plus (MAP f (RANGE m n)): *");; \end{verbatim} \end{session} Note that an indexed term $a_i$ is represented in \HOL{} as a function \verb@f@ of type \verb@num -> *@. Appendix~\ref{PrincipalLemmas} states several properties of \verb@SIGMA@; their proofs make use of properties of \verb@MAP@ from the theory \verb@list@, and also the lemmas about \verb@REDUCE@ and \verb@RANGE@. The functions \verb@REDUCE@ and \verb@RANGE@ are needed here only to define \verb@SIGMA@, and in fact could be omitted if \verb@SIGMA@ were to be defined directly using primitive recursion. It is worth going to the trouble of defining \verb@REDUCE@ and \verb@RANGE@, however, because they are likely to be useful in other situations. % ---------------------------------------------------------------------------- \section{The Binomial Theorem for a Commutative Ring} \label{BinomialTheorem} %Division of a medium sized proof into manageable lemmas. This section outlines the proof. For full details see the tactics in \verb@mk_BINOMIAL.ml@. The statement and proof of the Binomial Theorem use a new \HOL{} constant, \verb@BINTERM@, to abbreviate indexed terms of the form $\lambda k.\,{n \choose k} \cdot a^{n-k} \times b^k$. \verb@BINTERM@ is defined as a curried function as follows: \begin{session} \begin{verbatim} #let BINTERM_DEF = # new_definition (`BINTERM_DEF`, # "BINTERM (plus: *->*->*) (times: *->*->*) a b n k = # POWER plus (CHOOSE n k) # (times (POWER times (n-k) a) (POWER times k b))");; \end{verbatim} \end{session} This abbreviation lets the Binomial Theorem be stated and proven in \HOL{} as follows: \begin{session} \begin{verbatim} BINOMIAL = |- !plus times. RING(plus,times) ==> (!a b n. POWER times n(plus a b) = SIGMA(plus,0,SUC n)(BINTERM plus times a b n)) \end{verbatim} \end{session} In outline, the proof proceeds as follows. The following equation is proven by induction on $n$: \begin{eqnarray*} (a+b)^n &=& \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k \end{eqnarray*} When $n=0$ both sides reduce to the multiplicative identity element, $1$, of the ring. For the inductive step, the equation is assumed for $n$, and the following goal must be proven: \begin{eqnarray*} (a+b)^{n+1} &=& \sum_{k=0}^{(n+2)} {{n+1} \choose k} a^{n+1-k} b^k \end{eqnarray*} Three lemmas are used in the inductive step: \[\begin{array}{rcll} \displaystyle \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k &=& \displaystyle a \times \sum_{k=1}^{(n)} {n \choose k} a^{n+1-k} b^k + b \times \sum_{k=0}^{(n)} {n \choose k} a^{n+1-k} b^k & \mbox{({\tt LEMMA\_1})} \\ \displaystyle a \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k &=& \displaystyle a^{n+1} + a \times \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k & \mbox{({\tt LEMMA\_2})} \\ \displaystyle b \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k &=& \displaystyle b \times \sum_{k=0}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k + b^{n+1} & \mbox{({\tt LEMMA\_3})} \end{array}\] \verb@LEMMA_1@ is the key to the inductive step. The three stages of the inductive step are indicated (a), (b) and (c) in \ML{} comments. Step (a) expands the right-hand side of the goal with \verb@LEMMA_1@: \begin{eqnarray*} \mbox{rhs} &=& a^{n+1} + a \times \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k + b \times \sum_{k=0}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k + b^{n+1} \end{eqnarray*} Step (b) expands the left-hand side with the induction hypothesis: \begin{eqnarray*} \mbox{lhs} &=& (a+b)\sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k \end{eqnarray*} Finally, in step (c) the expansions of the two sides are shown to be the same, using distributivity of $\times$ over $+$, associativity of $+$, and \verb@LEMMA_2@ and \verb@LEMMA_3@: \begin{eqnarray*} \mbox{lhs} &=& a \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k + b \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k \\ &=& a^{n+1} + a \times \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k + b \times \sum_{k=0}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k + b^{n+1} \\ &=& \mbox{rhs} \end{eqnarray*} % ---------------------------------------------------------------------------- \section{The Binomial Theorem for Integers} \label{BinomialTheoremForIntegers} The previous sections have dealt with the material in theory \verb@BINOMIAL@, concerning how to prove the Binomial Theorem for an arbitrary ring. A natural next step is to produce some specific commutative ring, and prove a Binomial Theorem for it, as an instance of the general case. To see how to do so for Elsa Gunter's theory of integers, take a look at the file {\tt mk\_BINOMIAL\_integer.ml}, contained in directory \path{}. When loaded into \HOL{}, this \ML{} file proves that the integers are a ring---by quoting laws contained in the theory \verb@integer@---and then derives a Binomial Theorem for the integers---by Modus Ponens from the general theorem \verb@BINOMIAL@.