1% binomial.tex % 2% The Binomial Theorem proven in HOL % 3 4\setlength{\unitlength}{1mm} 5% \vskip10mm 6% \begin{center}\LARGE\bf 7% The Binomial Theorem proven in HOL. 8% \end{center} 9% \vskip10mm 10 11\def\obj#1{\mbox{\tt#1}} 12 13% --------------------------------------------------------------------- 14% Parameters customising the document; set by whoever \input's it 15% --------------------------------------------------------------------- 16% \self{} is one word denoting this document, such as "article" or "chapter" 17% \path{} is the path denoting the case-study directory 18% 19% Typical examples: 20% \def\self{article} 21% \def\path{\verb%Training/studies/binomial%} 22% --------------------------------------------------------------------- 23 24\section{Pascal's Triangle and the Binomial Theorem} 25 26Pascal's Triangle\footnote{ 27According to Knuth \cite{knuth73}, the triangle gets its name from 28its appearance in Blaise Pascal's {\em Trait\'e du triangle arithm\'etique} 29in 1653, although the binomial coefficients were known long before then. 30The earliest surviving description is from the tenth century, 31in Hal\={a}yudha's commentary on the Hindu classic, the Chandah-S\^{u}tra. 32The Binomial Theorem itself was first reported in 1676 by Isaac Newton. 33} 34is the infinite triangle of numbers which starts like this: 35\[\begin{array}{ccccccc} 361 \\ 371 & 1 \\ 381 & 2 & 1 \\ 391 & 3 & 3 & 1 \\ 401 & 4 & 6 & 4 & 1 \\ 411 & 5 & 10 & 10 & 5 & 1 \\ 42\cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ 43\end{array}\] 44The construction is as follows. The numbers at the edges 45are $1$s. Each number in the interior is the sum of the number immediately 46above it and the number to its left in the previous row. 47 48The numbers in Pascal's Triangle are called Binomial Coefficients. 49The number in the $n$th row and $k$th column (where $0 \leq k \leq n$, and the 50topmost row and the leftmost column are numbered $0$) is written 51$n \choose k$ and pronounced ``$n$ choose $k$''. The coefficients are 52pronounced this way because $n \choose k$ turns out to be the number of 53ways to choose $k$ things out of $n$ things, but that is another story 54(see for instance Knuth's book \cite{knuth73}). 55 56A simple form of the Binomial Theorem \cite{maclane67} \cite{mostow63} 57is the following equation, where $a$ and $b$ are integers and $n$ is any 58natural number, 59\[ 60(a + b)^n = \sum_{k=0}^n {n \choose k} a^k b^{n-k} 61\] 62The rest of this \self{} describes how the Binomial Theorem can be proven 63in \HOL{}. In fact, a more general theorem is proven, where $a$ and $b$ 64are elements of an arbitrary commutative ring, but the form displayed here 65can be derived from the general result. 66 67The motivation for the proof of the Binomial Theorem in \HOL{} is 68tutorial. 69% ---to be a medium sized worked example whose subject matter is 70% more widely known than any specific piece of hardware or software. 71The proof illustrates specific ways in \HOL{} to represent mathematical 72notations and manage theorems proven about general structures such as 73monoids, groups and rings. The following files accompany this \self{} in 74directory \path{}: 75\begin{description} 76\item[{\tt mk\_BINOMIAL.ml}] 77 The \ML{} program containing the definitions, lemmas and theorems\\ 78 needed to prove the Binomial Theorem in \HOL{}. 79\item 80 [{\tt BINOMIAL.th}] 81 The theory file which is generated by \verb@mk_BINOMIAL.ml@. 82\item 83 [{\tt BINOMIAL.ml}] 84 An \ML{} program which loads the theory \verb@BINOMIAL@ and declares 85 a few derived rules and one tactic used in the proof. 86 The Binomial Theorem is bound to the \ML{} identifier \verb@BINOMIAL@. 87\end{description} 88% The aim of this \self{} is to introduce the small amount of algebra and 89% mathematical notation needed to state and prove the Binomial Theorem, show 90% how this is rendered in \HOL{}, and outline the structure of the proof 91% contained in \verb@mk_BINOMIAL.ml@. 92The \self{} is meant to be intelligible 93without examination of any of the accompanying files. 94To avoid clutter not every detail is spelt out. 95Often definitions and theorems are indicated 96in the form displayed by \HOL{}, rather than as \ML{} source text. Specific 97details of the \ML{} definitions or tactics can be found in the file 98\verb@mk_BINOMIAL.ml@. 99 100The way to work with this case study is first to study this \self{} to see 101the structure of the proof, and then to work interactively with \HOL{}. 102Start \HOL{} and say: 103\begin{session} 104\begin{verbatim} 105#loadt `BINOMIAL`;; 106... 107File BINOMIAL loaded 108() : void 109 110#BINOMIAL;; 111|- !plus times. 112 RING(plus,times) ==> 113 (!a b n. 114 POWER times n(plus a b) = 115 SIGMA(plus,0,SUC n)(BINTERM plus times a b n)) 116\end{verbatim} 117\end{session} 118(This is the first display of an interaction with \HOL{}. The difference 119between lines input to and lines output from \HOL{} is that only the former 120are preceded by the \HOL{} prompt ``\verb@#@''. A line of ellipsis 121``\verb@...@'' stands for output from \HOL{} which has been omitted from the 122display.) 123 124The first input line in the display above loads in all the definitions, 125theorems, derived rules and the tactic defined for the proof. The second 126input line asks \HOL{} to print theorem \verb@BINOMIAL@. The constants 127used in \verb@BINOMIAL@ are explained later in the \self{}. The \HOL{} 128session is now set up either to make use of \verb@BINOMIAL@ by specialising 129it to a specific ring, or to study \verb@mk_BINOMIAL.ml@ by using the subgoal 130package to work through proofs contained in it. 131 132The remainder of the \self{} follows the plan: 133\begin{description} 134\item[Section~\ref{BinomialCoefficients}] 135 shows how to define the number $n \choose k$ in \HOL{} as the term 136 $\obj{CHOOSE}\,n\,k$, where $\obj{CHOOSE}$ is a function defined by 137 primitive recursion. 138\item[Section~\ref{MonoidsGroupsRings}] 139 shows how to define elementary algebraic concepts like associativity, 140 commutativity, monoid, group and ring in \HOL{}, and how to apply 141 theorems proved in the general case to particular situations. 142\item[Section~\ref{PowersReductionsRangesSums}] 143 shows how to define the iterated operations of raising a term to a 144 power and reducing (summing) a list of terms. These operations are 145 part of the everyday mathematical language used to state the Binomial 146 Theorem, but they are not built-in to \HOL{} and so need to be defined. 147 This section and the previous two together describe enough definitions 148 in \HOL{} to state the Binomial Theorem. 149\item[Section~\ref{BinomialTheorem}] 150 shows how the Binomial Theorem is proven in \HOL{}. 151 The proof is by induction, and depends on three main lemmas. 152\item[Section~\ref{BinomialTheoremForIntegers}] 153 outlines how to apply the general theorem to a particular case: 154 the ring of integers. 155\item[Appendix~\ref{PrincipalLemmas}] 156 states the principal lemmas used to prove the theorem. 157 Some of these are {\em ad hoc} but others could be reused elsewhere. 158\end{description} 159 160% ---------------------------------------------------------------------------- 161 162\section{The Binomial Coefficients} 163\label{BinomialCoefficients} 164 165The definition of the binomial coefficients as the numbers in Pascal's Triangle 166is formalised in three equations: 167\begin{eqnarray*} 168n \choose 0 &=& 1 \\ 1690 \choose {k+1} &=& 0 \\ 170{n+1} \choose {k+1} &=& {{n} \choose {k+1}} + {{n} \choose {k}} 171\end{eqnarray*} 172(These actually define $n \choose k$ to be $0$ if $k>n$, but this is 173consistent with Pascal's Triangle: think of the spaces to the right of 174the triangle as filled with $0$'s.) The desire is to define a constant 175\verb@CHOOSE@ in \HOL{} such that for all numbers $n$, $k$ in the type 176{\tt num}, $\verb@CHOOSE@\,n\,k$ denotes the number $n \choose k$. 177Unfortunately these three equations cannot be used directly as the definition 178of \verb@CHOOSE@ in \HOL{} because they are not in the form of a base case 179(when $n=0$) together with an inductive case (when $n>0$). 180 181To work towards the definition, consider a term that specifies a base case 182and an inductive case intended to be equivalent to the three equations above: 183\begin{session} 184\begin{verbatim} 185let base_and_inductive: term = 186 "(CHOOSE 0 k = ((0 = k) => 1 | 0)) /\ 187 (CHOOSE (SUC n) k = 188 ((0 = k) => 1 | (CHOOSE n (k-1)) + (CHOOSE n k)))";; 189\end{verbatim} 190\end{session} 191The theory \verb@prim_rec@ contains a primitive recursion theorem which 192says that any base case and inductive case identifies a unique function, 193\verb@fn@: 194\begin{session} 195\begin{verbatim} 196#num_Axiom;; 197|- !e f. ?! fn. 198 (fn 0 = e) /\ 199 (!n. fn (SUC n) = f (fn n) n) 200\end{verbatim} 201\end{session} 202Given the theorem \verb@prim_rec@, the builtin function 203\verb@prove_rec_fn_exists@ can prove that there is a function that satisfies 204the property specified by the term \verb@base_and_inductive@: 205\begin{session} 206\begin{verbatim} 207#let RAW_CHOOSE_DEF = prove_rec_fn_exists num_Axiom base_and_inductive;; 208RAW_CHOOSE_DEF = 209|- ?CHOOSE. 210 (!k. CHOOSE 0 k = ((0 = k) => 1 | 0)) /\ 211 (!n k. 212 CHOOSE(SUC n)k = ((0 = k) => 1 | (CHOOSE n(k - 1)) + (CHOOSE n k))) 213\end{verbatim} 214\end{session} 215These equations would not do as the only definition of \verb@CHOOSE@ due 216to the presence of the conditional operator, which makes rewriting 217difficult. But having obtained the theorem \verb@RAW_CHOOSE_DEF@ it is 218easy to prove there is a function which satisfies the original three 219equations: 220\begin{session} 221\begin{verbatim} 222CHOOSE_DEF_LEMMA = 223|- ?CHOOSE. 224 (!n. CHOOSE n 0 = 1) /\ 225 (!k. CHOOSE 0(SUC k) = 0) /\ 226 (!n k. CHOOSE(SUC n)(SUC k) = (CHOOSE n(SUC k)) + (CHOOSE n k)) 227\end{verbatim} 228\end{session} 229Now the constant \verb@CHOOSE@ can be specified as originally desired: 230\begin{session} 231\begin{verbatim} 232#let CHOOSE_DEF = 233# new_specification `CHOOSE_DEF` [`constant`,`CHOOSE`] CHOOSE_DEF_LEMMA;; 234CHOOSE_DEF = 235|- (!n. CHOOSE n 0 = 1) /\ 236 (!k. CHOOSE 0(SUC k) = 0) /\ 237 (!n k. CHOOSE(SUC n)(SUC k) = (CHOOSE n(SUC k)) + (CHOOSE n k)) 238\end{verbatim} 239\end{session} 240Two elementary properties of the binomial coefficients can now be proven 241by induction: 242\begin{session} 243\begin{verbatim} 244CHOOSE_LESS = |- !n k. n < k ==> (CHOOSE n k = 0) 245CHOOSE_SAME = |- !n. CHOOSE n n = 1 246\end{verbatim} 247\end{session} 248 249% ---------------------------------------------------------------------------- 250 251\section{Monoids, Groups and Commutative Rings} 252\label{MonoidsGroupsRings} 253%A simple way to define algebraic structures in \HOL{}. 254 255\subsection{Associativity and Commutativity} 256 257An {\em algebraic structure} is a set equipped with some operators that 258obey some laws. An elementary example of such a structure is the pair 259$(A,+)$, where $A$ is a set and $+$ is a binary operator on $A$ that obeys 260the laws of associativity, 261\[ 262a + (b + c) = (a + b) + c 263\] 264and commutativity, 265\[ 266a + b = b + a 267\] 268 269How might these laws be represented in \HOL{}? The simplest scheme is 270to deal with structures of the form $(\sigma, +)$, where $\sigma$ is a 271type of the \HOL{} logic, and $+$ is a \HOL{} term of type 272$\sigma\to\sigma\to\sigma$. Notice that the set in the structure is 273represented as a \HOL{} type: see the theory \verb@group@ presented in 274Elsa Gunter's case study of Modular Arithmetic for a more sophisticated 275approach where the set in a structure can be a subset of a \HOL{} type. 276 277The two laws can be defined as two constants in \HOL{}: 278\begin{session} 279\begin{verbatim} 280#let ASSOCIATIVE = 281# new_definition (`ASSOCIATIVE`, 282# "!plus: *->*->*. ASSOCIATIVE plus = 283# (!a b c. plus a (plus b c) = plus (plus a b) c)");; 284... 285#let COMMUTATIVE = 286# new_definition (`COMMUTATIVE`, 287# "!plus: *->*->*. COMMUTATIVE plus = 288# (!a b. plus a b = plus b a)");; 289\end{verbatim} 290\end{session} 291By instantiating the type variable \verb@*@ and specialising the variable 292\verb@plus@ these constants are applicable to any particular structure 293of the form $(\sigma, +: \sigma\to\sigma\to\sigma)$. 294 295A simple example to illustrate this methodology is a permutation theorem 296about an arbitrary associative commutative operator (which is called 297\verb@PERM@ in file \verb@mk_BINOMIAL.ml@). It is worthwhile going 298through the proof in detail because it illustrates how a difficulty in 299making use of assumptions like \verb@ASSOCIATIVE plus@ and 300\verb@COMMUTATIVE plus@ can be solved. 301\begin{session} 302\begin{verbatim} 303#g "!plus: *->*->*. 304# ASSOCIATIVE plus /\ COMMUTATIVE plus ==> 305# !a b c. plus (plus a b) c = plus b (plus a c)";; 306\end{verbatim} 307\end{session} 308The first step is to strip off the quantifiers, and break up the implication: 309\begin{session} 310\begin{verbatim} 311#e (REPEAT STRIP_TAC);; 312OK.. 313"plus(plus a b)c = plus b(plus a c)" 314 [ "ASSOCIATIVE plus" ] 315 [ "COMMUTATIVE plus" ] 316\end{verbatim} 317\end{session} 318The next step is to rewrite \verb@plus a b@ to \verb@plus b a@ on the 319left-hand side, from the assumption \verb@COMMUTATIVE plus@. If the theorem 320was about a specific operator, such as \verb@$+: num->num->num@ from the 321theory of numbers in \HOL{}, the rewriting could be done with 322\verb@REWRITE_TAC@ and the specific commutativity theorem, such as 323\verb@ADD_SYM@: 324\[ 325\mbox{\tt |- !m n. m + n = n + m}. 326\] 327But in the general case all that is available is the definition of 328\verb@COMMUTATIVE plus@ which cannot be used directly with 329\verb@REWRITE_TAC@ to swap the terms \verb@a@ and \verb@b@. 330One approach is first to prove the 331following equation from the definition \verb@COMMUTATIVE plus@, 332\[ 333\mbox{\tt COMMUTATIVE plus |- !a b. plus a b = plus b a}, 334\] 335and then to use the equation with \verb@REWRITE_TAC@ in the same way as one 336might use \verb@ADD_SYM@. This is a valid tactic because the 337assumption \verb@COMMUTATIVE plus@ is already present in the assumption 338list. A little forward proof achieves the first step, the derivation of the 339equation: 340\begin{session} 341\begin{verbatim} 342#top_print print_all_thm;; 343- : (thm -> void) 344 345#let forwards = fst(EQ_IMP_RULE (SPEC "plus: *->*->*" COMMUTATIVE));; 346forwards = |- COMMUTATIVE plus ==> (!a b. plus a b = plus b a) 347 348#let eqn = UNDISCH forwards;; 349eqn = COMMUTATIVE plus |- !a b. plus a b = plus b a 350\end{verbatim} 351\end{session} 352This forward proof is a special case of a derived 353rule of the logic, 354\[ 355\Gamma\turn \uquant{x} t_1 = t_2 356\over 357\Gamma, t_1[t'/x] \turn t_2[t'/x] 358\] 359which can be coded as the \ML{} function \verb@SPEC_EQ@: 360\begin{session} 361\begin{verbatim} 362#let SPEC_EQ v th = UNDISCH(fst(EQ_IMP_RULE (SPEC v th)));; 363SPEC_EQ = - : (term -> thm -> thm) 364\end{verbatim} 365\end{session} 366Now the equation \verb@eqn@ can be used to prove swap terms 367\verb@a@ and \verb@b@: 368\begin{session} 369\begin{verbatim} 370#e(SUBST1_TAC (SPECL ["a:*";"b:*"] eqn));; 371OK.. 372"plus(plus b a)c = plus b(plus a c)" 373 [ "ASSOCIATIVE plus" ] 374 [ "COMMUTATIVE plus" ] 375\end{verbatim} 376\end{session} 377Now that the terms appear in both sides in the same order but with different 378grouping, associativity can be used to make the grouping on both sides 379the same. Again the definition of \verb@ASSOCIATIVE@ cannot be used 380directly, but the derived rule \verb@SPEC_EQ@ obtains from the definition 381an equation analogous to \verb@eqn@ which can be used: 382\begin{session} 383\begin{verbatim} 384#e(REWRITE_TAC [SPEC_EQ "plus: *->*->*" ASSOCIATIVE]);; 385OK.. 386goal proved 387... 388|- !plus. 389 ASSOCIATIVE plus /\ COMMUTATIVE plus ==> 390 (!a b c. plus(plus a b)c = plus b(plus a c)) 391\end{verbatim} 392\end{session} 393 394The point of this example was to illustrate how to prove theorems about 395an arbitrary operator whose behaviour is specified by general predicates 396like \verb@ASSOCIATIVE@ and \verb@COMMUTATIVE@. The main difficulty is 397in deriving equations for use with rewriting or substitution. The solution 398used in the example and extensively in the proof of the Binomial Theorem 399is to derive equations using the derived rule \verb@SPEC_EQ@. Another derived 400rule, \verb@SPEC_IMP@, is used extensively and is a variant of \verb@SPEC_EQ@: 401\[ 402\Gamma\turn \uquant{x} t_1 \Rightarrow t_2 403\over 404\Gamma, t_1[t'/x] \turn t_2[t'/x] 405\] 406\begin{session} 407\begin{verbatim} 408#let SPEC_IMP v th = UNDISCH(SPEC v th);; 409SPEC_IMP = - : (term -> thm -> thm) 410\end{verbatim} 411\end{session} 412Two following two rules are versions of \verb@SPEC_EQ@ and \verb@SPEC_IMP@ 413generalised to take lists of variables rather than just one: 414\begin{session} 415\begin{verbatim} 416SPECL_EQ = - : (term list -> thm -> thm) 417SPECL_IMP = - : (term list -> thm -> thm) 418\end{verbatim} 419\end{session} 420The final new derived rule used in the proof of the Binomial Theorem 421is \verb@STRIP_SPEC_IMP@, a variant of \verb@SPEC_IMP@, which 422splits up a conjunction in the antecedent into separate assumptions: 423\[ 424\Gamma\turn \uquant{x} (t_1 \wedge \cdots \wedge t_n) \Rightarrow t_{n+1} 425\over 426\Gamma, t_1[t'/x], \ldots, t_n[t'/x] \turn t_{n+1}[t'/x] 427\] 428\begin{session} 429\begin{verbatim} 430#let STRIP_SPEC_IMP v th = 431# UNDISCH_ALL(SPEC v (CONV_RULE (ONCE_DEPTH_CONV ANTE_CONJ_CONV) th));; 432STRIP_SPEC_IMP = - : (term -> thm -> thm) 433\end{verbatim} 434\end{session} 435 436There are other methods for proving theorems which are conditional on 437predicates such as \verb@ASSOCIATIVE@ and \verb@COMMUTATIVE@. One is to 438rewrite with the definition before assuming the predicate (see, for instance, 439the proof of \verb@RIGHT_CANCELLATION@ in \verb@mk_BINOMIAL.ml@). This 440method is fine for small proofs, but in larger proofs it presents the problem 441of how to specify a particular assumption on the assumption list. Another 442method is to put the predicates on the assumption list, and use resolution 443to extract the body of the definition (see the proof of 444\verb@UNIQUE_LEFT_INV@). When the assumption list is not short this is 445rather a blunt instrument, in the sense that resolution can find many more 446theorems than the one desired, and is also rather computationally costly 447when compared to carefully constructing a theorem with \verb@SPEC_EQ@ and 448its variants. 449 450\subsection{Monoids} 451 452A {\em monoid} is a structure $(M,+)$ where $M$ is a set, $+$ is an 453associative binary operator on $M$, such that there is an {\em identity 454element}, $u \in M$, which is a left and right identity of $+$, that is, 455it satisfies $u+a = a+u = a$ for all $a \in M$. When the operator is written 456as $+$, the structure $(M,+)$ is called an {\em additive monoid} and the 457identity element is written as $0$; otherwise if the operator is written 458as $\times$, the structure $(M,\times)$ is called a {\em multiplicative 459monoid} and the identity element is written as $1$. 460\begin{session} 461\begin{verbatim} 462#let RAW_MONOID = 463# new_definition (`RAW_MONOID`, 464# "!plus: *->*->*. MONOID plus = 465# ASSOCIATIVE plus /\ 466# (?u. (!a. plus a u = a) /\ (!a. plus u a = a))");; 467\end{verbatim} 468\end{session} 469This definition is inconvenient to manipulate as it stands, because of 470the presence of the existential quantifier. However it is possible via 471Hilbert's $\epsilon$-operator to specify a function \verb@Id@ such that 472\verb@Id plus@ stands for an identity element of \verb@plus@, if such 473exists. In fact it is easy to prove that the identity element in a monoid 474is unique. The first step is to prove that there exists a function \verb@Id@ 475with the property just described: 476\begin{session} 477\begin{verbatim} 478ID_LEMMA = 479|- ?Id. 480 !plus. 481 (?u. (!a. plus u a = a) /\ (!a. plus a u = a)) ==> 482 (!a. plus(Id plus)a = a) /\ (!a. plus a(Id plus) = a) 483\end{verbatim} 484\end{session} 485The proof of \verb@ID_LEMMA@ uses Hilbert's $\epsilon$-operator to 486construct a witness for the outer existential quantifier, 487\begin{verbatim} 488 \plus.@u:*.(!a:*. (plus u a = a)) /\ (!a. (plus a u = a)) 489\end{verbatim} 490Given \verb@ID_LEMMA@, the constant \verb@Id@ is specified as follows: 491\begin{session} 492\begin{verbatim} 493#let ID = new_specification `Id` [`constant`, `Id`] ID_LEMMA;; 494ID = 495|- !plus. 496 (?u. (!a. plus u a = a) /\ (!a. plus a u = a)) ==> 497 (!a. plus(Id plus)a = a) /\ (!a. plus a(Id plus) = a) 498\end{verbatim} 499\end{session} 500The condition that \verb@Id plus@ is a left and right identity is expressed 501by the predicates \verb@LEFT_ID@ and \verb@RIGHT_ID@, respectively: 502\begin{session} 503\begin{verbatim} 504LEFT_ID = |- !plus. LEFT_ID plus = (!a. plus(Id plus)a = a) 505RIGHT_ID = |- !plus. RIGHT_ID plus = (!a. plus a(Id plus) = a) 506\end{verbatim} 507\end{session} 508Given these abbreviations it is easy to prove in \HOL{} that 509if a left and right identity element exists, then it is unique: 510\begin{session} 511\begin{verbatim} 512UNIQUE_LEFT_ID = 513|- !plus. 514 LEFT_ID plus /\ RIGHT_ID plus ==> 515 (!l. (!a. plus l a = a) ==> (Id plus = l)) 516 517UNIQUE_RIGHT_ID = 518|- !plus. 519 LEFT_ID plus /\ RIGHT_ID plus ==> 520 (!r. (!a. plus a r = a) ==> (Id plus = r)) 521\end{verbatim} 522\end{session} 523The use of these last two theorems is to help identify the element 524\verb@Id plus@ for some specific \verb@plus@, such as \verb@$+@ from the theory 525of \verb@arithmetic@. The abbreviations \verb@LEFT_ID@ and \verb@RIGHT_ID@ 526admit a new characterisation of the predicate \verb@MONOID@, which is easier 527to manipulate in \HOL{} than \verb@RAW_MONOID@: 528\begin{session} 529\begin{verbatim} 530MONOID = 531|- !plus. MONOID plus = LEFT_ID plus /\ RIGHT_ID plus /\ ASSOCIATIVE plus 532\end{verbatim} 533\end{session} 534 535\subsection{Groups} 536 537A {\em group} is a monoid in which every element is invertible. 538\begin{session} 539\begin{verbatim} 540RAW_GROUP = 541|- !plus. 542 Group plus = 543 MONOID plus /\ 544 (!a. ?b. (plus b a = Id plus) /\ (plus a b = Id plus)) 545\end{verbatim} 546\end{session} 547An alternative characterisation of \verb@Group@\footnote{The mixed 548 case identifier \texttt{Group} is used rather than \texttt{GROUP}, 549 because the latter is already defined in Elsa Gunter's theory 550 \texttt{group}, a parent of her theory \texttt{integer}, which needs 551 to coexist with the present theory so that integers can be used as 552 an example ring (Section~\ref{BinomialTheoremForIntegers}).} which 553makes the existential quantification implicit can be had by specifying 554a function \verb@Inv@ so that, if \verb@Group plus@ holds, then the 555element \verb@Inv plus a@ is an inverse of \verb@a@. The function 556\verb@Inv@ is specified in a way analogous to the specification of 557\verb@Id@. First a lemma is proven which says there does exist a 558function \verb@Inv@ with the property desired: 559\begin{session} 560\begin{verbatim} 561INV_LEMMA = 562|- ?Inv. 563 !plus. 564 (!a. ?b. (plus b a = Id plus) /\ (plus a b = Id plus)) ==> 565 (!a. 566 (plus(Inv plus a)a = Id plus) /\ (plus a(Inv plus a) = Id plus)) 567\end{verbatim} 568\end{session} 569Given this lemma, which is proven by an easy tactic almost the same 570as the one for \verb@ID_LEMMA@, the constant \verb@Inv@ can be specified: 571\begin{session} 572\begin{verbatim} 573#let INV = 574# new_specification `Inv` [`constant`, `Inv`] INV_LEMMA;; 575\end{verbatim} 576\end{session} 577The predicates which say that \verb@Inv@ produces left and right inverses 578are defined as new constants: 579\begin{session} 580\begin{verbatim} 581LEFT_INV = 582|- !plus. LEFT_INV plus = (!a. plus(Inv plus a)a = Id plus) 583RIGHT_INV = 584|- !plus. RIGHT_INV plus = (!a. plus a(Inv plus a) = Id plus) 585\end{verbatim} 586\end{session} 587Finally, the alternative characterisation of \verb@Group@ can be proven: 588\begin{session} 589\begin{verbatim} 590GROUP = |- Group plus = MONOID plus /\ LEFT_INV plus /\ RIGHT_INV plus 591\end{verbatim} 592\end{session} 593The theory of groups has been developed only as far as needed to prove 594the Binomial Theorem; Appendix~\ref{PrincipalLemmas} shows the one lemma, 595\verb@RIGHT_CANCELLATION@, specifically about groups. 596 597\subsection{Rings} 598 599A {\em ring} is a structure $(R,+,\times)$ such that structure $(R,+)$ 600is a group, structure $(R,\times)$ is a monoid, {\em addition}, $+$, is 601commutative, and {\em multiplication}, $\times$, distributes (on both sides) 602over addition. A {\em commutative ring} is a ring in which multiplication 603is commutative. The word ring is used in the remainder of this \self{} 604and in the \HOL{} code to mean a commutative ring: 605\begin{session} 606\begin{verbatim} 607RING = 608|- !plus times. 609 RING(plus,times) = 610 Group plus /\ 611 COMMUTATIVE plus /\ 612 MONOID times /\ 613 COMMUTATIVE times /\ 614 LEFT_DISTRIB(plus,times) /\ 615 RIGHT_DISTRIB(plus,times) 616\end{verbatim} 617\end{session} 618\verb@LEFT_DISTRIB@ and \verb@RIGHT_DISTRIB@ are new constants 619defined by the theorems: 620\begin{session} 621\begin{verbatim} 622LEFT_DISTRIB = 623|- !plus times. 624 LEFT_DISTRIB(plus,times) = 625 (!a b c. times a(plus b c) = plus(times a b)(times a c)) 626 627RIGHT_DISTRIB = 628|- !plus times. 629 RIGHT_DISTRIB(plus,times) = 630 (!a b c. times(plus a b)c = plus(times a c)(times b c)) 631\end{verbatim} 632\end{session} 633Notice that the definition of \verb@RING@ abbreviates a conjunction of 634predicates, some of which are atomic, in the sense that they are some basic 635property of \verb@plus@ or \verb@times@ or both, and some are compound, 636in the sense that they abbreviate a further conjunction of properties. 637The definition is a tree of predicates with atomic ones like 638\verb@LEFT_DISTRIB@ and \verb@ASSOCIATIVE@, at the leaves, and compound 639ones like \verb@Group@ and \verb@MONOID@, at the branches. Many theorems 640conditional on \verb@RING(plus,times)@ need to make use of both atomic 641and compound predicates contained in the tree. To make access to all these 642predicates easy, the following lemma is proven, which makes explicit all 643the predicates in the tree: 644\begin{session} 645\begin{verbatim} 646RING_LEMMA = 647|- RING(plus,times) ==> 648 RING(plus,times) /\ 649 Group plus /\ 650 MONOID plus /\ 651 LEFT_ID plus /\ 652 RIGHT_ID plus /\ 653 ASSOCIATIVE plus /\ 654 LEFT_INV plus /\ 655 RIGHT_INV plus /\ 656 COMMUTATIVE plus /\ 657 MONOID times /\ 658 LEFT_ID times /\ 659 RIGHT_ID times /\ 660 ASSOCIATIVE times /\ 661 COMMUTATIVE times /\ 662 LEFT_DISTRIB(plus,times) /\ 663 RIGHT_DISTRIB(plus,times) 664\end{verbatim} 665\end{session} 666Goals of the form, 667\[ 668\mbox{\tt !plus times. RING(plus,times) ==> ...} 669\] 670are consistently tackled with the following tactic, which uses \verb@RING_LEMMA@ 671to add all the atomic and compound predicates in the tree to the assumption 672list: 673\begin{session} 674\begin{verbatim} 675#let RING_TAC = 676# GEN_TAC THEN GEN_TAC THEN 677# DISCH_THEN (\th. STRIP_ASSUME_TAC (MP RING_LEMMA th));; 678\end{verbatim} 679\end{session} 680The one lemma about rings needed here, \verb@RING_0@, is stated in 681Appendix~\ref{PrincipalLemmas}. 682 683% ---------------------------------------------------------------------------- 684 685\section{Powers, Reductions, Ranges and Sums} 686\label{PowersReductionsRangesSums} 687%Translation of mathematical definitions such as sums of series into HOL 688 689The aim of this section is to explain how the mathematical notation 690$\sum_{k=0}^n {n \choose k} a^k b^{n-k}$ can be rendered in \HOL{}. 691If this is written out more explicitly than necessary for human 692consumption it looks like this: 693\[ 694\sum_{k=0}^n {n \choose k} \cdot (a^k \times b^{n-k}) 695\] 696where $\times$ is multiplication in the ring, and where exponentiation 697and $\cdot$ stand for the scalar powers of multiplication and addition, 698respectively. Since binomial coefficients were treated in 699Section~\ref{BinomialCoefficients}, what remain to be defined in this section 700are scalar powers and the $\Sigma$-notation. The latter is tackled in 701three steps: reduction of a list of monoid elements; generation of lists 702comprising subranges of natural numbers, $[n,n+1,\ldots,n+k-1]$; and finally 703the reduction of a list of ring elements indexed by a subrange of the natural 704numbers. 705 706\subsection{Scalar powers in a monoid} 707 708If $(M,+)$ and $(M,\times)$ are additive and multiplicative 709monoids respectively, the scalar powers of addition and multiplication are 710defined informally by the following equations, where $n>0$: 711\begin{eqnarray*} 712n \cdot a &=& \overbrace{a + \cdots + a}^{n} \\ 713a^n &=& \overbrace{a \times \cdots \times a}^{n} 714\end{eqnarray*} 715When $n=0$, $n \cdot a = 0$ and $a^n = 1$. 716 717Scalar power is defined in \HOL{} via the constant \verb@POWER@ which 718is defined by primitive recursion: 719\begin{session} 720\begin{verbatim} 721POWER = 722|- (!plus a. POWER plus 0 a = Id plus) /\ 723 (!plus n a. POWER plus(SUC n)a = plus a(POWER plus n a)) 724\end{verbatim} 725\end{session} 726Three lemmas about \verb@POWER@ are stated in Appendix~\ref{PrincipalLemmas}. 727 728\subsection{Reduction in a monoid} 729 730This section makes use of the standard \HOL{} theory of lists, in particular 731the constants \verb@NIL@, \verb@CONS@, \verb@APPEND@ and \verb@MAP@, 732definition by primitive recursion and proof by induction. The theory of 733lists is described in \DESCRIPTION. 734 735If $(M,+)$ is a monoid, then the reduction of a list $[a_1,\ldots,a_n]$, 736where each $a_i$ is an element of $M$, is the sum, 737\[ 738a_1 + \cdots + a_n, 739\] 740or the monoid's identity element if $n=0$. 741 742Reduction is represented in \HOL{} by the constant \verb@REDUCE@, 743with type, 744\begin{verbatim} 745REDUCE : (*->*->*) -> (*)list -> * 746\end{verbatim} 747and which is defined by primitive recursion on lists on its second argument: 748\begin{session} 749\begin{verbatim} 750#let REDUCE = 751# new_list_rec_definition (`REDUCE`, 752# "(!plus. (REDUCE plus NIL) = (Id plus):*) /\ 753# (!plus (a:*) as. REDUCE plus (CONS a as) = plus a (REDUCE plus as))");; 754REDUCE = 755|- (!plus. REDUCE plus[] = Id plus) /\ 756 (!plus a as. REDUCE plus(CONS a as) = plus a(REDUCE plus as)) 757\end{verbatim} 758\end{session} 759Three lemmas about \verb@REDUCE@ are stated in Appendix~\ref{PrincipalLemmas}. 760 761\subsection{Subranges of the natural numbers} 762 763The constant \verb@RANGE@, which is a curried function of two numbers, 764is defined by primitive recursion so that, 765\[ 766\verb@RANGE@\,m\,n = [m, m+1, \ldots, m+n-1] 767\] 768This definition means that the subrange is the first $n$ numbers starting 769at $m$. In particular, when $n=0$ the list is empty. 770Appendix~\ref{PrincipalLemmas} states two simple properties of \verb@RANGE@ 771 772\verb@RANGE@ generates a subrange given its least element and length; 773an alternative method is to generate the subrange given its two endpoints, 774via a constant \verb@INTERVAL@ which satisfies the equation: 775\[ 776\verb@INTERVAL@\,m\,n = 777 \left\{\begin{array}{ll} 778 [m, m+1, \ldots, n] & \mbox{if $m \leq n$} \\ 779 {}[] & \mbox{otherwise} 780 \end{array}\right. 781\] 782 783For comparison, the two methods are definable in \HOL{} as follows: 784\begin{session} 785\begin{verbatim} 786#let RANGE = 787# new_recursive_definition false num_Axiom `RANGE` 788# "(RANGE m 0 = NIL) /\ 789# (RANGE m (SUC n) = CONS m (RANGE (SUC m) n))";; 790# 791#let INTERVAL = 792# new_recursive_definition false num_Axiom `INTERVAL` 793# "(INTERVAL m 0 = 794# (m > 0) => NIL | [0]) /\ 795# (INTERVAL m (SUC n) = 796# (m > (SUC n)) => NIL | APPEND (INTERVAL m n) ([SUC n]))";; 797\end{verbatim} 798\end{session} 799 800\subsection{$\Sigma$-notation} 801 802The standard $\Sigma$-notation is defined informally by the equation, 803\[ 804\sum_{i=m}^n a_i = 805 \left\{\begin{array}{ll} 806 a_m + a_{m+1} + \cdots + a_n & \mbox{if $m \leq n$} \\ 807 0 & \mbox{otherwise} 808 \end{array}\right. 809\] 810Therefore the standard $\Sigma$-notation is naturally definable via 811a subrange generated by \verb@INTERVAL@. 812 813An alternative notation, naturally definable via \verb@RANGE@, can be 814defined informally as follows: 815\[ 816\sum_{i=m}^{(n)} a_i = 817 \left\{\begin{array}{ll} 818 a_m + a_{m+1} + \cdots + a_{m+n-1} & \mbox{if $n > 0$} \\ 819 0 & \mbox{otherwise} 820 \end{array}\right. 821\] 822The brackets around the $n$ above the $\Sigma$ distinguish this 823notation from the standard one. 824 825This second notation is used in the \HOL{} proof of the Binomial Theorem. 826The difference between the two is just that the standard notation is 827definable via \verb@INTERVAL@, and the second via \verb@RANGE@. The second 828notation was chosen entirely because the constant \verb@RANGE@ is easier 829to manipulate in \HOL{} than \verb@INTERVAL@, which is because the latter 830has a more complex definition. The disadvantage of this choice is that 831the proof of the theorem uses a non-standard notation, but that's all the 832difference is: notation. If desired, the Binomial Theorem stated in the 833standard notation could be trivially deduced from its non-standard statement. 834 835The non-standard $\Sigma$-notation is defined in \HOL{} as the following 836abbreviation: 837\begin{session} 838\begin{verbatim} 839#let SIGMA = 840# new_definition (`SIGMA`, 841# "SIGMA (plus,m,n) f = REDUCE plus (MAP f (RANGE m n)): *");; 842\end{verbatim} 843\end{session} 844Note that an indexed term $a_i$ is represented in \HOL{} as a function 845\verb@f@ of type \verb@num -> *@. 846 847Appendix~\ref{PrincipalLemmas} states several properties of \verb@SIGMA@; 848their proofs make use of properties of \verb@MAP@ from the theory 849\verb@list@, and also the lemmas about \verb@REDUCE@ and \verb@RANGE@. 850 851The functions \verb@REDUCE@ and \verb@RANGE@ are needed here only to define 852\verb@SIGMA@, and in fact could be omitted if \verb@SIGMA@ were to be defined 853directly using primitive recursion. It is worth going to the trouble 854of defining \verb@REDUCE@ and \verb@RANGE@, however, because they are likely 855to be useful in other situations. 856 857% ---------------------------------------------------------------------------- 858 859\section{The Binomial Theorem for a Commutative Ring} 860\label{BinomialTheorem} 861%Division of a medium sized proof into manageable lemmas. 862 863This section outlines the proof. For full details see the tactics in 864\verb@mk_BINOMIAL.ml@. 865 866The statement and proof of the Binomial Theorem use a new \HOL{} constant, 867\verb@BINTERM@, to abbreviate indexed terms of the form $\lambda k.\,{n 868\choose k} \cdot a^{n-k} \times b^k$. \verb@BINTERM@ is defined as a 869curried function as follows: 870\begin{session} 871\begin{verbatim} 872#let BINTERM_DEF = 873# new_definition (`BINTERM_DEF`, 874# "BINTERM (plus: *->*->*) (times: *->*->*) a b n k = 875# POWER plus (CHOOSE n k) 876# (times (POWER times (n-k) a) (POWER times k b))");; 877\end{verbatim} 878\end{session} 879This abbreviation lets the Binomial Theorem be stated and proven in \HOL{} as 880follows: 881\begin{session} 882\begin{verbatim} 883BINOMIAL = 884|- !plus times. 885 RING(plus,times) ==> 886 (!a b n. 887 POWER times n(plus a b) = 888 SIGMA(plus,0,SUC n)(BINTERM plus times a b n)) 889\end{verbatim} 890\end{session} 891In outline, the proof proceeds as follows. The following equation is 892proven by induction on $n$: 893\begin{eqnarray*} 894(a+b)^n &=& \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k 895\end{eqnarray*} 896When $n=0$ both sides reduce to the multiplicative identity element, $1$, 897of the ring. For the inductive step, the equation is assumed for $n$, and 898the following goal must be proven: 899\begin{eqnarray*} 900(a+b)^{n+1} &=& \sum_{k=0}^{(n+2)} {{n+1} \choose k} a^{n+1-k} b^k 901\end{eqnarray*} 902Three lemmas are used in the inductive step: 903\[\begin{array}{rcll} 904\displaystyle 905\sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k &=& 906\displaystyle 907 a \times \sum_{k=1}^{(n)} {n \choose k} a^{n+1-k} b^k + 908 b \times \sum_{k=0}^{(n)} {n \choose k} a^{n+1-k} b^k 909 & \mbox{({\tt LEMMA\_1})} \\ 910\displaystyle 911a \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k &=& 912\displaystyle 913 a^{n+1} + 914 a \times \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k 915 & \mbox{({\tt LEMMA\_2})} \\ 916\displaystyle 917b \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k &=& 918\displaystyle 919 b \times \sum_{k=0}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k 920 + b^{n+1} 921 & \mbox{({\tt LEMMA\_3})} 922\end{array}\] 923\verb@LEMMA_1@ is the key to the inductive step. The three stages of the 924inductive step are indicated (a), (b) and (c) in \ML{} comments. Step 925(a) expands the right-hand side of the goal with \verb@LEMMA_1@: 926\begin{eqnarray*} 927\mbox{rhs} &=& 928 a^{n+1} + 929 a \times \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k + 930 b \times \sum_{k=0}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k + 931 b^{n+1} 932\end{eqnarray*} 933Step (b) expands the left-hand side with the induction hypothesis: 934\begin{eqnarray*} 935\mbox{lhs} &=& 936 (a+b)\sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k 937\end{eqnarray*} 938Finally, in step (c) the expansions of the two sides are shown to be the 939same, using distributivity of $\times$ over $+$, associativity of $+$, 940and \verb@LEMMA_2@ and \verb@LEMMA_3@: 941\begin{eqnarray*} 942\mbox{lhs} &=& 943 a \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k 944 + 945 b \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k \\ 946 &=& 947 a^{n+1} + 948 a \times \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k 949 + 950 b \times \sum_{k=0}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k 951 + b^{n+1} \\ 952 &=& 953 \mbox{rhs} 954\end{eqnarray*} 955 956% ---------------------------------------------------------------------------- 957 958\section{The Binomial Theorem for Integers} 959\label{BinomialTheoremForIntegers} 960 961The previous sections have dealt with the material in theory \verb@BINOMIAL@, 962concerning how to prove the Binomial Theorem for an arbitrary ring. A 963natural next step is to produce some specific commutative ring, and prove 964a Binomial Theorem for it, as an instance of the general case. 965 966To see how to do so for Elsa Gunter's theory of integers, take a look 967at the file {\tt mk\_BINOMIAL\_integer.ml}, contained in directory \path{}. 968When loaded into \HOL{}, this \ML{} file proves that the integers are a 969ring---by quoting laws contained in the theory \verb@integer@---and then 970derives a Binomial Theorem for the integers---by Modus Ponens from the 971general theorem \verb@BINOMIAL@. 972