1% binomial.tex %
2% The Binomial Theorem proven in HOL %
3
4\setlength{\unitlength}{1mm}
5% \vskip10mm
6% \begin{center}\LARGE\bf
7% The Binomial Theorem proven in HOL.
8% \end{center}
9% \vskip10mm
10
11\def\obj#1{\mbox{\tt#1}}
12
13% ---------------------------------------------------------------------
14% Parameters customising the document; set by whoever \input's it
15% ---------------------------------------------------------------------
16% \self{} is one word denoting this document, such as "article" or "chapter"
17% \path{} is the path denoting the case-study directory
18%
19% Typical examples:
20% \def\self{article}
21% \def\path{\verb%Training/studies/binomial%}
22% ---------------------------------------------------------------------
23
24\section{Pascal's Triangle and the Binomial Theorem}
25
26Pascal's Triangle\footnote{
27According to Knuth \cite{knuth73}, the triangle gets its name from
28its appearance in Blaise Pascal's {\em Trait\'e du triangle arithm\'etique}
29in 1653, although the binomial coefficients were known long before then.
30The earliest surviving description is from the tenth century,
31in Hal\={a}yudha's commentary on the Hindu classic, the Chandah-S\^{u}tra.
32The Binomial Theorem itself was first reported in 1676 by Isaac Newton.
33}
34is the infinite triangle of numbers which starts like this:
35\[\begin{array}{ccccccc}
361  \\
371 & 1 \\
381 & 2 & 1 \\
391 & 3 & 3 & 1 \\
401 & 4 & 6 & 4 & 1 \\
411 & 5 & 10 & 10 & 5 & 1 \\
42\cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\
43\end{array}\]
44The construction is as follows. The numbers at the edges
45are $1$s. Each number in the interior is the sum of the number immediately
46above it and the number to its left in the previous row.
47
48The numbers in Pascal's Triangle are called Binomial Coefficients.
49The number in the $n$th row and $k$th column (where $0 \leq k \leq n$, and the
50topmost row and the leftmost column are numbered $0$) is written
51$n \choose k$ and pronounced ``$n$ choose $k$''.  The coefficients are
52pronounced this way because $n \choose k$ turns out to be the number of
53ways to choose $k$ things out of $n$ things, but that is another story
54(see for instance Knuth's book \cite{knuth73}).
55
56A simple form of the Binomial Theorem \cite{maclane67} \cite{mostow63}
57is the following equation, where $a$ and $b$ are integers and $n$ is any
58natural number,
59\[
60(a + b)^n = \sum_{k=0}^n {n \choose k} a^k b^{n-k}
61\]
62The rest of this \self{} describes how the Binomial Theorem can be proven
63in \HOL{}.  In fact, a more general theorem is proven, where $a$ and $b$
64are elements of an arbitrary commutative ring, but the form displayed here
65can be derived from the general result.
66
67The motivation for the proof of the Binomial Theorem in \HOL{} is
68tutorial.
69% ---to be a medium sized worked example whose subject matter is
70% more widely known than any specific piece of hardware or software.
71The proof illustrates specific ways in \HOL{} to represent mathematical
72notations and manage theorems proven about general structures such as
73monoids, groups and rings. The following files accompany this \self{} in
74directory \path{}:
75\begin{description}
76\item[{\tt mk\_BINOMIAL.ml}]
77    The \ML{} program containing the definitions, lemmas and theorems\\
78    needed to prove the Binomial Theorem in \HOL{}.
79\item
80    [{\tt BINOMIAL.th}]
81    The theory file which is generated by \verb@mk_BINOMIAL.ml@.
82\item
83    [{\tt BINOMIAL.ml}]
84    An \ML{} program which loads the theory \verb@BINOMIAL@ and declares
85    a few derived rules and one tactic used in the proof.
86    The Binomial Theorem is bound to the \ML{} identifier \verb@BINOMIAL@.
87\end{description}
88% The aim of this \self{} is to introduce the small amount of algebra and
89% mathematical notation needed to state and prove the Binomial Theorem, show
90% how this is rendered in \HOL{}, and outline the structure of the proof
91% contained in \verb@mk_BINOMIAL.ml@.
92The \self{} is meant to be intelligible
93without examination of any of the accompanying files.
94To avoid clutter not every detail is spelt out.
95Often definitions and theorems are indicated
96in the form displayed by \HOL{}, rather than as \ML{} source text. Specific
97details of the \ML{} definitions or tactics can be found in the file
98\verb@mk_BINOMIAL.ml@.
99
100The way to work with this case study is first to study this \self{} to see
101the structure of the proof, and then to work interactively with \HOL{}.
102Start \HOL{} and say:
103\begin{session}
104\begin{verbatim}
105#loadt `BINOMIAL`;;
106...
107File BINOMIAL loaded
108() : void
109
110#BINOMIAL;;
111|- !plus times.
112    RING(plus,times) ==>
113    (!a b n.
114      POWER times n(plus a b) =
115      SIGMA(plus,0,SUC n)(BINTERM plus times a b n))
116\end{verbatim}
117\end{session}
118(This is the first display of an interaction with \HOL{}.  The difference
119between lines input to and lines output from \HOL{} is that only the former
120are preceded by the \HOL{} prompt ``\verb@#@''.  A line of ellipsis
121``\verb@...@'' stands for output from \HOL{} which has been omitted from the
122display.)
123
124The first input line in the display above loads in all the definitions,
125theorems, derived rules and the tactic defined for the proof.  The second
126input line asks \HOL{} to print theorem \verb@BINOMIAL@.  The constants
127used in \verb@BINOMIAL@ are explained later in the \self{}.  The \HOL{}
128session is now set up either to make use of \verb@BINOMIAL@ by specialising
129it to a specific ring, or to study \verb@mk_BINOMIAL.ml@ by using the subgoal
130package to work through proofs contained in it.
131
132The remainder of the \self{} follows the plan:
133\begin{description}
134\item[Section~\ref{BinomialCoefficients}]
135    shows how to define the number $n \choose k$ in \HOL{} as the term
136    $\obj{CHOOSE}\,n\,k$, where $\obj{CHOOSE}$ is a function defined by
137    primitive recursion.
138\item[Section~\ref{MonoidsGroupsRings}]
139    shows how to define elementary algebraic concepts like associativity,
140    commutativity, monoid, group and ring in \HOL{}, and how to apply
141    theorems proved in the general case to particular situations.
142\item[Section~\ref{PowersReductionsRangesSums}]
143    shows how to define the iterated operations of raising a term to a
144    power and reducing (summing) a list of terms. These operations are
145    part of the everyday mathematical language used to state the Binomial
146    Theorem, but they are not built-in to \HOL{} and so need to be defined.
147    This section and the previous two together describe enough definitions
148    in \HOL{} to state the Binomial Theorem.
149\item[Section~\ref{BinomialTheorem}]
150    shows how the Binomial Theorem is proven in \HOL{}.
151    The proof is by induction, and depends on three main lemmas.
152\item[Section~\ref{BinomialTheoremForIntegers}]
153    outlines how to apply the general theorem to a particular case:
154    the ring of integers.
155\item[Appendix~\ref{PrincipalLemmas}]
156    states the principal lemmas used to prove the theorem.
157    Some of these are {\em ad hoc} but others could be reused elsewhere.
158\end{description}
159
160% ----------------------------------------------------------------------------
161
162\section{The Binomial Coefficients}
163\label{BinomialCoefficients}
164
165The definition of the binomial coefficients as the numbers in Pascal's Triangle
166is formalised in three equations:
167\begin{eqnarray*}
168n \choose 0 &=& 1 \\
1690 \choose {k+1} &=& 0 \\
170{n+1} \choose {k+1} &=& {{n} \choose {k+1}} + {{n} \choose {k}}
171\end{eqnarray*}
172(These actually define $n \choose k$ to be $0$ if $k>n$, but this is
173consistent with Pascal's Triangle: think of the spaces to the right of
174the triangle as filled with $0$'s.) The desire is to define a constant
175\verb@CHOOSE@ in \HOL{} such that for all numbers $n$, $k$ in the type
176{\tt num}, $\verb@CHOOSE@\,n\,k$ denotes the number $n \choose k$.
177Unfortunately these three equations cannot be used directly as the definition
178of \verb@CHOOSE@ in \HOL{} because they are not in the form of a base case
179(when $n=0$) together with an inductive case (when $n>0$).
180
181To work towards the definition, consider a term that specifies a base case
182and an inductive case intended to be equivalent to the three equations above:
183\begin{session}
184\begin{verbatim}
185let base_and_inductive: term =
186    "(CHOOSE 0 k = ((0 = k) => 1 | 0)) /\
187     (CHOOSE (SUC n) k =
188         ((0 = k) => 1 | (CHOOSE n (k-1)) + (CHOOSE n k)))";;
189\end{verbatim}
190\end{session}
191The theory \verb@prim_rec@ contains a primitive recursion theorem which
192says that any base case and inductive case identifies a unique function,
193\verb@fn@:
194\begin{session}
195\begin{verbatim}
196#num_Axiom;;
197|- !e f. ?! fn.
198    (fn 0 = e) /\
199    (!n. fn (SUC n) = f (fn n) n)
200\end{verbatim}
201\end{session}
202Given the theorem \verb@prim_rec@, the builtin function
203\verb@prove_rec_fn_exists@ can prove that there is a function that satisfies
204the property specified by the term \verb@base_and_inductive@:
205\begin{session}
206\begin{verbatim}
207#let RAW_CHOOSE_DEF = prove_rec_fn_exists num_Axiom base_and_inductive;;
208RAW_CHOOSE_DEF =
209|- ?CHOOSE.
210    (!k. CHOOSE 0 k = ((0 = k) => 1 | 0)) /\
211    (!n k.
212      CHOOSE(SUC n)k = ((0 = k) => 1 | (CHOOSE n(k - 1)) + (CHOOSE n k)))
213\end{verbatim}
214\end{session}
215These equations would not do as the only definition of \verb@CHOOSE@ due
216to the presence of the conditional operator, which makes rewriting
217difficult.  But having obtained the theorem \verb@RAW_CHOOSE_DEF@ it is
218easy to prove there is a function which satisfies the original three
219equations:
220\begin{session}
221\begin{verbatim}
222CHOOSE_DEF_LEMMA =
223|- ?CHOOSE.
224    (!n. CHOOSE n 0 = 1) /\
225    (!k. CHOOSE 0(SUC k) = 0) /\
226    (!n k. CHOOSE(SUC n)(SUC k) = (CHOOSE n(SUC k)) + (CHOOSE n k))
227\end{verbatim}
228\end{session}
229Now the constant \verb@CHOOSE@ can be specified as originally desired:
230\begin{session}
231\begin{verbatim}
232#let CHOOSE_DEF =
233#    new_specification `CHOOSE_DEF` [`constant`,`CHOOSE`] CHOOSE_DEF_LEMMA;;
234CHOOSE_DEF =
235|- (!n. CHOOSE n 0 = 1) /\
236   (!k. CHOOSE 0(SUC k) = 0) /\
237   (!n k. CHOOSE(SUC n)(SUC k) = (CHOOSE n(SUC k)) + (CHOOSE n k))
238\end{verbatim}
239\end{session}
240Two elementary properties of the binomial coefficients can now be proven
241by induction:
242\begin{session}
243\begin{verbatim}
244CHOOSE_LESS = |- !n k. n < k ==> (CHOOSE n k = 0)
245CHOOSE_SAME = |- !n. CHOOSE n n = 1
246\end{verbatim}
247\end{session}
248
249% ----------------------------------------------------------------------------
250
251\section{Monoids, Groups and Commutative Rings}
252\label{MonoidsGroupsRings}
253%A simple way to define algebraic structures in \HOL{}.
254
255\subsection{Associativity and Commutativity}
256
257An {\em algebraic structure} is a set equipped with some operators that
258obey some laws.  An elementary example of such a structure is the pair
259$(A,+)$, where $A$ is a set and $+$ is a binary operator on $A$ that obeys
260the laws of associativity,
261\[
262a + (b + c) = (a + b) + c
263\]
264and commutativity,
265\[
266a + b = b + a
267\]
268
269How might these laws be represented in \HOL{}?  The simplest scheme is
270to deal with structures of the form $(\sigma, +)$, where $\sigma$ is a
271type of the \HOL{} logic, and $+$ is a \HOL{} term of type
272$\sigma\to\sigma\to\sigma$.  Notice that the set in the structure is
273represented as a \HOL{} type: see the theory \verb@group@ presented in
274Elsa Gunter's case study of Modular Arithmetic for a more sophisticated
275approach where the set in a structure can be a subset of a \HOL{} type.
276
277The two laws can be defined as two constants in \HOL{}:
278\begin{session}
279\begin{verbatim}
280#let ASSOCIATIVE =
281#    new_definition (`ASSOCIATIVE`,
282#      "!plus: *->*->*. ASSOCIATIVE plus =
283#          (!a b c. plus a (plus b c) = plus (plus a b) c)");;
284...
285#let COMMUTATIVE =
286#    new_definition (`COMMUTATIVE`,
287#      "!plus: *->*->*. COMMUTATIVE plus =
288#          (!a b. plus a b = plus b a)");;
289\end{verbatim}
290\end{session}
291By instantiating the type variable \verb@*@ and specialising the variable
292\verb@plus@ these constants are applicable to any particular structure
293of the form $(\sigma, +: \sigma\to\sigma\to\sigma)$.
294
295A simple example to illustrate this methodology is a permutation theorem
296about an arbitrary associative commutative operator (which is called
297\verb@PERM@ in file \verb@mk_BINOMIAL.ml@).  It is worthwhile going
298through the proof in detail because it illustrates how a difficulty in
299making use of assumptions like \verb@ASSOCIATIVE plus@ and
300\verb@COMMUTATIVE plus@ can be solved.
301\begin{session}
302\begin{verbatim}
303#g "!plus: *->*->*.
304#       ASSOCIATIVE plus /\ COMMUTATIVE plus ==>
305#           !a b c. plus (plus a b) c = plus b (plus a c)";;
306\end{verbatim}
307\end{session}
308The first step is to strip off the quantifiers, and break up the implication:
309\begin{session}
310\begin{verbatim}
311#e (REPEAT STRIP_TAC);;
312OK..
313"plus(plus a b)c = plus b(plus a c)"
314    [ "ASSOCIATIVE plus" ]
315    [ "COMMUTATIVE plus" ]
316\end{verbatim}
317\end{session}
318The next step is to rewrite \verb@plus a b@ to \verb@plus b a@ on the
319left-hand side, from the assumption \verb@COMMUTATIVE plus@. If the theorem
320was about a specific operator, such as \verb@$+: num->num->num@ from the
321theory of numbers in \HOL{}, the rewriting could be done with
322\verb@REWRITE_TAC@ and the specific commutativity theorem, such as
323\verb@ADD_SYM@:
324\[
325\mbox{\tt |- !m n. m + n = n + m}.
326\]
327But in the general case all that is available is the definition of
328\verb@COMMUTATIVE plus@ which cannot be used directly with
329\verb@REWRITE_TAC@ to swap the terms \verb@a@ and \verb@b@.
330One approach is first to prove the
331following equation from the definition \verb@COMMUTATIVE plus@,
332\[
333\mbox{\tt COMMUTATIVE plus |- !a b. plus a b = plus b a},
334\]
335and then to use the equation with \verb@REWRITE_TAC@ in the same way as one
336might use \verb@ADD_SYM@.  This is a valid tactic because the
337assumption \verb@COMMUTATIVE plus@ is already present in the assumption
338list.  A little forward proof achieves the first step, the derivation of the
339equation:
340\begin{session}
341\begin{verbatim}
342#top_print print_all_thm;;
343- : (thm -> void)
344
345#let forwards = fst(EQ_IMP_RULE (SPEC "plus: *->*->*" COMMUTATIVE));;
346forwards = |- COMMUTATIVE plus ==> (!a b. plus a b = plus b a)
347
348#let eqn = UNDISCH forwards;;
349eqn = COMMUTATIVE plus |- !a b. plus a b = plus b a
350\end{verbatim}
351\end{session}
352This forward proof is a special case of a derived
353rule of the logic,
354\[
355\Gamma\turn \uquant{x} t_1 = t_2
356\over
357\Gamma, t_1[t'/x] \turn t_2[t'/x]
358\]
359which can be coded as the \ML{} function \verb@SPEC_EQ@:
360\begin{session}
361\begin{verbatim}
362#let SPEC_EQ v th = UNDISCH(fst(EQ_IMP_RULE (SPEC v th)));;
363SPEC_EQ = - : (term -> thm -> thm)
364\end{verbatim}
365\end{session}
366Now the equation \verb@eqn@ can be used to prove swap terms
367\verb@a@ and \verb@b@:
368\begin{session}
369\begin{verbatim}
370#e(SUBST1_TAC (SPECL ["a:*";"b:*"] eqn));;
371OK..
372"plus(plus b a)c = plus b(plus a c)"
373    [ "ASSOCIATIVE plus" ]
374    [ "COMMUTATIVE plus" ]
375\end{verbatim}
376\end{session}
377Now that the terms appear in both sides in the same order but with different
378grouping, associativity can be used to make the grouping on both sides
379the same.  Again the definition of \verb@ASSOCIATIVE@ cannot be used
380directly, but the derived rule \verb@SPEC_EQ@ obtains from the definition
381an equation analogous to \verb@eqn@ which can be used:
382\begin{session}
383\begin{verbatim}
384#e(REWRITE_TAC [SPEC_EQ "plus: *->*->*" ASSOCIATIVE]);;
385OK..
386goal proved
387...
388|- !plus.
389    ASSOCIATIVE plus /\ COMMUTATIVE plus ==>
390    (!a b c. plus(plus a b)c = plus b(plus a c))
391\end{verbatim}
392\end{session}
393
394The point of this example was to illustrate how to prove theorems about
395an arbitrary operator whose behaviour is specified by general predicates
396like \verb@ASSOCIATIVE@ and \verb@COMMUTATIVE@. The main difficulty is
397in deriving equations for use with rewriting or substitution.  The solution
398used in the example and extensively in the proof of the Binomial Theorem
399is to derive equations using the derived rule \verb@SPEC_EQ@. Another derived
400rule, \verb@SPEC_IMP@, is used extensively and is a variant of \verb@SPEC_EQ@:
401\[
402\Gamma\turn \uquant{x} t_1 \Rightarrow t_2
403\over
404\Gamma, t_1[t'/x] \turn t_2[t'/x]
405\]
406\begin{session}
407\begin{verbatim}
408#let SPEC_IMP v th = UNDISCH(SPEC v th);;
409SPEC_IMP = - : (term -> thm -> thm)
410\end{verbatim}
411\end{session}
412Two following two rules are versions of \verb@SPEC_EQ@ and \verb@SPEC_IMP@
413generalised to take lists of variables rather than just one:
414\begin{session}
415\begin{verbatim}
416SPECL_EQ = - : (term list -> thm -> thm)
417SPECL_IMP = - : (term list -> thm -> thm)
418\end{verbatim}
419\end{session}
420The final new derived rule used in the proof of the Binomial Theorem
421is \verb@STRIP_SPEC_IMP@, a variant of \verb@SPEC_IMP@, which
422splits up a conjunction in the antecedent into separate assumptions:
423\[
424\Gamma\turn \uquant{x} (t_1 \wedge \cdots \wedge t_n) \Rightarrow t_{n+1}
425\over
426\Gamma, t_1[t'/x], \ldots, t_n[t'/x] \turn t_{n+1}[t'/x]
427\]
428\begin{session}
429\begin{verbatim}
430#let STRIP_SPEC_IMP v th =
431#    UNDISCH_ALL(SPEC v (CONV_RULE (ONCE_DEPTH_CONV ANTE_CONJ_CONV) th));;
432STRIP_SPEC_IMP = - : (term -> thm -> thm)
433\end{verbatim}
434\end{session}
435
436There are other methods for proving theorems which are conditional on
437predicates such as \verb@ASSOCIATIVE@ and \verb@COMMUTATIVE@.  One is to
438rewrite with the definition before assuming the predicate (see, for instance,
439the proof of \verb@RIGHT_CANCELLATION@ in \verb@mk_BINOMIAL.ml@). This
440method is fine for small proofs, but in larger proofs it presents the problem
441of how to specify a particular assumption on the assumption list.  Another
442method is to put the predicates on the assumption list, and use resolution
443to extract the body of the definition (see the proof of
444\verb@UNIQUE_LEFT_INV@).  When the assumption list is not short this is
445rather a blunt instrument, in the sense that resolution can find many more
446theorems than the one desired, and is also rather computationally costly
447when compared to carefully constructing a theorem with \verb@SPEC_EQ@ and
448its variants.
449
450\subsection{Monoids}
451
452A {\em monoid} is a structure $(M,+)$ where $M$ is a set, $+$ is an
453associative binary operator on $M$, such that there is an {\em identity
454element}, $u \in M$, which is a left and right identity of $+$, that is,
455it satisfies $u+a = a+u = a$ for all $a \in M$. When the operator is written
456as $+$, the structure $(M,+)$ is called an {\em additive monoid} and the
457identity element is written as $0$; otherwise if the operator is written
458as $\times$, the structure $(M,\times)$ is called a {\em multiplicative
459monoid} and the identity element is written as $1$.
460\begin{session}
461\begin{verbatim}
462#let RAW_MONOID =
463#    new_definition (`RAW_MONOID`,
464#      "!plus: *->*->*. MONOID plus =
465#          ASSOCIATIVE plus /\
466#          (?u. (!a. plus a u = a) /\ (!a. plus u a = a))");;
467\end{verbatim}
468\end{session}
469This definition is inconvenient to manipulate as it stands, because of
470the presence of the existential quantifier.  However it is possible via
471Hilbert's $\epsilon$-operator to specify a function \verb@Id@ such that
472\verb@Id plus@ stands for an identity element of \verb@plus@, if such
473exists.  In fact it is easy to prove that the identity element in a monoid
474is unique.  The first step is to prove that there exists a function \verb@Id@
475with the property just described:
476\begin{session}
477\begin{verbatim}
478ID_LEMMA =
479|- ?Id.
480    !plus.
481     (?u. (!a. plus u a = a) /\ (!a. plus a u = a)) ==>
482     (!a. plus(Id plus)a = a) /\ (!a. plus a(Id plus) = a)
483\end{verbatim}
484\end{session}
485The proof of \verb@ID_LEMMA@ uses Hilbert's $\epsilon$-operator to
486construct a witness for the outer existential quantifier,
487\begin{verbatim}
488    \plus.@u:*.(!a:*. (plus u a = a)) /\ (!a. (plus a u = a))
489\end{verbatim}
490Given \verb@ID_LEMMA@, the constant \verb@Id@ is specified as follows:
491\begin{session}
492\begin{verbatim}
493#let ID = new_specification `Id` [`constant`, `Id`] ID_LEMMA;;
494ID =
495|- !plus.
496    (?u. (!a. plus u a = a) /\ (!a. plus a u = a)) ==>
497    (!a. plus(Id plus)a = a) /\ (!a. plus a(Id plus) = a)
498\end{verbatim}
499\end{session}
500The condition that \verb@Id plus@ is a left and right identity is expressed
501by the predicates \verb@LEFT_ID@ and \verb@RIGHT_ID@, respectively:
502\begin{session}
503\begin{verbatim}
504LEFT_ID = |- !plus. LEFT_ID plus = (!a. plus(Id plus)a = a)
505RIGHT_ID = |- !plus. RIGHT_ID plus = (!a. plus a(Id plus) = a)
506\end{verbatim}
507\end{session}
508Given these abbreviations it is easy to prove in \HOL{} that
509if a left and right identity element exists, then it is unique:
510\begin{session}
511\begin{verbatim}
512UNIQUE_LEFT_ID =
513|- !plus.
514    LEFT_ID plus /\ RIGHT_ID plus ==>
515    (!l. (!a. plus l a = a) ==> (Id plus = l))
516
517UNIQUE_RIGHT_ID =
518|- !plus.
519    LEFT_ID plus /\ RIGHT_ID plus ==>
520    (!r. (!a. plus a r = a) ==> (Id plus = r))
521\end{verbatim}
522\end{session}
523The use of these last two theorems is to help identify the element
524\verb@Id plus@ for some specific \verb@plus@, such as \verb@$+@ from the theory
525of \verb@arithmetic@.  The abbreviations \verb@LEFT_ID@ and \verb@RIGHT_ID@
526admit a new characterisation of the predicate \verb@MONOID@, which is easier
527to manipulate in \HOL{} than \verb@RAW_MONOID@:
528\begin{session}
529\begin{verbatim}
530MONOID =
531|- !plus. MONOID plus = LEFT_ID plus /\ RIGHT_ID plus /\ ASSOCIATIVE plus
532\end{verbatim}
533\end{session}
534
535\subsection{Groups}
536
537A {\em group} is a monoid in which every element is invertible.
538\begin{session}
539\begin{verbatim}
540RAW_GROUP =
541|- !plus.
542    Group plus =
543    MONOID plus /\
544    (!a. ?b. (plus b a = Id plus) /\ (plus a b = Id plus))
545\end{verbatim}
546\end{session}
547An alternative characterisation of \verb@Group@\footnote{The mixed
548  case identifier \texttt{Group} is used rather than \texttt{GROUP},
549  because the latter is already defined in Elsa Gunter's theory
550  \texttt{group}, a parent of her theory \texttt{integer}, which needs
551  to coexist with the present theory so that integers can be used as
552  an example ring (Section~\ref{BinomialTheoremForIntegers}).} which
553makes the existential quantification implicit can be had by specifying
554a function \verb@Inv@ so that, if \verb@Group plus@ holds, then the
555element \verb@Inv plus a@ is an inverse of \verb@a@.  The function
556\verb@Inv@ is specified in a way analogous to the specification of
557\verb@Id@.  First a lemma is proven which says there does exist a
558function \verb@Inv@ with the property desired:
559\begin{session}
560\begin{verbatim}
561INV_LEMMA =
562|- ?Inv.
563    !plus.
564     (!a. ?b. (plus b a = Id plus) /\ (plus a b = Id plus)) ==>
565     (!a.
566       (plus(Inv plus a)a = Id plus) /\ (plus a(Inv plus a) = Id plus))
567\end{verbatim}
568\end{session}
569Given this lemma, which is proven by an easy tactic almost the same
570as the one for \verb@ID_LEMMA@, the constant \verb@Inv@ can be specified:
571\begin{session}
572\begin{verbatim}
573#let INV =
574#    new_specification `Inv` [`constant`, `Inv`] INV_LEMMA;;
575\end{verbatim}
576\end{session}
577The predicates which say that \verb@Inv@ produces left and right inverses
578are defined as new constants:
579\begin{session}
580\begin{verbatim}
581LEFT_INV =
582|- !plus. LEFT_INV plus = (!a. plus(Inv plus a)a = Id plus)
583RIGHT_INV =
584|- !plus. RIGHT_INV plus = (!a. plus a(Inv plus a) = Id plus)
585\end{verbatim}
586\end{session}
587Finally, the alternative characterisation of \verb@Group@ can be proven:
588\begin{session}
589\begin{verbatim}
590GROUP = |- Group plus = MONOID plus /\ LEFT_INV plus /\ RIGHT_INV plus
591\end{verbatim}
592\end{session}
593The theory of groups has been developed only as far as needed to prove
594the Binomial Theorem; Appendix~\ref{PrincipalLemmas} shows the one lemma,
595\verb@RIGHT_CANCELLATION@, specifically about groups.
596
597\subsection{Rings}
598
599A {\em ring} is a structure $(R,+,\times)$ such that structure $(R,+)$
600is a group, structure $(R,\times)$ is a monoid, {\em addition}, $+$, is
601commutative, and {\em multiplication}, $\times$, distributes (on both sides)
602over addition.  A {\em commutative ring} is a ring in which multiplication
603is commutative.  The word ring is used in the remainder of this \self{}
604and in the \HOL{} code to mean a commutative ring:
605\begin{session}
606\begin{verbatim}
607RING =
608|- !plus times.
609    RING(plus,times) =
610    Group plus /\
611    COMMUTATIVE plus /\
612    MONOID times /\
613    COMMUTATIVE times /\
614    LEFT_DISTRIB(plus,times) /\
615    RIGHT_DISTRIB(plus,times)
616\end{verbatim}
617\end{session}
618\verb@LEFT_DISTRIB@ and \verb@RIGHT_DISTRIB@ are new constants
619defined by the theorems:
620\begin{session}
621\begin{verbatim}
622LEFT_DISTRIB =
623|- !plus times.
624    LEFT_DISTRIB(plus,times) =
625    (!a b c. times a(plus b c) = plus(times a b)(times a c))
626
627RIGHT_DISTRIB =
628|- !plus times.
629    RIGHT_DISTRIB(plus,times) =
630    (!a b c. times(plus a b)c = plus(times a c)(times b c))
631\end{verbatim}
632\end{session}
633Notice that the definition of \verb@RING@ abbreviates a conjunction of
634predicates, some of which are atomic, in the sense that they are some basic
635property of \verb@plus@ or \verb@times@ or both, and some are compound,
636in the sense that they abbreviate a further conjunction of properties.
637The definition is a tree of predicates with atomic ones like
638\verb@LEFT_DISTRIB@ and \verb@ASSOCIATIVE@, at the leaves, and compound
639ones like \verb@Group@ and \verb@MONOID@, at the branches.  Many theorems
640conditional on \verb@RING(plus,times)@ need to make use of both atomic
641and compound predicates contained in the tree.  To make access to all these
642predicates easy, the following lemma is proven, which makes explicit all
643the predicates in the tree:
644\begin{session}
645\begin{verbatim}
646RING_LEMMA =
647|- RING(plus,times) ==>
648   RING(plus,times) /\
649   Group plus /\
650   MONOID plus /\
651   LEFT_ID plus /\
652   RIGHT_ID plus /\
653   ASSOCIATIVE plus /\
654   LEFT_INV plus /\
655   RIGHT_INV plus /\
656   COMMUTATIVE plus /\
657   MONOID times /\
658   LEFT_ID times /\
659   RIGHT_ID times /\
660   ASSOCIATIVE times /\
661   COMMUTATIVE times /\
662   LEFT_DISTRIB(plus,times) /\
663   RIGHT_DISTRIB(plus,times)
664\end{verbatim}
665\end{session}
666Goals of the form,
667\[
668\mbox{\tt !plus times. RING(plus,times) ==> ...}
669\]
670are consistently tackled with the following tactic, which uses \verb@RING_LEMMA@
671to add all the atomic and compound predicates in the tree to the assumption
672list:
673\begin{session}
674\begin{verbatim}
675#let RING_TAC =
676#    GEN_TAC THEN GEN_TAC THEN
677#    DISCH_THEN (\th. STRIP_ASSUME_TAC (MP RING_LEMMA th));;
678\end{verbatim}
679\end{session}
680The one lemma about rings needed here, \verb@RING_0@, is stated in
681Appendix~\ref{PrincipalLemmas}.
682
683% ----------------------------------------------------------------------------
684
685\section{Powers, Reductions, Ranges and Sums}
686\label{PowersReductionsRangesSums}
687%Translation of mathematical definitions such as sums of series into HOL
688
689The aim of this section is to explain how the mathematical notation
690$\sum_{k=0}^n {n \choose k} a^k b^{n-k}$ can be rendered in \HOL{}.
691If this is written out more explicitly than necessary for human
692consumption it looks like this:
693\[
694\sum_{k=0}^n {n \choose k} \cdot (a^k \times b^{n-k})
695\]
696where $\times$ is multiplication in the ring, and where exponentiation
697and $\cdot$ stand for the scalar powers of multiplication and addition,
698respectively.  Since binomial coefficients were treated in
699Section~\ref{BinomialCoefficients}, what remain to be defined in this section
700are scalar powers and the $\Sigma$-notation.  The latter is tackled in
701three steps: reduction of a list of monoid elements; generation of lists
702comprising subranges of natural numbers, $[n,n+1,\ldots,n+k-1]$; and finally
703the reduction of a list of ring elements indexed by a subrange of the natural
704numbers.
705
706\subsection{Scalar powers in a monoid}
707
708If $(M,+)$ and $(M,\times)$ are additive and multiplicative
709monoids respectively, the scalar powers of addition and multiplication are
710defined informally by the following equations, where $n>0$:
711\begin{eqnarray*}
712n \cdot a &=& \overbrace{a + \cdots + a}^{n} \\
713a^n &=& \overbrace{a \times \cdots \times a}^{n}
714\end{eqnarray*}
715When $n=0$, $n \cdot a = 0$ and $a^n = 1$.
716
717Scalar power is defined in \HOL{} via the constant \verb@POWER@ which
718is defined by primitive recursion:
719\begin{session}
720\begin{verbatim}
721POWER =
722|- (!plus a. POWER plus 0 a = Id plus) /\
723   (!plus n a. POWER plus(SUC n)a = plus a(POWER plus n a))
724\end{verbatim}
725\end{session}
726Three lemmas about \verb@POWER@ are stated in Appendix~\ref{PrincipalLemmas}.
727
728\subsection{Reduction in a monoid}
729
730This section makes use of the standard \HOL{} theory of lists, in particular
731the constants \verb@NIL@, \verb@CONS@, \verb@APPEND@ and \verb@MAP@,
732definition by primitive recursion and proof by induction. The theory of
733lists is described in \DESCRIPTION.
734
735If $(M,+)$ is a monoid, then the reduction of a list $[a_1,\ldots,a_n]$,
736where each $a_i$ is an element of $M$, is the sum,
737\[
738a_1 + \cdots + a_n,
739\]
740or the monoid's identity element if $n=0$.
741
742Reduction is represented in \HOL{} by the constant \verb@REDUCE@,
743with type,
744\begin{verbatim}
745REDUCE : (*->*->*) -> (*)list -> *
746\end{verbatim}
747and which is defined by primitive recursion on lists on its second argument:
748\begin{session}
749\begin{verbatim}
750#let REDUCE =
751#  new_list_rec_definition (`REDUCE`,
752#    "(!plus. (REDUCE plus NIL) = (Id plus):*) /\
753#     (!plus (a:*) as. REDUCE plus (CONS a as) = plus a (REDUCE plus as))");;
754REDUCE =
755|- (!plus. REDUCE plus[] = Id plus) /\
756   (!plus a as. REDUCE plus(CONS a as) = plus a(REDUCE plus as))
757\end{verbatim}
758\end{session}
759Three lemmas about \verb@REDUCE@ are stated in Appendix~\ref{PrincipalLemmas}.
760
761\subsection{Subranges of the natural numbers}
762
763The constant \verb@RANGE@, which is a curried function of two numbers,
764is defined by primitive recursion so that,
765\[
766\verb@RANGE@\,m\,n = [m, m+1, \ldots, m+n-1]
767\]
768This definition means that the subrange is the first $n$ numbers starting
769at $m$.  In particular, when $n=0$ the list is empty.
770Appendix~\ref{PrincipalLemmas} states two simple properties of \verb@RANGE@
771
772\verb@RANGE@ generates a subrange given its least element and length;
773an alternative method is to generate the subrange given its two endpoints,
774via a constant \verb@INTERVAL@ which satisfies the equation:
775\[
776\verb@INTERVAL@\,m\,n =
777    \left\{\begin{array}{ll}
778    [m, m+1, \ldots, n] & \mbox{if $m \leq n$} \\
779    {}[] & \mbox{otherwise}
780    \end{array}\right.
781\]
782
783For comparison, the two methods are definable in \HOL{} as follows:
784\begin{session}
785\begin{verbatim}
786#let RANGE =
787#    new_recursive_definition false num_Axiom `RANGE`
788#      "(RANGE m 0 = NIL) /\
789#       (RANGE m (SUC n) = CONS m (RANGE (SUC m) n))";;
790#
791#let INTERVAL =
792#    new_recursive_definition false num_Axiom `INTERVAL`
793#      "(INTERVAL m 0 =
794#          (m > 0) => NIL | [0]) /\
795#      (INTERVAL m (SUC n) =
796#          (m > (SUC n)) => NIL | APPEND (INTERVAL m n) ([SUC n]))";;
797\end{verbatim}
798\end{session}
799
800\subsection{$\Sigma$-notation}
801
802The standard $\Sigma$-notation is defined informally by the equation,
803\[
804\sum_{i=m}^n a_i =
805    \left\{\begin{array}{ll}
806    a_m + a_{m+1} + \cdots + a_n & \mbox{if $m \leq n$} \\
807    0 & \mbox{otherwise}
808    \end{array}\right.
809\]
810Therefore the standard $\Sigma$-notation is naturally definable via
811a subrange generated by \verb@INTERVAL@.
812
813An alternative notation, naturally definable via \verb@RANGE@, can be
814defined informally as follows:
815\[
816\sum_{i=m}^{(n)} a_i =
817    \left\{\begin{array}{ll}
818    a_m + a_{m+1} + \cdots + a_{m+n-1} & \mbox{if $n > 0$} \\
819    0 & \mbox{otherwise}
820    \end{array}\right.
821\]
822The brackets around the $n$ above the $\Sigma$ distinguish this
823notation from the standard one.
824
825This second notation is used in the \HOL{} proof of the Binomial Theorem.
826The difference between the two is just that the standard notation is
827definable via \verb@INTERVAL@, and the second via \verb@RANGE@. The second
828notation was chosen entirely because the constant \verb@RANGE@ is easier
829to manipulate in \HOL{} than \verb@INTERVAL@, which is because the latter
830has a more complex definition. The disadvantage of this choice is that
831the proof of the theorem uses a non-standard notation, but that's all the
832difference is: notation.  If desired, the Binomial Theorem stated in the
833standard notation could be trivially deduced from its non-standard statement.
834
835The non-standard $\Sigma$-notation is defined in \HOL{} as the following
836abbreviation:
837\begin{session}
838\begin{verbatim}
839#let SIGMA =
840#    new_definition (`SIGMA`,
841#      "SIGMA (plus,m,n) f = REDUCE plus (MAP f (RANGE m n)): *");;
842\end{verbatim}
843\end{session}
844Note that an indexed term $a_i$ is represented in \HOL{} as a function
845\verb@f@ of type \verb@num -> *@.
846
847Appendix~\ref{PrincipalLemmas} states several properties of \verb@SIGMA@;
848their proofs make use of properties of \verb@MAP@ from the theory
849\verb@list@, and also the lemmas about \verb@REDUCE@ and \verb@RANGE@.
850
851The functions \verb@REDUCE@ and \verb@RANGE@ are needed here only to define
852\verb@SIGMA@, and in fact could be omitted if \verb@SIGMA@ were to be defined
853directly  using primitive recursion.  It is worth going to the trouble
854of defining \verb@REDUCE@ and \verb@RANGE@, however, because they are likely
855to be useful in other situations.
856
857% ----------------------------------------------------------------------------
858
859\section{The Binomial Theorem for a Commutative Ring}
860\label{BinomialTheorem}
861%Division of a medium sized proof into manageable lemmas.
862
863This section outlines the proof.  For full details see the tactics in
864\verb@mk_BINOMIAL.ml@.
865
866The statement and proof of the Binomial Theorem use a new \HOL{} constant,
867\verb@BINTERM@, to abbreviate indexed terms of the form $\lambda k.\,{n
868\choose k} \cdot a^{n-k} \times b^k$.  \verb@BINTERM@ is defined as a
869curried function as follows:
870\begin{session}
871\begin{verbatim}
872#let BINTERM_DEF =
873#    new_definition (`BINTERM_DEF`,
874#      "BINTERM (plus: *->*->*) (times: *->*->*) a b n k =
875#          POWER plus (CHOOSE n k)
876#              (times (POWER times (n-k) a) (POWER times k b))");;
877\end{verbatim}
878\end{session}
879This abbreviation lets the Binomial Theorem be stated and proven in \HOL{} as
880follows:
881\begin{session}
882\begin{verbatim}
883BINOMIAL =
884|- !plus times.
885    RING(plus,times) ==>
886    (!a b n.
887      POWER times n(plus a b) =
888      SIGMA(plus,0,SUC n)(BINTERM plus times a b n))
889\end{verbatim}
890\end{session}
891In outline, the proof proceeds as follows.  The following equation is
892proven by induction on $n$:
893\begin{eqnarray*}
894(a+b)^n &=& \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k
895\end{eqnarray*}
896When $n=0$ both sides reduce to the multiplicative identity element, $1$,
897of the ring. For the inductive step, the equation is assumed for $n$, and
898the following goal must be proven:
899\begin{eqnarray*}
900(a+b)^{n+1} &=& \sum_{k=0}^{(n+2)} {{n+1} \choose k} a^{n+1-k} b^k
901\end{eqnarray*}
902Three lemmas are used in the inductive step:
903\[\begin{array}{rcll}
904\displaystyle
905\sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k &=&
906\displaystyle
907    a \times \sum_{k=1}^{(n)} {n \choose k} a^{n+1-k} b^k +
908    b \times \sum_{k=0}^{(n)} {n \choose k} a^{n+1-k} b^k
909    & \mbox{({\tt LEMMA\_1})} \\
910\displaystyle
911a \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k &=&
912\displaystyle
913    a^{n+1} +
914    a \times \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k
915    & \mbox{({\tt LEMMA\_2})} \\
916\displaystyle
917b \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k &=&
918\displaystyle
919    b \times \sum_{k=0}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k
920    + b^{n+1}
921    & \mbox{({\tt LEMMA\_3})}
922\end{array}\]
923\verb@LEMMA_1@ is the key to the inductive step. The three stages of the
924inductive step are indicated (a), (b) and (c) in \ML{} comments.  Step
925(a) expands the right-hand side of the goal with \verb@LEMMA_1@:
926\begin{eqnarray*}
927\mbox{rhs} &=&
928    a^{n+1} +
929    a \times \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k +
930    b \times \sum_{k=0}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k +
931    b^{n+1}
932\end{eqnarray*}
933Step (b) expands the left-hand side with the induction hypothesis:
934\begin{eqnarray*}
935\mbox{lhs} &=&
936    (a+b)\sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k
937\end{eqnarray*}
938Finally, in step (c) the expansions of the two sides are shown to be the
939same, using distributivity of $\times$ over $+$, associativity of $+$,
940and \verb@LEMMA_2@ and \verb@LEMMA_3@:
941\begin{eqnarray*}
942\mbox{lhs} &=&
943    a \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k
944    +
945    b \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k \\
946  &=&
947    a^{n+1} +
948    a \times \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k
949    +
950    b \times \sum_{k=0}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k
951    + b^{n+1} \\
952  &=&
953      \mbox{rhs}
954\end{eqnarray*}
955
956% ----------------------------------------------------------------------------
957
958\section{The Binomial Theorem for Integers}
959\label{BinomialTheoremForIntegers}
960
961The previous sections have dealt with the material in theory \verb@BINOMIAL@,
962concerning how to prove the Binomial Theorem for an arbitrary ring.  A
963natural next step is to produce some specific commutative ring, and prove
964a Binomial Theorem for it, as an instance of the general case.
965
966To see how to do so for Elsa Gunter's theory of integers, take a look
967at the file {\tt mk\_BINOMIAL\_integer.ml}, contained in directory \path{}.
968When loaded into \HOL{}, this \ML{} file proves that the integers are a
969ring---by quoting laws contained in the theory \verb@integer@---and then
970derives a Binomial Theorem for the integers---by Modus Ponens from the
971general theorem \verb@BINOMIAL@.
972