| 
5375
 | 
     1  | 
\chapter{Functional Programming in HOL}
 | 
| 
 | 
     2  | 
  | 
| 
 | 
     3  | 
Although on the surface this chapter is mainly concerned with how to write
  | 
| 
 | 
     4  | 
functional programs in HOL and how to verify them, most of the
  | 
| 
 | 
     5  | 
constructs and proof procedures introduced are general purpose and recur in
  | 
| 
 | 
     6  | 
any specification or verification task.
  | 
| 
 | 
     7  | 
  | 
| 
 | 
     8  | 
The dedicated functional programmer should be warned: HOL offers only what
  | 
| 
 | 
     9  | 
could be called {\em total functional programming} --- all functions in HOL
 | 
| 
 | 
    10  | 
must be total; lazy data structures are not directly available. On the
  | 
| 
 | 
    11  | 
positive side, functions in HOL need not be computable: HOL is a
  | 
| 
 | 
    12  | 
specification language that goes well beyond what can be expressed as a
  | 
| 
 | 
    13  | 
program. However, for the time being we concentrate on the computable.
  | 
| 
 | 
    14  | 
  | 
| 
 | 
    15  | 
\section{An introductory theory}
 | 
| 
 | 
    16  | 
\label{sec:intro-theory}
 | 
| 
 | 
    17  | 
  | 
| 
 | 
    18  | 
Functional programming needs datatypes and functions. Both of them can be
  | 
| 
 | 
    19  | 
defined in a theory with a syntax reminiscent of languages like ML or
  | 
| 
 | 
    20  | 
Haskell. As an example consider the theory in Fig.~\ref{fig:ToyList}.
 | 
| 
 | 
    21  | 
  | 
| 
 | 
    22  | 
\begin{figure}[htbp]
 | 
| 
 | 
    23  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
    24  | 
\input{ToyList/ToyList.thy}\end{ttbox}
 | 
| 
 | 
    25  | 
\caption{A theory of lists}
 | 
| 
 | 
    26  | 
\label{fig:ToyList}
 | 
| 
 | 
    27  | 
\end{figure}
 | 
| 
 | 
    28  | 
  | 
| 
 | 
    29  | 
HOL already has a predefined theory of lists called \texttt{List} ---
 | 
| 
 | 
    30  | 
\texttt{ToyList} is merely a small fragment of it chosen as an example. In
 | 
| 
 | 
    31  | 
contrast to what is recommended in \S\ref{sec:Basic:Theories},
 | 
| 
 | 
    32  | 
\texttt{ToyList} is not based on \texttt{Main} but on \texttt{Datatype}, a
 | 
| 
 | 
    33  | 
theory that contains everything required for datatype definitions but does
  | 
| 
 | 
    34  | 
not have \texttt{List} as a parent, thus avoiding ambiguities caused by
 | 
| 
 | 
    35  | 
defining lists twice.
  | 
| 
 | 
    36  | 
  | 
| 
 | 
    37  | 
The \ttindexbold{datatype} \texttt{list} introduces two constructors
 | 
| 
 | 
    38  | 
\texttt{Nil} and \texttt{Cons}, the empty list and the operator that adds an
 | 
| 
 | 
    39  | 
element to the front of a list. For example, the term \texttt{Cons True (Cons
 | 
| 
 | 
    40  | 
  False Nil)} is a value of type \texttt{bool~list}, namely the list with the
 | 
| 
 | 
    41  | 
elements \texttt{True} and \texttt{False}. Because this notation becomes
 | 
| 
 | 
    42  | 
unwieldy very quickly, the datatype declaration is annotated with an
  | 
| 
 | 
    43  | 
alternative syntax: instead of \texttt{Nil} and \texttt{Cons}~$x$~$xs$ we can
 | 
| 
 | 
    44  | 
write \index{#@{\tt[]}|bold}\texttt{[]} and
 | 
| 
 | 
    45  | 
\texttt{$x$~\#~$xs$}\index{#@{\tt\#}|bold}. In fact, this alternative syntax
 | 
| 
 | 
    46  | 
is the standard syntax. Thus the list \texttt{Cons True (Cons False Nil)}
 | 
| 
 | 
    47  | 
becomes \texttt{True \# False \# []}. The annotation \ttindexbold{infixr}
 | 
| 
 | 
    48  | 
means that \texttt{\#} associates to the right, i.e.\ the term \texttt{$x$ \#
 | 
| 
 | 
    49  | 
  $y$ \# $z$} is read as \texttt{$x$ \# ($y$ \# $z$)} and not as \texttt{($x$
 | 
| 
 | 
    50  | 
  \# $y$) \# $z$}.
  | 
| 
 | 
    51  | 
  | 
| 
 | 
    52  | 
\begin{warn}
 | 
| 
 | 
    53  | 
  Syntax annotations are a powerful but completely optional feature. You
  | 
| 
 | 
    54  | 
  could drop them from theory \texttt{ToyList} and go back to the identifiers
 | 
| 
 | 
    55  | 
  \texttt{Nil} and \texttt{Cons}. However, lists are such a central datatype
 | 
| 
 | 
    56  | 
  that their syntax is highly customized. We recommend that novices should
  | 
| 
 | 
    57  | 
  not use syntax annotations in their own theories.
  | 
| 
 | 
    58  | 
\end{warn}
 | 
| 
 | 
    59  | 
  | 
| 
 | 
    60  | 
Next, the functions \texttt{app} and \texttt{rev} are declared. In contrast
 | 
| 
 | 
    61  | 
to ML, Isabelle insists on explicit declarations of all functions (keyword
  | 
| 
 | 
    62  | 
\ttindexbold{consts}).  (Apart from the declaration-before-use restriction,
 | 
| 
 | 
    63  | 
the order of items in a theory file is unconstrained.) Function \texttt{app}
 | 
| 
 | 
    64  | 
is annotated with concrete syntax too. Instead of the prefix syntax
  | 
| 
 | 
    65  | 
\texttt{app}~$xs$~$ys$ the infix $xs$~\texttt{\at}~$ys$ becomes the preferred
 | 
| 
 | 
    66  | 
form.
  | 
| 
 | 
    67  | 
  | 
| 
 | 
    68  | 
Both functions are defined recursively. The equations for \texttt{app} and
 | 
| 
 | 
    69  | 
\texttt{rev} hardly need comments: \texttt{app} appends two lists and
 | 
| 
5850
 | 
    70  | 
\texttt{rev} reverses a list.  The keyword \ttindex{primrec} indicates that
 | 
| 
5375
 | 
    71  | 
the recursion is of a particularly primitive kind where each recursive call
  | 
| 
 | 
    72  | 
peels off a datatype constructor from one of the arguments (see
  | 
| 
 | 
    73  | 
\S\ref{sec:datatype}).  Thus the recursion always terminates, i.e.\ the
 | 
| 
 | 
    74  | 
function is \bfindex{total}.
 | 
| 
 | 
    75  | 
  | 
| 
 | 
    76  | 
The termination requirement is absolutely essential in HOL, a logic of total
  | 
| 
 | 
    77  | 
functions. If we were to drop it, inconsistencies could quickly arise: the
  | 
| 
 | 
    78  | 
``definition'' $f(n) = f(n)+1$ immediately leads to $0 = 1$ by subtracting
  | 
| 
 | 
    79  | 
$f(n)$ on both sides.
  | 
| 
 | 
    80  | 
% However, this is a subtle issue that we cannot discuss here further.
  | 
| 
 | 
    81  | 
  | 
| 
 | 
    82  | 
\begin{warn}
 | 
| 
 | 
    83  | 
  As we have indicated, the desire for total functions is not a gratuitously
  | 
| 
 | 
    84  | 
  imposed restriction but an essential characteristic of HOL. It is only
  | 
| 
 | 
    85  | 
  because of totality that reasoning in HOL is comparatively easy.  More
  | 
| 
 | 
    86  | 
  generally, the philosophy in HOL is not to allow arbitrary axioms (such as
  | 
| 
 | 
    87  | 
  function definitions whose totality has not been proved) because they
  | 
| 
 | 
    88  | 
  quickly lead to inconsistencies. Instead, fixed constructs for introducing
  | 
| 
 | 
    89  | 
  types and functions are offered (such as \texttt{datatype} and
 | 
| 
 | 
    90  | 
  \texttt{primrec}) which are guaranteed to preserve consistency.
 | 
| 
 | 
    91  | 
\end{warn}
 | 
| 
 | 
    92  | 
  | 
| 
 | 
    93  | 
A remark about syntax.  The textual definition of a theory follows a fixed
  | 
| 
 | 
    94  | 
syntax with keywords like \texttt{datatype} and \texttt{end} (see
 | 
| 
 | 
    95  | 
Fig.~\ref{fig:keywords} in Appendix~\ref{sec:Appendix} for a full list).
 | 
| 
 | 
    96  | 
Embedded in this syntax are the types and formulae of HOL, whose syntax is
  | 
| 
 | 
    97  | 
extensible, e.g.\ by new user-defined infix operators
  | 
| 
 | 
    98  | 
(see~\ref{sec:infix-syntax}). To distinguish the two levels, everything
 | 
| 
 | 
    99  | 
HOL-specific should be enclosed in \texttt{"}\dots\texttt{"}. The same holds
 | 
| 
 | 
   100  | 
for identifiers that happen to be keywords, as in
  | 
| 
 | 
   101  | 
\begin{ttbox}
 | 
| 
 | 
   102  | 
consts "end" :: 'a list => 'a
  | 
| 
 | 
   103  | 
\end{ttbox}
 | 
| 
 | 
   104  | 
To lessen this burden, quotation marks around types can be dropped,
  | 
| 
 | 
   105  | 
provided their syntax does not go beyond what is described in
  | 
| 
 | 
   106  | 
\S\ref{sec:TypesTermsForms}. Types containing further operators, e.g.\
 | 
| 
5850
 | 
   107  | 
\label{startype} \texttt{*} for Cartesian products, need quotation marks.
 | 
| 
5375
 | 
   108  | 
  | 
| 
 | 
   109  | 
When Isabelle prints a syntax error message, it refers to the HOL syntax as
  | 
| 
 | 
   110  | 
the \bfindex{inner syntax}.
 | 
| 
 | 
   111  | 
  | 
| 
 | 
   112  | 
\section{An introductory proof}
 | 
| 
 | 
   113  | 
\label{sec:intro-proof}
 | 
| 
 | 
   114  | 
  | 
| 
 | 
   115  | 
Having defined \texttt{ToyList}, we load it with the ML command
 | 
| 
 | 
   116  | 
\begin{ttbox}
 | 
| 
 | 
   117  | 
use_thy "ToyList";
  | 
| 
 | 
   118  | 
\end{ttbox}
 | 
| 
 | 
   119  | 
and are ready to prove a few simple theorems. This will illustrate not just
  | 
| 
 | 
   120  | 
the basic proof commands but also the typical proof process.
  | 
| 
 | 
   121  | 
  | 
| 
 | 
   122  | 
\subsubsection*{Main goal: \texttt{rev(rev xs) = xs}}
 | 
| 
 | 
   123  | 
  | 
| 
 | 
   124  | 
Our goal is to show that reversing a list twice produces the original
  | 
| 
 | 
   125  | 
list. Typing
  | 
| 
 | 
   126  | 
\begin{ttbox}
 | 
| 
 | 
   127  | 
\input{ToyList/thm}\end{ttbox}
 | 
| 
 | 
   128  | 
establishes a new goal to be proved in the context of the current theory,
  | 
| 
 | 
   129  | 
which is the one we just loaded. Isabelle's response is to print the current proof state:
  | 
| 
 | 
   130  | 
\begin{ttbox}
 | 
| 
 | 
   131  | 
{\out Level 0}
 | 
| 
 | 
   132  | 
{\out rev (rev xs) = xs}
 | 
| 
 | 
   133  | 
{\out  1. rev (rev xs) = xs}
 | 
| 
 | 
   134  | 
\end{ttbox}
 | 
| 
 | 
   135  | 
Until we have finished a proof, the proof state always looks like this:
  | 
| 
 | 
   136  | 
\begin{ttbox}
 | 
| 
 | 
   137  | 
{\out Level \(i\)}
 | 
| 
 | 
   138  | 
{\out \(G\)}
 | 
| 
 | 
   139  | 
{\out  1. \(G@1\)}
 | 
| 
 | 
   140  | 
{\out  \(\vdots\)}
 | 
| 
 | 
   141  | 
{\out  \(n\). \(G@n\)}
 | 
| 
 | 
   142  | 
\end{ttbox}
 | 
| 
 | 
   143  | 
where \texttt{Level}~$i$ indicates that we are $i$ steps into the proof, $G$
 | 
| 
 | 
   144  | 
is the overall goal that we are trying to prove, and the numbered lines
  | 
| 
 | 
   145  | 
contain the subgoals $G@1$, \dots, $G@n$ that we need to prove to establish
  | 
| 
 | 
   146  | 
$G$. At \texttt{Level 0} there is only one subgoal, which is identical with
 | 
| 
 | 
   147  | 
the overall goal.  Normally $G$ is constant and only serves as a reminder.
  | 
| 
 | 
   148  | 
Hence we rarely show it in this tutorial.
  | 
| 
 | 
   149  | 
  | 
| 
 | 
   150  | 
Let us now get back to \texttt{rev(rev xs) = xs}. Properties of recursively
 | 
| 
 | 
   151  | 
defined functions are best established by induction. In this case there is
  | 
| 
 | 
   152  | 
not much choice except to induct on \texttt{xs}:
 | 
| 
 | 
   153  | 
\begin{ttbox}
 | 
| 
 | 
   154  | 
\input{ToyList/inductxs}\end{ttbox}
 | 
| 
 | 
   155  | 
This tells Isabelle to perform induction on variable \texttt{xs} in subgoal
 | 
| 
 | 
   156  | 
1. The new proof state contains two subgoals, namely the base case
  | 
| 
 | 
   157  | 
(\texttt{Nil}) and the induction step (\texttt{Cons}):
 | 
| 
 | 
   158  | 
\begin{ttbox}
 | 
| 
 | 
   159  | 
{\out 1. rev (rev []) = []}
 | 
| 
 | 
   160  | 
{\out 2. !!a list. rev (rev list) = list ==> rev (rev (a # list)) = a # list}
 | 
| 
 | 
   161  | 
\end{ttbox}
 | 
| 
 | 
   162  | 
The induction step is an example of the general format of a subgoal:
  | 
| 
 | 
   163  | 
\begin{ttbox}
 | 
| 
 | 
   164  | 
{\out  \(i\). !!\(x@1 \dots x@n\). {\it assumptions} ==> {\it conclusion}}
 | 
| 
 | 
   165  | 
\end{ttbox}\index{==>@{\tt==>}|bold}
 | 
| 
 | 
   166  | 
The prefix of bound variables \texttt{!!\(x@1 \dots x@n\)} can be ignored
 | 
| 
 | 
   167  | 
most of the time, or simply treated as a list of variables local to this
  | 
| 
 | 
   168  | 
subgoal. Their deeper significance is explained in \S\ref{sec:PCproofs}.  The
 | 
| 
 | 
   169  | 
{\it assumptions} are the local assumptions for this subgoal and {\it
 | 
| 
 | 
   170  | 
  conclusion} is the actual proposition to be proved. Typical proof steps
  | 
| 
 | 
   171  | 
that add new assumptions are induction or case distinction. In our example
  | 
| 
 | 
   172  | 
the only assumption is the induction hypothesis \texttt{rev (rev list) =
 | 
| 
 | 
   173  | 
  list}, where \texttt{list} is a variable name chosen by Isabelle. If there
 | 
| 
 | 
   174  | 
are multiple assumptions, they are enclosed in the bracket pair
  | 
| 
 | 
   175  | 
\texttt{[|}\index{==>@\ttlbr|bold} and \texttt{|]}\index{==>@\ttrbr|bold}
 | 
| 
 | 
   176  | 
and separated by semicolons.
  | 
| 
 | 
   177  | 
  | 
| 
 | 
   178  | 
Let us try to solve both goals automatically:
  | 
| 
 | 
   179  | 
\begin{ttbox}
 | 
| 
 | 
   180  | 
\input{ToyList/autotac}\end{ttbox}
 | 
| 
 | 
   181  | 
This command tells Isabelle to apply a proof strategy called
  | 
| 
 | 
   182  | 
\texttt{Auto_tac} to all subgoals. Essentially, \texttt{Auto_tac} tries to
 | 
| 
 | 
   183  | 
`simplify' the subgoals.  In our case, subgoal~1 is solved completely (thanks
  | 
| 
 | 
   184  | 
to the equation \texttt{rev [] = []}) and disappears; the simplified version
 | 
| 
 | 
   185  | 
of subgoal~2 becomes the new subgoal~1:
  | 
| 
 | 
   186  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
   187  | 
{\out 1. !!a list. rev(rev list) = list ==> rev(rev list @ a # []) = a # list}
 | 
| 
 | 
   188  | 
\end{ttbox}
 | 
| 
 | 
   189  | 
In order to simplify this subgoal further, a lemma suggests itself.
  | 
| 
 | 
   190  | 
  | 
| 
 | 
   191  | 
\subsubsection*{First lemma: \texttt{rev(xs \at~ys) = (rev ys) \at~(rev xs)}}
 | 
| 
 | 
   192  | 
  | 
| 
 | 
   193  | 
We start the proof as usual:
  | 
| 
 | 
   194  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
   195  | 
\input{ToyList/lemma1}\end{ttbox}
 | 
| 
 | 
   196  | 
There are two variables that we could induct on: \texttt{xs} and
 | 
| 
 | 
   197  | 
\texttt{ys}. Because \texttt{\at} is defined by recursion on
 | 
| 
 | 
   198  | 
the first argument, \texttt{xs} is the correct one:
 | 
| 
 | 
   199  | 
\begin{ttbox}
 | 
| 
 | 
   200  | 
\input{ToyList/inductxs}\end{ttbox}
 | 
| 
 | 
   201  | 
This time not even the base case is solved automatically:
  | 
| 
 | 
   202  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
   203  | 
by(Auto_tac);
  | 
| 
 | 
   204  | 
{\out 1. rev ys = rev ys @ []}
 | 
| 
 | 
   205  | 
{\out 2. \dots}
 | 
| 
 | 
   206  | 
\end{ttbox}
 | 
| 
 | 
   207  | 
We need another lemma.
  | 
| 
 | 
   208  | 
  | 
| 
 | 
   209  | 
\subsubsection*{Second lemma: \texttt{xs \at~[] = xs}}
 | 
| 
 | 
   210  | 
  | 
| 
 | 
   211  | 
This time the canonical proof procedure
  | 
| 
 | 
   212  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
   213  | 
\input{ToyList/lemma2}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
 | 
| 
 | 
   214  | 
leads to the desired message \texttt{No subgoals!}:
 | 
| 
 | 
   215  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
   216  | 
{\out Level 2}
 | 
| 
 | 
   217  | 
{\out xs @ [] = xs}
 | 
| 
 | 
   218  | 
{\out No subgoals!}
 | 
| 
 | 
   219  | 
\end{ttbox}
 | 
| 
 | 
   220  | 
Now we can give the lemma just proved a suitable name
  | 
| 
 | 
   221  | 
\begin{ttbox}
 | 
| 
 | 
   222  | 
\input{ToyList/qed2}\end{ttbox}
 | 
| 
 | 
   223  | 
and tell Isabelle to use this lemma in all future proofs by simplification:
  | 
| 
 | 
   224  | 
\begin{ttbox}
 | 
| 
 | 
   225  | 
\input{ToyList/addsimps2}\end{ttbox}
 | 
| 
 | 
   226  | 
Note that in the theorem \texttt{app_Nil2} the free variable \texttt{xs} has
 | 
| 
 | 
   227  | 
been replaced by the unknown \texttt{?xs}, just as explained in
 | 
| 
 | 
   228  | 
\S\ref{sec:variables}.
 | 
| 
 | 
   229  | 
  | 
| 
 | 
   230  | 
Going back to the proof of the first lemma
  | 
| 
 | 
   231  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
   232  | 
\input{ToyList/lemma1}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
 | 
| 
 | 
   233  | 
we find that this time \texttt{Auto_tac} solves the base case, but the
 | 
| 
 | 
   234  | 
induction step merely simplifies to
  | 
| 
 | 
   235  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
   236  | 
{\out 1. !!a list.}
 | 
| 
 | 
   237  | 
{\out       rev (list @ ys) = rev ys @ rev list}
 | 
| 
 | 
   238  | 
{\out       ==> (rev ys @ rev list) @ a # [] = rev ys @ rev list @ a # []}
 | 
| 
 | 
   239  | 
\end{ttbox}
 | 
| 
 | 
   240  | 
Now we need to remember that \texttt{\at} associates to the right, and that
 | 
| 
 | 
   241  | 
\texttt{\#} and \texttt{\at} have the same priority (namely the \texttt{65}
 | 
| 
 | 
   242  | 
in the definition of \texttt{ToyList}). Thus the conclusion really is
 | 
| 
 | 
   243  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
   244  | 
{\out     ==> (rev ys @ rev list) @ (a # []) = rev ys @ (rev list @ (a # []))}
 | 
| 
 | 
   245  | 
\end{ttbox}
 | 
| 
 | 
   246  | 
and the missing lemma is associativity of \texttt{\at}.
 | 
| 
 | 
   247  | 
  | 
| 
 | 
   248  | 
\subsubsection*{Third lemma: \texttt{(xs \at~ys) \at~zs = xs \at~(ys \at~zs)}}
 | 
| 
 | 
   249  | 
  | 
| 
 | 
   250  | 
This time the canonical proof procedure
  | 
| 
 | 
   251  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
   252  | 
\input{ToyList/lemma3}\end{ttbox}
 | 
| 
 | 
   253  | 
succeeds without further ado. Again we name the lemma and add it to
  | 
| 
 | 
   254  | 
the set of lemmas used during simplification:
  | 
| 
 | 
   255  | 
\begin{ttbox}
 | 
| 
 | 
   256  | 
\input{ToyList/qed3}\end{ttbox}
 | 
| 
 | 
   257  | 
Now we can go back and prove the first lemma
  | 
| 
 | 
   258  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
   259  | 
\input{ToyList/lemma1}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
 | 
| 
 | 
   260  | 
add it to the simplification lemmas
  | 
| 
 | 
   261  | 
\begin{ttbox}
 | 
| 
 | 
   262  | 
\input{ToyList/qed1}\end{ttbox}
 | 
| 
 | 
   263  | 
and then solve our main theorem:
  | 
| 
 | 
   264  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
   265  | 
\input{ToyList/thm}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
 | 
| 
 | 
   266  | 
  | 
| 
 | 
   267  | 
\subsubsection*{Review}
 | 
| 
 | 
   268  | 
  | 
| 
 | 
   269  | 
This is the end of our toy proof. It should have familiarized you with
  | 
| 
 | 
   270  | 
\begin{itemize}
 | 
| 
 | 
   271  | 
\item the standard theorem proving procedure:
  | 
| 
 | 
   272  | 
state a goal; proceed with proof until a new lemma is required; prove that
  | 
| 
 | 
   273  | 
lemma; come back to the original goal.
  | 
| 
 | 
   274  | 
\item a specific procedure that works well for functional programs:
  | 
| 
 | 
   275  | 
induction followed by all-out simplification via \texttt{Auto_tac}.
 | 
| 
 | 
   276  | 
\item a basic repertoire of proof commands.
  | 
| 
 | 
   277  | 
\end{itemize}
 | 
| 
 | 
   278  | 
  | 
| 
 | 
   279  | 
  | 
| 
 | 
   280  | 
\section{Some helpful commands}
 | 
| 
 | 
   281  | 
\label{sec:commands-and-hints}
 | 
| 
 | 
   282  | 
  | 
| 
 | 
   283  | 
This section discusses a few basic commands for manipulating the proof state
  | 
| 
 | 
   284  | 
and can be skipped by casual readers.
  | 
| 
 | 
   285  | 
  | 
| 
 | 
   286  | 
There are two kinds of commands used during a proof: the actual proof
  | 
| 
 | 
   287  | 
commands and auxiliary commands for examining the proof state and controlling
  | 
| 
 | 
   288  | 
the display. Proof commands are always of the form
  | 
| 
 | 
   289  | 
\texttt{by(\textit{tactic});}\indexbold{tactic} where \textbf{tactic} is a
 | 
| 
 | 
   290  | 
synonym for ``theorem proving function''. Typical examples are
  | 
| 
 | 
   291  | 
\texttt{induct_tac} and \texttt{Auto_tac} --- the suffix \texttt{_tac} is
 | 
| 
 | 
   292  | 
merely a mnemonic. Further tactics are introduced throughout the tutorial.
  | 
| 
 | 
   293  | 
  | 
| 
 | 
   294  | 
%Tactics can also be modified. For example,
  | 
| 
 | 
   295  | 
%\begin{ttbox}
 | 
| 
 | 
   296  | 
%by(ALLGOALS Asm_simp_tac);
  | 
| 
 | 
   297  | 
%\end{ttbox}
 | 
| 
 | 
   298  | 
%tells Isabelle to apply \texttt{Asm_simp_tac} to all subgoals. For more on
 | 
| 
 | 
   299  | 
%tactics and how to combine them see~\S\ref{sec:Tactics}.
 | 
| 
 | 
   300  | 
  | 
| 
 | 
   301  | 
The most useful auxiliary commands are:
  | 
| 
 | 
   302  | 
\begin{description}
 | 
| 
 | 
   303  | 
\item[Printing the current state]
  | 
| 
 | 
   304  | 
Type \texttt{pr();} to redisplay the current proof state, for example when it
 | 
| 
 | 
   305  | 
has disappeared off the screen.
  | 
| 
 | 
   306  | 
\item[Limiting the number of subgoals]
  | 
| 
 | 
   307  | 
Typing \texttt{prlim $k$;} tells Isabelle to print only the first $k$
 | 
| 
 | 
   308  | 
subgoals from now on and redisplays the current proof state. This is helpful
  | 
| 
 | 
   309  | 
when there are many subgoals.
  | 
| 
 | 
   310  | 
\item[Undoing] Typing \texttt{undo();} undoes the effect of the last
 | 
| 
 | 
   311  | 
tactic.
  | 
| 
 | 
   312  | 
\item[Context switch] Every proof happens in the context of a
  | 
| 
 | 
   313  | 
  \bfindex{current theory}. By default, this is the last theory loaded. If
 | 
| 
 | 
   314  | 
  you want to prove a theorem in the context of a different theory
  | 
| 
 | 
   315  | 
  \texttt{T}, you need to type \texttt{context T.thy;}\index{*context|bold}
 | 
| 
 | 
   316  | 
  first. Of course you need to change the context again if you want to go
  | 
| 
 | 
   317  | 
  back to your original theory.
  | 
| 
 | 
   318  | 
\item[Displaying types] We have already mentioned the flag
  | 
| 
 | 
   319  | 
  \ttindex{show_types} above. It can also be useful for detecting typos in
 | 
| 
 | 
   320  | 
  formulae early on. For example, if \texttt{show_types} is set and the goal
 | 
| 
 | 
   321  | 
  \texttt{rev(rev xs) = xs} is started, Isabelle prints the additional output
 | 
| 
 | 
   322  | 
\begin{ttbox}
 | 
| 
 | 
   323  | 
{\out Variables:}
 | 
| 
 | 
   324  | 
{\out   xs :: 'a list}
 | 
| 
 | 
   325  | 
\end{ttbox}
 | 
| 
 | 
   326  | 
which tells us that Isabelle has correctly inferred that
  | 
| 
 | 
   327  | 
\texttt{xs} is a variable of list type. On the other hand, had we
 | 
| 
 | 
   328  | 
made a typo as in \texttt{rev(re xs) = xs}, the response
 | 
| 
 | 
   329  | 
\begin{ttbox}
 | 
| 
 | 
   330  | 
Variables:
  | 
| 
 | 
   331  | 
  re :: 'a list => 'a list
  | 
| 
 | 
   332  | 
  xs :: 'a list
  | 
| 
 | 
   333  | 
\end{ttbox}
 | 
| 
 | 
   334  | 
would have alerted us because of the unexpected variable \texttt{re}.
 | 
| 
 | 
   335  | 
\item[(Re)loading theories]\index{loading theories}\index{reloading theories}
 | 
| 
 | 
   336  | 
Initially you load theory \texttt{T} by typing \ttindex{use_thy}~\texttt{"T";},
 | 
| 
 | 
   337  | 
which loads all parent theories of \texttt{T} automatically, if they are not
 | 
| 
 | 
   338  | 
loaded already. If you modify \texttt{T.thy} or \texttt{T.ML}, you can
 | 
| 
 | 
   339  | 
reload it by typing \texttt{use_thy~"T";} again. This time, however, only
 | 
| 
 | 
   340  | 
\texttt{T} is reloaded. If some of \texttt{T}'s parents have changed as well,
 | 
| 
6577
 | 
   341  | 
type \ttindexbold{update_thy}~\texttt{"T";} to reload \texttt{T} and all of
 | 
| 
 | 
   342  | 
its parents that have changed (or have changed parents).
  | 
| 
5375
 | 
   343  | 
\end{description}
 | 
| 
 | 
   344  | 
Further commands are found in the Reference Manual.
  | 
| 
 | 
   345  | 
  | 
| 
 | 
   346  | 
  | 
| 
 | 
   347  | 
\section{Datatypes}
 | 
| 
 | 
   348  | 
\label{sec:datatype}
 | 
| 
 | 
   349  | 
  | 
| 
 | 
   350  | 
Inductive datatypes are part of almost every non-trivial application of HOL.
  | 
| 
 | 
   351  | 
First we take another look at a very important example, the datatype of
  | 
| 
 | 
   352  | 
lists, before we turn to datatypes in general. The section closes with a
  | 
| 
 | 
   353  | 
case study.
  | 
| 
 | 
   354  | 
  | 
| 
 | 
   355  | 
  | 
| 
 | 
   356  | 
\subsection{Lists}
 | 
| 
 | 
   357  | 
  | 
| 
6148
 | 
   358  | 
Lists are one of the essential datatypes in computing. Readers of this
  | 
| 
 | 
   359  | 
tutorial and users of HOL need to be familiar with their basic operations.
  | 
| 
 | 
   360  | 
Theory \texttt{ToyList} is only a small fragment of HOL's predefined theory
 | 
| 
6628
 | 
   361  | 
\texttt{List}\footnote{\url{http://isabelle.in.tum.de/library/HOL/List.html}}.
 | 
| 
5375
 | 
   362  | 
The latter contains many further operations. For example, the functions
  | 
| 
 | 
   363  | 
\ttindexbold{hd} (`head') and \ttindexbold{tl} (`tail') return the first
 | 
| 
 | 
   364  | 
element and the remainder of a list. (However, pattern-matching is usually
  | 
| 
6148
 | 
   365  | 
preferable to \texttt{hd} and \texttt{tl}.)  Theory \texttt{List} also
 | 
| 
 | 
   366  | 
contains more syntactic sugar:
  | 
| 
5375
 | 
   367  | 
\texttt{[}$x@1$\texttt{,}\dots\texttt{,}$x@n$\texttt{]} abbreviates
 | 
| 
6148
 | 
   368  | 
$x@1$\texttt{\#}\dots\texttt{\#}$x@n$\texttt{\#[]}.  In the rest of the
 | 
| 
 | 
   369  | 
tutorial we always use HOL's predefined lists.
  | 
| 
5375
 | 
   370  | 
  | 
| 
 | 
   371  | 
  | 
| 
 | 
   372  | 
\subsection{The general format}
 | 
| 
 | 
   373  | 
\label{sec:general-datatype}
 | 
| 
 | 
   374  | 
  | 
| 
 | 
   375  | 
The general HOL \texttt{datatype} definition is of the form
 | 
| 
 | 
   376  | 
\[
  | 
| 
 | 
   377  | 
\mathtt{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~
 | 
| 
 | 
   378  | 
C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~
 | 
| 
 | 
   379  | 
C@m~\tau@{m1}~\dots~\tau@{mk@m}
 | 
| 
 | 
   380  | 
\]
  | 
| 
 | 
   381  | 
where $\alpha@i$ are type variables (the parameters), $C@i$ are distinct
  | 
| 
 | 
   382  | 
constructor names and $\tau@{ij}$ are types; it is customary to capitalize
 | 
| 
 | 
   383  | 
the first letter in constructor names. There are a number of
  | 
| 
 | 
   384  | 
restrictions (such as the type should not be empty) detailed
  | 
| 
6606
 | 
   385  | 
elsewhere~\cite{isabelle-HOL}. Isabelle notifies you if you violate them.
 | 
| 
5375
 | 
   386  | 
  | 
| 
 | 
   387  | 
Laws about datatypes, such as \verb$[] ~= x#xs$ and \texttt{(x\#xs = y\#ys) =
 | 
| 
 | 
   388  | 
  (x=y \& xs=ys)}, are used automatically during proofs by simplification.
  | 
| 
 | 
   389  | 
The same is true for the equations in primitive recursive function
  | 
| 
 | 
   390  | 
definitions.
  | 
| 
 | 
   391  | 
  | 
| 
 | 
   392  | 
\subsection{Primitive recursion}
 | 
| 
 | 
   393  | 
  | 
| 
 | 
   394  | 
Functions on datatypes are usually defined by recursion. In fact, most of the
  | 
| 
 | 
   395  | 
time they are defined by what is called \bfindex{primitive recursion}.
 | 
| 
5850
 | 
   396  | 
The keyword \ttindexbold{primrec} is followed by a list of equations
 | 
| 
5375
 | 
   397  | 
\[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \]
  | 
| 
 | 
   398  | 
such that $C$ is a constructor of the datatype $t$ and all recursive calls of
  | 
| 
 | 
   399  | 
$f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus
  | 
| 
 | 
   400  | 
Isabelle immediately sees that $f$ terminates because one (fixed!) argument
  | 
| 
 | 
   401  | 
becomes smaller with every recursive call. There must be exactly one equation
  | 
| 
 | 
   402  | 
for each constructor.  Their order is immaterial.
  | 
| 
 | 
   403  | 
A more general method for defining total recursive functions is explained in
  | 
| 
 | 
   404  | 
\S\ref{sec:recdef}.
 | 
| 
 | 
   405  | 
  | 
| 
 | 
   406  | 
\begin{exercise}
 | 
| 
 | 
   407  | 
Given the datatype of binary trees
  | 
| 
 | 
   408  | 
\begin{ttbox}
 | 
| 
 | 
   409  | 
\input{Misc/tree}\end{ttbox}
 | 
| 
 | 
   410  | 
define a function \texttt{mirror} that mirrors the structure of a binary tree
 | 
| 
 | 
   411  | 
by swapping subtrees (recursively). Prove \texttt{mirror(mirror(t)) = t}.
 | 
| 
 | 
   412  | 
\end{exercise}
 | 
| 
 | 
   413  | 
  | 
| 
 | 
   414  | 
\subsection{\texttt{case}-expressions}
 | 
| 
 | 
   415  | 
\label{sec:case-expressions}
 | 
| 
 | 
   416  | 
  | 
| 
 | 
   417  | 
HOL also features \ttindexbold{case}-expressions for analyzing elements of a
 | 
| 
 | 
   418  | 
datatype. For example,
  | 
| 
 | 
   419  | 
\begin{ttbox}
 | 
| 
 | 
   420  | 
case xs of [] => 0 | y#ys => y
  | 
| 
 | 
   421  | 
\end{ttbox}
 | 
| 
 | 
   422  | 
evaluates to \texttt{0} if \texttt{xs} is \texttt{[]} and to \texttt{y} if 
 | 
| 
 | 
   423  | 
\texttt{xs} is \texttt{y\#ys}. (Since the result in both branches must be of
 | 
| 
 | 
   424  | 
the same type, it follows that \texttt{y::nat} and hence
 | 
| 
 | 
   425  | 
\texttt{xs::(nat)list}.)
 | 
| 
 | 
   426  | 
  | 
| 
 | 
   427  | 
In general, if $e$ is a term of the datatype $t$ defined in
  | 
| 
 | 
   428  | 
\S\ref{sec:general-datatype} above, the corresponding
 | 
| 
 | 
   429  | 
\texttt{case}-expression analyzing $e$ is
 | 
| 
 | 
   430  | 
\[
  | 
| 
 | 
   431  | 
\begin{array}{rrcl}
 | 
| 
 | 
   432  | 
\mbox{\tt case}~e~\mbox{\tt of} & C@1~x@{11}~\dots~x@{1k@1} & \To & e@1 \\
 | 
| 
 | 
   433  | 
                           \vdots \\
  | 
| 
 | 
   434  | 
                           \mid & C@m~x@{m1}~\dots~x@{mk@m} & \To & e@m
 | 
| 
 | 
   435  | 
\end{array}
 | 
| 
 | 
   436  | 
\]
  | 
| 
 | 
   437  | 
  | 
| 
 | 
   438  | 
\begin{warn}
 | 
| 
 | 
   439  | 
{\em All} constructors must be present, their order is fixed, and nested
 | 
| 
 | 
   440  | 
patterns are not supported.  Violating these restrictions results in strange
  | 
| 
 | 
   441  | 
error messages.
  | 
| 
 | 
   442  | 
\end{warn}
 | 
| 
 | 
   443  | 
\noindent
  | 
| 
 | 
   444  | 
Nested patterns can be simulated by nested \texttt{case}-expressions: instead
 | 
| 
 | 
   445  | 
of
  | 
| 
 | 
   446  | 
\begin{ttbox}
 | 
| 
 | 
   447  | 
case xs of [] => 0 | [x] => x | x#(y#zs) => y
  | 
| 
 | 
   448  | 
\end{ttbox}
 | 
| 
 | 
   449  | 
write
  | 
| 
 | 
   450  | 
\begin{ttbox}
 | 
| 
 | 
   451  | 
case xs of [] => 0 | x#ys => (case ys of [] => x | y#zs => y)
  | 
| 
 | 
   452  | 
\end{ttbox}
 | 
| 
 | 
   453  | 
Note that \texttt{case}-expressions should be enclosed in parentheses to
 | 
| 
 | 
   454  | 
indicate their scope.
  | 
| 
 | 
   455  | 
  | 
| 
 | 
   456  | 
\subsection{Structural induction}
 | 
| 
 | 
   457  | 
  | 
| 
 | 
   458  | 
Almost all the basic laws about a datatype are applied automatically during
  | 
| 
 | 
   459  | 
simplification. Only induction is invoked by hand via \texttt{induct_tac},
 | 
| 
 | 
   460  | 
which works for any datatype. In some cases, induction is overkill and a case
  | 
| 
 | 
   461  | 
distinction over all constructors of the datatype suffices. This is performed
  | 
| 
 | 
   462  | 
by \ttindexbold{exhaust_tac}. A trivial example:
 | 
| 
 | 
   463  | 
\begin{ttbox}
 | 
| 
 | 
   464  | 
\input{Misc/exhaust.ML}{\out1. xs = [] ==> (case xs of [] => [] | y # ys => xs) = xs}
 | 
| 
 | 
   465  | 
{\out2. !!a list. xs = a # list ==> (case xs of [] => [] | y # ys => xs) = xs}
 | 
| 
 | 
   466  | 
\input{Misc/autotac.ML}\end{ttbox}
 | 
| 
 | 
   467  | 
Note that this particular case distinction could have been automated
  | 
| 
 | 
   468  | 
completely. See~\S\ref{sec:SimpFeatures}.
 | 
| 
 | 
   469  | 
  | 
| 
 | 
   470  | 
\begin{warn}
 | 
| 
 | 
   471  | 
  Induction is only allowed on a free variable that should not occur among
  | 
| 
 | 
   472  | 
  the assumptions of the subgoal.  Exhaustion works for arbitrary terms.
  | 
| 
 | 
   473  | 
\end{warn}
 | 
| 
 | 
   474  | 
  | 
| 
 | 
   475  | 
\subsection{Case study: boolean expressions}
 | 
| 
 | 
   476  | 
\label{sec:boolex}
 | 
| 
 | 
   477  | 
  | 
| 
 | 
   478  | 
The aim of this case study is twofold: it shows how to model boolean
  | 
| 
 | 
   479  | 
expressions and some algorithms for manipulating them, and it demonstrates
  | 
| 
 | 
   480  | 
the constructs introduced above.
  | 
| 
 | 
   481  | 
  | 
| 
 | 
   482  | 
\subsubsection{How can we model boolean expressions?}
 | 
| 
 | 
   483  | 
  | 
| 
 | 
   484  | 
We want to represent boolean expressions built up from variables and
  | 
| 
 | 
   485  | 
constants by negation and conjunction. The following datatype serves exactly
  | 
| 
 | 
   486  | 
that purpose:
  | 
| 
 | 
   487  | 
\begin{ttbox}
 | 
| 
 | 
   488  | 
\input{Ifexpr/boolex}\end{ttbox}
 | 
| 
 | 
   489  | 
The two constants are represented by the terms \texttt{Const~True} and
 | 
| 
 | 
   490  | 
\texttt{Const~False}. Variables are represented by terms of the form
 | 
| 
 | 
   491  | 
\texttt{Var}~$n$, where $n$ is a natural number (type \texttt{nat}).
 | 
| 
 | 
   492  | 
For example, the formula $P@0 \land \neg P@1$ is represented by the term
  | 
| 
 | 
   493  | 
\texttt{And~(Var~0)~(Neg(Var~1))}.
 | 
| 
 | 
   494  | 
  | 
| 
 | 
   495  | 
\subsubsection{What is the value of boolean expressions?}
 | 
| 
 | 
   496  | 
  | 
| 
 | 
   497  | 
The value of a boolean expressions depends on the value of its variables.
  | 
| 
 | 
   498  | 
Hence the function \texttt{value} takes an additional parameter, an {\em
 | 
| 
 | 
   499  | 
  environment} of type \texttt{nat~=>~bool}, which maps variables to their
 | 
| 
 | 
   500  | 
values:
  | 
| 
 | 
   501  | 
\begin{ttbox}
 | 
| 
 | 
   502  | 
\input{Ifexpr/value}\end{ttbox}
 | 
| 
 | 
   503  | 
  | 
| 
 | 
   504  | 
\subsubsection{If-expressions}
 | 
| 
 | 
   505  | 
  | 
| 
 | 
   506  | 
An alternative and often more efficient (because in a certain sense
  | 
| 
 | 
   507  | 
canonical) representation are so-called \textit{If-expressions\/} built up
 | 
| 
 | 
   508  | 
from constants (\texttt{CIF}), variables (\texttt{VIF}) and conditionals
 | 
| 
 | 
   509  | 
(\texttt{IF}):
 | 
| 
 | 
   510  | 
\begin{ttbox}
 | 
| 
 | 
   511  | 
\input{Ifexpr/ifex}\end{ttbox}
 | 
| 
 | 
   512  | 
The evaluation if If-expressions proceeds as for \texttt{boolex}:
 | 
| 
 | 
   513  | 
\begin{ttbox}
 | 
| 
 | 
   514  | 
\input{Ifexpr/valif}\end{ttbox}
 | 
| 
 | 
   515  | 
  | 
| 
 | 
   516  | 
\subsubsection{Transformation into and of If-expressions}
 | 
| 
 | 
   517  | 
  | 
| 
 | 
   518  | 
The type \texttt{boolex} is close to the customary representation of logical
 | 
| 
 | 
   519  | 
formulae, whereas \texttt{ifex} is designed for efficiency. Thus we need to
 | 
| 
 | 
   520  | 
translate from \texttt{boolex} into \texttt{ifex}:
 | 
| 
 | 
   521  | 
\begin{ttbox}
 | 
| 
 | 
   522  | 
\input{Ifexpr/bool2if}\end{ttbox}
 | 
| 
 | 
   523  | 
At last, we have something we can verify: that \texttt{bool2if} preserves the
 | 
| 
 | 
   524  | 
value of its argument.
  | 
| 
 | 
   525  | 
\begin{ttbox}
 | 
| 
 | 
   526  | 
\input{Ifexpr/bool2if.ML}\end{ttbox}
 | 
| 
 | 
   527  | 
The proof is canonical:
  | 
| 
 | 
   528  | 
\begin{ttbox}
 | 
| 
 | 
   529  | 
\input{Ifexpr/proof.ML}\end{ttbox}
 | 
| 
 | 
   530  | 
In fact, all proofs in this case study look exactly like this. Hence we do
  | 
| 
 | 
   531  | 
not show them below.
  | 
| 
 | 
   532  | 
  | 
| 
 | 
   533  | 
More interesting is the transformation of If-expressions into a normal form
  | 
| 
 | 
   534  | 
where the first argument of \texttt{IF} cannot be another \texttt{IF} but
 | 
| 
 | 
   535  | 
must be a constant or variable. Such a normal form can be computed by
  | 
| 
 | 
   536  | 
repeatedly replacing a subterm of the form \texttt{IF~(IF~b~x~y)~z~u} by
 | 
| 
 | 
   537  | 
\texttt{IF b (IF x z u) (IF y z u)}, which has the same value. The following
 | 
| 
 | 
   538  | 
primitive recursive functions perform this task:
  | 
| 
 | 
   539  | 
\begin{ttbox}
 | 
| 
 | 
   540  | 
\input{Ifexpr/normif}
 | 
| 
 | 
   541  | 
\input{Ifexpr/norm}\end{ttbox}
 | 
| 
 | 
   542  | 
Their interplay is a bit tricky, and we leave it to the reader to develop an
  | 
| 
 | 
   543  | 
intuitive understanding. Fortunately, Isabelle can help us to verify that the
  | 
| 
 | 
   544  | 
transformation preserves the value of the expression:
  | 
| 
 | 
   545  | 
\begin{ttbox}
 | 
| 
 | 
   546  | 
\input{Ifexpr/norm.ML}\end{ttbox}
 | 
| 
 | 
   547  | 
The proof is canonical, provided we first show the following lemma (which
  | 
| 
 | 
   548  | 
also helps to understand what \texttt{normif} does) and make it available
 | 
| 
 | 
   549  | 
for simplification via \texttt{Addsimps}:
 | 
| 
 | 
   550  | 
\begin{ttbox}
 | 
| 
 | 
   551  | 
\input{Ifexpr/normif.ML}\end{ttbox}
 | 
| 
 | 
   552  | 
  | 
| 
 | 
   553  | 
But how can we be sure that \texttt{norm} really produces a normal form in
 | 
| 
 | 
   554  | 
the above sense? We have to prove
  | 
| 
 | 
   555  | 
\begin{ttbox}
 | 
| 
 | 
   556  | 
\input{Ifexpr/normal_norm.ML}\end{ttbox}
 | 
| 
 | 
   557  | 
where \texttt{normal} expresses that an If-expression is in normal form:
 | 
| 
 | 
   558  | 
\begin{ttbox}
 | 
| 
 | 
   559  | 
\input{Ifexpr/normal}\end{ttbox}
 | 
| 
 | 
   560  | 
Of course, this requires a lemma about normality of \texttt{normif}
 | 
| 
 | 
   561  | 
\begin{ttbox}
 | 
| 
 | 
   562  | 
\input{Ifexpr/normal_normif.ML}\end{ttbox}
 | 
| 
 | 
   563  | 
that has to be made available for simplification via \texttt{Addsimps}.
 | 
| 
 | 
   564  | 
  | 
| 
 | 
   565  | 
How does one come up with the required lemmas? Try to prove the main theorems
  | 
| 
 | 
   566  | 
without them and study carefully what \texttt{Auto_tac} leaves unproved. This
 | 
| 
 | 
   567  | 
has to provide the clue.
  | 
| 
 | 
   568  | 
The necessity of universal quantification (\texttt{!t e}) in the two lemmas
 | 
| 
 | 
   569  | 
is explained in \S\ref{sec:InductionHeuristics}
 | 
| 
 | 
   570  | 
  | 
| 
 | 
   571  | 
\begin{exercise}
 | 
| 
 | 
   572  | 
  We strengthen the definition of a {\em normal\/} If-expression as follows:
 | 
| 
 | 
   573  | 
  the first argument of all \texttt{IF}s must be a variable. Adapt the above
 | 
| 
 | 
   574  | 
  development to this changed requirement. (Hint: you may need to formulate
  | 
| 
 | 
   575  | 
  some of the goals as implications (\texttt{-->}) rather than equalities
 | 
| 
 | 
   576  | 
  (\texttt{=}).)
 | 
| 
 | 
   577  | 
\end{exercise}
 | 
| 
 | 
   578  | 
  | 
| 
 | 
   579  | 
\section{Some basic types}
 | 
| 
 | 
   580  | 
  | 
| 
6577
 | 
   581  | 
  | 
| 
5375
 | 
   582  | 
\subsection{Natural numbers}
 | 
| 
6577
 | 
   583  | 
\index{arithmetic|(}
 | 
| 
5375
 | 
   584  | 
  | 
| 
 | 
   585  | 
The type \ttindexbold{nat} of natural numbers is predefined and behaves like
 | 
| 
 | 
   586  | 
\begin{ttbox}
 | 
| 
 | 
   587  | 
datatype nat = 0 | Suc nat
  | 
| 
 | 
   588  | 
\end{ttbox}
 | 
| 
 | 
   589  | 
In particular, there are \texttt{case}-expressions, for example
 | 
| 
 | 
   590  | 
\begin{ttbox}
 | 
| 
 | 
   591  | 
case n of 0 => 0 | Suc m => m
  | 
| 
 | 
   592  | 
\end{ttbox}
 | 
| 
 | 
   593  | 
primitive recursion, for example
  | 
| 
 | 
   594  | 
\begin{ttbox}
 | 
| 
 | 
   595  | 
\input{Misc/natsum}\end{ttbox}
 | 
| 
 | 
   596  | 
and induction, for example
  | 
| 
 | 
   597  | 
\begin{ttbox}
 | 
| 
 | 
   598  | 
\input{Misc/NatSum.ML}\ttbreak
 | 
| 
 | 
   599  | 
{\out sum n + sum n = n * Suc n}
 | 
| 
 | 
   600  | 
{\out No subgoals!}
 | 
| 
 | 
   601  | 
\end{ttbox}
 | 
| 
 | 
   602  | 
  | 
| 
 | 
   603  | 
The usual arithmetic operations \ttindexbold{+}, \ttindexbold{-},
 | 
| 
6577
 | 
   604  | 
\ttindexbold{*}, \ttindexbold{div}, \ttindexbold{mod}, \ttindexbold{min} and
 | 
| 
 | 
   605  | 
\ttindexbold{max} are predefined, as are the relations \ttindexbold{<=} and
 | 
| 
 | 
   606  | 
\ttindexbold{<}. There is even a least number operation \ttindexbold{LEAST}.
 | 
| 
 | 
   607  | 
For example, \texttt{(LEAST n.$\,$1 < n) = 2} (HOL does not prove this
 | 
| 
 | 
   608  | 
completely automatically).
  | 
| 
5375
 | 
   609  | 
  | 
| 
 | 
   610  | 
\begin{warn}
 | 
| 
6577
 | 
   611  | 
  The operations \ttindexbold{+}, \ttindexbold{-}, \ttindexbold{*},
 | 
| 
 | 
   612  | 
  \ttindexbold{min}, \ttindexbold{max}, \ttindexbold{<=} and \ttindexbold{<}
 | 
| 
 | 
   613  | 
  are overloaded, i.e.\ they are available not just for natural numbers but
  | 
| 
 | 
   614  | 
  at other types as well (see \S\ref{sec:TypeClasses}). For example, given
 | 
| 
5375
 | 
   615  | 
  the goal \texttt{x+y = y+x}, there is nothing to indicate that you are
 | 
| 
6577
 | 
   616  | 
  talking about natural numbers.  Hence Isabelle can only infer that
  | 
| 
5375
 | 
   617  | 
  \texttt{x} and \texttt{y} are of some arbitrary type where \texttt{+} is
 | 
| 
 | 
   618  | 
  declared. As a consequence, you will be unable to prove the goal (although
  | 
| 
 | 
   619  | 
  it may take you some time to realize what has happened if
  | 
| 
 | 
   620  | 
  \texttt{show_types} is not set).  In this particular example, you need to
 | 
| 
 | 
   621  | 
  include an explicit type constraint, for example \texttt{x+y = y+(x::nat)}.
 | 
| 
 | 
   622  | 
  If there is enough contextual information this may not be necessary:
  | 
| 
 | 
   623  | 
  \texttt{x+0 = x} automatically implies \texttt{x::nat}.
 | 
| 
 | 
   624  | 
\end{warn}
 | 
| 
 | 
   625  | 
  | 
| 
6577
 | 
   626  | 
Simple arithmetic goals are proved automatically by both \texttt{Auto_tac}
 | 
| 
 | 
   627  | 
and the simplification tactics introduced in \S\ref{sec:Simplification}.  For
 | 
| 
 | 
   628  | 
example, the goal
  | 
| 
 | 
   629  | 
\begin{ttbox}
 | 
| 
 | 
   630  | 
\input{Misc/arith1.ML}\end{ttbox}
 | 
| 
 | 
   631  | 
is proved automatically. The main restriction is that only addition is taken
  | 
| 
 | 
   632  | 
into account; other arithmetic operations and quantified formulae are ignored.
  | 
| 
 | 
   633  | 
  | 
| 
 | 
   634  | 
For more complex goals, there is the special tactic \ttindexbold{arith_tac}. It
 | 
| 
 | 
   635  | 
proves arithmetic goals involving the usual logical connectives (\verb$~$,
  | 
| 
 | 
   636  | 
\verb$&$, \verb$|$, \verb$-->$), the relations \texttt{<=} and \texttt{<},
 | 
| 
 | 
   637  | 
and the operations \ttindexbold{+}, \ttindexbold{-}, \ttindexbold{min} and
 | 
| 
 | 
   638  | 
\ttindexbold{max}. For example, it can prove
 | 
| 
 | 
   639  | 
\begin{ttbox}
 | 
| 
 | 
   640  | 
\input{Misc/arith2.ML}\end{ttbox}
 | 
| 
 | 
   641  | 
because \texttt{k*k} can be treated as atomic.
 | 
| 
 | 
   642  | 
In contrast, $n*n = n \Longrightarrow n=0 \lor n=1$ is not
  | 
| 
 | 
   643  | 
even proved by \texttt{arith_tac} because the proof relies essentially on
 | 
| 
 | 
   644  | 
properties of multiplication.
  | 
| 
 | 
   645  | 
  | 
| 
 | 
   646  | 
\begin{warn}
 | 
| 
 | 
   647  | 
  The running time of \texttt{arith_tac} is exponential in the number of
 | 
| 
 | 
   648  | 
  occurrences of \ttindexbold{-}, \ttindexbold{min} and \ttindexbold{max}
 | 
| 
 | 
   649  | 
  because they are first eliminated by case distinctions.
  | 
| 
 | 
   650  | 
  | 
| 
 | 
   651  | 
  \texttt{arith_tac} is incomplete even for the restricted class of formulae
 | 
| 
 | 
   652  | 
  described above (known as ``linear arithmetic''). If divisibility plays a
  | 
| 
 | 
   653  | 
  role, it may fail to prove a valid formula, for example $m+m \neq n+n+1$.
  | 
| 
 | 
   654  | 
  Fortunately, such examples are rare in practice.
  | 
| 
 | 
   655  | 
\end{warn}
 | 
| 
 | 
   656  | 
  | 
| 
 | 
   657  | 
\index{arithmetic|)}
 | 
| 
 | 
   658  | 
  | 
| 
5375
 | 
   659  | 
  | 
| 
 | 
   660  | 
\subsection{Products}
 | 
| 
 | 
   661  | 
  | 
| 
 | 
   662  | 
HOL also has pairs: \texttt{($a@1$,$a@2$)} is of type \texttt{$\tau@1$ *
 | 
| 
 | 
   663  | 
$\tau@2$} provided each $a@i$ is of type $\tau@i$. The components of a pair
  | 
| 
 | 
   664  | 
are extracted by \texttt{fst} and \texttt{snd}:
 | 
| 
 | 
   665  | 
\texttt{fst($x$,$y$) = $x$} and \texttt{snd($x$,$y$) = $y$}. Tuples
 | 
| 
 | 
   666  | 
are simulated by pairs nested to the right: 
  | 
| 
 | 
   667  | 
\texttt{($a@1$,$a@2$,$a@3$)} and \texttt{$\tau@1$ * $\tau@2$ * $\tau@3$}
 | 
| 
 | 
   668  | 
stand for \texttt{($a@1$,($a@2$,$a@3$))} and \texttt{$\tau@1$ * ($\tau@2$ *
 | 
| 
 | 
   669  | 
$\tau@3$)}. Therefore \texttt{fst(snd($a@1$,$a@2$,$a@3$)) = $a@2$}.
 | 
| 
 | 
   670  | 
  | 
| 
 | 
   671  | 
It is possible to use (nested) tuples as patterns in abstractions, for
  | 
| 
 | 
   672  | 
example \texttt{\%(x,y,z).x+y+z} and \texttt{\%((x,y),z).x+y+z}.
 | 
| 
 | 
   673  | 
  | 
| 
 | 
   674  | 
In addition to explicit $\lambda$-abstractions, tuple patterns can be used in
  | 
| 
 | 
   675  | 
most variable binding constructs. Typical examples are
  | 
| 
 | 
   676  | 
\begin{ttbox}
 | 
| 
 | 
   677  | 
let (x,y) = f z in (y,x)
  | 
| 
 | 
   678  | 
  | 
| 
 | 
   679  | 
case xs of [] => 0 | (x,y)\#zs => x+y
  | 
| 
 | 
   680  | 
\end{ttbox}
 | 
| 
 | 
   681  | 
Further important examples are quantifiers and sets.
  | 
| 
 | 
   682  | 
  | 
| 
 | 
   683  | 
\begin{warn}
 | 
| 
 | 
   684  | 
Abstraction over pairs and tuples is merely a convenient shorthand for a more
  | 
| 
 | 
   685  | 
complex internal representation.  Thus the internal and external form of a
  | 
| 
 | 
   686  | 
term may differ, which can affect proofs. If you want to avoid this
  | 
| 
 | 
   687  | 
complication, use \texttt{fst} and \texttt{snd}, i.e.\ write
 | 
| 
 | 
   688  | 
\texttt{\%p.~fst p + snd p} instead of \texttt{\%(x,y).~x + y}.
 | 
| 
 | 
   689  | 
See~\S\ref{} for theorem proving with tuple patterns.
 | 
| 
 | 
   690  | 
\end{warn}
 | 
| 
 | 
   691  | 
  | 
| 
 | 
   692  | 
  | 
| 
 | 
   693  | 
\section{Definitions}
 | 
| 
 | 
   694  | 
\label{sec:Definitions}
 | 
| 
 | 
   695  | 
  | 
| 
 | 
   696  | 
A definition is simply an abbreviation, i.e.\ a new name for an existing
  | 
| 
 | 
   697  | 
construction. In particular, definitions cannot be recursive. Isabelle offers
  | 
| 
 | 
   698  | 
definitions on the level of types and terms. Those on the type level are
  | 
| 
 | 
   699  | 
called type synonyms, those on the term level are called (constant)
  | 
| 
 | 
   700  | 
definitions.
  | 
| 
 | 
   701  | 
  | 
| 
 | 
   702  | 
  | 
| 
 | 
   703  | 
\subsection{Type synonyms}
 | 
| 
 | 
   704  | 
\indexbold{type synonyms}
 | 
| 
 | 
   705  | 
  | 
| 
 | 
   706  | 
Type synonyms are similar to those found in ML. Their syntax is fairly self
  | 
| 
 | 
   707  | 
explanatory:
  | 
| 
 | 
   708  | 
\begin{ttbox}
 | 
| 
 | 
   709  | 
\input{Misc/types}\end{ttbox}\indexbold{*types}
 | 
| 
 | 
   710  | 
The synonym \texttt{alist} shows that in general the type on the right-hand
 | 
| 
 | 
   711  | 
side needs to be enclosed in double quotation marks
  | 
| 
 | 
   712  | 
(see the end of~\S\ref{sec:intro-theory}).
 | 
| 
 | 
   713  | 
  | 
| 
 | 
   714  | 
Internally all synonyms are fully expanded.  As a consequence Isabelle's
  | 
| 
 | 
   715  | 
output never contains synonyms.  Their main purpose is to improve the
  | 
| 
 | 
   716  | 
readability of theory definitions.  Synonyms can be used just like any other
  | 
| 
 | 
   717  | 
type:
  | 
| 
 | 
   718  | 
\begin{ttbox}
 | 
| 
 | 
   719  | 
\input{Misc/consts}\end{ttbox}
 | 
| 
 | 
   720  | 
  | 
| 
 | 
   721  | 
\subsection{Constant definitions}
 | 
| 
 | 
   722  | 
\label{sec:ConstDefinitions}
 | 
| 
 | 
   723  | 
  | 
| 
 | 
   724  | 
The above constants \texttt{nand} and \texttt{exor} are non-recursive and can
 | 
| 
 | 
   725  | 
therefore be defined directly by
  | 
| 
 | 
   726  | 
\begin{ttbox}
 | 
| 
 | 
   727  | 
\input{Misc/defs}\end{ttbox}\indexbold{*defs}
 | 
| 
 | 
   728  | 
where \texttt{defs} is a keyword and \texttt{nand_def} and \texttt{exor_def}
 | 
| 
 | 
   729  | 
are arbitrary user-supplied names.
  | 
| 
 | 
   730  | 
The symbol \texttt{==}\index{==>@{\tt==}|bold} is a special form of equality
 | 
| 
 | 
   731  | 
that should only be used in constant definitions.
  | 
| 
 | 
   732  | 
Declarations and definitions can also be merged
  | 
| 
 | 
   733  | 
\begin{ttbox}
 | 
| 
 | 
   734  | 
\input{Misc/constdefs}\end{ttbox}\indexbold{*constdefs}
 | 
| 
 | 
   735  | 
in which case the default name of each definition is $f$\texttt{_def}, where
 | 
| 
 | 
   736  | 
$f$ is the name of the defined constant.
  | 
| 
 | 
   737  | 
  | 
| 
 | 
   738  | 
Note that pattern-matching is not allowed, i.e.\ each definition must be of
  | 
| 
 | 
   739  | 
the form $f\,x@1\,\dots\,x@n$\texttt{~==~}$t$.
 | 
| 
 | 
   740  | 
  | 
| 
 | 
   741  | 
Section~\S\ref{sec:Simplification} explains how definitions are used
 | 
| 
 | 
   742  | 
in proofs.
  | 
| 
 | 
   743  | 
  | 
| 
 | 
   744  | 
\begin{warn}
 | 
| 
 | 
   745  | 
A common mistake when writing definitions is to introduce extra free variables
  | 
| 
 | 
   746  | 
on the right-hand side as in the following fictitious definition:
  | 
| 
 | 
   747  | 
\begin{ttbox}
 | 
| 
 | 
   748  | 
defs  prime_def "prime(p) == (m divides p) --> (m=1 | m=p)"
  | 
| 
 | 
   749  | 
\end{ttbox}
 | 
| 
 | 
   750  | 
Isabelle rejects this `definition' because of the extra {\tt m} on the
 | 
| 
 | 
   751  | 
right-hand side, which would introduce an inconsistency.  What you should have
  | 
| 
 | 
   752  | 
written is
  | 
| 
 | 
   753  | 
\begin{ttbox}
 | 
| 
 | 
   754  | 
defs  prime_def "prime(p) == !m. (m divides p) --> (m=1 | m=p)"
  | 
| 
 | 
   755  | 
\end{ttbox}
 | 
| 
 | 
   756  | 
\end{warn}
 | 
| 
 | 
   757  | 
  | 
| 
 | 
   758  | 
  | 
| 
 | 
   759  | 
  | 
| 
 | 
   760  | 
  | 
| 
 | 
   761  | 
\chapter{More Functional Programming}
 | 
| 
 | 
   762  | 
  | 
| 
 | 
   763  | 
The purpose of this chapter is to deepen the reader's understanding of the
  | 
| 
 | 
   764  | 
concepts encountered so far and to introduce an advanced method for defining
  | 
| 
 | 
   765  | 
recursive functions. The first two sections give a structured presentation of
  | 
| 
 | 
   766  | 
theorem proving by simplification (\S\ref{sec:Simplification}) and
 | 
| 
 | 
   767  | 
discuss important heuristics for induction (\S\ref{sec:InductionHeuristics}). They
 | 
| 
 | 
   768  | 
can be skipped by readers less interested in proofs. They are followed by a
  | 
| 
 | 
   769  | 
case study, a compiler for expressions (\S\ref{sec:ExprCompiler}).
 | 
| 
 | 
   770  | 
Finally we present a very general method for defining recursive functions
  | 
| 
 | 
   771  | 
that goes well beyond what \texttt{primrec} allows (\S\ref{sec:recdef}).
 | 
| 
 | 
   772  | 
  | 
| 
 | 
   773  | 
  | 
| 
 | 
   774  | 
\section{Simplification}
 | 
| 
 | 
   775  | 
\label{sec:Simplification}
 | 
| 
 | 
   776  | 
  | 
| 
 | 
   777  | 
So far we have proved our theorems by \texttt{Auto_tac}, which
 | 
| 
 | 
   778  | 
`simplifies' all subgoals. In fact, \texttt{Auto_tac} can do much more than
 | 
| 
 | 
   779  | 
that, except that it did not need to so far. However, when you go beyond toy
  | 
| 
 | 
   780  | 
examples, you need to understand the ingredients of \texttt{Auto_tac}.
 | 
| 
 | 
   781  | 
This section covers the tactic that \texttt{Auto_tac} always applies first,
 | 
| 
 | 
   782  | 
namely simplification.
  | 
| 
 | 
   783  | 
  | 
| 
 | 
   784  | 
Simplification is one of the central theorem proving tools in Isabelle and
  | 
| 
 | 
   785  | 
many other systems. The tool itself is called the \bfindex{simplifier}. The
 | 
| 
 | 
   786  | 
purpose of this section is twofold: to introduce the many features of the
  | 
| 
 | 
   787  | 
simplifier (\S\ref{sec:SimpFeatures}) and to explain a little bit how the
 | 
| 
 | 
   788  | 
simplifier works (\S\ref{sec:SimpHow}).  Anybody intending to use HOL should
 | 
| 
 | 
   789  | 
read \S\ref{sec:SimpFeatures}, and the serious student should read
 | 
| 
 | 
   790  | 
\S\ref{sec:SimpHow} as well in order to understand what happened in case
 | 
| 
 | 
   791  | 
things do not simplify as expected.
  | 
| 
 | 
   792  | 
  | 
| 
 | 
   793  | 
  | 
| 
 | 
   794  | 
\subsection{Using the simplifier}
 | 
| 
 | 
   795  | 
\label{sec:SimpFeatures}
 | 
| 
 | 
   796  | 
  | 
| 
 | 
   797  | 
In its most basic form, simplification means repeated application of
  | 
| 
 | 
   798  | 
equations from left to right. For example, taking the rules for \texttt{\at}
 | 
| 
 | 
   799  | 
and applying them to the term \texttt{[0,1] \at\ []} results in a sequence of
 | 
| 
 | 
   800  | 
simplification steps:
  | 
| 
 | 
   801  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
   802  | 
(0#1#[]) @ []  \(\leadsto\)  0#((1#[]) @ [])  \(\leadsto\)  0#(1#([] @ []))  \(\leadsto\)  0#1#[]
  | 
| 
 | 
   803  | 
\end{ttbox}
 | 
| 
 | 
   804  | 
This is also known as {\em term rewriting} and the equations are referred
 | 
| 
 | 
   805  | 
to as {\em rewrite rules}. This is more honest than `simplification' because
 | 
| 
 | 
   806  | 
the terms do not necessarily become simpler in the process.
  | 
| 
 | 
   807  | 
  | 
| 
 | 
   808  | 
\subsubsection{Simpsets}
 | 
| 
 | 
   809  | 
  | 
| 
 | 
   810  | 
To facilitate simplification, each theory has an associated set of
  | 
| 
 | 
   811  | 
simplification rules, known as a \bfindex{simpset}. Within a theory,
 | 
| 
 | 
   812  | 
proofs by simplification refer to the associated simpset by default.
  | 
| 
 | 
   813  | 
The simpset of a theory is built up as follows: starting with the union of
  | 
| 
 | 
   814  | 
the simpsets of the parent theories, each occurrence of a \texttt{datatype}
 | 
| 
 | 
   815  | 
or \texttt{primrec} construct augments the simpset. Explicit definitions are
 | 
| 
 | 
   816  | 
not added automatically. Users can add new theorems via \texttt{Addsimps} and
 | 
| 
 | 
   817  | 
delete them again later by \texttt{Delsimps}.
 | 
| 
 | 
   818  | 
  | 
| 
 | 
   819  | 
You may augment a simpset not just by equations but by pretty much any
  | 
| 
 | 
   820  | 
theorem. The simplifier will try to make sense of it.  For example, a theorem
  | 
| 
 | 
   821  | 
\verb$~$$P$ is automatically turned into \texttt{$P$ = False}. The details are
 | 
| 
 | 
   822  | 
explained in \S\ref{sec:SimpHow}.
 | 
| 
 | 
   823  | 
  | 
| 
 | 
   824  | 
As a rule of thumb, rewrite rules that really simplify a term (like
  | 
| 
 | 
   825  | 
\texttt{xs \at\ [] = xs} and \texttt{rev(rev xs) = xs}) should be added to the
 | 
| 
 | 
   826  | 
current simpset right after they have been proved.  Those of a more specific
  | 
| 
 | 
   827  | 
nature (e.g.\ distributivity laws, which alter the structure of terms
  | 
| 
 | 
   828  | 
considerably) should only be added for specific proofs and deleted again
  | 
| 
 | 
   829  | 
afterwards.  Conversely, it may also happen that a generally useful rule
  | 
| 
 | 
   830  | 
needs to be removed for a certain proof and is added again afterwards.  The
  | 
| 
 | 
   831  | 
need of frequent temporary additions or deletions may indicate a badly
  | 
| 
 | 
   832  | 
designed simpset.
  | 
| 
 | 
   833  | 
\begin{warn}
 | 
| 
 | 
   834  | 
  Simplification may not terminate, for example if both $f(x) = g(x)$ and
  | 
| 
 | 
   835  | 
  $g(x) = f(x)$ are in the simpset. It is the user's responsibility not to
  | 
| 
 | 
   836  | 
  include rules that can lead to nontermination, either on their own or in
  | 
| 
 | 
   837  | 
  combination with other rules.
  | 
| 
 | 
   838  | 
\end{warn}
 | 
| 
 | 
   839  | 
  | 
| 
 | 
   840  | 
\subsubsection{Simplification tactics}
 | 
| 
 | 
   841  | 
  | 
| 
 | 
   842  | 
There are four main simplification tactics:
  | 
| 
 | 
   843  | 
\begin{ttdescription}
 | 
| 
 | 
   844  | 
\item[\ttindexbold{Simp_tac} $i$] simplifies the conclusion of subgoal~$i$
 | 
| 
 | 
   845  | 
  using the theory's simpset.  It may solve the subgoal completely if it has
  | 
| 
 | 
   846  | 
  become trivial. For example:
  | 
| 
 | 
   847  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
   848  | 
{\out 1. [] @ [] = []}
 | 
| 
 | 
   849  | 
by(Simp_tac 1);
  | 
| 
 | 
   850  | 
{\out No subgoals!}
 | 
| 
 | 
   851  | 
\end{ttbox}
 | 
| 
 | 
   852  | 
  | 
| 
 | 
   853  | 
\item[\ttindexbold{Asm_simp_tac}]
 | 
| 
 | 
   854  | 
  is like \verb$Simp_tac$, but extracts additional rewrite rules from
  | 
| 
 | 
   855  | 
  the assumptions of the subgoal. For example, it solves
  | 
| 
 | 
   856  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
   857  | 
{\out 1. xs = [] ==> xs @ xs = xs}
 | 
| 
 | 
   858  | 
\end{ttbox}
 | 
| 
 | 
   859  | 
which \texttt{Simp_tac} does not do.
 | 
| 
 | 
   860  | 
  
  | 
| 
 | 
   861  | 
\item[\ttindexbold{Full_simp_tac}] is like \verb$Simp_tac$, but also
 | 
| 
 | 
   862  | 
  simplifies the assumptions (without using the assumptions to
  | 
| 
 | 
   863  | 
  simplify each other or the actual goal).
  | 
| 
 | 
   864  | 
  | 
| 
 | 
   865  | 
\item[\ttindexbold{Asm_full_simp_tac}] is like \verb$Asm_simp_tac$,
 | 
| 
 | 
   866  | 
  but also simplifies the assumptions. In particular, assumptions can
  | 
| 
 | 
   867  | 
  simplify each other. For example:
  | 
| 
 | 
   868  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
   869  | 
\out{ 1. [| xs @ zs = ys @ xs; [] @ xs = [] @ [] |] ==> ys = zs}
 | 
| 
 | 
   870  | 
by(Asm_full_simp_tac 1);
  | 
| 
 | 
   871  | 
{\out No subgoals!}
 | 
| 
 | 
   872  | 
\end{ttbox}
 | 
| 
 | 
   873  | 
The second assumption simplifies to \texttt{xs = []}, which in turn
 | 
| 
 | 
   874  | 
simplifies the first assumption to \texttt{zs = ys}, thus reducing the
 | 
| 
 | 
   875  | 
conclusion to \texttt{ys = ys} and hence to \texttt{True}.
 | 
| 
 | 
   876  | 
(See also the paragraph on tracing below.)
  | 
| 
 | 
   877  | 
\end{ttdescription}
 | 
| 
 | 
   878  | 
\texttt{Asm_full_simp_tac} is the most powerful of this quartet of
 | 
| 
 | 
   879  | 
tactics. In fact, \texttt{Auto_tac} starts by applying
 | 
| 
 | 
   880  | 
\texttt{Asm_full_simp_tac} to all subgoals. The only reason for the existence
 | 
| 
 | 
   881  | 
of the other three tactics is that sometimes one wants to limit the amount of
  | 
| 
 | 
   882  | 
simplification, for example to avoid nontermination:
  | 
| 
 | 
   883  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
   884  | 
{\out  1. ! x. f x = g (f (g x)) ==> f [] = f [] @ []}
 | 
| 
 | 
   885  | 
\end{ttbox}
 | 
| 
 | 
   886  | 
is solved by \texttt{Simp_tac}, but \texttt{Asm_simp_tac} and
 | 
| 
 | 
   887  | 
\texttt{Asm_full_simp_tac} loop because the rewrite rule \texttt{f x = g(f(g
 | 
| 
 | 
   888  | 
x))} extracted from the assumption does not terminate.  Isabelle notices
  | 
| 
 | 
   889  | 
certain simple forms of nontermination, but not this one.
  | 
| 
 | 
   890  | 
 
  | 
| 
 | 
   891  | 
\subsubsection{Modifying simpsets locally}
 | 
| 
 | 
   892  | 
  | 
| 
 | 
   893  | 
If a certain theorem is merely needed in one proof by simplification, the
  | 
| 
 | 
   894  | 
pattern
  | 
| 
 | 
   895  | 
\begin{ttbox}
 | 
| 
 | 
   896  | 
Addsimps [\(rare_theorem\)];
  | 
| 
 | 
   897  | 
by(Simp_tac 1);
  | 
| 
 | 
   898  | 
Delsimps [\(rare_theorem\)];
  | 
| 
 | 
   899  | 
\end{ttbox}
 | 
| 
 | 
   900  | 
is awkward. Therefore there are lower-case versions of the simplification
  | 
| 
 | 
   901  | 
tactics (\ttindexbold{simp_tac}, \ttindexbold{asm_simp_tac},
 | 
| 
 | 
   902  | 
\ttindexbold{full_simp_tac}, \ttindexbold{asm_full_simp_tac}) and of the
 | 
| 
 | 
   903  | 
simpset modifiers (\ttindexbold{addsimps}, \ttindexbold{delsimps})
 | 
| 
 | 
   904  | 
that do not access or modify the implicit simpset but explicitly take a
  | 
| 
 | 
   905  | 
simpset as an argument. For example, the above three lines become
  | 
| 
 | 
   906  | 
\begin{ttbox}
 | 
| 
 | 
   907  | 
by(simp_tac (simpset() addsimps [\(rare_theorem\)]) 1);
  | 
| 
 | 
   908  | 
\end{ttbox}
 | 
| 
 | 
   909  | 
where the result of the function call \texttt{simpset()} is the simpset of
 | 
| 
 | 
   910  | 
the current theory and \texttt{addsimps} is an infix function. The implicit
 | 
| 
 | 
   911  | 
simpset is read once but not modified.
  | 
| 
 | 
   912  | 
This is far preferable to pairs of \texttt{Addsimps} and \texttt{Delsimps}.
 | 
| 
 | 
   913  | 
Local modifications can be stacked as in
  | 
| 
 | 
   914  | 
\begin{ttbox}
 | 
| 
 | 
   915  | 
by(simp_tac (simpset() addsimps [\(rare_theorem\)] delsimps [\(some_thm\)]) 1);
  | 
| 
 | 
   916  | 
\end{ttbox}
 | 
| 
 | 
   917  | 
  | 
| 
 | 
   918  | 
\subsubsection{Rewriting with definitions}
 | 
| 
 | 
   919  | 
  | 
| 
 | 
   920  | 
Constant definitions (\S\ref{sec:ConstDefinitions}) are not automatically
 | 
| 
 | 
   921  | 
included in the simpset of a theory. Hence such definitions are not expanded
  | 
| 
 | 
   922  | 
automatically either, just as it should be: definitions are introduced for
  | 
| 
 | 
   923  | 
the purpose of abbreviating complex concepts. Of course we need to expand the
  | 
| 
 | 
   924  | 
definitions initially to derive enough lemmas that characterize the concept
  | 
| 
 | 
   925  | 
sufficiently for us to forget the original definition completely. For
  | 
| 
 | 
   926  | 
example, given the theory
  | 
| 
 | 
   927  | 
\begin{ttbox}
 | 
| 
 | 
   928  | 
\input{Misc/Exor.thy}\end{ttbox}
 | 
| 
 | 
   929  | 
we may want to prove \verb$exor A (~A)$. Instead of \texttt{Goal} we use
 | 
| 
 | 
   930  | 
\begin{ttbox}
 | 
| 
 | 
   931  | 
\input{Misc/exorgoal.ML}\end{ttbox}
 | 
| 
 | 
   932  | 
which tells Isabelle to expand the definition of \texttt{exor}---the first
 | 
| 
 | 
   933  | 
argument of \texttt{Goalw} can be a list of definitions---in the initial goal:
 | 
| 
 | 
   934  | 
\begin{ttbox}
 | 
| 
 | 
   935  | 
{\out exor A (~ A)}
 | 
| 
 | 
   936  | 
{\out  1. A & ~ ~ A | ~ A & ~ A}
 | 
| 
 | 
   937  | 
\end{ttbox}
 | 
| 
 | 
   938  | 
In this simple example, the goal is proved by \texttt{Simp_tac}.
 | 
| 
 | 
   939  | 
Of course the resulting theorem is insufficient to characterize \texttt{exor}
 | 
| 
 | 
   940  | 
completely.
  | 
| 
 | 
   941  | 
  | 
| 
 | 
   942  | 
In case we want to expand a definition in the middle of a proof, we can
  | 
| 
 | 
   943  | 
simply add the definition locally to the simpset:
  | 
| 
 | 
   944  | 
\begin{ttbox}
 | 
| 
 | 
   945  | 
\input{Misc/exorproof.ML}\end{ttbox}
 | 
| 
 | 
   946  | 
You should normally not add the definition permanently to the simpset
  | 
| 
 | 
   947  | 
using \texttt{Addsimps} because this defeats the whole purpose of an
 | 
| 
 | 
   948  | 
abbreviation.
  | 
| 
 | 
   949  | 
  | 
| 
 | 
   950  | 
\begin{warn}
 | 
| 
 | 
   951  | 
If you have defined $f\,x\,y$\texttt{~==~}$t$ then you can only expand
 | 
| 
 | 
   952  | 
occurrences of $f$ with at least two arguments. Thus it is safer to define
  | 
| 
 | 
   953  | 
$f$\texttt{~==~\%$x\,y$.}$\;t$.
 | 
| 
 | 
   954  | 
\end{warn}
 | 
| 
 | 
   955  | 
  | 
| 
 | 
   956  | 
\subsubsection{Simplifying \texttt{let}-expressions}
 | 
| 
 | 
   957  | 
  | 
| 
 | 
   958  | 
Proving a goal containing \ttindex{let}-expressions invariably requires the
 | 
| 
 | 
   959  | 
\texttt{let}-constructs to be expanded at some point. Since
 | 
| 
 | 
   960  | 
\texttt{let}-\texttt{in} is just syntactic sugar for a defined constant
 | 
| 
 | 
   961  | 
(called \texttt{Let}), expanding \texttt{let}-constructs means rewriting with
 | 
| 
 | 
   962  | 
\texttt{Let_def}:
 | 
| 
 | 
   963  | 
%context List.thy;
  | 
| 
 | 
   964  | 
%Goal "(let xs = [] in xs@xs) = ys";
  | 
| 
 | 
   965  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
   966  | 
{\out  1. (let xs = [] in xs @ xs) = ys}
 | 
| 
 | 
   967  | 
by(simp_tac (simpset() addsimps [Let_def]) 1);
  | 
| 
 | 
   968  | 
{\out  1. [] = ys}
 | 
| 
 | 
   969  | 
\end{ttbox}
 | 
| 
 | 
   970  | 
If, in a particular context, there is no danger of a combinatorial explosion
  | 
| 
 | 
   971  | 
of nested \texttt{let}s one could even add \texttt{Let_def} permanently via
 | 
| 
 | 
   972  | 
\texttt{Addsimps}.
 | 
| 
 | 
   973  | 
  | 
| 
 | 
   974  | 
\subsubsection{Conditional equations}
 | 
| 
 | 
   975  | 
  | 
| 
 | 
   976  | 
So far all examples of rewrite rules were equations. The simplifier also
  | 
| 
 | 
   977  | 
accepts {\em conditional\/} equations, for example
 | 
| 
 | 
   978  | 
\begin{ttbox}
 | 
| 
 | 
   979  | 
xs ~= []  ==>  hd xs # tl xs = xs \hfill \((*)\)
  | 
| 
 | 
   980  | 
\end{ttbox}
 | 
| 
 | 
   981  | 
(which is proved by \texttt{exhaust_tac} on \texttt{xs} followed by
 | 
| 
 | 
   982  | 
\texttt{Asm_full_simp_tac} twice). Assuming that this theorem together with
 | 
| 
 | 
   983  | 
%\begin{ttbox}\makeatother
 | 
| 
 | 
   984  | 
\texttt{(rev xs = []) = (xs = [])}
 | 
| 
 | 
   985  | 
%\end{ttbox}
 | 
| 
 | 
   986  | 
are part of the simpset, the subgoal
  | 
| 
 | 
   987  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
   988  | 
{\out 1. xs ~= [] ==> hd(rev xs) # tl(rev xs) = rev xs}
 | 
| 
 | 
   989  | 
\end{ttbox}
 | 
| 
 | 
   990  | 
is proved by simplification:
  | 
| 
 | 
   991  | 
the conditional equation $(*)$ above
  | 
| 
 | 
   992  | 
can simplify \texttt{hd(rev~xs)~\#~tl(rev~xs)} to \texttt{rev xs}
 | 
| 
 | 
   993  | 
because the corresponding precondition \verb$rev xs ~= []$ simplifies to
  | 
| 
 | 
   994  | 
\verb$xs ~= []$, which is exactly the local assumption of the subgoal.
  | 
| 
 | 
   995  | 
  | 
| 
 | 
   996  | 
  | 
| 
 | 
   997  | 
\subsubsection{Automatic case splits}
 | 
| 
 | 
   998  | 
  | 
| 
 | 
   999  | 
Goals containing \ttindex{if}-expressions are usually proved by case
 | 
| 
 | 
  1000  | 
distinction on the condition of the \texttt{if}. For example the goal
 | 
| 
 | 
  1001  | 
\begin{ttbox}
 | 
| 
 | 
  1002  | 
{\out 1. ! xs. if xs = [] then rev xs = [] else rev xs ~= []}
 | 
| 
 | 
  1003  | 
\end{ttbox}
 | 
| 
 | 
  1004  | 
can be split into
  | 
| 
 | 
  1005  | 
\begin{ttbox}
 | 
| 
 | 
  1006  | 
{\out 1. ! xs. (xs = [] --> rev xs = []) \& (xs ~= [] --> rev xs ~= [])}
 | 
| 
 | 
  1007  | 
\end{ttbox}
 | 
| 
 | 
  1008  | 
by typing
  | 
| 
 | 
  1009  | 
\begin{ttbox}
 | 
| 
 | 
  1010  | 
\input{Misc/splitif.ML}\end{ttbox}
 | 
| 
 | 
  1011  | 
Because this is almost always the right proof strategy, the simplifier
  | 
| 
 | 
  1012  | 
performs case-splitting on \texttt{if}s automatically. Try \texttt{Simp_tac}
 | 
| 
 | 
  1013  | 
on the initial goal above.
  | 
| 
 | 
  1014  | 
  | 
| 
 | 
  1015  | 
This splitting idea generalizes from \texttt{if} to \ttindex{case}:
 | 
| 
 | 
  1016  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
  1017  | 
{\out 1. (case xs of [] => zs | y#ys => y#(ys@zs)) = xs@zs}
 | 
| 
 | 
  1018  | 
\end{ttbox}
 | 
| 
 | 
  1019  | 
becomes
  | 
| 
 | 
  1020  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
  1021  | 
{\out 1. (xs = [] --> zs = xs @ zs) &}
 | 
| 
 | 
  1022  | 
{\out    (! a list. xs = a # list --> a # list @ zs = xs @ zs)}
 | 
| 
 | 
  1023  | 
\end{ttbox}
 | 
| 
 | 
  1024  | 
by typing
  | 
| 
 | 
  1025  | 
\begin{ttbox}
 | 
| 
 | 
  1026  | 
\input{Misc/splitlist.ML}\end{ttbox}
 | 
| 
 | 
  1027  | 
In contrast to \texttt{if}-expressions, the simplifier does not split
 | 
| 
 | 
  1028  | 
\texttt{case}-expressions by default because this can lead to nontermination
 | 
| 
 | 
  1029  | 
in case of recursive datatypes.
  | 
| 
 | 
  1030  | 
Nevertheless the simplifier can be instructed to perform \texttt{case}-splits
 | 
| 
 | 
  1031  | 
by adding the appropriate rule to the simpset:
  | 
| 
 | 
  1032  | 
\begin{ttbox}
 | 
| 
 | 
  1033  | 
by(simp_tac (simpset() addsplits [split_list_case]) 1);
  | 
| 
 | 
  1034  | 
\end{ttbox}\indexbold{*addsplits}
 | 
| 
 | 
  1035  | 
solves the initial goal outright, which \texttt{Simp_tac} alone will not do.
 | 
| 
 | 
  1036  | 
  | 
| 
 | 
  1037  | 
In general, every datatype $t$ comes with a rule
  | 
| 
 | 
  1038  | 
\texttt{$t$.split} that can be added to the simpset either
 | 
| 
 | 
  1039  | 
locally via \texttt{addsplits} (see above), or permanently via
 | 
| 
 | 
  1040  | 
\begin{ttbox}
 | 
| 
 | 
  1041  | 
Addsplits [\(t\).split];
  | 
| 
 | 
  1042  | 
\end{ttbox}\indexbold{*Addsplits}
 | 
| 
 | 
  1043  | 
Split-rules can be removed globally via \ttindexbold{Delsplits} and locally
 | 
| 
 | 
  1044  | 
via \ttindexbold{delsplits} as, for example, in
 | 
| 
 | 
  1045  | 
\begin{ttbox}
 | 
| 
 | 
  1046  | 
by(simp_tac (simpset() addsimps [\(\dots\)] delsplits [split_if]) 1);
  | 
| 
 | 
  1047  | 
\end{ttbox}
 | 
| 
 | 
  1048  | 
  | 
| 
 | 
  1049  | 
  | 
| 
6577
 | 
  1050  | 
\subsubsection{Arithmetic}
 | 
| 
 | 
  1051  | 
\index{arithmetic}
 | 
| 
 | 
  1052  | 
  | 
| 
 | 
  1053  | 
The simplifier routinely solves a small class of linear arithmetic formulae
  | 
| 
 | 
  1054  | 
(over types \texttt{nat} and \texttt{int}): it only takes into account
 | 
| 
 | 
  1055  | 
assumptions and conclusions that are (possibly negated) (in)equalities
  | 
| 
 | 
  1056  | 
(\texttt{=}, \texttt{<=}, \texttt{<}) and it only knows about addition. Thus
 | 
| 
 | 
  1057  | 
\begin{ttbox}
 | 
| 
 | 
  1058  | 
\input{Misc/arith1.ML}\end{ttbox}
 | 
| 
 | 
  1059  | 
is proved by simplification, whereas the only slightly more complex
  | 
| 
 | 
  1060  | 
\begin{ttbox}
 | 
| 
 | 
  1061  | 
\input{Misc/arith3.ML}\end{ttbox}
 | 
| 
 | 
  1062  | 
is not proved by simplification and requires \texttt{arith_tac}.
 | 
| 
 | 
  1063  | 
  | 
| 
5375
 | 
  1064  | 
\subsubsection{Permutative rewrite rules}
 | 
| 
 | 
  1065  | 
  | 
| 
 | 
  1066  | 
A rewrite rule is {\bf permutative} if the left-hand side and right-hand side
 | 
| 
 | 
  1067  | 
are the same up to renaming of variables.  The most common permutative rule
  | 
| 
 | 
  1068  | 
is commutativity: $x+y = y+x$.  Another example is $(x-y)-z = (x-z)-y$.  Such
  | 
| 
 | 
  1069  | 
rules are problematic because once they apply, they can be used forever.
  | 
| 
 | 
  1070  | 
The simplifier is aware of this danger and treats permutative rules
  | 
| 
6606
 | 
  1071  | 
separately. For details see~\cite{isabelle-ref}.
 | 
| 
5375
 | 
  1072  | 
  | 
| 
 | 
  1073  | 
\subsubsection{Tracing}
 | 
| 
 | 
  1074  | 
\indexbold{tracing the simplifier}
 | 
| 
 | 
  1075  | 
  | 
| 
 | 
  1076  | 
Using the simplifier effectively may take a bit of experimentation.  Set the
  | 
| 
 | 
  1077  | 
\verb$trace_simp$ flag to get a better idea of what is going on:
  | 
| 
 | 
  1078  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
  1079  | 
{\out  1. rev [x] = []}
 | 
| 
 | 
  1080  | 
\ttbreak
  | 
| 
 | 
  1081  | 
set trace_simp;
  | 
| 
 | 
  1082  | 
by(Simp_tac 1);
  | 
| 
 | 
  1083  | 
\ttbreak\makeatother
  | 
| 
 | 
  1084  | 
{\out Applying instance of rewrite rule:}
 | 
| 
 | 
  1085  | 
{\out rev (?x # ?xs) == rev ?xs @ [?x]}
 | 
| 
 | 
  1086  | 
{\out Rewriting:}
 | 
| 
 | 
  1087  | 
{\out rev [x] == rev [] @ [x]}
 | 
| 
 | 
  1088  | 
\ttbreak
  | 
| 
 | 
  1089  | 
{\out Applying instance of rewrite rule:}
 | 
| 
 | 
  1090  | 
{\out rev [] == []}
 | 
| 
 | 
  1091  | 
{\out Rewriting:}
 | 
| 
 | 
  1092  | 
{\out rev [] == []}
 | 
| 
 | 
  1093  | 
\ttbreak\makeatother
  | 
| 
 | 
  1094  | 
{\out Applying instance of rewrite rule:}
 | 
| 
 | 
  1095  | 
{\out [] @ ?y == ?y}
 | 
| 
 | 
  1096  | 
{\out Rewriting:}
 | 
| 
 | 
  1097  | 
{\out [] @ [x] == [x]}
 | 
| 
 | 
  1098  | 
\ttbreak
  | 
| 
 | 
  1099  | 
{\out Applying instance of rewrite rule:}
 | 
| 
 | 
  1100  | 
{\out ?x # ?t = ?t == False}
 | 
| 
 | 
  1101  | 
{\out Rewriting:}
 | 
| 
 | 
  1102  | 
{\out [x] = [] == False}
 | 
| 
 | 
  1103  | 
\ttbreak
  | 
| 
 | 
  1104  | 
{\out Level 1}
 | 
| 
 | 
  1105  | 
{\out rev [x] = []}
 | 
| 
 | 
  1106  | 
{\out  1. False}
 | 
| 
 | 
  1107  | 
\end{ttbox}
 | 
| 
 | 
  1108  | 
In more complicated cases, the trace can be enormous, especially since
  | 
| 
 | 
  1109  | 
invocations of the simplifier are often nested (e.g.\ when solving conditions
  | 
| 
 | 
  1110  | 
of rewrite rules).
  | 
| 
 | 
  1111  | 
  | 
| 
 | 
  1112  | 
\subsection{How it works}
 | 
| 
 | 
  1113  | 
\label{sec:SimpHow}
 | 
| 
 | 
  1114  | 
  | 
| 
 | 
  1115  | 
\subsubsection{Higher-order patterns}
 | 
| 
 | 
  1116  | 
  | 
| 
 | 
  1117  | 
\subsubsection{Local assumptions}
 | 
| 
 | 
  1118  | 
  | 
| 
 | 
  1119  | 
\subsubsection{The preprocessor}
 | 
| 
 | 
  1120  | 
  | 
| 
 | 
  1121  | 
\section{Induction heuristics}
 | 
| 
 | 
  1122  | 
\label{sec:InductionHeuristics}
 | 
| 
 | 
  1123  | 
  | 
| 
 | 
  1124  | 
The purpose of this section is to illustrate some simple heuristics for
  | 
| 
 | 
  1125  | 
inductive proofs. The first one we have already mentioned in our initial
  | 
| 
 | 
  1126  | 
example:
  | 
| 
 | 
  1127  | 
\begin{quote}
 | 
| 
 | 
  1128  | 
{\em 1. Theorems about recursive functions are proved by induction.}
 | 
| 
 | 
  1129  | 
\end{quote}
 | 
| 
 | 
  1130  | 
In case the function has more than one argument
  | 
| 
 | 
  1131  | 
\begin{quote}
 | 
| 
 | 
  1132  | 
{\em 2. Do induction on argument number $i$ if the function is defined by
 | 
| 
 | 
  1133  | 
recursion in argument number $i$.}
  | 
| 
 | 
  1134  | 
\end{quote}
 | 
| 
 | 
  1135  | 
When we look at the proof of
  | 
| 
 | 
  1136  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
  1137  | 
(xs @ ys) @ zs = xs @ (ys @ zs)
  | 
| 
 | 
  1138  | 
\end{ttbox}
 | 
| 
 | 
  1139  | 
in \S\ref{sec:intro-proof} we find (a) \texttt{\at} is recursive in the first
 | 
| 
 | 
  1140  | 
argument, (b) \texttt{xs} occurs only as the first argument of \texttt{\at},
 | 
| 
 | 
  1141  | 
and (c) both \texttt{ys} and \texttt{zs} occur at least once as the second
 | 
| 
 | 
  1142  | 
argument of \texttt{\at}. Hence it is natural to perform induction on
 | 
| 
 | 
  1143  | 
\texttt{xs}.
 | 
| 
 | 
  1144  | 
  | 
| 
 | 
  1145  | 
The key heuristic, and the main point of this section, is to
  | 
| 
 | 
  1146  | 
generalize the goal before induction. The reason is simple: if the goal is
  | 
| 
 | 
  1147  | 
too specific, the induction hypothesis is too weak to allow the induction
  | 
| 
 | 
  1148  | 
step to go through. Let us now illustrate the idea with an example.
  | 
| 
 | 
  1149  | 
  | 
| 
 | 
  1150  | 
We define a tail-recursive version of list-reversal,
  | 
| 
 | 
  1151  | 
i.e.\ one that can be compiled into a loop:
  | 
| 
 | 
  1152  | 
\begin{ttbox}
 | 
| 
 | 
  1153  | 
\input{Misc/Itrev.thy}\end{ttbox}
 | 
| 
 | 
  1154  | 
The behaviour of \texttt{itrev} is simple: it reverses its first argument by
 | 
| 
 | 
  1155  | 
stacking its elements onto the second argument, and returning that second
  | 
| 
 | 
  1156  | 
argument when the first one becomes empty.
  | 
| 
 | 
  1157  | 
We need to show that \texttt{itrev} does indeed reverse its first argument
 | 
| 
 | 
  1158  | 
provided the second one is empty:
  | 
| 
 | 
  1159  | 
\begin{ttbox}
 | 
| 
 | 
  1160  | 
\input{Misc/itrev1.ML}\end{ttbox}
 | 
| 
 | 
  1161  | 
There is no choice as to the induction variable, and we immediately simplify:
  | 
| 
 | 
  1162  | 
\begin{ttbox}
 | 
| 
 | 
  1163  | 
\input{Misc/induct_auto.ML}\ttbreak\makeatother
 | 
| 
 | 
  1164  | 
{\out1. !!a list. itrev list [] = rev list\(\;\)==> itrev list [a] = rev list @ [a]}
 | 
| 
 | 
  1165  | 
\end{ttbox}
 | 
| 
 | 
  1166  | 
Just as predicted above, the overall goal, and hence the induction
  | 
| 
 | 
  1167  | 
hypothesis, is too weak to solve the induction step because of the fixed
  | 
| 
 | 
  1168  | 
\texttt{[]}. The corresponding heuristic:
 | 
| 
 | 
  1169  | 
\begin{quote}
 | 
| 
 | 
  1170  | 
{\em 3. Generalize goals for induction by replacing constants by variables.}
 | 
| 
 | 
  1171  | 
\end{quote}
 | 
| 
 | 
  1172  | 
Of course one cannot do this na\"{\i}vely: \texttt{itrev xs ys = rev xs} is
 | 
| 
 | 
  1173  | 
just not true --- the correct generalization is
  | 
| 
 | 
  1174  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
  1175  | 
\input{Misc/itrev2.ML}\end{ttbox}
 | 
| 
 | 
  1176  | 
If \texttt{ys} is replaced by \texttt{[]}, the right-hand side simplifies to
 | 
| 
 | 
  1177  | 
\texttt{rev xs}, just as required.
 | 
| 
 | 
  1178  | 
  | 
| 
 | 
  1179  | 
In this particular instance it is easy to guess the right generalization,
  | 
| 
 | 
  1180  | 
but in more complex situations a good deal of creativity is needed. This is
  | 
| 
 | 
  1181  | 
the main source of complications in inductive proofs.
  | 
| 
 | 
  1182  | 
  | 
| 
 | 
  1183  | 
Although we now have two variables, only \texttt{xs} is suitable for
 | 
| 
 | 
  1184  | 
induction, and we repeat our above proof attempt. Unfortunately, we are still
  | 
| 
 | 
  1185  | 
not there:
  | 
| 
 | 
  1186  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
  1187  | 
{\out 1. !!a list.}
 | 
| 
 | 
  1188  | 
{\out       itrev list ys = rev list @ ys}
 | 
| 
 | 
  1189  | 
{\out       ==> itrev list (a # ys) = rev list @ a # ys}
 | 
| 
 | 
  1190  | 
\end{ttbox}
 | 
| 
 | 
  1191  | 
The induction hypothesis is still too weak, but this time it takes no
  | 
| 
 | 
  1192  | 
intuition to generalize: the problem is that \texttt{ys} is fixed throughout
 | 
| 
 | 
  1193  | 
the subgoal, but the induction hypothesis needs to be applied with
  | 
| 
 | 
  1194  | 
\texttt{a \# ys} instead of \texttt{ys}. Hence we prove the theorem
 | 
| 
 | 
  1195  | 
for all \texttt{ys} instead of a fixed one:
 | 
| 
 | 
  1196  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
  1197  | 
\input{Misc/itrev3.ML}\end{ttbox}
 | 
| 
 | 
  1198  | 
This time induction on \texttt{xs} followed by simplification succeeds. This
 | 
| 
 | 
  1199  | 
leads to another heuristic for generalization:
  | 
| 
 | 
  1200  | 
\begin{quote}
 | 
| 
 | 
  1201  | 
{\em 4. Generalize goals for induction by universally quantifying all free
 | 
| 
 | 
  1202  | 
variables {\em(except the induction variable itself!)}.}
 | 
| 
 | 
  1203  | 
\end{quote}
 | 
| 
 | 
  1204  | 
This prevents trivial failures like the above and does not change the
  | 
| 
 | 
  1205  | 
provability of the goal. Because it is not always required, and may even
  | 
| 
 | 
  1206  | 
complicate matters in some cases, this heuristic is often not
  | 
| 
 | 
  1207  | 
applied blindly.
  | 
| 
 | 
  1208  | 
  | 
| 
 | 
  1209  | 
A final point worth mentioning is the orientation of the equation we just
  | 
| 
 | 
  1210  | 
proved: the more complex notion (\texttt{itrev}) is on the left-hand
 | 
| 
 | 
  1211  | 
side, the simpler \texttt{rev} on the right-hand side. This constitutes
 | 
| 
 | 
  1212  | 
another, albeit weak heuristic that is not restricted to induction:
  | 
| 
 | 
  1213  | 
\begin{quote}
 | 
| 
 | 
  1214  | 
  {\em 5. The right-hand side of an equation should (in some sense) be
 | 
| 
 | 
  1215  | 
    simpler than the left-hand side.}
  | 
| 
 | 
  1216  | 
\end{quote}
 | 
| 
 | 
  1217  | 
The heuristic is tricky to apply because it is not obvious that
  | 
| 
 | 
  1218  | 
\texttt{rev xs \at\ ys} is simpler than \texttt{itrev xs ys}. But see what
 | 
| 
 | 
  1219  | 
happens if you try to prove the symmetric equation!
  | 
| 
 | 
  1220  | 
  | 
| 
 | 
  1221  | 
  | 
| 
 | 
  1222  | 
\section{Case study: compiling expressions}
 | 
| 
 | 
  1223  | 
\label{sec:ExprCompiler}
 | 
| 
 | 
  1224  | 
  | 
| 
 | 
  1225  | 
The task is to develop a compiler from a generic type of expressions (built
  | 
| 
 | 
  1226  | 
up from variables, constants and binary operations) to a stack machine.  This
  | 
| 
 | 
  1227  | 
generic type of expressions is a generalization of the boolean expressions in
  | 
| 
 | 
  1228  | 
\S\ref{sec:boolex}.  This time we do not commit ourselves to a particular
 | 
| 
 | 
  1229  | 
type of variables or values but make them type parameters.  Neither is there
  | 
| 
 | 
  1230  | 
a fixed set of binary operations: instead the expression contains the
  | 
| 
 | 
  1231  | 
appropriate function itself.
  | 
| 
 | 
  1232  | 
\begin{ttbox}
 | 
| 
 | 
  1233  | 
\input{CodeGen/expr}\end{ttbox}
 | 
| 
 | 
  1234  | 
The three constructors represent constants, variables and the combination of
  | 
| 
 | 
  1235  | 
two subexpressions with a binary operation.
  | 
| 
 | 
  1236  | 
  | 
| 
 | 
  1237  | 
The value of an expression w.r.t.\ an environment that maps variables to
  | 
| 
 | 
  1238  | 
values is easily defined:
  | 
| 
 | 
  1239  | 
\begin{ttbox}
 | 
| 
 | 
  1240  | 
\input{CodeGen/value}\end{ttbox}
 | 
| 
 | 
  1241  | 
  | 
| 
 | 
  1242  | 
The stack machine has three instructions: load a constant value onto the
  | 
| 
 | 
  1243  | 
stack, load the contents of a certain address onto the stack, and apply a
  | 
| 
 | 
  1244  | 
binary operation to the two topmost elements of the stack, replacing them by
  | 
| 
 | 
  1245  | 
the result. As for \texttt{expr}, addresses and values are type parameters:
 | 
| 
 | 
  1246  | 
\begin{ttbox}
 | 
| 
 | 
  1247  | 
\input{CodeGen/instr}\end{ttbox}
 | 
| 
 | 
  1248  | 
  | 
| 
 | 
  1249  | 
The execution of the stack machine is modelled by a function \texttt{exec}
 | 
| 
 | 
  1250  | 
that takes a store (modelled as a function from addresses to values, just
  | 
| 
 | 
  1251  | 
like the environment for evaluating expressions), a stack (modelled as a
  | 
| 
 | 
  1252  | 
list) of values and a list of instructions and returns the stack at the end
  | 
| 
 | 
  1253  | 
of the execution --- the store remains unchanged:
  | 
| 
 | 
  1254  | 
\begin{ttbox}
 | 
| 
 | 
  1255  | 
\input{CodeGen/exec}\end{ttbox}
 | 
| 
 | 
  1256  | 
Recall that \texttt{hd} and \texttt{tl}
 | 
| 
 | 
  1257  | 
return the first element and the remainder of a list.
  | 
| 
 | 
  1258  | 
  | 
| 
 | 
  1259  | 
Because all functions are total, \texttt{hd} is defined even for the empty
 | 
| 
 | 
  1260  | 
list, although we do not know what the result is. Thus our model of the
  | 
| 
 | 
  1261  | 
machine always terminates properly, although the above definition does not
  | 
| 
 | 
  1262  | 
tell us much about the result in situations where \texttt{Apply} was executed
 | 
| 
 | 
  1263  | 
with fewer than two elements on the stack.
  | 
| 
 | 
  1264  | 
  | 
| 
 | 
  1265  | 
The compiler is a function from expressions to a list of instructions. Its
  | 
| 
 | 
  1266  | 
definition is pretty much obvious:
  | 
| 
 | 
  1267  | 
\begin{ttbox}\makeatother
 | 
| 
 | 
  1268  | 
\input{CodeGen/comp}\end{ttbox}
 | 
| 
 | 
  1269  | 
  | 
| 
 | 
  1270  | 
Now we have to prove the correctness of the compiler, i.e.\ that the
  | 
| 
 | 
  1271  | 
execution of a compiled expression results in the value of the expression:
  | 
| 
 | 
  1272  | 
\begin{ttbox}
 | 
| 
 | 
  1273  | 
exec s [] (comp e) = [value s e]
  | 
| 
 | 
  1274  | 
\end{ttbox}
 | 
| 
 | 
  1275  | 
This is generalized to
  | 
| 
 | 
  1276  | 
\begin{ttbox}
 | 
| 
 | 
  1277  | 
\input{CodeGen/goal2.ML}\end{ttbox}
 | 
| 
 | 
  1278  | 
and proved by induction on \texttt{e} followed by simplification, once we
 | 
| 
 | 
  1279  | 
have the following lemma about executing the concatenation of two instruction
  | 
| 
 | 
  1280  | 
sequences:
  | 
| 
 | 
  1281  | 
\begin{ttbox}\makeatother
 | 
| 
6099
 | 
  1282  | 
\input{CodeGen/goal1.ML}\end{ttbox}
 | 
| 
5375
 | 
  1283  | 
This requires induction on \texttt{xs} and ordinary simplification for the
 | 
| 
 | 
  1284  | 
base cases. In the induction step, simplification leaves us with a formula
  | 
| 
 | 
  1285  | 
that contains two \texttt{case}-expressions over instructions. Thus we add
 | 
| 
 | 
  1286  | 
automatic case splitting as well, which finishes the proof:
  | 
| 
 | 
  1287  | 
\begin{ttbox}
 | 
| 
 | 
  1288  | 
\input{CodeGen/simpsplit.ML}\end{ttbox}
 | 
| 
 | 
  1289  | 
  | 
| 
 | 
  1290  | 
We could now go back and prove \texttt{exec s [] (comp e) = [value s e]}
 | 
| 
 | 
  1291  | 
merely by simplification with the generalized version we just proved.
  | 
| 
 | 
  1292  | 
However, this is unnecessary because the generalized version fully subsumes
  | 
| 
 | 
  1293  | 
its instance.
  | 
| 
 | 
  1294  | 
  | 
| 
5850
 | 
  1295  | 
  | 
| 
 | 
  1296  | 
\section{Advanced datatypes}
 | 
| 
 | 
  1297  | 
\index{*datatype|(}
 | 
| 
 | 
  1298  | 
\index{*primrec|(}
 | 
| 
 | 
  1299  | 
  | 
| 
6099
 | 
  1300  | 
This section presents advanced forms of \texttt{datatype}s and (in the near
 | 
| 
 | 
  1301  | 
future!) records.
  | 
| 
 | 
  1302  | 
  | 
| 
5850
 | 
  1303  | 
\subsection{Mutual recursion}
 | 
| 
 | 
  1304  | 
  | 
| 
 | 
  1305  | 
Sometimes it is necessary to define two datatypes that depend on each
  | 
| 
 | 
  1306  | 
other. This is called \textbf{mutual recursion}. As an example consider a
 | 
| 
 | 
  1307  | 
language of arithmetic and boolean expressions where
  | 
| 
 | 
  1308  | 
\begin{itemize}
 | 
| 
 | 
  1309  | 
\item arithmetic expressions contain boolean expressions because there are
  | 
| 
 | 
  1310  | 
  conditional arithmetic expressions like ``if $m<n$ then $n-m$ else $m-n$'',
  | 
| 
 | 
  1311  | 
  and
  | 
| 
 | 
  1312  | 
\item boolean expressions contain arithmetic expressions because of
  | 
| 
 | 
  1313  | 
  comparisons like ``$m<n$''.
  | 
| 
 | 
  1314  | 
\end{itemize}
 | 
| 
 | 
  1315  | 
In Isabelle this becomes
  | 
| 
 | 
  1316  | 
\begin{ttbox}
 | 
| 
 | 
  1317  | 
\input{Datatype/abdata}\end{ttbox}\indexbold{*and}
 | 
| 
 | 
  1318  | 
Type \texttt{aexp} is similar to \texttt{expr} in \S\ref{sec:ExprCompiler},
 | 
| 
 | 
  1319  | 
except that we have fixed the values to be of type \texttt{nat} and that we
 | 
| 
 | 
  1320  | 
have fixed the two binary operations \texttt{Sum} and \texttt{Diff}. Boolean
 | 
| 
 | 
  1321  | 
expressions can be arithmetic comparisons, conjunctions and negations.
  | 
| 
 | 
  1322  | 
The semantics is fixed via two evaluation functions
  | 
| 
 | 
  1323  | 
\begin{ttbox}
 | 
| 
 | 
  1324  | 
\input{Datatype/abconstseval}\end{ttbox}
 | 
| 
 | 
  1325  | 
that take an environment (a mapping from variables \texttt{'a} to values
 | 
| 
 | 
  1326  | 
\texttt{nat}) and an expression and return its arithmetic/boolean
 | 
| 
 | 
  1327  | 
value. Since the datatypes are mutually recursive, so are functions that
  | 
| 
 | 
  1328  | 
operate on them. Hence they need to be defined in a single \texttt{primrec}
 | 
| 
 | 
  1329  | 
section:
  | 
| 
 | 
  1330  | 
\begin{ttbox}
 | 
| 
 | 
  1331  | 
\input{Datatype/abevala}
 | 
| 
 | 
  1332  | 
\input{Datatype/abevalb}\end{ttbox}
 | 
| 
 | 
  1333  | 
  | 
| 
 | 
  1334  | 
%In general, given $n$ mutually recursive datatypes, whenever you define a
  | 
| 
 | 
  1335  | 
%\texttt{primrec} function on one of them, Isabelle expects you to define $n$
 | 
| 
 | 
  1336  | 
%(possibly mutually recursive) functions simultaneously.
  | 
| 
 | 
  1337  | 
  | 
| 
 | 
  1338  | 
In the same fashion we also define two functions that perform substitution:
  | 
| 
 | 
  1339  | 
\begin{ttbox}
 | 
| 
 | 
  1340  | 
\input{Datatype/abconstssubst}\end{ttbox}
 | 
| 
 | 
  1341  | 
The first argument is a function mapping variables to expressions, the
  | 
| 
 | 
  1342  | 
substitution. It is applied to all variables in the second argument. As a
  | 
| 
 | 
  1343  | 
result, the type of variables in the expression may change from \texttt{'a}
 | 
| 
 | 
  1344  | 
to \texttt{'b}. Note that there are only arithmetic and no boolean variables.
 | 
| 
 | 
  1345  | 
\begin{ttbox}
 | 
| 
 | 
  1346  | 
\input{Datatype/absubsta}
 | 
| 
 | 
  1347  | 
\input{Datatype/absubstb}\end{ttbox}
 | 
| 
 | 
  1348  | 
  | 
| 
 | 
  1349  | 
Now we can prove a fundamental theorem about the interaction between
  | 
| 
 | 
  1350  | 
evaluation and substitution: applying a substitution $s$ to an expression $a$
  | 
| 
 | 
  1351  | 
and evaluating the result in an environment $env$ yields the same result as
  | 
| 
 | 
  1352  | 
evaluation $a$ in the environment that maps every variable $x$ to the value
  | 
| 
 | 
  1353  | 
of $s(x)$ under $env$. If you try to prove this separately for arithmetic or
  | 
| 
 | 
  1354  | 
boolean expressions (by induction), you find that you always need the other
  | 
| 
 | 
  1355  | 
theorem in the induction step. Therefore you need to state and prove both
  | 
| 
 | 
  1356  | 
theorems simultaneously:
  | 
| 
 | 
  1357  | 
\begin{quote}\small
 | 
| 
 | 
  1358  | 
\verbatiminput{Datatype/abgoalind.ML}
 | 
| 
 | 
  1359  | 
\end{quote}\indexbold{*mutual_induct_tac}
 | 
| 
 | 
  1360  | 
The resulting 8 goals (one for each constructor) are proved in one fell swoop
  | 
| 
 | 
  1361  | 
\texttt{by Auto_tac;}.
 | 
| 
 | 
  1362  | 
  | 
| 
 | 
  1363  | 
In general, given $n$ mutually recursive datatypes $\tau@1$, \dots, $\tau@n$,
  | 
| 
 | 
  1364  | 
Isabelle expects an inductive proof to start with a goal of the form
  | 
| 
 | 
  1365  | 
\[ P@1(x@1)\ \texttt{\&}\ \dots\ \texttt{\&}\ P@n(x@n) \]
 | 
| 
 | 
  1366  | 
where each variable $x@i$ is of type $\tau@i$. Induction is started by
  | 
| 
 | 
  1367  | 
\begin{ttbox}
 | 
| 
 | 
  1368  | 
by(mutual_induct_tac ["\(x@1\)",\(\dots\),"\(x@n\)"] \(k\));
  | 
| 
 | 
  1369  | 
\end{ttbox}
 | 
| 
 | 
  1370  | 
  | 
| 
 | 
  1371  | 
\begin{exercise}
 | 
| 
 | 
  1372  | 
  Define a function \texttt{norma} of type \texttt{'a aexp => 'a aexp} that
 | 
| 
 | 
  1373  | 
  replaces \texttt{IF}s with complex boolean conditions by nested
 | 
| 
 | 
  1374  | 
  \texttt{IF}s where each condition is a \texttt{Less} --- \texttt{And} and
 | 
| 
 | 
  1375  | 
  \texttt{Neg} should be eliminated completely. Prove that \texttt{norma}
 | 
| 
 | 
  1376  | 
  preserves the value of an expression and that the result of \texttt{norma}
 | 
| 
 | 
  1377  | 
  is really normal, i.e.\ no more \texttt{And}s and \texttt{Neg}s occur in
 | 
| 
 | 
  1378  | 
  it.  ({\em Hint:} proceed as in \S\ref{sec:boolex}).
 | 
| 
 | 
  1379  | 
\end{exercise}
 | 
| 
 | 
  1380  | 
  | 
| 
 | 
  1381  | 
\subsection{Nested recursion}
 | 
| 
 | 
  1382  | 
  | 
| 
 | 
  1383  | 
So far, all datatypes had the property that on the right-hand side of their
  | 
| 
 | 
  1384  | 
definition they occurred only at the top-level, i.e.\ directly below a
  | 
| 
 | 
  1385  | 
constructor. This is not the case any longer for the following model of terms
  | 
| 
 | 
  1386  | 
where function symbols can be applied to a list of arguments:
  | 
| 
 | 
  1387  | 
\begin{ttbox}
 | 
| 
 | 
  1388  | 
\input{Datatype/tdata}\end{ttbox}
 | 
| 
 | 
  1389  | 
Parameter \texttt{'a} is the type of variables and \texttt{'b} the type of
 | 
| 
 | 
  1390  | 
function symbols.
  | 
| 
 | 
  1391  | 
A mathematical term like $f(x,g(y))$ becomes \texttt{App f [Var x, App g
 | 
| 
 | 
  1392  | 
  [Var y]]}, where \texttt{f}, \texttt{g}, \texttt{x}, \texttt{y} are
 | 
| 
 | 
  1393  | 
suitable values, e.g.\ numbers or strings.
  | 
| 
 | 
  1394  | 
  | 
| 
 | 
  1395  | 
What complicates the definition of \texttt{term} is the nested occurrence of
 | 
| 
 | 
  1396  | 
\texttt{term} inside \texttt{list} on the right-hand side. In principle,
 | 
| 
 | 
  1397  | 
nested recursion can be eliminated in favour of mutual recursion by unfolding
  | 
| 
 | 
  1398  | 
the offending datatypes, here \texttt{list}. The result for \texttt{term}
 | 
| 
 | 
  1399  | 
would be something like
  | 
| 
 | 
  1400  | 
\begin{ttbox}
 | 
| 
 | 
  1401  | 
\input{Datatype/tunfoldeddata}\end{ttbox}
 | 
| 
 | 
  1402  | 
Although we do not recommend this unfolding to the user, this is how Isabelle
  | 
| 
 | 
  1403  | 
deals with nested recursion internally. Strictly speaking, this information
  | 
| 
 | 
  1404  | 
is irrelevant at the user level (and might change in the future), but it
  | 
| 
6099
 | 
  1405  | 
motivates why \texttt{primrec} and induction work for nested types the way
 | 
| 
5850
 | 
  1406  | 
they do. We now return to the initial definition of \texttt{term} using
 | 
| 
 | 
  1407  | 
nested recursion.
  | 
| 
 | 
  1408  | 
  | 
| 
 | 
  1409  | 
Let us define a substitution function on terms. Because terms involve term
  | 
| 
 | 
  1410  | 
lists, we need to define two substitution functions simultaneously:
  | 
| 
 | 
  1411  | 
\begin{ttbox}
 | 
| 
 | 
  1412  | 
\input{Datatype/tconstssubst}
 | 
| 
 | 
  1413  | 
\input{Datatype/tsubst}
 | 
| 
 | 
  1414  | 
\input{Datatype/tsubsts}\end{ttbox}
 | 
| 
 | 
  1415  | 
Similarly, when proving a statement about terms inductively, we need
  | 
| 
 | 
  1416  | 
to prove a related statement about term lists simultaneously. For example,
  | 
| 
 | 
  1417  | 
the fact that the identity substitution does not change a term needs to be
  | 
| 
 | 
  1418  | 
strengthened and proved as follows:
  | 
| 
 | 
  1419  | 
\begin{quote}\small
 | 
| 
 | 
  1420  | 
\verbatiminput{Datatype/tidproof.ML}
 | 
| 
 | 
  1421  | 
\end{quote}
 | 
| 
 | 
  1422  | 
Note that \texttt{Var} is the identity substitution because by definition it
 | 
| 
 | 
  1423  | 
leaves variables unchanged: \texttt{subst Var (Var $x$) = Var $x$}. Note also
 | 
| 
 | 
  1424  | 
that the type annotations are necessary because otherwise there is nothing in
  | 
| 
6099
 | 
  1425  | 
the goal to enforce that both halves of the goal talk about the same type
  | 
| 
5850
 | 
  1426  | 
parameters \texttt{('a,'b)}. As a result, induction would fail
 | 
| 
6099
 | 
  1427  | 
because the two halves of the goal would be unrelated.
  | 
| 
5850
 | 
  1428  | 
  | 
| 
 | 
  1429  | 
\begin{exercise}
 | 
| 
6099
 | 
  1430  | 
The fact that substitution distributes over composition can be expressed
  | 
| 
 | 
  1431  | 
roughly as follows:
  | 
| 
5850
 | 
  1432  | 
\begin{ttbox}
 | 
| 
 | 
  1433  | 
subst (f o g) t = subst f (subst g t)
  | 
| 
 | 
  1434  | 
\end{ttbox}
 | 
| 
 | 
  1435  | 
Correct this statement (you will find that it does not type-check),
  | 
| 
 | 
  1436  | 
strengthen it and prove it. (Note: \texttt{o} is function composition; its
 | 
| 
 | 
  1437  | 
definition is found in the theorem \texttt{o_def}).
 | 
| 
 | 
  1438  | 
\end{exercise}
 | 
| 
 | 
  1439  | 
  | 
| 
 | 
  1440  | 
If you feel that the \texttt{App}-equation in the definition of substitution
 | 
| 
 | 
  1441  | 
is overly complicated, you are right: the simpler
  | 
| 
 | 
  1442  | 
\begin{ttbox}
 | 
| 
 | 
  1443  | 
\input{Datatype/appmap}\end{ttbox}
 | 
| 
 | 
  1444  | 
would have done the job. Unfortunately, Isabelle insists on the more verbose
  | 
| 
 | 
  1445  | 
equation given above. Nevertheless, we can easily {\em prove} that the
 | 
| 
 | 
  1446  | 
\texttt{map}-equation holds: simply by induction on \texttt{ts} followed by
 | 
| 
 | 
  1447  | 
\texttt{Auto_tac}.
 | 
| 
 | 
  1448  | 
  | 
| 
 | 
  1449  | 
%FIXME: forward pointer to section where better induction principles are
  | 
| 
 | 
  1450  | 
%derived?
  | 
| 
 | 
  1451  | 
  | 
| 
 | 
  1452  | 
\begin{exercise}
 | 
| 
 | 
  1453  | 
  Define a function \texttt{rev_term} of type \texttt{term -> term} that
 | 
| 
 | 
  1454  | 
  recursively reverses the order of arguments of all function symbols in a
  | 
| 
 | 
  1455  | 
  term. Prove that \texttt{rev_term(rev_term t) = t}.
 | 
| 
 | 
  1456  | 
\end{exercise}
 | 
| 
 | 
  1457  | 
  | 
| 
 | 
  1458  | 
Of course, you may also combine mutual and nested recursion as in the
  | 
| 
 | 
  1459  | 
following example
  | 
| 
 | 
  1460  | 
\begin{ttbox}
 | 
| 
 | 
  1461  | 
\input{Datatype/mutnested}\end{ttbox}
 | 
| 
 | 
  1462  | 
taken from an operational semantics of applied lambda terms. Note that double
  | 
| 
 | 
  1463  | 
quotes are required around the type involving \texttt{*}, as explained on
 | 
| 
 | 
  1464  | 
page~\pageref{startype}.
 | 
| 
 | 
  1465  | 
  | 
| 
 | 
  1466  | 
\subsection{The limits of nested recursion}
 | 
| 
 | 
  1467  | 
  | 
| 
 | 
  1468  | 
How far can we push nested recursion? By the unfolding argument above, we can
  | 
| 
 | 
  1469  | 
reduce nested to mutual recursion provided the nested recursion only involves
  | 
| 
 | 
  1470  | 
previously defined datatypes. Isabelle is a bit more generous and also permits
  | 
| 
 | 
  1471  | 
products as in the \texttt{data} example above.
 | 
| 
 | 
  1472  | 
However, functions are most emphatically not allowed:
  | 
| 
 | 
  1473  | 
\begin{ttbox}
 | 
| 
 | 
  1474  | 
datatype t = C (t => bool)
  | 
| 
 | 
  1475  | 
\end{ttbox}
 | 
| 
 | 
  1476  | 
is a real can of worms: in HOL it must be ruled out because it requires a
  | 
| 
 | 
  1477  | 
type \texttt{t} such that \texttt{t} and its power set \texttt{t => bool}
 | 
| 
 | 
  1478  | 
have the same cardinality---an impossibility.
  | 
| 
 | 
  1479  | 
In theory, we could allow limited forms of recursion involving function
  | 
| 
 | 
  1480  | 
spaces. For example
  | 
| 
 | 
  1481  | 
\begin{ttbox}
 | 
| 
 | 
  1482  | 
datatype t = C (nat => t) | D
  | 
| 
 | 
  1483  | 
\end{ttbox}
 | 
| 
 | 
  1484  | 
is unproblematic. However, Isabelle does not support recursion involving
  | 
| 
 | 
  1485  | 
\texttt{=>} at all at the moment.
 | 
| 
 | 
  1486  | 
  | 
| 
 | 
  1487  | 
For a theoretical analysis of what kinds of datatypes are feasible in HOL
  | 
| 
 | 
  1488  | 
see, for example,~\cite{Gunter-HOL92}. There are alternatives to pure HOL:
 | 
| 
6606
 | 
  1489  | 
LCF~\cite{paulson87} is a logic where types like
 | 
| 
5850
 | 
  1490  | 
\begin{ttbox}
 | 
| 
 | 
  1491  | 
datatype t = C (t -> t)
  | 
| 
 | 
  1492  | 
\end{ttbox}
 | 
| 
6099
 | 
  1493  | 
do indeed make sense (note the intentionally different arrow \texttt{->}!).
 | 
| 
5850
 | 
  1494  | 
There is even a version of LCF on top of HOL, called
  | 
| 
6606
 | 
  1495  | 
HOLCF~\cite{MuellerNvOS99}.
 | 
| 
5850
 | 
  1496  | 
  | 
| 
 | 
  1497  | 
\index{*primrec|)}
 | 
| 
 | 
  1498  | 
\index{*datatype|)}
 | 
| 
 | 
  1499  | 
  | 
| 
 | 
  1500  | 
\subsection{Case study: Tries}
 | 
| 
 | 
  1501  | 
  | 
| 
 | 
  1502  | 
Tries are a classic search tree data structure~\cite{Knuth3-75} for fast
 | 
| 
 | 
  1503  | 
indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a
 | 
| 
 | 
  1504  | 
trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and
  | 
| 
 | 
  1505  | 
``cat''.  When searching a string in a trie, the letters of the string are
  | 
| 
 | 
  1506  | 
examined sequentially. Each letter determines which subtrie to search next.
  | 
| 
 | 
  1507  | 
In this case study we model tries as a datatype, define a lookup and an
  | 
| 
 | 
  1508  | 
update function, and prove that they behave as expected.
  | 
| 
 | 
  1509  | 
  | 
| 
 | 
  1510  | 
\begin{figure}[htbp]
 | 
| 
 | 
  1511  | 
\begin{center}
 | 
| 
 | 
  1512  | 
\unitlength1mm
  | 
| 
 | 
  1513  | 
\begin{picture}(60,30)
 | 
| 
 | 
  1514  | 
\put( 5, 0){\makebox(0,0)[b]{l}}
 | 
| 
 | 
  1515  | 
\put(25, 0){\makebox(0,0)[b]{e}}
 | 
| 
 | 
  1516  | 
\put(35, 0){\makebox(0,0)[b]{n}}
 | 
| 
 | 
  1517  | 
\put(45, 0){\makebox(0,0)[b]{r}}
 | 
| 
 | 
  1518  | 
\put(55, 0){\makebox(0,0)[b]{t}}
 | 
| 
 | 
  1519  | 
%
  | 
| 
 | 
  1520  | 
\put( 5, 9){\line(0,-1){5}}
 | 
| 
 | 
  1521  | 
\put(25, 9){\line(0,-1){5}}
 | 
| 
 | 
  1522  | 
\put(44, 9){\line(-3,-2){9}}
 | 
| 
 | 
  1523  | 
\put(45, 9){\line(0,-1){5}}
 | 
| 
 | 
  1524  | 
\put(46, 9){\line(3,-2){9}}
 | 
| 
 | 
  1525  | 
%
  | 
| 
 | 
  1526  | 
\put( 5,10){\makebox(0,0)[b]{l}}
 | 
| 
 | 
  1527  | 
\put(15,10){\makebox(0,0)[b]{n}}
 | 
| 
 | 
  1528  | 
\put(25,10){\makebox(0,0)[b]{p}}
 | 
| 
 | 
  1529  | 
\put(45,10){\makebox(0,0)[b]{a}}
 | 
| 
 | 
  1530  | 
%
  | 
| 
 | 
  1531  | 
\put(14,19){\line(-3,-2){9}}
 | 
| 
 | 
  1532  | 
\put(15,19){\line(0,-1){5}}
 | 
| 
 | 
  1533  | 
\put(16,19){\line(3,-2){9}}
 | 
| 
 | 
  1534  | 
\put(45,19){\line(0,-1){5}}
 | 
| 
 | 
  1535  | 
%
  | 
| 
 | 
  1536  | 
\put(15,20){\makebox(0,0)[b]{a}}
 | 
| 
 | 
  1537  | 
\put(45,20){\makebox(0,0)[b]{c}}
 | 
| 
 | 
  1538  | 
%
  | 
| 
 | 
  1539  | 
\put(30,30){\line(-3,-2){13}}
 | 
| 
 | 
  1540  | 
\put(30,30){\line(3,-2){13}}
 | 
| 
 | 
  1541  | 
\end{picture}
 | 
| 
 | 
  1542  | 
\end{center}
 | 
| 
 | 
  1543  | 
\caption{A sample trie}
 | 
| 
 | 
  1544  | 
\label{fig:trie}
 | 
| 
 | 
  1545  | 
\end{figure}
 | 
| 
 | 
  1546  | 
  | 
| 
6099
 | 
  1547  | 
Proper tries associate some value with each string. Since the
  | 
| 
5850
 | 
  1548  | 
information is stored only in the final node associated with the string, many
  | 
| 
 | 
  1549  | 
nodes do not carry any value. This distinction is captured by the
  | 
| 
 | 
  1550  | 
following predefined datatype:
  | 
| 
 | 
  1551  | 
\begin{ttbox}
 | 
| 
 | 
  1552  | 
datatype 'a option = None | Some 'a
  | 
| 
6099
 | 
  1553  | 
\end{ttbox}\indexbold{*option}\indexbold{*None}\indexbold{*Some}
 | 
| 
5850
 | 
  1554  | 
  | 
| 
 | 
  1555  | 
To minimize running time, each node of a trie should contain an array that maps
  | 
| 
6099
 | 
  1556  | 
letters to subtries. We have chosen a more space efficient representation
  | 
| 
5850
 | 
  1557  | 
where the subtries are held in an association list, i.e.\ a list of
  | 
| 
 | 
  1558  | 
(letter,trie) pairs.  Abstracting over the alphabet \texttt{'a} and the
 | 
| 
 | 
  1559  | 
values \texttt{'v} we define a trie as follows:
 | 
| 
 | 
  1560  | 
\begin{ttbox}
 | 
| 
 | 
  1561  | 
\input{Datatype/trie}\end{ttbox}
 | 
| 
 | 
  1562  | 
The first component is the optional value, the second component the
  | 
| 
 | 
  1563  | 
association list of subtries. We define two corresponding selector functions:
  | 
| 
 | 
  1564  | 
\begin{ttbox}
 | 
| 
 | 
  1565  | 
\input{Datatype/triesels}\end{ttbox}
 | 
| 
 | 
  1566  | 
Association lists come with a generic lookup function:
  | 
| 
 | 
  1567  | 
\begin{ttbox}
 | 
| 
 | 
  1568  | 
\input{Datatype/assoc}\end{ttbox}
 | 
| 
 | 
  1569  | 
  | 
| 
 | 
  1570  | 
Now we can define the lookup function for tries. It descends into the trie
  | 
| 
 | 
  1571  | 
examining the letters of the search string one by one. As
  | 
| 
 | 
  1572  | 
recursion on lists is simpler than on tries, let us express this as primitive
  | 
| 
 | 
  1573  | 
recursion on the search string argument:
  | 
| 
 | 
  1574  | 
\begin{ttbox}
 | 
| 
 | 
  1575  | 
\input{Datatype/lookup}\end{ttbox}
 | 
| 
 | 
  1576  | 
As a first simple property we prove that looking up a string in the empty
  | 
| 
 | 
  1577  | 
trie \texttt{Trie~None~[]} always returns \texttt{None}. The proof merely
 | 
| 
 | 
  1578  | 
distinguishes the two cases whether the search string is empty or not:
  | 
| 
 | 
  1579  | 
\begin{ttbox}
 | 
| 
 | 
  1580  | 
\input{Datatype/lookupempty.ML}\end{ttbox}
 | 
| 
 | 
  1581  | 
This lemma is added to the simpset as usual.
  | 
| 
 | 
  1582  | 
  | 
| 
 | 
  1583  | 
Things begin to get interesting with the definition of an update function
  | 
| 
 | 
  1584  | 
that adds a new (string,value) pair to a trie, overwriting the old value
  | 
| 
 | 
  1585  | 
associated with that string:
  | 
| 
 | 
  1586  | 
\begin{ttbox}
 | 
| 
 | 
  1587  | 
\input{Datatype/update}\end{ttbox}
 | 
| 
 | 
  1588  | 
The base case is obvious. In the recursive case the subtrie
  | 
| 
 | 
  1589  | 
\texttt{tt} associated with the first letter \texttt{a} is extracted,
 | 
| 
 | 
  1590  | 
recursively updated, and then placed in front of the association list.
  | 
| 
 | 
  1591  | 
The old subtrie associated with \texttt{a} is still in the association list
 | 
| 
 | 
  1592  | 
but no longer accessible via \texttt{assoc}. Clearly, there is room here for
 | 
| 
 | 
  1593  | 
optimizations!
  | 
| 
 | 
  1594  | 
  | 
| 
 | 
  1595  | 
Our main goal is to prove the correct interaction of \texttt{update} and
 | 
| 
 | 
  1596  | 
\texttt{lookup}:
 | 
| 
 | 
  1597  | 
\begin{quote}\small
 | 
| 
 | 
  1598  | 
\verbatiminput{Datatype/triemain.ML}
 | 
| 
 | 
  1599  | 
\end{quote}
 | 
| 
 | 
  1600  | 
Our plan will be to induct on \texttt{as}; hence the remaining variables are
 | 
| 
 | 
  1601  | 
quantified. From the definitions it is clear that induction on either
  | 
| 
 | 
  1602  | 
\texttt{as} or \texttt{bs} is required. The choice of \texttt{as} is merely
 | 
| 
 | 
  1603  | 
guided by the intuition that simplification of \texttt{lookup} might be easier
 | 
| 
 | 
  1604  | 
if \texttt{update} has already been simplified, which can only happen if
 | 
| 
 | 
  1605  | 
\texttt{as} is instantiated.
 | 
| 
 | 
  1606  | 
The start of the proof is completely conventional:
  | 
| 
 | 
  1607  | 
\begin{ttbox}
 | 
| 
 | 
  1608  | 
\input{Datatype/trieinduct.ML}\end{ttbox}
 | 
| 
 | 
  1609  | 
Unfortunately, this time we are left with three intimidating looking subgoals:
  | 
| 
 | 
  1610  | 
\begin{ttbox}
 | 
| 
 | 
  1611  | 
{\out 1. ... ==> ... lookup (...) bs = lookup t bs}
 | 
| 
 | 
  1612  | 
{\out 2. ... ==> ... lookup (...) bs = lookup t bs}
 | 
| 
 | 
  1613  | 
{\out 3. ... ==> ... lookup (...) bs = lookup t bs}
 | 
| 
 | 
  1614  | 
\end{ttbox}
 | 
| 
 | 
  1615  | 
Clearly, if we want to make headway we have to instantiate \texttt{bs} as
 | 
| 
 | 
  1616  | 
well now. It turns out that instead of induction, case distinction
  | 
| 
 | 
  1617  | 
suffices:
  | 
| 
 | 
  1618  | 
\begin{ttbox}
 | 
| 
 | 
  1619  | 
\input{Datatype/trieexhaust.ML}\end{ttbox}
 | 
| 
 | 
  1620  | 
The {\em tactical} \texttt{ALLGOALS} merely applies the tactic following it
 | 
| 
 | 
  1621  | 
to all subgoals, which results in a total of six subgoals; \texttt{Auto_tac}
 | 
| 
 | 
  1622  | 
solves them all.
  | 
| 
 | 
  1623  | 
  | 
| 
 | 
  1624  | 
This proof may look surprisingly straightforward. The reason is that we
  | 
| 
 | 
  1625  | 
have told the simplifier (without telling the reader) to expand all
  | 
| 
 | 
  1626  | 
\texttt{let}s and to split all \texttt{case}-constructs over options before
 | 
| 
 | 
  1627  | 
we started on the main goal:
  | 
| 
 | 
  1628  | 
\begin{ttbox}
 | 
| 
 | 
  1629  | 
\input{Datatype/trieAdds.ML}\end{ttbox}
 | 
| 
 | 
  1630  | 
  | 
| 
 | 
  1631  | 
\begin{exercise}
 | 
| 
 | 
  1632  | 
  Write an improved version of \texttt{update} that does not suffer from the
 | 
| 
 | 
  1633  | 
  space leak in the version above. Prove the main theorem for your improved
  | 
| 
 | 
  1634  | 
  \texttt{update}.
 | 
| 
 | 
  1635  | 
\end{exercise}
 | 
| 
 | 
  1636  | 
  | 
| 
 | 
  1637  | 
\begin{exercise}
 | 
| 
 | 
  1638  | 
  Modify \texttt{update} so that it can also {\em delete} entries from a
 | 
| 
 | 
  1639  | 
  trie. It is up to you if you want to shrink tries if possible. Prove (a
  | 
| 
 | 
  1640  | 
  modified version of) the main theorem above.
  | 
| 
 | 
  1641  | 
\end{exercise}
 | 
| 
 | 
  1642  | 
  | 
| 
5375
 | 
  1643  | 
\section{Total recursive functions}
 | 
| 
 | 
  1644  | 
\label{sec:recdef}
 | 
| 
 | 
  1645  | 
\index{*recdef|(}
 | 
| 
 | 
  1646  | 
  | 
| 
 | 
  1647  | 
  | 
| 
 | 
  1648  | 
Although many total functions have a natural primitive recursive definition,
  | 
| 
 | 
  1649  | 
this is not always the case. Arbitrary total recursive functions can be
  | 
| 
 | 
  1650  | 
defined by means of \texttt{recdef}: you can use full pattern-matching,
 | 
| 
 | 
  1651  | 
recursion need not involve datatypes, and termination is proved by showing
  | 
| 
 | 
  1652  | 
that each recursive call makes the argument smaller in a suitable (user
  | 
| 
 | 
  1653  | 
supplied) sense.
  | 
| 
 | 
  1654  | 
  | 
| 
 | 
  1655  | 
\subsection{Defining recursive functions}
 | 
| 
 | 
  1656  | 
  | 
| 
 | 
  1657  | 
Here is a simple example, the Fibonacci function:
  | 
| 
 | 
  1658  | 
\begin{ttbox}
 | 
| 
6099
 | 
  1659  | 
\input{Recdef/fib}\end{ttbox}
 | 
| 
5375
 | 
  1660  | 
The definition of \texttt{fib} is accompanied by a \bfindex{measure function}
 | 
| 
 | 
  1661  | 
\texttt{\%n.$\;$n} that maps the argument of \texttt{fib} to a natural
 | 
| 
 | 
  1662  | 
number. The requirement is that in each equation the measure of the argument
  | 
| 
 | 
  1663  | 
on the left-hand side is strictly greater than the measure of the argument of
  | 
| 
 | 
  1664  | 
each recursive call. In the case of \texttt{fib} this is obviously true
 | 
| 
 | 
  1665  | 
because the measure function is the identity and \texttt{Suc(Suc~x)} is
 | 
| 
 | 
  1666  | 
strictly greater than both \texttt{x} and \texttt{Suc~x}.
 | 
| 
 | 
  1667  | 
  | 
| 
 | 
  1668  | 
Slightly more interesting is the insertion of a fixed element
  | 
| 
 | 
  1669  | 
between any two elements of a list:
  | 
| 
 | 
  1670  | 
\begin{ttbox}
 | 
| 
6099
 | 
  1671  | 
\input{Recdef/sep1}\end{ttbox}
 | 
| 
5375
 | 
  1672  | 
This time the measure is the length of the list, which decreases with the
  | 
| 
 | 
  1673  | 
recursive call; the first component of the argument tuple is irrelevant.
  | 
| 
 | 
  1674  | 
  | 
| 
 | 
  1675  | 
Pattern matching need not be exhaustive:
  | 
| 
 | 
  1676  | 
\begin{ttbox}
 | 
| 
6099
 | 
  1677  | 
\input{Recdef/last}\end{ttbox}
 | 
| 
5375
 | 
  1678  | 
  | 
| 
 | 
  1679  | 
Overlapping patterns are disambiguated by taking the order of equations into
  | 
| 
 | 
  1680  | 
account, just as in functional programming:
  | 
| 
6099
 | 
  1681  | 
\begin{ttbox}
 | 
| 
 | 
  1682  | 
\input{Recdef/sep1}\end{ttbox}
 | 
| 
5375
 | 
  1683  | 
This defines exactly the same function \texttt{sep} as further above.
 | 
| 
 | 
  1684  | 
  | 
| 
 | 
  1685  | 
\begin{warn}
 | 
| 
6099
 | 
  1686  | 
  Currently \texttt{recdef} only takes the first argument of a (curried)
 | 
| 
 | 
  1687  | 
  recursive function into account. This means both the termination measure
  | 
| 
 | 
  1688  | 
  and pattern matching can only use that first argument. In general, you will
  | 
| 
 | 
  1689  | 
  therefore have to combine several arguments into a tuple. In case only one
  | 
| 
 | 
  1690  | 
  argument is relevant for termination, you can also rearrange the order of
  | 
| 
 | 
  1691  | 
  arguments as in
  | 
| 
 | 
  1692  | 
\begin{ttbox}
 | 
| 
 | 
  1693  | 
\input{Recdef/sep2}\end{ttbox}
 | 
| 
5375
 | 
  1694  | 
\end{warn}
 | 
| 
 | 
  1695  | 
  | 
| 
 | 
  1696  | 
When loading a theory containing a \texttt{recdef} of a function $f$,
 | 
| 
 | 
  1697  | 
Isabelle proves the recursion equations and stores the result as a list of
  | 
| 
 | 
  1698  | 
theorems $f$.\texttt{rules}. It can be viewed by typing
 | 
| 
 | 
  1699  | 
\begin{ttbox}
 | 
| 
 | 
  1700  | 
prths \(f\).rules;
  | 
| 
 | 
  1701  | 
\end{ttbox}
 | 
| 
 | 
  1702  | 
All of the above examples are simple enough that Isabelle can determine
  | 
| 
 | 
  1703  | 
automatically that the measure of the argument goes down in each recursive
  | 
| 
 | 
  1704  | 
call. In that case $f$.\texttt{rules} contains precisely the defining
 | 
| 
 | 
  1705  | 
equations.
  | 
| 
 | 
  1706  | 
  | 
| 
 | 
  1707  | 
In general, Isabelle may not be able to prove all termination conditions
  | 
| 
 | 
  1708  | 
automatically. For example, termination of
  | 
| 
 | 
  1709  | 
\begin{ttbox}
 | 
| 
6099
 | 
  1710  | 
\input{Recdef/constsgcd}recdef gcd "measure ((\%(m,n).n))"
 | 
| 
 | 
  1711  | 
  "gcd (m, n) = (if n=0 then m else gcd(n, m mod n))"
  | 
| 
5375
 | 
  1712  | 
\end{ttbox}
 | 
| 
 | 
  1713  | 
relies on the lemma \texttt{mod_less_divisor}
 | 
| 
 | 
  1714  | 
\begin{ttbox}
 | 
| 
 | 
  1715  | 
0 < n ==> m mod n < n
  | 
| 
 | 
  1716  | 
\end{ttbox}
 | 
| 
 | 
  1717  | 
that is not part of the default simpset. As a result, Isabelle prints a
  | 
| 
 | 
  1718  | 
warning and \texttt{gcd.rules} contains a precondition:
 | 
| 
 | 
  1719  | 
\begin{ttbox}
 | 
| 
 | 
  1720  | 
(! m n. 0 < n --> m mod n < n) ==> gcd (m, n) = (if n=0 \dots)
  | 
| 
 | 
  1721  | 
\end{ttbox}
 | 
| 
 | 
  1722  | 
We need to instruct \texttt{recdef} to use an extended simpset to prove the
 | 
| 
 | 
  1723  | 
termination condition:
  | 
| 
 | 
  1724  | 
\begin{ttbox}
 | 
| 
6099
 | 
  1725  | 
\input{Recdef/gcd}\end{ttbox}
 | 
| 
5375
 | 
  1726  | 
This time everything works fine and \texttt{gcd.rules} contains precisely the
 | 
| 
 | 
  1727  | 
stated recursion equation for \texttt{gcd}.
 | 
| 
 | 
  1728  | 
  | 
| 
 | 
  1729  | 
When defining some nontrivial total recursive function, the first attempt
  | 
| 
 | 
  1730  | 
will usually generate a number of termination conditions, some of which may
  | 
| 
 | 
  1731  | 
require new lemmas to be proved in some of the parent theories. Those lemmas
  | 
| 
 | 
  1732  | 
can then be added to the simpset used by \texttt{recdef} for its
 | 
| 
 | 
  1733  | 
proofs, as shown for \texttt{gcd}.
 | 
| 
 | 
  1734  | 
  | 
| 
 | 
  1735  | 
Although all the above examples employ measure functions, \texttt{recdef}
 | 
| 
 | 
  1736  | 
allows arbitrary wellfounded relations. For example, termination of
  | 
| 
 | 
  1737  | 
Ackermann's function requires the lexicographic product \texttt{**}:
 | 
| 
 | 
  1738  | 
\begin{ttbox}
 | 
| 
6099
 | 
  1739  | 
\input{Recdef/ack}\end{ttbox}
 | 
| 
6606
 | 
  1740  | 
For details see the manual~\cite{isabelle-HOL} and the examples in the
 | 
| 
6099
 | 
  1741  | 
library.
  | 
| 
5375
 | 
  1742  | 
  | 
| 
 | 
  1743  | 
  | 
| 
 | 
  1744  | 
\subsection{Deriving simplification rules}
 | 
| 
 | 
  1745  | 
  | 
| 
6099
 | 
  1746  | 
Once we have succeeded in proving all termination conditions, we can start to
  | 
| 
5375
 | 
  1747  | 
derive some theorems. In contrast to \texttt{primrec} definitions, which are
 | 
| 
 | 
  1748  | 
automatically added to the simpset, \texttt{recdef} rules must be included
 | 
| 
 | 
  1749  | 
explicitly, for example as in
  | 
| 
 | 
  1750  | 
\begin{ttbox}
 | 
| 
 | 
  1751  | 
Addsimps fib.rules;
  | 
| 
 | 
  1752  | 
\end{ttbox}
 | 
| 
 | 
  1753  | 
However, some care is necessary now, in contrast to \texttt{primrec}.
 | 
| 
 | 
  1754  | 
Although \texttt{gcd} is a total function, its defining equation leads to
 | 
| 
 | 
  1755  | 
nontermination of the simplifier, because the subterm \texttt{gcd(n, m mod
 | 
| 
 | 
  1756  | 
  n)} on the right-hand side can again be simplified by the same equation,
  | 
| 
 | 
  1757  | 
and so on. The reason: the simplifier rewrites the \texttt{then} and
 | 
| 
 | 
  1758  | 
\texttt{else} branches of a conditional if the condition simplifies to
 | 
| 
 | 
  1759  | 
neither \texttt{True} nor \texttt{False}.  Therefore it is recommended to
 | 
| 
 | 
  1760  | 
derive an alternative formulation that replaces case distinctions on the
  | 
| 
 | 
  1761  | 
right-hand side by conditional equations. For \texttt{gcd} it means we have
 | 
| 
 | 
  1762  | 
to prove
  | 
| 
 | 
  1763  | 
\begin{ttbox}
 | 
| 
 | 
  1764  | 
           gcd (m, 0) = m
  | 
| 
 | 
  1765  | 
n ~= 0 ==> gcd (m, n) = gcd(n, m mod n)
  | 
| 
 | 
  1766  | 
\end{ttbox}
 | 
| 
 | 
  1767  | 
To avoid nontermination during those proofs, we have to resort to some low
  | 
| 
 | 
  1768  | 
level tactics:
  | 
| 
 | 
  1769  | 
\begin{ttbox}
 | 
| 
 | 
  1770  | 
Goal "gcd(m,0) = m";
  | 
| 
 | 
  1771  | 
by(resolve_tac [trans] 1);
  | 
| 
 | 
  1772  | 
by(resolve_tac gcd.rules 1);
  | 
| 
 | 
  1773  | 
by(Simp_tac 1);
  | 
| 
 | 
  1774  | 
\end{ttbox}
 | 
| 
 | 
  1775  | 
At this point it is not necessary to understand what exactly
  | 
| 
 | 
  1776  | 
\texttt{resolve_tac} is doing. The main point is that the above proof works
 | 
| 
 | 
  1777  | 
not just for this one example but in general (except that we have to use
  | 
| 
 | 
  1778  | 
\texttt{Asm_simp_tac} and $f$\texttt{.rules} in general). Try the second
 | 
| 
 | 
  1779  | 
\texttt{gcd}-equation above!
 | 
| 
 | 
  1780  | 
  | 
| 
 | 
  1781  | 
\subsection{Induction}
 | 
| 
 | 
  1782  | 
  | 
| 
 | 
  1783  | 
Assuming we have added the recursion equations (or some suitable derived
  | 
| 
 | 
  1784  | 
equations) to the simpset, we might like to prove something about our
  | 
| 
 | 
  1785  | 
function. Since the function is recursive, the natural proof principle is
  | 
| 
 | 
  1786  | 
again induction. But this time the structural form of induction that comes
  | 
| 
 | 
  1787  | 
with datatypes is unlikely to work well---otherwise we could have defined the
  | 
| 
 | 
  1788  | 
function by \texttt{primrec}. Therefore \texttt{recdef} automatically proves
 | 
| 
 | 
  1789  | 
a suitable induction rule $f$\texttt{.induct} that follows the recursion
 | 
| 
 | 
  1790  | 
pattern of the particular function $f$. Roughly speaking, it requires you to
  | 
| 
 | 
  1791  | 
prove for each \texttt{recdef} equation that the property you are trying to
 | 
| 
 | 
  1792  | 
establish holds for the left-hand side provided it holds for all recursive
  | 
| 
 | 
  1793  | 
calls on the right-hand side. Applying $f$\texttt{.induct} requires its
 | 
| 
 | 
  1794  | 
explicit instantiation. See \S\ref{sec:explicit-inst} for details.
 | 
| 
 | 
  1795  | 
  | 
| 
 | 
  1796  | 
\index{*recdef|)}
 |