| 5375 |      1 | \chapter{Functional Programming in HOL}
 | 
|  |      2 | 
 | 
|  |      3 | Although on the surface this chapter is mainly concerned with how to write
 | 
|  |      4 | functional programs in HOL and how to verify them, most of the
 | 
|  |      5 | constructs and proof procedures introduced are general purpose and recur in
 | 
|  |      6 | any specification or verification task.
 | 
|  |      7 | 
 | 
|  |      8 | The dedicated functional programmer should be warned: HOL offers only what
 | 
|  |      9 | could be called {\em total functional programming} --- all functions in HOL
 | 
|  |     10 | must be total; lazy data structures are not directly available. On the
 | 
|  |     11 | positive side, functions in HOL need not be computable: HOL is a
 | 
|  |     12 | specification language that goes well beyond what can be expressed as a
 | 
|  |     13 | program. However, for the time being we concentrate on the computable.
 | 
|  |     14 | 
 | 
|  |     15 | \section{An introductory theory}
 | 
|  |     16 | \label{sec:intro-theory}
 | 
|  |     17 | 
 | 
|  |     18 | Functional programming needs datatypes and functions. Both of them can be
 | 
|  |     19 | defined in a theory with a syntax reminiscent of languages like ML or
 | 
|  |     20 | Haskell. As an example consider the theory in Fig.~\ref{fig:ToyList}.
 | 
|  |     21 | 
 | 
|  |     22 | \begin{figure}[htbp]
 | 
|  |     23 | \begin{ttbox}\makeatother
 | 
|  |     24 | \input{ToyList/ToyList.thy}\end{ttbox}
 | 
|  |     25 | \caption{A theory of lists}
 | 
|  |     26 | \label{fig:ToyList}
 | 
|  |     27 | \end{figure}
 | 
|  |     28 | 
 | 
|  |     29 | HOL already has a predefined theory of lists called \texttt{List} ---
 | 
|  |     30 | \texttt{ToyList} is merely a small fragment of it chosen as an example. In
 | 
|  |     31 | contrast to what is recommended in \S\ref{sec:Basic:Theories},
 | 
|  |     32 | \texttt{ToyList} is not based on \texttt{Main} but on \texttt{Datatype}, a
 | 
|  |     33 | theory that contains everything required for datatype definitions but does
 | 
|  |     34 | not have \texttt{List} as a parent, thus avoiding ambiguities caused by
 | 
|  |     35 | defining lists twice.
 | 
|  |     36 | 
 | 
|  |     37 | The \ttindexbold{datatype} \texttt{list} introduces two constructors
 | 
|  |     38 | \texttt{Nil} and \texttt{Cons}, the empty list and the operator that adds an
 | 
|  |     39 | element to the front of a list. For example, the term \texttt{Cons True (Cons
 | 
|  |     40 |   False Nil)} is a value of type \texttt{bool~list}, namely the list with the
 | 
|  |     41 | elements \texttt{True} and \texttt{False}. Because this notation becomes
 | 
|  |     42 | unwieldy very quickly, the datatype declaration is annotated with an
 | 
|  |     43 | alternative syntax: instead of \texttt{Nil} and \texttt{Cons}~$x$~$xs$ we can
 | 
|  |     44 | write \index{#@{\tt[]}|bold}\texttt{[]} and
 | 
|  |     45 | \texttt{$x$~\#~$xs$}\index{#@{\tt\#}|bold}. In fact, this alternative syntax
 | 
|  |     46 | is the standard syntax. Thus the list \texttt{Cons True (Cons False Nil)}
 | 
|  |     47 | becomes \texttt{True \# False \# []}. The annotation \ttindexbold{infixr}
 | 
|  |     48 | means that \texttt{\#} associates to the right, i.e.\ the term \texttt{$x$ \#
 | 
|  |     49 |   $y$ \# $z$} is read as \texttt{$x$ \# ($y$ \# $z$)} and not as \texttt{($x$
 | 
|  |     50 |   \# $y$) \# $z$}.
 | 
|  |     51 | 
 | 
|  |     52 | \begin{warn}
 | 
|  |     53 |   Syntax annotations are a powerful but completely optional feature. You
 | 
|  |     54 |   could drop them from theory \texttt{ToyList} and go back to the identifiers
 | 
|  |     55 |   \texttt{Nil} and \texttt{Cons}. However, lists are such a central datatype
 | 
|  |     56 |   that their syntax is highly customized. We recommend that novices should
 | 
|  |     57 |   not use syntax annotations in their own theories.
 | 
|  |     58 | \end{warn}
 | 
|  |     59 | 
 | 
|  |     60 | Next, the functions \texttt{app} and \texttt{rev} are declared. In contrast
 | 
|  |     61 | to ML, Isabelle insists on explicit declarations of all functions (keyword
 | 
|  |     62 | \ttindexbold{consts}).  (Apart from the declaration-before-use restriction,
 | 
|  |     63 | the order of items in a theory file is unconstrained.) Function \texttt{app}
 | 
|  |     64 | is annotated with concrete syntax too. Instead of the prefix syntax
 | 
|  |     65 | \texttt{app}~$xs$~$ys$ the infix $xs$~\texttt{\at}~$ys$ becomes the preferred
 | 
|  |     66 | form.
 | 
|  |     67 | 
 | 
|  |     68 | Both functions are defined recursively. The equations for \texttt{app} and
 | 
|  |     69 | \texttt{rev} hardly need comments: \texttt{app} appends two lists and
 | 
| 5850 |     70 | \texttt{rev} reverses a list.  The keyword \ttindex{primrec} indicates that
 | 
| 5375 |     71 | the recursion is of a particularly primitive kind where each recursive call
 | 
|  |     72 | peels off a datatype constructor from one of the arguments (see
 | 
|  |     73 | \S\ref{sec:datatype}).  Thus the recursion always terminates, i.e.\ the
 | 
|  |     74 | function is \bfindex{total}.
 | 
|  |     75 | 
 | 
|  |     76 | The termination requirement is absolutely essential in HOL, a logic of total
 | 
|  |     77 | functions. If we were to drop it, inconsistencies could quickly arise: the
 | 
|  |     78 | ``definition'' $f(n) = f(n)+1$ immediately leads to $0 = 1$ by subtracting
 | 
|  |     79 | $f(n)$ on both sides.
 | 
|  |     80 | % However, this is a subtle issue that we cannot discuss here further.
 | 
|  |     81 | 
 | 
|  |     82 | \begin{warn}
 | 
|  |     83 |   As we have indicated, the desire for total functions is not a gratuitously
 | 
|  |     84 |   imposed restriction but an essential characteristic of HOL. It is only
 | 
|  |     85 |   because of totality that reasoning in HOL is comparatively easy.  More
 | 
|  |     86 |   generally, the philosophy in HOL is not to allow arbitrary axioms (such as
 | 
|  |     87 |   function definitions whose totality has not been proved) because they
 | 
|  |     88 |   quickly lead to inconsistencies. Instead, fixed constructs for introducing
 | 
|  |     89 |   types and functions are offered (such as \texttt{datatype} and
 | 
|  |     90 |   \texttt{primrec}) which are guaranteed to preserve consistency.
 | 
|  |     91 | \end{warn}
 | 
|  |     92 | 
 | 
|  |     93 | A remark about syntax.  The textual definition of a theory follows a fixed
 | 
|  |     94 | syntax with keywords like \texttt{datatype} and \texttt{end} (see
 | 
|  |     95 | Fig.~\ref{fig:keywords} in Appendix~\ref{sec:Appendix} for a full list).
 | 
|  |     96 | Embedded in this syntax are the types and formulae of HOL, whose syntax is
 | 
|  |     97 | extensible, e.g.\ by new user-defined infix operators
 | 
|  |     98 | (see~\ref{sec:infix-syntax}). To distinguish the two levels, everything
 | 
|  |     99 | HOL-specific should be enclosed in \texttt{"}\dots\texttt{"}. The same holds
 | 
|  |    100 | for identifiers that happen to be keywords, as in
 | 
|  |    101 | \begin{ttbox}
 | 
|  |    102 | consts "end" :: 'a list => 'a
 | 
|  |    103 | \end{ttbox}
 | 
|  |    104 | To lessen this burden, quotation marks around types can be dropped,
 | 
|  |    105 | provided their syntax does not go beyond what is described in
 | 
|  |    106 | \S\ref{sec:TypesTermsForms}. Types containing further operators, e.g.\
 | 
| 5850 |    107 | \label{startype} \texttt{*} for Cartesian products, need quotation marks.
 | 
| 5375 |    108 | 
 | 
|  |    109 | When Isabelle prints a syntax error message, it refers to the HOL syntax as
 | 
|  |    110 | the \bfindex{inner syntax}.
 | 
|  |    111 | 
 | 
|  |    112 | \section{An introductory proof}
 | 
|  |    113 | \label{sec:intro-proof}
 | 
|  |    114 | 
 | 
|  |    115 | Having defined \texttt{ToyList}, we load it with the ML command
 | 
|  |    116 | \begin{ttbox}
 | 
|  |    117 | use_thy "ToyList";
 | 
|  |    118 | \end{ttbox}
 | 
|  |    119 | and are ready to prove a few simple theorems. This will illustrate not just
 | 
|  |    120 | the basic proof commands but also the typical proof process.
 | 
|  |    121 | 
 | 
|  |    122 | \subsubsection*{Main goal: \texttt{rev(rev xs) = xs}}
 | 
|  |    123 | 
 | 
|  |    124 | Our goal is to show that reversing a list twice produces the original
 | 
|  |    125 | list. Typing
 | 
|  |    126 | \begin{ttbox}
 | 
|  |    127 | \input{ToyList/thm}\end{ttbox}
 | 
|  |    128 | establishes a new goal to be proved in the context of the current theory,
 | 
|  |    129 | which is the one we just loaded. Isabelle's response is to print the current proof state:
 | 
|  |    130 | \begin{ttbox}
 | 
|  |    131 | {\out Level 0}
 | 
|  |    132 | {\out rev (rev xs) = xs}
 | 
|  |    133 | {\out  1. rev (rev xs) = xs}
 | 
|  |    134 | \end{ttbox}
 | 
|  |    135 | Until we have finished a proof, the proof state always looks like this:
 | 
|  |    136 | \begin{ttbox}
 | 
|  |    137 | {\out Level \(i\)}
 | 
|  |    138 | {\out \(G\)}
 | 
|  |    139 | {\out  1. \(G@1\)}
 | 
|  |    140 | {\out  \(\vdots\)}
 | 
|  |    141 | {\out  \(n\). \(G@n\)}
 | 
|  |    142 | \end{ttbox}
 | 
|  |    143 | where \texttt{Level}~$i$ indicates that we are $i$ steps into the proof, $G$
 | 
|  |    144 | is the overall goal that we are trying to prove, and the numbered lines
 | 
|  |    145 | contain the subgoals $G@1$, \dots, $G@n$ that we need to prove to establish
 | 
|  |    146 | $G$. At \texttt{Level 0} there is only one subgoal, which is identical with
 | 
|  |    147 | the overall goal.  Normally $G$ is constant and only serves as a reminder.
 | 
|  |    148 | Hence we rarely show it in this tutorial.
 | 
|  |    149 | 
 | 
|  |    150 | Let us now get back to \texttt{rev(rev xs) = xs}. Properties of recursively
 | 
|  |    151 | defined functions are best established by induction. In this case there is
 | 
|  |    152 | not much choice except to induct on \texttt{xs}:
 | 
|  |    153 | \begin{ttbox}
 | 
|  |    154 | \input{ToyList/inductxs}\end{ttbox}
 | 
|  |    155 | This tells Isabelle to perform induction on variable \texttt{xs} in subgoal
 | 
|  |    156 | 1. The new proof state contains two subgoals, namely the base case
 | 
|  |    157 | (\texttt{Nil}) and the induction step (\texttt{Cons}):
 | 
|  |    158 | \begin{ttbox}
 | 
|  |    159 | {\out 1. rev (rev []) = []}
 | 
|  |    160 | {\out 2. !!a list. rev (rev list) = list ==> rev (rev (a # list)) = a # list}
 | 
|  |    161 | \end{ttbox}
 | 
|  |    162 | The induction step is an example of the general format of a subgoal:
 | 
|  |    163 | \begin{ttbox}
 | 
|  |    164 | {\out  \(i\). !!\(x@1 \dots x@n\). {\it assumptions} ==> {\it conclusion}}
 | 
|  |    165 | \end{ttbox}\index{==>@{\tt==>}|bold}
 | 
|  |    166 | The prefix of bound variables \texttt{!!\(x@1 \dots x@n\)} can be ignored
 | 
|  |    167 | most of the time, or simply treated as a list of variables local to this
 | 
|  |    168 | subgoal. Their deeper significance is explained in \S\ref{sec:PCproofs}.  The
 | 
|  |    169 | {\it assumptions} are the local assumptions for this subgoal and {\it
 | 
|  |    170 |   conclusion} is the actual proposition to be proved. Typical proof steps
 | 
|  |    171 | that add new assumptions are induction or case distinction. In our example
 | 
|  |    172 | the only assumption is the induction hypothesis \texttt{rev (rev list) =
 | 
|  |    173 |   list}, where \texttt{list} is a variable name chosen by Isabelle. If there
 | 
|  |    174 | are multiple assumptions, they are enclosed in the bracket pair
 | 
|  |    175 | \texttt{[|}\index{==>@\ttlbr|bold} and \texttt{|]}\index{==>@\ttrbr|bold}
 | 
|  |    176 | and separated by semicolons.
 | 
|  |    177 | 
 | 
|  |    178 | Let us try to solve both goals automatically:
 | 
|  |    179 | \begin{ttbox}
 | 
|  |    180 | \input{ToyList/autotac}\end{ttbox}
 | 
|  |    181 | This command tells Isabelle to apply a proof strategy called
 | 
|  |    182 | \texttt{Auto_tac} to all subgoals. Essentially, \texttt{Auto_tac} tries to
 | 
|  |    183 | `simplify' the subgoals.  In our case, subgoal~1 is solved completely (thanks
 | 
|  |    184 | to the equation \texttt{rev [] = []}) and disappears; the simplified version
 | 
|  |    185 | of subgoal~2 becomes the new subgoal~1:
 | 
|  |    186 | \begin{ttbox}\makeatother
 | 
|  |    187 | {\out 1. !!a list. rev(rev list) = list ==> rev(rev list @ a # []) = a # list}
 | 
|  |    188 | \end{ttbox}
 | 
|  |    189 | In order to simplify this subgoal further, a lemma suggests itself.
 | 
|  |    190 | 
 | 
|  |    191 | \subsubsection*{First lemma: \texttt{rev(xs \at~ys) = (rev ys) \at~(rev xs)}}
 | 
|  |    192 | 
 | 
|  |    193 | We start the proof as usual:
 | 
|  |    194 | \begin{ttbox}\makeatother
 | 
|  |    195 | \input{ToyList/lemma1}\end{ttbox}
 | 
|  |    196 | There are two variables that we could induct on: \texttt{xs} and
 | 
|  |    197 | \texttt{ys}. Because \texttt{\at} is defined by recursion on
 | 
|  |    198 | the first argument, \texttt{xs} is the correct one:
 | 
|  |    199 | \begin{ttbox}
 | 
|  |    200 | \input{ToyList/inductxs}\end{ttbox}
 | 
|  |    201 | This time not even the base case is solved automatically:
 | 
|  |    202 | \begin{ttbox}\makeatother
 | 
|  |    203 | by(Auto_tac);
 | 
|  |    204 | {\out 1. rev ys = rev ys @ []}
 | 
|  |    205 | {\out 2. \dots}
 | 
|  |    206 | \end{ttbox}
 | 
|  |    207 | We need another lemma.
 | 
|  |    208 | 
 | 
|  |    209 | \subsubsection*{Second lemma: \texttt{xs \at~[] = xs}}
 | 
|  |    210 | 
 | 
|  |    211 | This time the canonical proof procedure
 | 
|  |    212 | \begin{ttbox}\makeatother
 | 
|  |    213 | \input{ToyList/lemma2}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
 | 
|  |    214 | leads to the desired message \texttt{No subgoals!}:
 | 
|  |    215 | \begin{ttbox}\makeatother
 | 
|  |    216 | {\out Level 2}
 | 
|  |    217 | {\out xs @ [] = xs}
 | 
|  |    218 | {\out No subgoals!}
 | 
|  |    219 | \end{ttbox}
 | 
|  |    220 | Now we can give the lemma just proved a suitable name
 | 
|  |    221 | \begin{ttbox}
 | 
|  |    222 | \input{ToyList/qed2}\end{ttbox}
 | 
|  |    223 | and tell Isabelle to use this lemma in all future proofs by simplification:
 | 
|  |    224 | \begin{ttbox}
 | 
|  |    225 | \input{ToyList/addsimps2}\end{ttbox}
 | 
|  |    226 | Note that in the theorem \texttt{app_Nil2} the free variable \texttt{xs} has
 | 
|  |    227 | been replaced by the unknown \texttt{?xs}, just as explained in
 | 
|  |    228 | \S\ref{sec:variables}.
 | 
|  |    229 | 
 | 
|  |    230 | Going back to the proof of the first lemma
 | 
|  |    231 | \begin{ttbox}\makeatother
 | 
|  |    232 | \input{ToyList/lemma1}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
 | 
|  |    233 | we find that this time \texttt{Auto_tac} solves the base case, but the
 | 
|  |    234 | induction step merely simplifies to
 | 
|  |    235 | \begin{ttbox}\makeatother
 | 
|  |    236 | {\out 1. !!a list.}
 | 
|  |    237 | {\out       rev (list @ ys) = rev ys @ rev list}
 | 
|  |    238 | {\out       ==> (rev ys @ rev list) @ a # [] = rev ys @ rev list @ a # []}
 | 
|  |    239 | \end{ttbox}
 | 
|  |    240 | Now we need to remember that \texttt{\at} associates to the right, and that
 | 
|  |    241 | \texttt{\#} and \texttt{\at} have the same priority (namely the \texttt{65}
 | 
|  |    242 | in the definition of \texttt{ToyList}). Thus the conclusion really is
 | 
|  |    243 | \begin{ttbox}\makeatother
 | 
|  |    244 | {\out     ==> (rev ys @ rev list) @ (a # []) = rev ys @ (rev list @ (a # []))}
 | 
|  |    245 | \end{ttbox}
 | 
|  |    246 | and the missing lemma is associativity of \texttt{\at}.
 | 
|  |    247 | 
 | 
|  |    248 | \subsubsection*{Third lemma: \texttt{(xs \at~ys) \at~zs = xs \at~(ys \at~zs)}}
 | 
|  |    249 | 
 | 
|  |    250 | This time the canonical proof procedure
 | 
|  |    251 | \begin{ttbox}\makeatother
 | 
|  |    252 | \input{ToyList/lemma3}\end{ttbox}
 | 
|  |    253 | succeeds without further ado. Again we name the lemma and add it to
 | 
|  |    254 | the set of lemmas used during simplification:
 | 
|  |    255 | \begin{ttbox}
 | 
|  |    256 | \input{ToyList/qed3}\end{ttbox}
 | 
|  |    257 | Now we can go back and prove the first lemma
 | 
|  |    258 | \begin{ttbox}\makeatother
 | 
|  |    259 | \input{ToyList/lemma1}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
 | 
|  |    260 | add it to the simplification lemmas
 | 
|  |    261 | \begin{ttbox}
 | 
|  |    262 | \input{ToyList/qed1}\end{ttbox}
 | 
|  |    263 | and then solve our main theorem:
 | 
|  |    264 | \begin{ttbox}\makeatother
 | 
|  |    265 | \input{ToyList/thm}\input{ToyList/inductxs}\input{ToyList/autotac}\end{ttbox}
 | 
|  |    266 | 
 | 
|  |    267 | \subsubsection*{Review}
 | 
|  |    268 | 
 | 
|  |    269 | This is the end of our toy proof. It should have familiarized you with
 | 
|  |    270 | \begin{itemize}
 | 
|  |    271 | \item the standard theorem proving procedure:
 | 
|  |    272 | state a goal; proceed with proof until a new lemma is required; prove that
 | 
|  |    273 | lemma; come back to the original goal.
 | 
|  |    274 | \item a specific procedure that works well for functional programs:
 | 
|  |    275 | induction followed by all-out simplification via \texttt{Auto_tac}.
 | 
|  |    276 | \item a basic repertoire of proof commands.
 | 
|  |    277 | \end{itemize}
 | 
|  |    278 | 
 | 
|  |    279 | 
 | 
|  |    280 | \section{Some helpful commands}
 | 
|  |    281 | \label{sec:commands-and-hints}
 | 
|  |    282 | 
 | 
|  |    283 | This section discusses a few basic commands for manipulating the proof state
 | 
|  |    284 | and can be skipped by casual readers.
 | 
|  |    285 | 
 | 
|  |    286 | There are two kinds of commands used during a proof: the actual proof
 | 
|  |    287 | commands and auxiliary commands for examining the proof state and controlling
 | 
|  |    288 | the display. Proof commands are always of the form
 | 
|  |    289 | \texttt{by(\textit{tactic});}\indexbold{tactic} where \textbf{tactic} is a
 | 
|  |    290 | synonym for ``theorem proving function''. Typical examples are
 | 
|  |    291 | \texttt{induct_tac} and \texttt{Auto_tac} --- the suffix \texttt{_tac} is
 | 
|  |    292 | merely a mnemonic. Further tactics are introduced throughout the tutorial.
 | 
|  |    293 | 
 | 
|  |    294 | %Tactics can also be modified. For example,
 | 
|  |    295 | %\begin{ttbox}
 | 
|  |    296 | %by(ALLGOALS Asm_simp_tac);
 | 
|  |    297 | %\end{ttbox}
 | 
|  |    298 | %tells Isabelle to apply \texttt{Asm_simp_tac} to all subgoals. For more on
 | 
|  |    299 | %tactics and how to combine them see~\S\ref{sec:Tactics}.
 | 
|  |    300 | 
 | 
|  |    301 | The most useful auxiliary commands are:
 | 
|  |    302 | \begin{description}
 | 
|  |    303 | \item[Printing the current state]
 | 
|  |    304 | Type \texttt{pr();} to redisplay the current proof state, for example when it
 | 
|  |    305 | has disappeared off the screen.
 | 
|  |    306 | \item[Limiting the number of subgoals]
 | 
|  |    307 | Typing \texttt{prlim $k$;} tells Isabelle to print only the first $k$
 | 
|  |    308 | subgoals from now on and redisplays the current proof state. This is helpful
 | 
|  |    309 | when there are many subgoals.
 | 
|  |    310 | \item[Undoing] Typing \texttt{undo();} undoes the effect of the last
 | 
|  |    311 | tactic.
 | 
|  |    312 | \item[Context switch] Every proof happens in the context of a
 | 
|  |    313 |   \bfindex{current theory}. By default, this is the last theory loaded. If
 | 
|  |    314 |   you want to prove a theorem in the context of a different theory
 | 
|  |    315 |   \texttt{T}, you need to type \texttt{context T.thy;}\index{*context|bold}
 | 
|  |    316 |   first. Of course you need to change the context again if you want to go
 | 
|  |    317 |   back to your original theory.
 | 
|  |    318 | \item[Displaying types] We have already mentioned the flag
 | 
|  |    319 |   \ttindex{show_types} above. It can also be useful for detecting typos in
 | 
|  |    320 |   formulae early on. For example, if \texttt{show_types} is set and the goal
 | 
|  |    321 |   \texttt{rev(rev xs) = xs} is started, Isabelle prints the additional output
 | 
|  |    322 | \begin{ttbox}
 | 
|  |    323 | {\out Variables:}
 | 
|  |    324 | {\out   xs :: 'a list}
 | 
|  |    325 | \end{ttbox}
 | 
|  |    326 | which tells us that Isabelle has correctly inferred that
 | 
|  |    327 | \texttt{xs} is a variable of list type. On the other hand, had we
 | 
|  |    328 | made a typo as in \texttt{rev(re xs) = xs}, the response
 | 
|  |    329 | \begin{ttbox}
 | 
|  |    330 | Variables:
 | 
|  |    331 |   re :: 'a list => 'a list
 | 
|  |    332 |   xs :: 'a list
 | 
|  |    333 | \end{ttbox}
 | 
|  |    334 | would have alerted us because of the unexpected variable \texttt{re}.
 | 
|  |    335 | \item[(Re)loading theories]\index{loading theories}\index{reloading theories}
 | 
|  |    336 | Initially you load theory \texttt{T} by typing \ttindex{use_thy}~\texttt{"T";},
 | 
|  |    337 | which loads all parent theories of \texttt{T} automatically, if they are not
 | 
|  |    338 | loaded already. If you modify \texttt{T.thy} or \texttt{T.ML}, you can
 | 
|  |    339 | reload it by typing \texttt{use_thy~"T";} again. This time, however, only
 | 
|  |    340 | \texttt{T} is reloaded. If some of \texttt{T}'s parents have changed as well,
 | 
| 6577 |    341 | type \ttindexbold{update_thy}~\texttt{"T";} to reload \texttt{T} and all of
 | 
|  |    342 | its parents that have changed (or have changed parents).
 | 
| 5375 |    343 | \end{description}
 | 
|  |    344 | Further commands are found in the Reference Manual.
 | 
|  |    345 | 
 | 
|  |    346 | 
 | 
|  |    347 | \section{Datatypes}
 | 
|  |    348 | \label{sec:datatype}
 | 
|  |    349 | 
 | 
|  |    350 | Inductive datatypes are part of almost every non-trivial application of HOL.
 | 
|  |    351 | First we take another look at a very important example, the datatype of
 | 
|  |    352 | lists, before we turn to datatypes in general. The section closes with a
 | 
|  |    353 | case study.
 | 
|  |    354 | 
 | 
|  |    355 | 
 | 
|  |    356 | \subsection{Lists}
 | 
|  |    357 | 
 | 
| 6148 |    358 | Lists are one of the essential datatypes in computing. Readers of this
 | 
|  |    359 | tutorial and users of HOL need to be familiar with their basic operations.
 | 
|  |    360 | Theory \texttt{ToyList} is only a small fragment of HOL's predefined theory
 | 
|  |    361 | \texttt{List}\footnote{\texttt{http://isabelle.in.tum.de/library/HOL/List.html}}.
 | 
| 5375 |    362 | The latter contains many further operations. For example, the functions
 | 
|  |    363 | \ttindexbold{hd} (`head') and \ttindexbold{tl} (`tail') return the first
 | 
|  |    364 | element and the remainder of a list. (However, pattern-matching is usually
 | 
| 6148 |    365 | preferable to \texttt{hd} and \texttt{tl}.)  Theory \texttt{List} also
 | 
|  |    366 | contains more syntactic sugar:
 | 
| 5375 |    367 | \texttt{[}$x@1$\texttt{,}\dots\texttt{,}$x@n$\texttt{]} abbreviates
 | 
| 6148 |    368 | $x@1$\texttt{\#}\dots\texttt{\#}$x@n$\texttt{\#[]}.  In the rest of the
 | 
|  |    369 | tutorial we always use HOL's predefined lists.
 | 
| 5375 |    370 | 
 | 
|  |    371 | 
 | 
|  |    372 | \subsection{The general format}
 | 
|  |    373 | \label{sec:general-datatype}
 | 
|  |    374 | 
 | 
|  |    375 | The general HOL \texttt{datatype} definition is of the form
 | 
|  |    376 | \[
 | 
|  |    377 | \mathtt{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~
 | 
|  |    378 | C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~
 | 
|  |    379 | C@m~\tau@{m1}~\dots~\tau@{mk@m}
 | 
|  |    380 | \]
 | 
|  |    381 | where $\alpha@i$ are type variables (the parameters), $C@i$ are distinct
 | 
|  |    382 | constructor names and $\tau@{ij}$ are types; it is customary to capitalize
 | 
|  |    383 | the first letter in constructor names. There are a number of
 | 
|  |    384 | restrictions (such as the type should not be empty) detailed
 | 
| 6606 |    385 | elsewhere~\cite{isabelle-HOL}. Isabelle notifies you if you violate them.
 | 
| 5375 |    386 | 
 | 
|  |    387 | Laws about datatypes, such as \verb$[] ~= x#xs$ and \texttt{(x\#xs = y\#ys) =
 | 
|  |    388 |   (x=y \& xs=ys)}, are used automatically during proofs by simplification.
 | 
|  |    389 | The same is true for the equations in primitive recursive function
 | 
|  |    390 | definitions.
 | 
|  |    391 | 
 | 
|  |    392 | \subsection{Primitive recursion}
 | 
|  |    393 | 
 | 
|  |    394 | Functions on datatypes are usually defined by recursion. In fact, most of the
 | 
|  |    395 | time they are defined by what is called \bfindex{primitive recursion}.
 | 
| 5850 |    396 | The keyword \ttindexbold{primrec} is followed by a list of equations
 | 
| 5375 |    397 | \[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \]
 | 
|  |    398 | such that $C$ is a constructor of the datatype $t$ and all recursive calls of
 | 
|  |    399 | $f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus
 | 
|  |    400 | Isabelle immediately sees that $f$ terminates because one (fixed!) argument
 | 
|  |    401 | becomes smaller with every recursive call. There must be exactly one equation
 | 
|  |    402 | for each constructor.  Their order is immaterial.
 | 
|  |    403 | A more general method for defining total recursive functions is explained in
 | 
|  |    404 | \S\ref{sec:recdef}.
 | 
|  |    405 | 
 | 
|  |    406 | \begin{exercise}
 | 
|  |    407 | Given the datatype of binary trees
 | 
|  |    408 | \begin{ttbox}
 | 
|  |    409 | \input{Misc/tree}\end{ttbox}
 | 
|  |    410 | define a function \texttt{mirror} that mirrors the structure of a binary tree
 | 
|  |    411 | by swapping subtrees (recursively). Prove \texttt{mirror(mirror(t)) = t}.
 | 
|  |    412 | \end{exercise}
 | 
|  |    413 | 
 | 
|  |    414 | \subsection{\texttt{case}-expressions}
 | 
|  |    415 | \label{sec:case-expressions}
 | 
|  |    416 | 
 | 
|  |    417 | HOL also features \ttindexbold{case}-expressions for analyzing elements of a
 | 
|  |    418 | datatype. For example,
 | 
|  |    419 | \begin{ttbox}
 | 
|  |    420 | case xs of [] => 0 | y#ys => y
 | 
|  |    421 | \end{ttbox}
 | 
|  |    422 | evaluates to \texttt{0} if \texttt{xs} is \texttt{[]} and to \texttt{y} if 
 | 
|  |    423 | \texttt{xs} is \texttt{y\#ys}. (Since the result in both branches must be of
 | 
|  |    424 | the same type, it follows that \texttt{y::nat} and hence
 | 
|  |    425 | \texttt{xs::(nat)list}.)
 | 
|  |    426 | 
 | 
|  |    427 | In general, if $e$ is a term of the datatype $t$ defined in
 | 
|  |    428 | \S\ref{sec:general-datatype} above, the corresponding
 | 
|  |    429 | \texttt{case}-expression analyzing $e$ is
 | 
|  |    430 | \[
 | 
|  |    431 | \begin{array}{rrcl}
 | 
|  |    432 | \mbox{\tt case}~e~\mbox{\tt of} & C@1~x@{11}~\dots~x@{1k@1} & \To & e@1 \\
 | 
|  |    433 |                            \vdots \\
 | 
|  |    434 |                            \mid & C@m~x@{m1}~\dots~x@{mk@m} & \To & e@m
 | 
|  |    435 | \end{array}
 | 
|  |    436 | \]
 | 
|  |    437 | 
 | 
|  |    438 | \begin{warn}
 | 
|  |    439 | {\em All} constructors must be present, their order is fixed, and nested
 | 
|  |    440 | patterns are not supported.  Violating these restrictions results in strange
 | 
|  |    441 | error messages.
 | 
|  |    442 | \end{warn}
 | 
|  |    443 | \noindent
 | 
|  |    444 | Nested patterns can be simulated by nested \texttt{case}-expressions: instead
 | 
|  |    445 | of
 | 
|  |    446 | \begin{ttbox}
 | 
|  |    447 | case xs of [] => 0 | [x] => x | x#(y#zs) => y
 | 
|  |    448 | \end{ttbox}
 | 
|  |    449 | write
 | 
|  |    450 | \begin{ttbox}
 | 
|  |    451 | case xs of [] => 0 | x#ys => (case ys of [] => x | y#zs => y)
 | 
|  |    452 | \end{ttbox}
 | 
|  |    453 | Note that \texttt{case}-expressions should be enclosed in parentheses to
 | 
|  |    454 | indicate their scope.
 | 
|  |    455 | 
 | 
|  |    456 | \subsection{Structural induction}
 | 
|  |    457 | 
 | 
|  |    458 | Almost all the basic laws about a datatype are applied automatically during
 | 
|  |    459 | simplification. Only induction is invoked by hand via \texttt{induct_tac},
 | 
|  |    460 | which works for any datatype. In some cases, induction is overkill and a case
 | 
|  |    461 | distinction over all constructors of the datatype suffices. This is performed
 | 
|  |    462 | by \ttindexbold{exhaust_tac}. A trivial example:
 | 
|  |    463 | \begin{ttbox}
 | 
|  |    464 | \input{Misc/exhaust.ML}{\out1. xs = [] ==> (case xs of [] => [] | y # ys => xs) = xs}
 | 
|  |    465 | {\out2. !!a list. xs = a # list ==> (case xs of [] => [] | y # ys => xs) = xs}
 | 
|  |    466 | \input{Misc/autotac.ML}\end{ttbox}
 | 
|  |    467 | Note that this particular case distinction could have been automated
 | 
|  |    468 | completely. See~\S\ref{sec:SimpFeatures}.
 | 
|  |    469 | 
 | 
|  |    470 | \begin{warn}
 | 
|  |    471 |   Induction is only allowed on a free variable that should not occur among
 | 
|  |    472 |   the assumptions of the subgoal.  Exhaustion works for arbitrary terms.
 | 
|  |    473 | \end{warn}
 | 
|  |    474 | 
 | 
|  |    475 | \subsection{Case study: boolean expressions}
 | 
|  |    476 | \label{sec:boolex}
 | 
|  |    477 | 
 | 
|  |    478 | The aim of this case study is twofold: it shows how to model boolean
 | 
|  |    479 | expressions and some algorithms for manipulating them, and it demonstrates
 | 
|  |    480 | the constructs introduced above.
 | 
|  |    481 | 
 | 
|  |    482 | \subsubsection{How can we model boolean expressions?}
 | 
|  |    483 | 
 | 
|  |    484 | We want to represent boolean expressions built up from variables and
 | 
|  |    485 | constants by negation and conjunction. The following datatype serves exactly
 | 
|  |    486 | that purpose:
 | 
|  |    487 | \begin{ttbox}
 | 
|  |    488 | \input{Ifexpr/boolex}\end{ttbox}
 | 
|  |    489 | The two constants are represented by the terms \texttt{Const~True} and
 | 
|  |    490 | \texttt{Const~False}. Variables are represented by terms of the form
 | 
|  |    491 | \texttt{Var}~$n$, where $n$ is a natural number (type \texttt{nat}).
 | 
|  |    492 | For example, the formula $P@0 \land \neg P@1$ is represented by the term
 | 
|  |    493 | \texttt{And~(Var~0)~(Neg(Var~1))}.
 | 
|  |    494 | 
 | 
|  |    495 | \subsubsection{What is the value of boolean expressions?}
 | 
|  |    496 | 
 | 
|  |    497 | The value of a boolean expressions depends on the value of its variables.
 | 
|  |    498 | Hence the function \texttt{value} takes an additional parameter, an {\em
 | 
|  |    499 |   environment} of type \texttt{nat~=>~bool}, which maps variables to their
 | 
|  |    500 | values:
 | 
|  |    501 | \begin{ttbox}
 | 
|  |    502 | \input{Ifexpr/value}\end{ttbox}
 | 
|  |    503 | 
 | 
|  |    504 | \subsubsection{If-expressions}
 | 
|  |    505 | 
 | 
|  |    506 | An alternative and often more efficient (because in a certain sense
 | 
|  |    507 | canonical) representation are so-called \textit{If-expressions\/} built up
 | 
|  |    508 | from constants (\texttt{CIF}), variables (\texttt{VIF}) and conditionals
 | 
|  |    509 | (\texttt{IF}):
 | 
|  |    510 | \begin{ttbox}
 | 
|  |    511 | \input{Ifexpr/ifex}\end{ttbox}
 | 
|  |    512 | The evaluation if If-expressions proceeds as for \texttt{boolex}:
 | 
|  |    513 | \begin{ttbox}
 | 
|  |    514 | \input{Ifexpr/valif}\end{ttbox}
 | 
|  |    515 | 
 | 
|  |    516 | \subsubsection{Transformation into and of If-expressions}
 | 
|  |    517 | 
 | 
|  |    518 | The type \texttt{boolex} is close to the customary representation of logical
 | 
|  |    519 | formulae, whereas \texttt{ifex} is designed for efficiency. Thus we need to
 | 
|  |    520 | translate from \texttt{boolex} into \texttt{ifex}:
 | 
|  |    521 | \begin{ttbox}
 | 
|  |    522 | \input{Ifexpr/bool2if}\end{ttbox}
 | 
|  |    523 | At last, we have something we can verify: that \texttt{bool2if} preserves the
 | 
|  |    524 | value of its argument.
 | 
|  |    525 | \begin{ttbox}
 | 
|  |    526 | \input{Ifexpr/bool2if.ML}\end{ttbox}
 | 
|  |    527 | The proof is canonical:
 | 
|  |    528 | \begin{ttbox}
 | 
|  |    529 | \input{Ifexpr/proof.ML}\end{ttbox}
 | 
|  |    530 | In fact, all proofs in this case study look exactly like this. Hence we do
 | 
|  |    531 | not show them below.
 | 
|  |    532 | 
 | 
|  |    533 | More interesting is the transformation of If-expressions into a normal form
 | 
|  |    534 | where the first argument of \texttt{IF} cannot be another \texttt{IF} but
 | 
|  |    535 | must be a constant or variable. Such a normal form can be computed by
 | 
|  |    536 | repeatedly replacing a subterm of the form \texttt{IF~(IF~b~x~y)~z~u} by
 | 
|  |    537 | \texttt{IF b (IF x z u) (IF y z u)}, which has the same value. The following
 | 
|  |    538 | primitive recursive functions perform this task:
 | 
|  |    539 | \begin{ttbox}
 | 
|  |    540 | \input{Ifexpr/normif}
 | 
|  |    541 | \input{Ifexpr/norm}\end{ttbox}
 | 
|  |    542 | Their interplay is a bit tricky, and we leave it to the reader to develop an
 | 
|  |    543 | intuitive understanding. Fortunately, Isabelle can help us to verify that the
 | 
|  |    544 | transformation preserves the value of the expression:
 | 
|  |    545 | \begin{ttbox}
 | 
|  |    546 | \input{Ifexpr/norm.ML}\end{ttbox}
 | 
|  |    547 | The proof is canonical, provided we first show the following lemma (which
 | 
|  |    548 | also helps to understand what \texttt{normif} does) and make it available
 | 
|  |    549 | for simplification via \texttt{Addsimps}:
 | 
|  |    550 | \begin{ttbox}
 | 
|  |    551 | \input{Ifexpr/normif.ML}\end{ttbox}
 | 
|  |    552 | 
 | 
|  |    553 | But how can we be sure that \texttt{norm} really produces a normal form in
 | 
|  |    554 | the above sense? We have to prove
 | 
|  |    555 | \begin{ttbox}
 | 
|  |    556 | \input{Ifexpr/normal_norm.ML}\end{ttbox}
 | 
|  |    557 | where \texttt{normal} expresses that an If-expression is in normal form:
 | 
|  |    558 | \begin{ttbox}
 | 
|  |    559 | \input{Ifexpr/normal}\end{ttbox}
 | 
|  |    560 | Of course, this requires a lemma about normality of \texttt{normif}
 | 
|  |    561 | \begin{ttbox}
 | 
|  |    562 | \input{Ifexpr/normal_normif.ML}\end{ttbox}
 | 
|  |    563 | that has to be made available for simplification via \texttt{Addsimps}.
 | 
|  |    564 | 
 | 
|  |    565 | How does one come up with the required lemmas? Try to prove the main theorems
 | 
|  |    566 | without them and study carefully what \texttt{Auto_tac} leaves unproved. This
 | 
|  |    567 | has to provide the clue.
 | 
|  |    568 | The necessity of universal quantification (\texttt{!t e}) in the two lemmas
 | 
|  |    569 | is explained in \S\ref{sec:InductionHeuristics}
 | 
|  |    570 | 
 | 
|  |    571 | \begin{exercise}
 | 
|  |    572 |   We strengthen the definition of a {\em normal\/} If-expression as follows:
 | 
|  |    573 |   the first argument of all \texttt{IF}s must be a variable. Adapt the above
 | 
|  |    574 |   development to this changed requirement. (Hint: you may need to formulate
 | 
|  |    575 |   some of the goals as implications (\texttt{-->}) rather than equalities
 | 
|  |    576 |   (\texttt{=}).)
 | 
|  |    577 | \end{exercise}
 | 
|  |    578 | 
 | 
|  |    579 | \section{Some basic types}
 | 
|  |    580 | 
 | 
| 6577 |    581 | 
 | 
| 5375 |    582 | \subsection{Natural numbers}
 | 
| 6577 |    583 | \index{arithmetic|(}
 | 
| 5375 |    584 | 
 | 
|  |    585 | The type \ttindexbold{nat} of natural numbers is predefined and behaves like
 | 
|  |    586 | \begin{ttbox}
 | 
|  |    587 | datatype nat = 0 | Suc nat
 | 
|  |    588 | \end{ttbox}
 | 
|  |    589 | In particular, there are \texttt{case}-expressions, for example
 | 
|  |    590 | \begin{ttbox}
 | 
|  |    591 | case n of 0 => 0 | Suc m => m
 | 
|  |    592 | \end{ttbox}
 | 
|  |    593 | primitive recursion, for example
 | 
|  |    594 | \begin{ttbox}
 | 
|  |    595 | \input{Misc/natsum}\end{ttbox}
 | 
|  |    596 | and induction, for example
 | 
|  |    597 | \begin{ttbox}
 | 
|  |    598 | \input{Misc/NatSum.ML}\ttbreak
 | 
|  |    599 | {\out sum n + sum n = n * Suc n}
 | 
|  |    600 | {\out No subgoals!}
 | 
|  |    601 | \end{ttbox}
 | 
|  |    602 | 
 | 
|  |    603 | The usual arithmetic operations \ttindexbold{+}, \ttindexbold{-},
 | 
| 6577 |    604 | \ttindexbold{*}, \ttindexbold{div}, \ttindexbold{mod}, \ttindexbold{min} and
 | 
|  |    605 | \ttindexbold{max} are predefined, as are the relations \ttindexbold{<=} and
 | 
|  |    606 | \ttindexbold{<}. There is even a least number operation \ttindexbold{LEAST}.
 | 
|  |    607 | For example, \texttt{(LEAST n.$\,$1 < n) = 2} (HOL does not prove this
 | 
|  |    608 | completely automatically).
 | 
| 5375 |    609 | 
 | 
|  |    610 | \begin{warn}
 | 
| 6577 |    611 |   The operations \ttindexbold{+}, \ttindexbold{-}, \ttindexbold{*},
 | 
|  |    612 |   \ttindexbold{min}, \ttindexbold{max}, \ttindexbold{<=} and \ttindexbold{<}
 | 
|  |    613 |   are overloaded, i.e.\ they are available not just for natural numbers but
 | 
|  |    614 |   at other types as well (see \S\ref{sec:TypeClasses}). For example, given
 | 
| 5375 |    615 |   the goal \texttt{x+y = y+x}, there is nothing to indicate that you are
 | 
| 6577 |    616 |   talking about natural numbers.  Hence Isabelle can only infer that
 | 
| 5375 |    617 |   \texttt{x} and \texttt{y} are of some arbitrary type where \texttt{+} is
 | 
|  |    618 |   declared. As a consequence, you will be unable to prove the goal (although
 | 
|  |    619 |   it may take you some time to realize what has happened if
 | 
|  |    620 |   \texttt{show_types} is not set).  In this particular example, you need to
 | 
|  |    621 |   include an explicit type constraint, for example \texttt{x+y = y+(x::nat)}.
 | 
|  |    622 |   If there is enough contextual information this may not be necessary:
 | 
|  |    623 |   \texttt{x+0 = x} automatically implies \texttt{x::nat}.
 | 
|  |    624 | \end{warn}
 | 
|  |    625 | 
 | 
| 6577 |    626 | Simple arithmetic goals are proved automatically by both \texttt{Auto_tac}
 | 
|  |    627 | and the simplification tactics introduced in \S\ref{sec:Simplification}.  For
 | 
|  |    628 | example, the goal
 | 
|  |    629 | \begin{ttbox}
 | 
|  |    630 | \input{Misc/arith1.ML}\end{ttbox}
 | 
|  |    631 | is proved automatically. The main restriction is that only addition is taken
 | 
|  |    632 | into account; other arithmetic operations and quantified formulae are ignored.
 | 
|  |    633 | 
 | 
|  |    634 | For more complex goals, there is the special tactic \ttindexbold{arith_tac}. It
 | 
|  |    635 | proves arithmetic goals involving the usual logical connectives (\verb$~$,
 | 
|  |    636 | \verb$&$, \verb$|$, \verb$-->$), the relations \texttt{<=} and \texttt{<},
 | 
|  |    637 | and the operations \ttindexbold{+}, \ttindexbold{-}, \ttindexbold{min} and
 | 
|  |    638 | \ttindexbold{max}. For example, it can prove
 | 
|  |    639 | \begin{ttbox}
 | 
|  |    640 | \input{Misc/arith2.ML}\end{ttbox}
 | 
|  |    641 | because \texttt{k*k} can be treated as atomic.
 | 
|  |    642 | In contrast, $n*n = n \Longrightarrow n=0 \lor n=1$ is not
 | 
|  |    643 | even proved by \texttt{arith_tac} because the proof relies essentially on
 | 
|  |    644 | properties of multiplication.
 | 
|  |    645 | 
 | 
|  |    646 | \begin{warn}
 | 
|  |    647 |   The running time of \texttt{arith_tac} is exponential in the number of
 | 
|  |    648 |   occurrences of \ttindexbold{-}, \ttindexbold{min} and \ttindexbold{max}
 | 
|  |    649 |   because they are first eliminated by case distinctions.
 | 
|  |    650 | 
 | 
|  |    651 |   \texttt{arith_tac} is incomplete even for the restricted class of formulae
 | 
|  |    652 |   described above (known as ``linear arithmetic''). If divisibility plays a
 | 
|  |    653 |   role, it may fail to prove a valid formula, for example $m+m \neq n+n+1$.
 | 
|  |    654 |   Fortunately, such examples are rare in practice.
 | 
|  |    655 | \end{warn}
 | 
|  |    656 | 
 | 
|  |    657 | \index{arithmetic|)}
 | 
|  |    658 | 
 | 
| 5375 |    659 | 
 | 
|  |    660 | \subsection{Products}
 | 
|  |    661 | 
 | 
|  |    662 | HOL also has pairs: \texttt{($a@1$,$a@2$)} is of type \texttt{$\tau@1$ *
 | 
|  |    663 | $\tau@2$} provided each $a@i$ is of type $\tau@i$. The components of a pair
 | 
|  |    664 | are extracted by \texttt{fst} and \texttt{snd}:
 | 
|  |    665 | \texttt{fst($x$,$y$) = $x$} and \texttt{snd($x$,$y$) = $y$}. Tuples
 | 
|  |    666 | are simulated by pairs nested to the right: 
 | 
|  |    667 | \texttt{($a@1$,$a@2$,$a@3$)} and \texttt{$\tau@1$ * $\tau@2$ * $\tau@3$}
 | 
|  |    668 | stand for \texttt{($a@1$,($a@2$,$a@3$))} and \texttt{$\tau@1$ * ($\tau@2$ *
 | 
|  |    669 | $\tau@3$)}. Therefore \texttt{fst(snd($a@1$,$a@2$,$a@3$)) = $a@2$}.
 | 
|  |    670 | 
 | 
|  |    671 | It is possible to use (nested) tuples as patterns in abstractions, for
 | 
|  |    672 | example \texttt{\%(x,y,z).x+y+z} and \texttt{\%((x,y),z).x+y+z}.
 | 
|  |    673 | 
 | 
|  |    674 | In addition to explicit $\lambda$-abstractions, tuple patterns can be used in
 | 
|  |    675 | most variable binding constructs. Typical examples are
 | 
|  |    676 | \begin{ttbox}
 | 
|  |    677 | let (x,y) = f z in (y,x)
 | 
|  |    678 | 
 | 
|  |    679 | case xs of [] => 0 | (x,y)\#zs => x+y
 | 
|  |    680 | \end{ttbox}
 | 
|  |    681 | Further important examples are quantifiers and sets.
 | 
|  |    682 | 
 | 
|  |    683 | \begin{warn}
 | 
|  |    684 | Abstraction over pairs and tuples is merely a convenient shorthand for a more
 | 
|  |    685 | complex internal representation.  Thus the internal and external form of a
 | 
|  |    686 | term may differ, which can affect proofs. If you want to avoid this
 | 
|  |    687 | complication, use \texttt{fst} and \texttt{snd}, i.e.\ write
 | 
|  |    688 | \texttt{\%p.~fst p + snd p} instead of \texttt{\%(x,y).~x + y}.
 | 
|  |    689 | See~\S\ref{} for theorem proving with tuple patterns.
 | 
|  |    690 | \end{warn}
 | 
|  |    691 | 
 | 
|  |    692 | 
 | 
|  |    693 | \section{Definitions}
 | 
|  |    694 | \label{sec:Definitions}
 | 
|  |    695 | 
 | 
|  |    696 | A definition is simply an abbreviation, i.e.\ a new name for an existing
 | 
|  |    697 | construction. In particular, definitions cannot be recursive. Isabelle offers
 | 
|  |    698 | definitions on the level of types and terms. Those on the type level are
 | 
|  |    699 | called type synonyms, those on the term level are called (constant)
 | 
|  |    700 | definitions.
 | 
|  |    701 | 
 | 
|  |    702 | 
 | 
|  |    703 | \subsection{Type synonyms}
 | 
|  |    704 | \indexbold{type synonyms}
 | 
|  |    705 | 
 | 
|  |    706 | Type synonyms are similar to those found in ML. Their syntax is fairly self
 | 
|  |    707 | explanatory:
 | 
|  |    708 | \begin{ttbox}
 | 
|  |    709 | \input{Misc/types}\end{ttbox}\indexbold{*types}
 | 
|  |    710 | The synonym \texttt{alist} shows that in general the type on the right-hand
 | 
|  |    711 | side needs to be enclosed in double quotation marks
 | 
|  |    712 | (see the end of~\S\ref{sec:intro-theory}).
 | 
|  |    713 | 
 | 
|  |    714 | Internally all synonyms are fully expanded.  As a consequence Isabelle's
 | 
|  |    715 | output never contains synonyms.  Their main purpose is to improve the
 | 
|  |    716 | readability of theory definitions.  Synonyms can be used just like any other
 | 
|  |    717 | type:
 | 
|  |    718 | \begin{ttbox}
 | 
|  |    719 | \input{Misc/consts}\end{ttbox}
 | 
|  |    720 | 
 | 
|  |    721 | \subsection{Constant definitions}
 | 
|  |    722 | \label{sec:ConstDefinitions}
 | 
|  |    723 | 
 | 
|  |    724 | The above constants \texttt{nand} and \texttt{exor} are non-recursive and can
 | 
|  |    725 | therefore be defined directly by
 | 
|  |    726 | \begin{ttbox}
 | 
|  |    727 | \input{Misc/defs}\end{ttbox}\indexbold{*defs}
 | 
|  |    728 | where \texttt{defs} is a keyword and \texttt{nand_def} and \texttt{exor_def}
 | 
|  |    729 | are arbitrary user-supplied names.
 | 
|  |    730 | The symbol \texttt{==}\index{==>@{\tt==}|bold} is a special form of equality
 | 
|  |    731 | that should only be used in constant definitions.
 | 
|  |    732 | Declarations and definitions can also be merged
 | 
|  |    733 | \begin{ttbox}
 | 
|  |    734 | \input{Misc/constdefs}\end{ttbox}\indexbold{*constdefs}
 | 
|  |    735 | in which case the default name of each definition is $f$\texttt{_def}, where
 | 
|  |    736 | $f$ is the name of the defined constant.
 | 
|  |    737 | 
 | 
|  |    738 | Note that pattern-matching is not allowed, i.e.\ each definition must be of
 | 
|  |    739 | the form $f\,x@1\,\dots\,x@n$\texttt{~==~}$t$.
 | 
|  |    740 | 
 | 
|  |    741 | Section~\S\ref{sec:Simplification} explains how definitions are used
 | 
|  |    742 | in proofs.
 | 
|  |    743 | 
 | 
|  |    744 | \begin{warn}
 | 
|  |    745 | A common mistake when writing definitions is to introduce extra free variables
 | 
|  |    746 | on the right-hand side as in the following fictitious definition:
 | 
|  |    747 | \begin{ttbox}
 | 
|  |    748 | defs  prime_def "prime(p) == (m divides p) --> (m=1 | m=p)"
 | 
|  |    749 | \end{ttbox}
 | 
|  |    750 | Isabelle rejects this `definition' because of the extra {\tt m} on the
 | 
|  |    751 | right-hand side, which would introduce an inconsistency.  What you should have
 | 
|  |    752 | written is
 | 
|  |    753 | \begin{ttbox}
 | 
|  |    754 | defs  prime_def "prime(p) == !m. (m divides p) --> (m=1 | m=p)"
 | 
|  |    755 | \end{ttbox}
 | 
|  |    756 | \end{warn}
 | 
|  |    757 | 
 | 
|  |    758 | 
 | 
|  |    759 | 
 | 
|  |    760 | 
 | 
|  |    761 | \chapter{More Functional Programming}
 | 
|  |    762 | 
 | 
|  |    763 | The purpose of this chapter is to deepen the reader's understanding of the
 | 
|  |    764 | concepts encountered so far and to introduce an advanced method for defining
 | 
|  |    765 | recursive functions. The first two sections give a structured presentation of
 | 
|  |    766 | theorem proving by simplification (\S\ref{sec:Simplification}) and
 | 
|  |    767 | discuss important heuristics for induction (\S\ref{sec:InductionHeuristics}). They
 | 
|  |    768 | can be skipped by readers less interested in proofs. They are followed by a
 | 
|  |    769 | case study, a compiler for expressions (\S\ref{sec:ExprCompiler}).
 | 
|  |    770 | Finally we present a very general method for defining recursive functions
 | 
|  |    771 | that goes well beyond what \texttt{primrec} allows (\S\ref{sec:recdef}).
 | 
|  |    772 | 
 | 
|  |    773 | 
 | 
|  |    774 | \section{Simplification}
 | 
|  |    775 | \label{sec:Simplification}
 | 
|  |    776 | 
 | 
|  |    777 | So far we have proved our theorems by \texttt{Auto_tac}, which
 | 
|  |    778 | `simplifies' all subgoals. In fact, \texttt{Auto_tac} can do much more than
 | 
|  |    779 | that, except that it did not need to so far. However, when you go beyond toy
 | 
|  |    780 | examples, you need to understand the ingredients of \texttt{Auto_tac}.
 | 
|  |    781 | This section covers the tactic that \texttt{Auto_tac} always applies first,
 | 
|  |    782 | namely simplification.
 | 
|  |    783 | 
 | 
|  |    784 | Simplification is one of the central theorem proving tools in Isabelle and
 | 
|  |    785 | many other systems. The tool itself is called the \bfindex{simplifier}. The
 | 
|  |    786 | purpose of this section is twofold: to introduce the many features of the
 | 
|  |    787 | simplifier (\S\ref{sec:SimpFeatures}) and to explain a little bit how the
 | 
|  |    788 | simplifier works (\S\ref{sec:SimpHow}).  Anybody intending to use HOL should
 | 
|  |    789 | read \S\ref{sec:SimpFeatures}, and the serious student should read
 | 
|  |    790 | \S\ref{sec:SimpHow} as well in order to understand what happened in case
 | 
|  |    791 | things do not simplify as expected.
 | 
|  |    792 | 
 | 
|  |    793 | 
 | 
|  |    794 | \subsection{Using the simplifier}
 | 
|  |    795 | \label{sec:SimpFeatures}
 | 
|  |    796 | 
 | 
|  |    797 | In its most basic form, simplification means repeated application of
 | 
|  |    798 | equations from left to right. For example, taking the rules for \texttt{\at}
 | 
|  |    799 | and applying them to the term \texttt{[0,1] \at\ []} results in a sequence of
 | 
|  |    800 | simplification steps:
 | 
|  |    801 | \begin{ttbox}\makeatother
 | 
|  |    802 | (0#1#[]) @ []  \(\leadsto\)  0#((1#[]) @ [])  \(\leadsto\)  0#(1#([] @ []))  \(\leadsto\)  0#1#[]
 | 
|  |    803 | \end{ttbox}
 | 
|  |    804 | This is also known as {\em term rewriting} and the equations are referred
 | 
|  |    805 | to as {\em rewrite rules}. This is more honest than `simplification' because
 | 
|  |    806 | the terms do not necessarily become simpler in the process.
 | 
|  |    807 | 
 | 
|  |    808 | \subsubsection{Simpsets}
 | 
|  |    809 | 
 | 
|  |    810 | To facilitate simplification, each theory has an associated set of
 | 
|  |    811 | simplification rules, known as a \bfindex{simpset}. Within a theory,
 | 
|  |    812 | proofs by simplification refer to the associated simpset by default.
 | 
|  |    813 | The simpset of a theory is built up as follows: starting with the union of
 | 
|  |    814 | the simpsets of the parent theories, each occurrence of a \texttt{datatype}
 | 
|  |    815 | or \texttt{primrec} construct augments the simpset. Explicit definitions are
 | 
|  |    816 | not added automatically. Users can add new theorems via \texttt{Addsimps} and
 | 
|  |    817 | delete them again later by \texttt{Delsimps}.
 | 
|  |    818 | 
 | 
|  |    819 | You may augment a simpset not just by equations but by pretty much any
 | 
|  |    820 | theorem. The simplifier will try to make sense of it.  For example, a theorem
 | 
|  |    821 | \verb$~$$P$ is automatically turned into \texttt{$P$ = False}. The details are
 | 
|  |    822 | explained in \S\ref{sec:SimpHow}.
 | 
|  |    823 | 
 | 
|  |    824 | As a rule of thumb, rewrite rules that really simplify a term (like
 | 
|  |    825 | \texttt{xs \at\ [] = xs} and \texttt{rev(rev xs) = xs}) should be added to the
 | 
|  |    826 | current simpset right after they have been proved.  Those of a more specific
 | 
|  |    827 | nature (e.g.\ distributivity laws, which alter the structure of terms
 | 
|  |    828 | considerably) should only be added for specific proofs and deleted again
 | 
|  |    829 | afterwards.  Conversely, it may also happen that a generally useful rule
 | 
|  |    830 | needs to be removed for a certain proof and is added again afterwards.  The
 | 
|  |    831 | need of frequent temporary additions or deletions may indicate a badly
 | 
|  |    832 | designed simpset.
 | 
|  |    833 | \begin{warn}
 | 
|  |    834 |   Simplification may not terminate, for example if both $f(x) = g(x)$ and
 | 
|  |    835 |   $g(x) = f(x)$ are in the simpset. It is the user's responsibility not to
 | 
|  |    836 |   include rules that can lead to nontermination, either on their own or in
 | 
|  |    837 |   combination with other rules.
 | 
|  |    838 | \end{warn}
 | 
|  |    839 | 
 | 
|  |    840 | \subsubsection{Simplification tactics}
 | 
|  |    841 | 
 | 
|  |    842 | There are four main simplification tactics:
 | 
|  |    843 | \begin{ttdescription}
 | 
|  |    844 | \item[\ttindexbold{Simp_tac} $i$] simplifies the conclusion of subgoal~$i$
 | 
|  |    845 |   using the theory's simpset.  It may solve the subgoal completely if it has
 | 
|  |    846 |   become trivial. For example:
 | 
|  |    847 | \begin{ttbox}\makeatother
 | 
|  |    848 | {\out 1. [] @ [] = []}
 | 
|  |    849 | by(Simp_tac 1);
 | 
|  |    850 | {\out No subgoals!}
 | 
|  |    851 | \end{ttbox}
 | 
|  |    852 | 
 | 
|  |    853 | \item[\ttindexbold{Asm_simp_tac}]
 | 
|  |    854 |   is like \verb$Simp_tac$, but extracts additional rewrite rules from
 | 
|  |    855 |   the assumptions of the subgoal. For example, it solves
 | 
|  |    856 | \begin{ttbox}\makeatother
 | 
|  |    857 | {\out 1. xs = [] ==> xs @ xs = xs}
 | 
|  |    858 | \end{ttbox}
 | 
|  |    859 | which \texttt{Simp_tac} does not do.
 | 
|  |    860 |   
 | 
|  |    861 | \item[\ttindexbold{Full_simp_tac}] is like \verb$Simp_tac$, but also
 | 
|  |    862 |   simplifies the assumptions (without using the assumptions to
 | 
|  |    863 |   simplify each other or the actual goal).
 | 
|  |    864 | 
 | 
|  |    865 | \item[\ttindexbold{Asm_full_simp_tac}] is like \verb$Asm_simp_tac$,
 | 
|  |    866 |   but also simplifies the assumptions. In particular, assumptions can
 | 
|  |    867 |   simplify each other. For example:
 | 
|  |    868 | \begin{ttbox}\makeatother
 | 
|  |    869 | \out{ 1. [| xs @ zs = ys @ xs; [] @ xs = [] @ [] |] ==> ys = zs}
 | 
|  |    870 | by(Asm_full_simp_tac 1);
 | 
|  |    871 | {\out No subgoals!}
 | 
|  |    872 | \end{ttbox}
 | 
|  |    873 | The second assumption simplifies to \texttt{xs = []}, which in turn
 | 
|  |    874 | simplifies the first assumption to \texttt{zs = ys}, thus reducing the
 | 
|  |    875 | conclusion to \texttt{ys = ys} and hence to \texttt{True}.
 | 
|  |    876 | (See also the paragraph on tracing below.)
 | 
|  |    877 | \end{ttdescription}
 | 
|  |    878 | \texttt{Asm_full_simp_tac} is the most powerful of this quartet of
 | 
|  |    879 | tactics. In fact, \texttt{Auto_tac} starts by applying
 | 
|  |    880 | \texttt{Asm_full_simp_tac} to all subgoals. The only reason for the existence
 | 
|  |    881 | of the other three tactics is that sometimes one wants to limit the amount of
 | 
|  |    882 | simplification, for example to avoid nontermination:
 | 
|  |    883 | \begin{ttbox}\makeatother
 | 
|  |    884 | {\out  1. ! x. f x = g (f (g x)) ==> f [] = f [] @ []}
 | 
|  |    885 | \end{ttbox}
 | 
|  |    886 | is solved by \texttt{Simp_tac}, but \texttt{Asm_simp_tac} and
 | 
|  |    887 | \texttt{Asm_full_simp_tac} loop because the rewrite rule \texttt{f x = g(f(g
 | 
|  |    888 | x))} extracted from the assumption does not terminate.  Isabelle notices
 | 
|  |    889 | certain simple forms of nontermination, but not this one.
 | 
|  |    890 |  
 | 
|  |    891 | \subsubsection{Modifying simpsets locally}
 | 
|  |    892 | 
 | 
|  |    893 | If a certain theorem is merely needed in one proof by simplification, the
 | 
|  |    894 | pattern
 | 
|  |    895 | \begin{ttbox}
 | 
|  |    896 | Addsimps [\(rare_theorem\)];
 | 
|  |    897 | by(Simp_tac 1);
 | 
|  |    898 | Delsimps [\(rare_theorem\)];
 | 
|  |    899 | \end{ttbox}
 | 
|  |    900 | is awkward. Therefore there are lower-case versions of the simplification
 | 
|  |    901 | tactics (\ttindexbold{simp_tac}, \ttindexbold{asm_simp_tac},
 | 
|  |    902 | \ttindexbold{full_simp_tac}, \ttindexbold{asm_full_simp_tac}) and of the
 | 
|  |    903 | simpset modifiers (\ttindexbold{addsimps}, \ttindexbold{delsimps})
 | 
|  |    904 | that do not access or modify the implicit simpset but explicitly take a
 | 
|  |    905 | simpset as an argument. For example, the above three lines become
 | 
|  |    906 | \begin{ttbox}
 | 
|  |    907 | by(simp_tac (simpset() addsimps [\(rare_theorem\)]) 1);
 | 
|  |    908 | \end{ttbox}
 | 
|  |    909 | where the result of the function call \texttt{simpset()} is the simpset of
 | 
|  |    910 | the current theory and \texttt{addsimps} is an infix function. The implicit
 | 
|  |    911 | simpset is read once but not modified.
 | 
|  |    912 | This is far preferable to pairs of \texttt{Addsimps} and \texttt{Delsimps}.
 | 
|  |    913 | Local modifications can be stacked as in
 | 
|  |    914 | \begin{ttbox}
 | 
|  |    915 | by(simp_tac (simpset() addsimps [\(rare_theorem\)] delsimps [\(some_thm\)]) 1);
 | 
|  |    916 | \end{ttbox}
 | 
|  |    917 | 
 | 
|  |    918 | \subsubsection{Rewriting with definitions}
 | 
|  |    919 | 
 | 
|  |    920 | Constant definitions (\S\ref{sec:ConstDefinitions}) are not automatically
 | 
|  |    921 | included in the simpset of a theory. Hence such definitions are not expanded
 | 
|  |    922 | automatically either, just as it should be: definitions are introduced for
 | 
|  |    923 | the purpose of abbreviating complex concepts. Of course we need to expand the
 | 
|  |    924 | definitions initially to derive enough lemmas that characterize the concept
 | 
|  |    925 | sufficiently for us to forget the original definition completely. For
 | 
|  |    926 | example, given the theory
 | 
|  |    927 | \begin{ttbox}
 | 
|  |    928 | \input{Misc/Exor.thy}\end{ttbox}
 | 
|  |    929 | we may want to prove \verb$exor A (~A)$. Instead of \texttt{Goal} we use
 | 
|  |    930 | \begin{ttbox}
 | 
|  |    931 | \input{Misc/exorgoal.ML}\end{ttbox}
 | 
|  |    932 | which tells Isabelle to expand the definition of \texttt{exor}---the first
 | 
|  |    933 | argument of \texttt{Goalw} can be a list of definitions---in the initial goal:
 | 
|  |    934 | \begin{ttbox}
 | 
|  |    935 | {\out exor A (~ A)}
 | 
|  |    936 | {\out  1. A & ~ ~ A | ~ A & ~ A}
 | 
|  |    937 | \end{ttbox}
 | 
|  |    938 | In this simple example, the goal is proved by \texttt{Simp_tac}.
 | 
|  |    939 | Of course the resulting theorem is insufficient to characterize \texttt{exor}
 | 
|  |    940 | completely.
 | 
|  |    941 | 
 | 
|  |    942 | In case we want to expand a definition in the middle of a proof, we can
 | 
|  |    943 | simply add the definition locally to the simpset:
 | 
|  |    944 | \begin{ttbox}
 | 
|  |    945 | \input{Misc/exorproof.ML}\end{ttbox}
 | 
|  |    946 | You should normally not add the definition permanently to the simpset
 | 
|  |    947 | using \texttt{Addsimps} because this defeats the whole purpose of an
 | 
|  |    948 | abbreviation.
 | 
|  |    949 | 
 | 
|  |    950 | \begin{warn}
 | 
|  |    951 | If you have defined $f\,x\,y$\texttt{~==~}$t$ then you can only expand
 | 
|  |    952 | occurrences of $f$ with at least two arguments. Thus it is safer to define
 | 
|  |    953 | $f$\texttt{~==~\%$x\,y$.}$\;t$.
 | 
|  |    954 | \end{warn}
 | 
|  |    955 | 
 | 
|  |    956 | \subsubsection{Simplifying \texttt{let}-expressions}
 | 
|  |    957 | 
 | 
|  |    958 | Proving a goal containing \ttindex{let}-expressions invariably requires the
 | 
|  |    959 | \texttt{let}-constructs to be expanded at some point. Since
 | 
|  |    960 | \texttt{let}-\texttt{in} is just syntactic sugar for a defined constant
 | 
|  |    961 | (called \texttt{Let}), expanding \texttt{let}-constructs means rewriting with
 | 
|  |    962 | \texttt{Let_def}:
 | 
|  |    963 | %context List.thy;
 | 
|  |    964 | %Goal "(let xs = [] in xs@xs) = ys";
 | 
|  |    965 | \begin{ttbox}\makeatother
 | 
|  |    966 | {\out  1. (let xs = [] in xs @ xs) = ys}
 | 
|  |    967 | by(simp_tac (simpset() addsimps [Let_def]) 1);
 | 
|  |    968 | {\out  1. [] = ys}
 | 
|  |    969 | \end{ttbox}
 | 
|  |    970 | If, in a particular context, there is no danger of a combinatorial explosion
 | 
|  |    971 | of nested \texttt{let}s one could even add \texttt{Let_def} permanently via
 | 
|  |    972 | \texttt{Addsimps}.
 | 
|  |    973 | 
 | 
|  |    974 | \subsubsection{Conditional equations}
 | 
|  |    975 | 
 | 
|  |    976 | So far all examples of rewrite rules were equations. The simplifier also
 | 
|  |    977 | accepts {\em conditional\/} equations, for example
 | 
|  |    978 | \begin{ttbox}
 | 
|  |    979 | xs ~= []  ==>  hd xs # tl xs = xs \hfill \((*)\)
 | 
|  |    980 | \end{ttbox}
 | 
|  |    981 | (which is proved by \texttt{exhaust_tac} on \texttt{xs} followed by
 | 
|  |    982 | \texttt{Asm_full_simp_tac} twice). Assuming that this theorem together with
 | 
|  |    983 | %\begin{ttbox}\makeatother
 | 
|  |    984 | \texttt{(rev xs = []) = (xs = [])}
 | 
|  |    985 | %\end{ttbox}
 | 
|  |    986 | are part of the simpset, the subgoal
 | 
|  |    987 | \begin{ttbox}\makeatother
 | 
|  |    988 | {\out 1. xs ~= [] ==> hd(rev xs) # tl(rev xs) = rev xs}
 | 
|  |    989 | \end{ttbox}
 | 
|  |    990 | is proved by simplification:
 | 
|  |    991 | the conditional equation $(*)$ above
 | 
|  |    992 | can simplify \texttt{hd(rev~xs)~\#~tl(rev~xs)} to \texttt{rev xs}
 | 
|  |    993 | because the corresponding precondition \verb$rev xs ~= []$ simplifies to
 | 
|  |    994 | \verb$xs ~= []$, which is exactly the local assumption of the subgoal.
 | 
|  |    995 | 
 | 
|  |    996 | 
 | 
|  |    997 | \subsubsection{Automatic case splits}
 | 
|  |    998 | 
 | 
|  |    999 | Goals containing \ttindex{if}-expressions are usually proved by case
 | 
|  |   1000 | distinction on the condition of the \texttt{if}. For example the goal
 | 
|  |   1001 | \begin{ttbox}
 | 
|  |   1002 | {\out 1. ! xs. if xs = [] then rev xs = [] else rev xs ~= []}
 | 
|  |   1003 | \end{ttbox}
 | 
|  |   1004 | can be split into
 | 
|  |   1005 | \begin{ttbox}
 | 
|  |   1006 | {\out 1. ! xs. (xs = [] --> rev xs = []) \& (xs ~= [] --> rev xs ~= [])}
 | 
|  |   1007 | \end{ttbox}
 | 
|  |   1008 | by typing
 | 
|  |   1009 | \begin{ttbox}
 | 
|  |   1010 | \input{Misc/splitif.ML}\end{ttbox}
 | 
|  |   1011 | Because this is almost always the right proof strategy, the simplifier
 | 
|  |   1012 | performs case-splitting on \texttt{if}s automatically. Try \texttt{Simp_tac}
 | 
|  |   1013 | on the initial goal above.
 | 
|  |   1014 | 
 | 
|  |   1015 | This splitting idea generalizes from \texttt{if} to \ttindex{case}:
 | 
|  |   1016 | \begin{ttbox}\makeatother
 | 
|  |   1017 | {\out 1. (case xs of [] => zs | y#ys => y#(ys@zs)) = xs@zs}
 | 
|  |   1018 | \end{ttbox}
 | 
|  |   1019 | becomes
 | 
|  |   1020 | \begin{ttbox}\makeatother
 | 
|  |   1021 | {\out 1. (xs = [] --> zs = xs @ zs) &}
 | 
|  |   1022 | {\out    (! a list. xs = a # list --> a # list @ zs = xs @ zs)}
 | 
|  |   1023 | \end{ttbox}
 | 
|  |   1024 | by typing
 | 
|  |   1025 | \begin{ttbox}
 | 
|  |   1026 | \input{Misc/splitlist.ML}\end{ttbox}
 | 
|  |   1027 | In contrast to \texttt{if}-expressions, the simplifier does not split
 | 
|  |   1028 | \texttt{case}-expressions by default because this can lead to nontermination
 | 
|  |   1029 | in case of recursive datatypes.
 | 
|  |   1030 | Nevertheless the simplifier can be instructed to perform \texttt{case}-splits
 | 
|  |   1031 | by adding the appropriate rule to the simpset:
 | 
|  |   1032 | \begin{ttbox}
 | 
|  |   1033 | by(simp_tac (simpset() addsplits [split_list_case]) 1);
 | 
|  |   1034 | \end{ttbox}\indexbold{*addsplits}
 | 
|  |   1035 | solves the initial goal outright, which \texttt{Simp_tac} alone will not do.
 | 
|  |   1036 | 
 | 
|  |   1037 | In general, every datatype $t$ comes with a rule
 | 
|  |   1038 | \texttt{$t$.split} that can be added to the simpset either
 | 
|  |   1039 | locally via \texttt{addsplits} (see above), or permanently via
 | 
|  |   1040 | \begin{ttbox}
 | 
|  |   1041 | Addsplits [\(t\).split];
 | 
|  |   1042 | \end{ttbox}\indexbold{*Addsplits}
 | 
|  |   1043 | Split-rules can be removed globally via \ttindexbold{Delsplits} and locally
 | 
|  |   1044 | via \ttindexbold{delsplits} as, for example, in
 | 
|  |   1045 | \begin{ttbox}
 | 
|  |   1046 | by(simp_tac (simpset() addsimps [\(\dots\)] delsplits [split_if]) 1);
 | 
|  |   1047 | \end{ttbox}
 | 
|  |   1048 | 
 | 
|  |   1049 | 
 | 
| 6577 |   1050 | \subsubsection{Arithmetic}
 | 
|  |   1051 | \index{arithmetic}
 | 
|  |   1052 | 
 | 
|  |   1053 | The simplifier routinely solves a small class of linear arithmetic formulae
 | 
|  |   1054 | (over types \texttt{nat} and \texttt{int}): it only takes into account
 | 
|  |   1055 | assumptions and conclusions that are (possibly negated) (in)equalities
 | 
|  |   1056 | (\texttt{=}, \texttt{<=}, \texttt{<}) and it only knows about addition. Thus
 | 
|  |   1057 | \begin{ttbox}
 | 
|  |   1058 | \input{Misc/arith1.ML}\end{ttbox}
 | 
|  |   1059 | is proved by simplification, whereas the only slightly more complex
 | 
|  |   1060 | \begin{ttbox}
 | 
|  |   1061 | \input{Misc/arith3.ML}\end{ttbox}
 | 
|  |   1062 | is not proved by simplification and requires \texttt{arith_tac}.
 | 
|  |   1063 | 
 | 
| 5375 |   1064 | \subsubsection{Permutative rewrite rules}
 | 
|  |   1065 | 
 | 
|  |   1066 | A rewrite rule is {\bf permutative} if the left-hand side and right-hand side
 | 
|  |   1067 | are the same up to renaming of variables.  The most common permutative rule
 | 
|  |   1068 | is commutativity: $x+y = y+x$.  Another example is $(x-y)-z = (x-z)-y$.  Such
 | 
|  |   1069 | rules are problematic because once they apply, they can be used forever.
 | 
|  |   1070 | The simplifier is aware of this danger and treats permutative rules
 | 
| 6606 |   1071 | separately. For details see~\cite{isabelle-ref}.
 | 
| 5375 |   1072 | 
 | 
|  |   1073 | \subsubsection{Tracing}
 | 
|  |   1074 | \indexbold{tracing the simplifier}
 | 
|  |   1075 | 
 | 
|  |   1076 | Using the simplifier effectively may take a bit of experimentation.  Set the
 | 
|  |   1077 | \verb$trace_simp$ flag to get a better idea of what is going on:
 | 
|  |   1078 | \begin{ttbox}\makeatother
 | 
|  |   1079 | {\out  1. rev [x] = []}
 | 
|  |   1080 | \ttbreak
 | 
|  |   1081 | set trace_simp;
 | 
|  |   1082 | by(Simp_tac 1);
 | 
|  |   1083 | \ttbreak\makeatother
 | 
|  |   1084 | {\out Applying instance of rewrite rule:}
 | 
|  |   1085 | {\out rev (?x # ?xs) == rev ?xs @ [?x]}
 | 
|  |   1086 | {\out Rewriting:}
 | 
|  |   1087 | {\out rev [x] == rev [] @ [x]}
 | 
|  |   1088 | \ttbreak
 | 
|  |   1089 | {\out Applying instance of rewrite rule:}
 | 
|  |   1090 | {\out rev [] == []}
 | 
|  |   1091 | {\out Rewriting:}
 | 
|  |   1092 | {\out rev [] == []}
 | 
|  |   1093 | \ttbreak\makeatother
 | 
|  |   1094 | {\out Applying instance of rewrite rule:}
 | 
|  |   1095 | {\out [] @ ?y == ?y}
 | 
|  |   1096 | {\out Rewriting:}
 | 
|  |   1097 | {\out [] @ [x] == [x]}
 | 
|  |   1098 | \ttbreak
 | 
|  |   1099 | {\out Applying instance of rewrite rule:}
 | 
|  |   1100 | {\out ?x # ?t = ?t == False}
 | 
|  |   1101 | {\out Rewriting:}
 | 
|  |   1102 | {\out [x] = [] == False}
 | 
|  |   1103 | \ttbreak
 | 
|  |   1104 | {\out Level 1}
 | 
|  |   1105 | {\out rev [x] = []}
 | 
|  |   1106 | {\out  1. False}
 | 
|  |   1107 | \end{ttbox}
 | 
|  |   1108 | In more complicated cases, the trace can be enormous, especially since
 | 
|  |   1109 | invocations of the simplifier are often nested (e.g.\ when solving conditions
 | 
|  |   1110 | of rewrite rules).
 | 
|  |   1111 | 
 | 
|  |   1112 | \subsection{How it works}
 | 
|  |   1113 | \label{sec:SimpHow}
 | 
|  |   1114 | 
 | 
|  |   1115 | \subsubsection{Higher-order patterns}
 | 
|  |   1116 | 
 | 
|  |   1117 | \subsubsection{Local assumptions}
 | 
|  |   1118 | 
 | 
|  |   1119 | \subsubsection{The preprocessor}
 | 
|  |   1120 | 
 | 
|  |   1121 | \section{Induction heuristics}
 | 
|  |   1122 | \label{sec:InductionHeuristics}
 | 
|  |   1123 | 
 | 
|  |   1124 | The purpose of this section is to illustrate some simple heuristics for
 | 
|  |   1125 | inductive proofs. The first one we have already mentioned in our initial
 | 
|  |   1126 | example:
 | 
|  |   1127 | \begin{quote}
 | 
|  |   1128 | {\em 1. Theorems about recursive functions are proved by induction.}
 | 
|  |   1129 | \end{quote}
 | 
|  |   1130 | In case the function has more than one argument
 | 
|  |   1131 | \begin{quote}
 | 
|  |   1132 | {\em 2. Do induction on argument number $i$ if the function is defined by
 | 
|  |   1133 | recursion in argument number $i$.}
 | 
|  |   1134 | \end{quote}
 | 
|  |   1135 | When we look at the proof of
 | 
|  |   1136 | \begin{ttbox}\makeatother
 | 
|  |   1137 | (xs @ ys) @ zs = xs @ (ys @ zs)
 | 
|  |   1138 | \end{ttbox}
 | 
|  |   1139 | in \S\ref{sec:intro-proof} we find (a) \texttt{\at} is recursive in the first
 | 
|  |   1140 | argument, (b) \texttt{xs} occurs only as the first argument of \texttt{\at},
 | 
|  |   1141 | and (c) both \texttt{ys} and \texttt{zs} occur at least once as the second
 | 
|  |   1142 | argument of \texttt{\at}. Hence it is natural to perform induction on
 | 
|  |   1143 | \texttt{xs}.
 | 
|  |   1144 | 
 | 
|  |   1145 | The key heuristic, and the main point of this section, is to
 | 
|  |   1146 | generalize the goal before induction. The reason is simple: if the goal is
 | 
|  |   1147 | too specific, the induction hypothesis is too weak to allow the induction
 | 
|  |   1148 | step to go through. Let us now illustrate the idea with an example.
 | 
|  |   1149 | 
 | 
|  |   1150 | We define a tail-recursive version of list-reversal,
 | 
|  |   1151 | i.e.\ one that can be compiled into a loop:
 | 
|  |   1152 | \begin{ttbox}
 | 
|  |   1153 | \input{Misc/Itrev.thy}\end{ttbox}
 | 
|  |   1154 | The behaviour of \texttt{itrev} is simple: it reverses its first argument by
 | 
|  |   1155 | stacking its elements onto the second argument, and returning that second
 | 
|  |   1156 | argument when the first one becomes empty.
 | 
|  |   1157 | We need to show that \texttt{itrev} does indeed reverse its first argument
 | 
|  |   1158 | provided the second one is empty:
 | 
|  |   1159 | \begin{ttbox}
 | 
|  |   1160 | \input{Misc/itrev1.ML}\end{ttbox}
 | 
|  |   1161 | There is no choice as to the induction variable, and we immediately simplify:
 | 
|  |   1162 | \begin{ttbox}
 | 
|  |   1163 | \input{Misc/induct_auto.ML}\ttbreak\makeatother
 | 
|  |   1164 | {\out1. !!a list. itrev list [] = rev list\(\;\)==> itrev list [a] = rev list @ [a]}
 | 
|  |   1165 | \end{ttbox}
 | 
|  |   1166 | Just as predicted above, the overall goal, and hence the induction
 | 
|  |   1167 | hypothesis, is too weak to solve the induction step because of the fixed
 | 
|  |   1168 | \texttt{[]}. The corresponding heuristic:
 | 
|  |   1169 | \begin{quote}
 | 
|  |   1170 | {\em 3. Generalize goals for induction by replacing constants by variables.}
 | 
|  |   1171 | \end{quote}
 | 
|  |   1172 | Of course one cannot do this na\"{\i}vely: \texttt{itrev xs ys = rev xs} is
 | 
|  |   1173 | just not true --- the correct generalization is
 | 
|  |   1174 | \begin{ttbox}\makeatother
 | 
|  |   1175 | \input{Misc/itrev2.ML}\end{ttbox}
 | 
|  |   1176 | If \texttt{ys} is replaced by \texttt{[]}, the right-hand side simplifies to
 | 
|  |   1177 | \texttt{rev xs}, just as required.
 | 
|  |   1178 | 
 | 
|  |   1179 | In this particular instance it is easy to guess the right generalization,
 | 
|  |   1180 | but in more complex situations a good deal of creativity is needed. This is
 | 
|  |   1181 | the main source of complications in inductive proofs.
 | 
|  |   1182 | 
 | 
|  |   1183 | Although we now have two variables, only \texttt{xs} is suitable for
 | 
|  |   1184 | induction, and we repeat our above proof attempt. Unfortunately, we are still
 | 
|  |   1185 | not there:
 | 
|  |   1186 | \begin{ttbox}\makeatother
 | 
|  |   1187 | {\out 1. !!a list.}
 | 
|  |   1188 | {\out       itrev list ys = rev list @ ys}
 | 
|  |   1189 | {\out       ==> itrev list (a # ys) = rev list @ a # ys}
 | 
|  |   1190 | \end{ttbox}
 | 
|  |   1191 | The induction hypothesis is still too weak, but this time it takes no
 | 
|  |   1192 | intuition to generalize: the problem is that \texttt{ys} is fixed throughout
 | 
|  |   1193 | the subgoal, but the induction hypothesis needs to be applied with
 | 
|  |   1194 | \texttt{a \# ys} instead of \texttt{ys}. Hence we prove the theorem
 | 
|  |   1195 | for all \texttt{ys} instead of a fixed one:
 | 
|  |   1196 | \begin{ttbox}\makeatother
 | 
|  |   1197 | \input{Misc/itrev3.ML}\end{ttbox}
 | 
|  |   1198 | This time induction on \texttt{xs} followed by simplification succeeds. This
 | 
|  |   1199 | leads to another heuristic for generalization:
 | 
|  |   1200 | \begin{quote}
 | 
|  |   1201 | {\em 4. Generalize goals for induction by universally quantifying all free
 | 
|  |   1202 | variables {\em(except the induction variable itself!)}.}
 | 
|  |   1203 | \end{quote}
 | 
|  |   1204 | This prevents trivial failures like the above and does not change the
 | 
|  |   1205 | provability of the goal. Because it is not always required, and may even
 | 
|  |   1206 | complicate matters in some cases, this heuristic is often not
 | 
|  |   1207 | applied blindly.
 | 
|  |   1208 | 
 | 
|  |   1209 | A final point worth mentioning is the orientation of the equation we just
 | 
|  |   1210 | proved: the more complex notion (\texttt{itrev}) is on the left-hand
 | 
|  |   1211 | side, the simpler \texttt{rev} on the right-hand side. This constitutes
 | 
|  |   1212 | another, albeit weak heuristic that is not restricted to induction:
 | 
|  |   1213 | \begin{quote}
 | 
|  |   1214 |   {\em 5. The right-hand side of an equation should (in some sense) be
 | 
|  |   1215 |     simpler than the left-hand side.}
 | 
|  |   1216 | \end{quote}
 | 
|  |   1217 | The heuristic is tricky to apply because it is not obvious that
 | 
|  |   1218 | \texttt{rev xs \at\ ys} is simpler than \texttt{itrev xs ys}. But see what
 | 
|  |   1219 | happens if you try to prove the symmetric equation!
 | 
|  |   1220 | 
 | 
|  |   1221 | 
 | 
|  |   1222 | \section{Case study: compiling expressions}
 | 
|  |   1223 | \label{sec:ExprCompiler}
 | 
|  |   1224 | 
 | 
|  |   1225 | The task is to develop a compiler from a generic type of expressions (built
 | 
|  |   1226 | up from variables, constants and binary operations) to a stack machine.  This
 | 
|  |   1227 | generic type of expressions is a generalization of the boolean expressions in
 | 
|  |   1228 | \S\ref{sec:boolex}.  This time we do not commit ourselves to a particular
 | 
|  |   1229 | type of variables or values but make them type parameters.  Neither is there
 | 
|  |   1230 | a fixed set of binary operations: instead the expression contains the
 | 
|  |   1231 | appropriate function itself.
 | 
|  |   1232 | \begin{ttbox}
 | 
|  |   1233 | \input{CodeGen/expr}\end{ttbox}
 | 
|  |   1234 | The three constructors represent constants, variables and the combination of
 | 
|  |   1235 | two subexpressions with a binary operation.
 | 
|  |   1236 | 
 | 
|  |   1237 | The value of an expression w.r.t.\ an environment that maps variables to
 | 
|  |   1238 | values is easily defined:
 | 
|  |   1239 | \begin{ttbox}
 | 
|  |   1240 | \input{CodeGen/value}\end{ttbox}
 | 
|  |   1241 | 
 | 
|  |   1242 | The stack machine has three instructions: load a constant value onto the
 | 
|  |   1243 | stack, load the contents of a certain address onto the stack, and apply a
 | 
|  |   1244 | binary operation to the two topmost elements of the stack, replacing them by
 | 
|  |   1245 | the result. As for \texttt{expr}, addresses and values are type parameters:
 | 
|  |   1246 | \begin{ttbox}
 | 
|  |   1247 | \input{CodeGen/instr}\end{ttbox}
 | 
|  |   1248 | 
 | 
|  |   1249 | The execution of the stack machine is modelled by a function \texttt{exec}
 | 
|  |   1250 | that takes a store (modelled as a function from addresses to values, just
 | 
|  |   1251 | like the environment for evaluating expressions), a stack (modelled as a
 | 
|  |   1252 | list) of values and a list of instructions and returns the stack at the end
 | 
|  |   1253 | of the execution --- the store remains unchanged:
 | 
|  |   1254 | \begin{ttbox}
 | 
|  |   1255 | \input{CodeGen/exec}\end{ttbox}
 | 
|  |   1256 | Recall that \texttt{hd} and \texttt{tl}
 | 
|  |   1257 | return the first element and the remainder of a list.
 | 
|  |   1258 | 
 | 
|  |   1259 | Because all functions are total, \texttt{hd} is defined even for the empty
 | 
|  |   1260 | list, although we do not know what the result is. Thus our model of the
 | 
|  |   1261 | machine always terminates properly, although the above definition does not
 | 
|  |   1262 | tell us much about the result in situations where \texttt{Apply} was executed
 | 
|  |   1263 | with fewer than two elements on the stack.
 | 
|  |   1264 | 
 | 
|  |   1265 | The compiler is a function from expressions to a list of instructions. Its
 | 
|  |   1266 | definition is pretty much obvious:
 | 
|  |   1267 | \begin{ttbox}\makeatother
 | 
|  |   1268 | \input{CodeGen/comp}\end{ttbox}
 | 
|  |   1269 | 
 | 
|  |   1270 | Now we have to prove the correctness of the compiler, i.e.\ that the
 | 
|  |   1271 | execution of a compiled expression results in the value of the expression:
 | 
|  |   1272 | \begin{ttbox}
 | 
|  |   1273 | exec s [] (comp e) = [value s e]
 | 
|  |   1274 | \end{ttbox}
 | 
|  |   1275 | This is generalized to
 | 
|  |   1276 | \begin{ttbox}
 | 
|  |   1277 | \input{CodeGen/goal2.ML}\end{ttbox}
 | 
|  |   1278 | and proved by induction on \texttt{e} followed by simplification, once we
 | 
|  |   1279 | have the following lemma about executing the concatenation of two instruction
 | 
|  |   1280 | sequences:
 | 
|  |   1281 | \begin{ttbox}\makeatother
 | 
| 6099 |   1282 | \input{CodeGen/goal1.ML}\end{ttbox}
 | 
| 5375 |   1283 | This requires induction on \texttt{xs} and ordinary simplification for the
 | 
|  |   1284 | base cases. In the induction step, simplification leaves us with a formula
 | 
|  |   1285 | that contains two \texttt{case}-expressions over instructions. Thus we add
 | 
|  |   1286 | automatic case splitting as well, which finishes the proof:
 | 
|  |   1287 | \begin{ttbox}
 | 
|  |   1288 | \input{CodeGen/simpsplit.ML}\end{ttbox}
 | 
|  |   1289 | 
 | 
|  |   1290 | We could now go back and prove \texttt{exec s [] (comp e) = [value s e]}
 | 
|  |   1291 | merely by simplification with the generalized version we just proved.
 | 
|  |   1292 | However, this is unnecessary because the generalized version fully subsumes
 | 
|  |   1293 | its instance.
 | 
|  |   1294 | 
 | 
| 5850 |   1295 | 
 | 
|  |   1296 | \section{Advanced datatypes}
 | 
|  |   1297 | \index{*datatype|(}
 | 
|  |   1298 | \index{*primrec|(}
 | 
|  |   1299 | 
 | 
| 6099 |   1300 | This section presents advanced forms of \texttt{datatype}s and (in the near
 | 
|  |   1301 | future!) records.
 | 
|  |   1302 | 
 | 
| 5850 |   1303 | \subsection{Mutual recursion}
 | 
|  |   1304 | 
 | 
|  |   1305 | Sometimes it is necessary to define two datatypes that depend on each
 | 
|  |   1306 | other. This is called \textbf{mutual recursion}. As an example consider a
 | 
|  |   1307 | language of arithmetic and boolean expressions where
 | 
|  |   1308 | \begin{itemize}
 | 
|  |   1309 | \item arithmetic expressions contain boolean expressions because there are
 | 
|  |   1310 |   conditional arithmetic expressions like ``if $m<n$ then $n-m$ else $m-n$'',
 | 
|  |   1311 |   and
 | 
|  |   1312 | \item boolean expressions contain arithmetic expressions because of
 | 
|  |   1313 |   comparisons like ``$m<n$''.
 | 
|  |   1314 | \end{itemize}
 | 
|  |   1315 | In Isabelle this becomes
 | 
|  |   1316 | \begin{ttbox}
 | 
|  |   1317 | \input{Datatype/abdata}\end{ttbox}\indexbold{*and}
 | 
|  |   1318 | Type \texttt{aexp} is similar to \texttt{expr} in \S\ref{sec:ExprCompiler},
 | 
|  |   1319 | except that we have fixed the values to be of type \texttt{nat} and that we
 | 
|  |   1320 | have fixed the two binary operations \texttt{Sum} and \texttt{Diff}. Boolean
 | 
|  |   1321 | expressions can be arithmetic comparisons, conjunctions and negations.
 | 
|  |   1322 | The semantics is fixed via two evaluation functions
 | 
|  |   1323 | \begin{ttbox}
 | 
|  |   1324 | \input{Datatype/abconstseval}\end{ttbox}
 | 
|  |   1325 | that take an environment (a mapping from variables \texttt{'a} to values
 | 
|  |   1326 | \texttt{nat}) and an expression and return its arithmetic/boolean
 | 
|  |   1327 | value. Since the datatypes are mutually recursive, so are functions that
 | 
|  |   1328 | operate on them. Hence they need to be defined in a single \texttt{primrec}
 | 
|  |   1329 | section:
 | 
|  |   1330 | \begin{ttbox}
 | 
|  |   1331 | \input{Datatype/abevala}
 | 
|  |   1332 | \input{Datatype/abevalb}\end{ttbox}
 | 
|  |   1333 | 
 | 
|  |   1334 | %In general, given $n$ mutually recursive datatypes, whenever you define a
 | 
|  |   1335 | %\texttt{primrec} function on one of them, Isabelle expects you to define $n$
 | 
|  |   1336 | %(possibly mutually recursive) functions simultaneously.
 | 
|  |   1337 | 
 | 
|  |   1338 | In the same fashion we also define two functions that perform substitution:
 | 
|  |   1339 | \begin{ttbox}
 | 
|  |   1340 | \input{Datatype/abconstssubst}\end{ttbox}
 | 
|  |   1341 | The first argument is a function mapping variables to expressions, the
 | 
|  |   1342 | substitution. It is applied to all variables in the second argument. As a
 | 
|  |   1343 | result, the type of variables in the expression may change from \texttt{'a}
 | 
|  |   1344 | to \texttt{'b}. Note that there are only arithmetic and no boolean variables.
 | 
|  |   1345 | \begin{ttbox}
 | 
|  |   1346 | \input{Datatype/absubsta}
 | 
|  |   1347 | \input{Datatype/absubstb}\end{ttbox}
 | 
|  |   1348 | 
 | 
|  |   1349 | Now we can prove a fundamental theorem about the interaction between
 | 
|  |   1350 | evaluation and substitution: applying a substitution $s$ to an expression $a$
 | 
|  |   1351 | and evaluating the result in an environment $env$ yields the same result as
 | 
|  |   1352 | evaluation $a$ in the environment that maps every variable $x$ to the value
 | 
|  |   1353 | of $s(x)$ under $env$. If you try to prove this separately for arithmetic or
 | 
|  |   1354 | boolean expressions (by induction), you find that you always need the other
 | 
|  |   1355 | theorem in the induction step. Therefore you need to state and prove both
 | 
|  |   1356 | theorems simultaneously:
 | 
|  |   1357 | \begin{quote}\small
 | 
|  |   1358 | \verbatiminput{Datatype/abgoalind.ML}
 | 
|  |   1359 | \end{quote}\indexbold{*mutual_induct_tac}
 | 
|  |   1360 | The resulting 8 goals (one for each constructor) are proved in one fell swoop
 | 
|  |   1361 | \texttt{by Auto_tac;}.
 | 
|  |   1362 | 
 | 
|  |   1363 | In general, given $n$ mutually recursive datatypes $\tau@1$, \dots, $\tau@n$,
 | 
|  |   1364 | Isabelle expects an inductive proof to start with a goal of the form
 | 
|  |   1365 | \[ P@1(x@1)\ \texttt{\&}\ \dots\ \texttt{\&}\ P@n(x@n) \]
 | 
|  |   1366 | where each variable $x@i$ is of type $\tau@i$. Induction is started by
 | 
|  |   1367 | \begin{ttbox}
 | 
|  |   1368 | by(mutual_induct_tac ["\(x@1\)",\(\dots\),"\(x@n\)"] \(k\));
 | 
|  |   1369 | \end{ttbox}
 | 
|  |   1370 | 
 | 
|  |   1371 | \begin{exercise}
 | 
|  |   1372 |   Define a function \texttt{norma} of type \texttt{'a aexp => 'a aexp} that
 | 
|  |   1373 |   replaces \texttt{IF}s with complex boolean conditions by nested
 | 
|  |   1374 |   \texttt{IF}s where each condition is a \texttt{Less} --- \texttt{And} and
 | 
|  |   1375 |   \texttt{Neg} should be eliminated completely. Prove that \texttt{norma}
 | 
|  |   1376 |   preserves the value of an expression and that the result of \texttt{norma}
 | 
|  |   1377 |   is really normal, i.e.\ no more \texttt{And}s and \texttt{Neg}s occur in
 | 
|  |   1378 |   it.  ({\em Hint:} proceed as in \S\ref{sec:boolex}).
 | 
|  |   1379 | \end{exercise}
 | 
|  |   1380 | 
 | 
|  |   1381 | \subsection{Nested recursion}
 | 
|  |   1382 | 
 | 
|  |   1383 | So far, all datatypes had the property that on the right-hand side of their
 | 
|  |   1384 | definition they occurred only at the top-level, i.e.\ directly below a
 | 
|  |   1385 | constructor. This is not the case any longer for the following model of terms
 | 
|  |   1386 | where function symbols can be applied to a list of arguments:
 | 
|  |   1387 | \begin{ttbox}
 | 
|  |   1388 | \input{Datatype/tdata}\end{ttbox}
 | 
|  |   1389 | Parameter \texttt{'a} is the type of variables and \texttt{'b} the type of
 | 
|  |   1390 | function symbols.
 | 
|  |   1391 | A mathematical term like $f(x,g(y))$ becomes \texttt{App f [Var x, App g
 | 
|  |   1392 |   [Var y]]}, where \texttt{f}, \texttt{g}, \texttt{x}, \texttt{y} are
 | 
|  |   1393 | suitable values, e.g.\ numbers or strings.
 | 
|  |   1394 | 
 | 
|  |   1395 | What complicates the definition of \texttt{term} is the nested occurrence of
 | 
|  |   1396 | \texttt{term} inside \texttt{list} on the right-hand side. In principle,
 | 
|  |   1397 | nested recursion can be eliminated in favour of mutual recursion by unfolding
 | 
|  |   1398 | the offending datatypes, here \texttt{list}. The result for \texttt{term}
 | 
|  |   1399 | would be something like
 | 
|  |   1400 | \begin{ttbox}
 | 
|  |   1401 | \input{Datatype/tunfoldeddata}\end{ttbox}
 | 
|  |   1402 | Although we do not recommend this unfolding to the user, this is how Isabelle
 | 
|  |   1403 | deals with nested recursion internally. Strictly speaking, this information
 | 
|  |   1404 | is irrelevant at the user level (and might change in the future), but it
 | 
| 6099 |   1405 | motivates why \texttt{primrec} and induction work for nested types the way
 | 
| 5850 |   1406 | they do. We now return to the initial definition of \texttt{term} using
 | 
|  |   1407 | nested recursion.
 | 
|  |   1408 | 
 | 
|  |   1409 | Let us define a substitution function on terms. Because terms involve term
 | 
|  |   1410 | lists, we need to define two substitution functions simultaneously:
 | 
|  |   1411 | \begin{ttbox}
 | 
|  |   1412 | \input{Datatype/tconstssubst}
 | 
|  |   1413 | \input{Datatype/tsubst}
 | 
|  |   1414 | \input{Datatype/tsubsts}\end{ttbox}
 | 
|  |   1415 | Similarly, when proving a statement about terms inductively, we need
 | 
|  |   1416 | to prove a related statement about term lists simultaneously. For example,
 | 
|  |   1417 | the fact that the identity substitution does not change a term needs to be
 | 
|  |   1418 | strengthened and proved as follows:
 | 
|  |   1419 | \begin{quote}\small
 | 
|  |   1420 | \verbatiminput{Datatype/tidproof.ML}
 | 
|  |   1421 | \end{quote}
 | 
|  |   1422 | Note that \texttt{Var} is the identity substitution because by definition it
 | 
|  |   1423 | leaves variables unchanged: \texttt{subst Var (Var $x$) = Var $x$}. Note also
 | 
|  |   1424 | that the type annotations are necessary because otherwise there is nothing in
 | 
| 6099 |   1425 | the goal to enforce that both halves of the goal talk about the same type
 | 
| 5850 |   1426 | parameters \texttt{('a,'b)}. As a result, induction would fail
 | 
| 6099 |   1427 | because the two halves of the goal would be unrelated.
 | 
| 5850 |   1428 | 
 | 
|  |   1429 | \begin{exercise}
 | 
| 6099 |   1430 | The fact that substitution distributes over composition can be expressed
 | 
|  |   1431 | roughly as follows:
 | 
| 5850 |   1432 | \begin{ttbox}
 | 
|  |   1433 | subst (f o g) t = subst f (subst g t)
 | 
|  |   1434 | \end{ttbox}
 | 
|  |   1435 | Correct this statement (you will find that it does not type-check),
 | 
|  |   1436 | strengthen it and prove it. (Note: \texttt{o} is function composition; its
 | 
|  |   1437 | definition is found in the theorem \texttt{o_def}).
 | 
|  |   1438 | \end{exercise}
 | 
|  |   1439 | 
 | 
|  |   1440 | If you feel that the \texttt{App}-equation in the definition of substitution
 | 
|  |   1441 | is overly complicated, you are right: the simpler
 | 
|  |   1442 | \begin{ttbox}
 | 
|  |   1443 | \input{Datatype/appmap}\end{ttbox}
 | 
|  |   1444 | would have done the job. Unfortunately, Isabelle insists on the more verbose
 | 
|  |   1445 | equation given above. Nevertheless, we can easily {\em prove} that the
 | 
|  |   1446 | \texttt{map}-equation holds: simply by induction on \texttt{ts} followed by
 | 
|  |   1447 | \texttt{Auto_tac}.
 | 
|  |   1448 | 
 | 
|  |   1449 | %FIXME: forward pointer to section where better induction principles are
 | 
|  |   1450 | %derived?
 | 
|  |   1451 | 
 | 
|  |   1452 | \begin{exercise}
 | 
|  |   1453 |   Define a function \texttt{rev_term} of type \texttt{term -> term} that
 | 
|  |   1454 |   recursively reverses the order of arguments of all function symbols in a
 | 
|  |   1455 |   term. Prove that \texttt{rev_term(rev_term t) = t}.
 | 
|  |   1456 | \end{exercise}
 | 
|  |   1457 | 
 | 
|  |   1458 | Of course, you may also combine mutual and nested recursion as in the
 | 
|  |   1459 | following example
 | 
|  |   1460 | \begin{ttbox}
 | 
|  |   1461 | \input{Datatype/mutnested}\end{ttbox}
 | 
|  |   1462 | taken from an operational semantics of applied lambda terms. Note that double
 | 
|  |   1463 | quotes are required around the type involving \texttt{*}, as explained on
 | 
|  |   1464 | page~\pageref{startype}.
 | 
|  |   1465 | 
 | 
|  |   1466 | \subsection{The limits of nested recursion}
 | 
|  |   1467 | 
 | 
|  |   1468 | How far can we push nested recursion? By the unfolding argument above, we can
 | 
|  |   1469 | reduce nested to mutual recursion provided the nested recursion only involves
 | 
|  |   1470 | previously defined datatypes. Isabelle is a bit more generous and also permits
 | 
|  |   1471 | products as in the \texttt{data} example above.
 | 
|  |   1472 | However, functions are most emphatically not allowed:
 | 
|  |   1473 | \begin{ttbox}
 | 
|  |   1474 | datatype t = C (t => bool)
 | 
|  |   1475 | \end{ttbox}
 | 
|  |   1476 | is a real can of worms: in HOL it must be ruled out because it requires a
 | 
|  |   1477 | type \texttt{t} such that \texttt{t} and its power set \texttt{t => bool}
 | 
|  |   1478 | have the same cardinality---an impossibility.
 | 
|  |   1479 | In theory, we could allow limited forms of recursion involving function
 | 
|  |   1480 | spaces. For example
 | 
|  |   1481 | \begin{ttbox}
 | 
|  |   1482 | datatype t = C (nat => t) | D
 | 
|  |   1483 | \end{ttbox}
 | 
|  |   1484 | is unproblematic. However, Isabelle does not support recursion involving
 | 
|  |   1485 | \texttt{=>} at all at the moment.
 | 
|  |   1486 | 
 | 
|  |   1487 | For a theoretical analysis of what kinds of datatypes are feasible in HOL
 | 
|  |   1488 | see, for example,~\cite{Gunter-HOL92}. There are alternatives to pure HOL:
 | 
| 6606 |   1489 | LCF~\cite{paulson87} is a logic where types like
 | 
| 5850 |   1490 | \begin{ttbox}
 | 
|  |   1491 | datatype t = C (t -> t)
 | 
|  |   1492 | \end{ttbox}
 | 
| 6099 |   1493 | do indeed make sense (note the intentionally different arrow \texttt{->}!).
 | 
| 5850 |   1494 | There is even a version of LCF on top of HOL, called
 | 
| 6606 |   1495 | HOLCF~\cite{MuellerNvOS99}.
 | 
| 5850 |   1496 | 
 | 
|  |   1497 | \index{*primrec|)}
 | 
|  |   1498 | \index{*datatype|)}
 | 
|  |   1499 | 
 | 
|  |   1500 | \subsection{Case study: Tries}
 | 
|  |   1501 | 
 | 
|  |   1502 | Tries are a classic search tree data structure~\cite{Knuth3-75} for fast
 | 
|  |   1503 | indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a
 | 
|  |   1504 | trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and
 | 
|  |   1505 | ``cat''.  When searching a string in a trie, the letters of the string are
 | 
|  |   1506 | examined sequentially. Each letter determines which subtrie to search next.
 | 
|  |   1507 | In this case study we model tries as a datatype, define a lookup and an
 | 
|  |   1508 | update function, and prove that they behave as expected.
 | 
|  |   1509 | 
 | 
|  |   1510 | \begin{figure}[htbp]
 | 
|  |   1511 | \begin{center}
 | 
|  |   1512 | \unitlength1mm
 | 
|  |   1513 | \begin{picture}(60,30)
 | 
|  |   1514 | \put( 5, 0){\makebox(0,0)[b]{l}}
 | 
|  |   1515 | \put(25, 0){\makebox(0,0)[b]{e}}
 | 
|  |   1516 | \put(35, 0){\makebox(0,0)[b]{n}}
 | 
|  |   1517 | \put(45, 0){\makebox(0,0)[b]{r}}
 | 
|  |   1518 | \put(55, 0){\makebox(0,0)[b]{t}}
 | 
|  |   1519 | %
 | 
|  |   1520 | \put( 5, 9){\line(0,-1){5}}
 | 
|  |   1521 | \put(25, 9){\line(0,-1){5}}
 | 
|  |   1522 | \put(44, 9){\line(-3,-2){9}}
 | 
|  |   1523 | \put(45, 9){\line(0,-1){5}}
 | 
|  |   1524 | \put(46, 9){\line(3,-2){9}}
 | 
|  |   1525 | %
 | 
|  |   1526 | \put( 5,10){\makebox(0,0)[b]{l}}
 | 
|  |   1527 | \put(15,10){\makebox(0,0)[b]{n}}
 | 
|  |   1528 | \put(25,10){\makebox(0,0)[b]{p}}
 | 
|  |   1529 | \put(45,10){\makebox(0,0)[b]{a}}
 | 
|  |   1530 | %
 | 
|  |   1531 | \put(14,19){\line(-3,-2){9}}
 | 
|  |   1532 | \put(15,19){\line(0,-1){5}}
 | 
|  |   1533 | \put(16,19){\line(3,-2){9}}
 | 
|  |   1534 | \put(45,19){\line(0,-1){5}}
 | 
|  |   1535 | %
 | 
|  |   1536 | \put(15,20){\makebox(0,0)[b]{a}}
 | 
|  |   1537 | \put(45,20){\makebox(0,0)[b]{c}}
 | 
|  |   1538 | %
 | 
|  |   1539 | \put(30,30){\line(-3,-2){13}}
 | 
|  |   1540 | \put(30,30){\line(3,-2){13}}
 | 
|  |   1541 | \end{picture}
 | 
|  |   1542 | \end{center}
 | 
|  |   1543 | \caption{A sample trie}
 | 
|  |   1544 | \label{fig:trie}
 | 
|  |   1545 | \end{figure}
 | 
|  |   1546 | 
 | 
| 6099 |   1547 | Proper tries associate some value with each string. Since the
 | 
| 5850 |   1548 | information is stored only in the final node associated with the string, many
 | 
|  |   1549 | nodes do not carry any value. This distinction is captured by the
 | 
|  |   1550 | following predefined datatype:
 | 
|  |   1551 | \begin{ttbox}
 | 
|  |   1552 | datatype 'a option = None | Some 'a
 | 
| 6099 |   1553 | \end{ttbox}\indexbold{*option}\indexbold{*None}\indexbold{*Some}
 | 
| 5850 |   1554 | 
 | 
|  |   1555 | To minimize running time, each node of a trie should contain an array that maps
 | 
| 6099 |   1556 | letters to subtries. We have chosen a more space efficient representation
 | 
| 5850 |   1557 | where the subtries are held in an association list, i.e.\ a list of
 | 
|  |   1558 | (letter,trie) pairs.  Abstracting over the alphabet \texttt{'a} and the
 | 
|  |   1559 | values \texttt{'v} we define a trie as follows:
 | 
|  |   1560 | \begin{ttbox}
 | 
|  |   1561 | \input{Datatype/trie}\end{ttbox}
 | 
|  |   1562 | The first component is the optional value, the second component the
 | 
|  |   1563 | association list of subtries. We define two corresponding selector functions:
 | 
|  |   1564 | \begin{ttbox}
 | 
|  |   1565 | \input{Datatype/triesels}\end{ttbox}
 | 
|  |   1566 | Association lists come with a generic lookup function:
 | 
|  |   1567 | \begin{ttbox}
 | 
|  |   1568 | \input{Datatype/assoc}\end{ttbox}
 | 
|  |   1569 | 
 | 
|  |   1570 | Now we can define the lookup function for tries. It descends into the trie
 | 
|  |   1571 | examining the letters of the search string one by one. As
 | 
|  |   1572 | recursion on lists is simpler than on tries, let us express this as primitive
 | 
|  |   1573 | recursion on the search string argument:
 | 
|  |   1574 | \begin{ttbox}
 | 
|  |   1575 | \input{Datatype/lookup}\end{ttbox}
 | 
|  |   1576 | As a first simple property we prove that looking up a string in the empty
 | 
|  |   1577 | trie \texttt{Trie~None~[]} always returns \texttt{None}. The proof merely
 | 
|  |   1578 | distinguishes the two cases whether the search string is empty or not:
 | 
|  |   1579 | \begin{ttbox}
 | 
|  |   1580 | \input{Datatype/lookupempty.ML}\end{ttbox}
 | 
|  |   1581 | This lemma is added to the simpset as usual.
 | 
|  |   1582 | 
 | 
|  |   1583 | Things begin to get interesting with the definition of an update function
 | 
|  |   1584 | that adds a new (string,value) pair to a trie, overwriting the old value
 | 
|  |   1585 | associated with that string:
 | 
|  |   1586 | \begin{ttbox}
 | 
|  |   1587 | \input{Datatype/update}\end{ttbox}
 | 
|  |   1588 | The base case is obvious. In the recursive case the subtrie
 | 
|  |   1589 | \texttt{tt} associated with the first letter \texttt{a} is extracted,
 | 
|  |   1590 | recursively updated, and then placed in front of the association list.
 | 
|  |   1591 | The old subtrie associated with \texttt{a} is still in the association list
 | 
|  |   1592 | but no longer accessible via \texttt{assoc}. Clearly, there is room here for
 | 
|  |   1593 | optimizations!
 | 
|  |   1594 | 
 | 
|  |   1595 | Our main goal is to prove the correct interaction of \texttt{update} and
 | 
|  |   1596 | \texttt{lookup}:
 | 
|  |   1597 | \begin{quote}\small
 | 
|  |   1598 | \verbatiminput{Datatype/triemain.ML}
 | 
|  |   1599 | \end{quote}
 | 
|  |   1600 | Our plan will be to induct on \texttt{as}; hence the remaining variables are
 | 
|  |   1601 | quantified. From the definitions it is clear that induction on either
 | 
|  |   1602 | \texttt{as} or \texttt{bs} is required. The choice of \texttt{as} is merely
 | 
|  |   1603 | guided by the intuition that simplification of \texttt{lookup} might be easier
 | 
|  |   1604 | if \texttt{update} has already been simplified, which can only happen if
 | 
|  |   1605 | \texttt{as} is instantiated.
 | 
|  |   1606 | The start of the proof is completely conventional:
 | 
|  |   1607 | \begin{ttbox}
 | 
|  |   1608 | \input{Datatype/trieinduct.ML}\end{ttbox}
 | 
|  |   1609 | Unfortunately, this time we are left with three intimidating looking subgoals:
 | 
|  |   1610 | \begin{ttbox}
 | 
|  |   1611 | {\out 1. ... ==> ... lookup (...) bs = lookup t bs}
 | 
|  |   1612 | {\out 2. ... ==> ... lookup (...) bs = lookup t bs}
 | 
|  |   1613 | {\out 3. ... ==> ... lookup (...) bs = lookup t bs}
 | 
|  |   1614 | \end{ttbox}
 | 
|  |   1615 | Clearly, if we want to make headway we have to instantiate \texttt{bs} as
 | 
|  |   1616 | well now. It turns out that instead of induction, case distinction
 | 
|  |   1617 | suffices:
 | 
|  |   1618 | \begin{ttbox}
 | 
|  |   1619 | \input{Datatype/trieexhaust.ML}\end{ttbox}
 | 
|  |   1620 | The {\em tactical} \texttt{ALLGOALS} merely applies the tactic following it
 | 
|  |   1621 | to all subgoals, which results in a total of six subgoals; \texttt{Auto_tac}
 | 
|  |   1622 | solves them all.
 | 
|  |   1623 | 
 | 
|  |   1624 | This proof may look surprisingly straightforward. The reason is that we
 | 
|  |   1625 | have told the simplifier (without telling the reader) to expand all
 | 
|  |   1626 | \texttt{let}s and to split all \texttt{case}-constructs over options before
 | 
|  |   1627 | we started on the main goal:
 | 
|  |   1628 | \begin{ttbox}
 | 
|  |   1629 | \input{Datatype/trieAdds.ML}\end{ttbox}
 | 
|  |   1630 | 
 | 
|  |   1631 | \begin{exercise}
 | 
|  |   1632 |   Write an improved version of \texttt{update} that does not suffer from the
 | 
|  |   1633 |   space leak in the version above. Prove the main theorem for your improved
 | 
|  |   1634 |   \texttt{update}.
 | 
|  |   1635 | \end{exercise}
 | 
|  |   1636 | 
 | 
|  |   1637 | \begin{exercise}
 | 
|  |   1638 |   Modify \texttt{update} so that it can also {\em delete} entries from a
 | 
|  |   1639 |   trie. It is up to you if you want to shrink tries if possible. Prove (a
 | 
|  |   1640 |   modified version of) the main theorem above.
 | 
|  |   1641 | \end{exercise}
 | 
|  |   1642 | 
 | 
| 5375 |   1643 | \section{Total recursive functions}
 | 
|  |   1644 | \label{sec:recdef}
 | 
|  |   1645 | \index{*recdef|(}
 | 
|  |   1646 | 
 | 
|  |   1647 | 
 | 
|  |   1648 | Although many total functions have a natural primitive recursive definition,
 | 
|  |   1649 | this is not always the case. Arbitrary total recursive functions can be
 | 
|  |   1650 | defined by means of \texttt{recdef}: you can use full pattern-matching,
 | 
|  |   1651 | recursion need not involve datatypes, and termination is proved by showing
 | 
|  |   1652 | that each recursive call makes the argument smaller in a suitable (user
 | 
|  |   1653 | supplied) sense.
 | 
|  |   1654 | 
 | 
|  |   1655 | \subsection{Defining recursive functions}
 | 
|  |   1656 | 
 | 
|  |   1657 | Here is a simple example, the Fibonacci function:
 | 
|  |   1658 | \begin{ttbox}
 | 
| 6099 |   1659 | \input{Recdef/fib}\end{ttbox}
 | 
| 5375 |   1660 | The definition of \texttt{fib} is accompanied by a \bfindex{measure function}
 | 
|  |   1661 | \texttt{\%n.$\;$n} that maps the argument of \texttt{fib} to a natural
 | 
|  |   1662 | number. The requirement is that in each equation the measure of the argument
 | 
|  |   1663 | on the left-hand side is strictly greater than the measure of the argument of
 | 
|  |   1664 | each recursive call. In the case of \texttt{fib} this is obviously true
 | 
|  |   1665 | because the measure function is the identity and \texttt{Suc(Suc~x)} is
 | 
|  |   1666 | strictly greater than both \texttt{x} and \texttt{Suc~x}.
 | 
|  |   1667 | 
 | 
|  |   1668 | Slightly more interesting is the insertion of a fixed element
 | 
|  |   1669 | between any two elements of a list:
 | 
|  |   1670 | \begin{ttbox}
 | 
| 6099 |   1671 | \input{Recdef/sep1}\end{ttbox}
 | 
| 5375 |   1672 | This time the measure is the length of the list, which decreases with the
 | 
|  |   1673 | recursive call; the first component of the argument tuple is irrelevant.
 | 
|  |   1674 | 
 | 
|  |   1675 | Pattern matching need not be exhaustive:
 | 
|  |   1676 | \begin{ttbox}
 | 
| 6099 |   1677 | \input{Recdef/last}\end{ttbox}
 | 
| 5375 |   1678 | 
 | 
|  |   1679 | Overlapping patterns are disambiguated by taking the order of equations into
 | 
|  |   1680 | account, just as in functional programming:
 | 
| 6099 |   1681 | \begin{ttbox}
 | 
|  |   1682 | \input{Recdef/sep1}\end{ttbox}
 | 
| 5375 |   1683 | This defines exactly the same function \texttt{sep} as further above.
 | 
|  |   1684 | 
 | 
|  |   1685 | \begin{warn}
 | 
| 6099 |   1686 |   Currently \texttt{recdef} only takes the first argument of a (curried)
 | 
|  |   1687 |   recursive function into account. This means both the termination measure
 | 
|  |   1688 |   and pattern matching can only use that first argument. In general, you will
 | 
|  |   1689 |   therefore have to combine several arguments into a tuple. In case only one
 | 
|  |   1690 |   argument is relevant for termination, you can also rearrange the order of
 | 
|  |   1691 |   arguments as in
 | 
|  |   1692 | \begin{ttbox}
 | 
|  |   1693 | \input{Recdef/sep2}\end{ttbox}
 | 
| 5375 |   1694 | \end{warn}
 | 
|  |   1695 | 
 | 
|  |   1696 | When loading a theory containing a \texttt{recdef} of a function $f$,
 | 
|  |   1697 | Isabelle proves the recursion equations and stores the result as a list of
 | 
|  |   1698 | theorems $f$.\texttt{rules}. It can be viewed by typing
 | 
|  |   1699 | \begin{ttbox}
 | 
|  |   1700 | prths \(f\).rules;
 | 
|  |   1701 | \end{ttbox}
 | 
|  |   1702 | All of the above examples are simple enough that Isabelle can determine
 | 
|  |   1703 | automatically that the measure of the argument goes down in each recursive
 | 
|  |   1704 | call. In that case $f$.\texttt{rules} contains precisely the defining
 | 
|  |   1705 | equations.
 | 
|  |   1706 | 
 | 
|  |   1707 | In general, Isabelle may not be able to prove all termination conditions
 | 
|  |   1708 | automatically. For example, termination of
 | 
|  |   1709 | \begin{ttbox}
 | 
| 6099 |   1710 | \input{Recdef/constsgcd}recdef gcd "measure ((\%(m,n).n))"
 | 
|  |   1711 |   "gcd (m, n) = (if n=0 then m else gcd(n, m mod n))"
 | 
| 5375 |   1712 | \end{ttbox}
 | 
|  |   1713 | relies on the lemma \texttt{mod_less_divisor}
 | 
|  |   1714 | \begin{ttbox}
 | 
|  |   1715 | 0 < n ==> m mod n < n
 | 
|  |   1716 | \end{ttbox}
 | 
|  |   1717 | that is not part of the default simpset. As a result, Isabelle prints a
 | 
|  |   1718 | warning and \texttt{gcd.rules} contains a precondition:
 | 
|  |   1719 | \begin{ttbox}
 | 
|  |   1720 | (! m n. 0 < n --> m mod n < n) ==> gcd (m, n) = (if n=0 \dots)
 | 
|  |   1721 | \end{ttbox}
 | 
|  |   1722 | We need to instruct \texttt{recdef} to use an extended simpset to prove the
 | 
|  |   1723 | termination condition:
 | 
|  |   1724 | \begin{ttbox}
 | 
| 6099 |   1725 | \input{Recdef/gcd}\end{ttbox}
 | 
| 5375 |   1726 | This time everything works fine and \texttt{gcd.rules} contains precisely the
 | 
|  |   1727 | stated recursion equation for \texttt{gcd}.
 | 
|  |   1728 | 
 | 
|  |   1729 | When defining some nontrivial total recursive function, the first attempt
 | 
|  |   1730 | will usually generate a number of termination conditions, some of which may
 | 
|  |   1731 | require new lemmas to be proved in some of the parent theories. Those lemmas
 | 
|  |   1732 | can then be added to the simpset used by \texttt{recdef} for its
 | 
|  |   1733 | proofs, as shown for \texttt{gcd}.
 | 
|  |   1734 | 
 | 
|  |   1735 | Although all the above examples employ measure functions, \texttt{recdef}
 | 
|  |   1736 | allows arbitrary wellfounded relations. For example, termination of
 | 
|  |   1737 | Ackermann's function requires the lexicographic product \texttt{**}:
 | 
|  |   1738 | \begin{ttbox}
 | 
| 6099 |   1739 | \input{Recdef/ack}\end{ttbox}
 | 
| 6606 |   1740 | For details see the manual~\cite{isabelle-HOL} and the examples in the
 | 
| 6099 |   1741 | library.
 | 
| 5375 |   1742 | 
 | 
|  |   1743 | 
 | 
|  |   1744 | \subsection{Deriving simplification rules}
 | 
|  |   1745 | 
 | 
| 6099 |   1746 | Once we have succeeded in proving all termination conditions, we can start to
 | 
| 5375 |   1747 | derive some theorems. In contrast to \texttt{primrec} definitions, which are
 | 
|  |   1748 | automatically added to the simpset, \texttt{recdef} rules must be included
 | 
|  |   1749 | explicitly, for example as in
 | 
|  |   1750 | \begin{ttbox}
 | 
|  |   1751 | Addsimps fib.rules;
 | 
|  |   1752 | \end{ttbox}
 | 
|  |   1753 | However, some care is necessary now, in contrast to \texttt{primrec}.
 | 
|  |   1754 | Although \texttt{gcd} is a total function, its defining equation leads to
 | 
|  |   1755 | nontermination of the simplifier, because the subterm \texttt{gcd(n, m mod
 | 
|  |   1756 |   n)} on the right-hand side can again be simplified by the same equation,
 | 
|  |   1757 | and so on. The reason: the simplifier rewrites the \texttt{then} and
 | 
|  |   1758 | \texttt{else} branches of a conditional if the condition simplifies to
 | 
|  |   1759 | neither \texttt{True} nor \texttt{False}.  Therefore it is recommended to
 | 
|  |   1760 | derive an alternative formulation that replaces case distinctions on the
 | 
|  |   1761 | right-hand side by conditional equations. For \texttt{gcd} it means we have
 | 
|  |   1762 | to prove
 | 
|  |   1763 | \begin{ttbox}
 | 
|  |   1764 |            gcd (m, 0) = m
 | 
|  |   1765 | n ~= 0 ==> gcd (m, n) = gcd(n, m mod n)
 | 
|  |   1766 | \end{ttbox}
 | 
|  |   1767 | To avoid nontermination during those proofs, we have to resort to some low
 | 
|  |   1768 | level tactics:
 | 
|  |   1769 | \begin{ttbox}
 | 
|  |   1770 | Goal "gcd(m,0) = m";
 | 
|  |   1771 | by(resolve_tac [trans] 1);
 | 
|  |   1772 | by(resolve_tac gcd.rules 1);
 | 
|  |   1773 | by(Simp_tac 1);
 | 
|  |   1774 | \end{ttbox}
 | 
|  |   1775 | At this point it is not necessary to understand what exactly
 | 
|  |   1776 | \texttt{resolve_tac} is doing. The main point is that the above proof works
 | 
|  |   1777 | not just for this one example but in general (except that we have to use
 | 
|  |   1778 | \texttt{Asm_simp_tac} and $f$\texttt{.rules} in general). Try the second
 | 
|  |   1779 | \texttt{gcd}-equation above!
 | 
|  |   1780 | 
 | 
|  |   1781 | \subsection{Induction}
 | 
|  |   1782 | 
 | 
|  |   1783 | Assuming we have added the recursion equations (or some suitable derived
 | 
|  |   1784 | equations) to the simpset, we might like to prove something about our
 | 
|  |   1785 | function. Since the function is recursive, the natural proof principle is
 | 
|  |   1786 | again induction. But this time the structural form of induction that comes
 | 
|  |   1787 | with datatypes is unlikely to work well---otherwise we could have defined the
 | 
|  |   1788 | function by \texttt{primrec}. Therefore \texttt{recdef} automatically proves
 | 
|  |   1789 | a suitable induction rule $f$\texttt{.induct} that follows the recursion
 | 
|  |   1790 | pattern of the particular function $f$. Roughly speaking, it requires you to
 | 
|  |   1791 | prove for each \texttt{recdef} equation that the property you are trying to
 | 
|  |   1792 | establish holds for the left-hand side provided it holds for all recursive
 | 
|  |   1793 | calls on the right-hand side. Applying $f$\texttt{.induct} requires its
 | 
|  |   1794 | explicit instantiation. See \S\ref{sec:explicit-inst} for details.
 | 
|  |   1795 | 
 | 
|  |   1796 | \index{*recdef|)}
 |