changeset 8743 3253c6046d57
child 8771 026f37a86ea7
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc-src/TutorialI/fp.tex	Wed Apr 19 11:54:39 2000 +0200
@@ -0,0 +1,841 @@
+\chapter{Functional Programming in HOL}
+Although on the surface this chapter is mainly concerned with how to write
+functional programs in HOL and how to verify them, most of the
+constructs and proof procedures introduced are general purpose and recur in
+any specification or verification task.
+The dedicated functional programmer should be warned: HOL offers only {\em
+  total functional programming} --- all functions in HOL must be total; lazy
+data structures are not directly available. On the positive side, functions
+in HOL need not be computable: HOL is a specification language that goes well
+beyond what can be expressed as a program. However, for the time being we
+concentrate on the computable.
+\section{An introductory theory}
+Functional programming needs datatypes and functions. Both of them can be
+defined in a theory with a syntax reminiscent of languages like ML or
+Haskell. As an example consider the theory in figure~\ref{fig:ToyList}.
+We will now examine it line by line.
+\caption{A theory of lists}
+The complete proof script is shown in figure~\ref{fig:ToyList-proofs}. The
+concatenation of figures \ref{fig:ToyList} and \ref{fig:ToyList-proofs}
+constitutes the complete theory \texttt{ToyList} and should reside in file
+\texttt{ToyList.thy}. It is good practice to present all declarations and
+definitions at the beginning of a theory to facilitate browsing.
+\caption{Proofs about lists}
+This is the end of our toy proof. It should have familiarized you with
+\item the standard theorem proving procedure:
+state a goal (lemma or theorem); proceed with proof until a separate lemma is
+required; prove that lemma; come back to the original goal.
+\item a specific procedure that works well for functional programs:
+induction followed by all-out simplification via \isa{auto}.
+\item a basic repertoire of proof commands.
+\section{Some helpful commands}
+This section discusses a few basic commands for manipulating the proof state
+and can be skipped by casual readers.
+There are two kinds of commands used during a proof: the actual proof
+commands and auxiliary commands for examining the proof state and controlling
+the display. Simple proof commands are of the form
+\isacommand{apply}\isa{(method)}\indexbold{apply} where \bfindex{method} is a
+synonym for ``theorem proving function''. Typical examples are
+\isa{induct_tac} and \isa{auto}. Further methods are introduced throughout
+the tutorial.  Unless stated otherwise you may assume that a method attacks
+merely the first subgoal. An exception is \isa{auto} which tries to solve all
+The most useful auxiliary commands are:
+\item[Undoing:] \isacommand{undo}\indexbold{*undo} undoes the effect of the
+  last command; \isacommand{undo} can be undone by
+  \isacommand{redo}\indexbold{*redo}.  Both are only needed at the shell
+  level and should not occur in the final theory.
+\item[Printing the current state:] \isacommand{pr}\indexbold{*pr} redisplays
+  the current proof state, for example when it has disappeared off the
+  screen.
+\item[Limiting the number of subgoals:] \isacommand{pr}~$n$ tells Isabelle to
+  print only the first $n$ subgoals from now on and redisplays the current
+  proof state. This is helpful when there are many subgoals.
+\item[Modifying the order of subgoals:]
+\isacommand{defer}\indexbold{*defer} moves the first subgoal to the end and
+\isacommand{prefer}\indexbold{*prefer}~$n$ moves subgoal $n$ to the front.
+\item[Printing theorems:]
+  \isacommand{thm}\indexbold{*thm}~\textit{name}$@1$~\dots~\textit{name}$@n$
+  prints the named theorems.
+\item[Displaying types:] We have already mentioned the flag
+  \ttindex{show_types} above. It can also be useful for detecting typos in
+  formulae early on. For example, if \texttt{show_types} is set and the goal
+  \isa{rev(rev xs) = xs} is started, Isabelle prints the additional output
+which tells us that Isabelle has correctly inferred that
+\isa{xs} is a variable of list type. On the other hand, had we
+made a typo as in \isa{rev(re xs) = xs}, the response
+would have alerted us because of the unexpected variable \isa{re}.
+\item[Reading terms and types:] \isacommand{term}\indexbold{*term}
+  \textit{string} reads, type-checks and prints the given string as a term in
+  the current context; the inferred type is output as well.
+  \isacommand{typ}\indexbold{*typ} \textit{string} reads and prints the given
+  string as a type in the current context.
+\item[(Re)loading theories:] When you start your interaction you must open a
+  named theory with the line
+  \isacommand{theory}\texttt{~T~=}~\dots~\texttt{:}. Isabelle automatically
+  loads all the required parent theories from their respective files (which
+  may take a moment, unless the theories are already loaded and the files
+  have not been modified). The only time when you need to load a theory by
+  hand is when you simply want to check if it loads successfully without
+  wanting to make use of the theory itself. This you can do by typing
+  \isacommand{use\_thy}\indexbold{*use_thy}~\texttt{"T"}.
+  If you suddenly discover that you need to modify a parent theory of your
+  current theory you must first abandon your current theory (at the shell
+  level type \isacommand{kill}\indexbold{*kill}). After the parent theory has
+  been modified you go back to your original theory. When its first line
+  \isacommand{theory}\texttt{~T~=}~\dots~\texttt{:} is processed, the
+  modified parent is reloaded automatically.
+Further commands are found in the Isabelle/Isar Reference Manual.
+Inductive datatypes are part of almost every non-trivial application of HOL.
+First we take another look at a very important example, the datatype of
+lists, before we turn to datatypes in general. The section closes with a
+case study.
+Lists are one of the essential datatypes in computing. Readers of this
+tutorial and users of HOL need to be familiar with their basic operations.
+Theory \texttt{ToyList} is only a small fragment of HOL's predefined theory
+The latter contains many further operations. For example, the functions
+\isaindexbold{hd} (`head') and \isaindexbold{tl} (`tail') return the first
+element and the remainder of a list. (However, pattern-matching is usually
+preferable to \isa{hd} and \isa{tl}.)  Theory \texttt{List} also contains
+more syntactic sugar: \isa{[}$x@1$\isa{,}\dots\isa{,}$x@n$\isa{]} abbreviates
+$x@1$\isa{\#}\dots\isa{\#}$x@n$\isa{\#[]}.  In the rest of the tutorial we
+always use HOL's predefined lists.
+\subsection{The general format}
+The general HOL \isacommand{datatype} definition is of the form
+\isacommand{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~
+C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~
+where $\alpha@i$ are type variables (the parameters), $C@i$ are distinct
+constructor names and $\tau@{ij}$ are types; it is customary to capitalize
+the first letter in constructor names. There are a number of
+restrictions (such as that the type should not be empty) detailed
+elsewhere~\cite{isabelle-HOL}. Isabelle notifies you if you violate them.
+Laws about datatypes, such as \isa{[] \isasymnoteq~x\#xs} and
+\isa{(x\#xs = y\#ys) = (x=y \isasymand~xs=ys)}, are used automatically
+during proofs by simplification.  The same is true for the equations in
+primitive recursive function definitions.
+\subsection{Primitive recursion}
+Functions on datatypes are usually defined by recursion. In fact, most of the
+time they are defined by what is called \bfindex{primitive recursion}.
+The keyword \isacommand{primrec}\indexbold{*primrec} is followed by a list of
+\[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \]
+such that $C$ is a constructor of the datatype $t$ and all recursive calls of
+$f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus
+Isabelle immediately sees that $f$ terminates because one (fixed!) argument
+becomes smaller with every recursive call. There must be exactly one equation
+for each constructor.  Their order is immaterial.
+A more general method for defining total recursive functions is explained in
+\subsection{Case expressions}
+HOL also features \isaindexbold{case}-expressions for analyzing
+elements of a datatype. For example,
+% case xs of [] => 0 | y#ys => y
+evaluates to \isa{0} if \isa{xs} is \isa{[]} and to \isa{y} if 
+\isa{xs} is \isa{y\#ys}. (Since the result in both branches must be of
+the same type, it follows that \isa{y::nat} and hence
+In general, if $e$ is a term of the datatype $t$ defined in
+\S\ref{sec:general-datatype} above, the corresponding
+\isa{case}-expression analyzing $e$ is
+\isa{case}~e~\isa{of} & C@1~x@{11}~\dots~x@{1k@1} & \To & e@1 \\
+                           \vdots \\
+                           \mid & C@m~x@{m1}~\dots~x@{mk@m} & \To & e@m
+{\em All} constructors must be present, their order is fixed, and nested
+patterns are not supported.  Violating these restrictions results in strange
+error messages.
+Nested patterns can be simulated by nested \isa{case}-expressions: instead
+% case xs of [] => 0 | [x] => x | x#(y#zs) => y
+% term(latex xsymbols symbols)"case xs of [] => 0 | x#ys => (case ys of [] => x | y#zs => y)";
+Note that \isa{case}-expressions should be enclosed in parentheses to
+indicate their scope.
+\subsection{Structural induction and case distinction}
+\indexbold{structural induction}
+\indexbold{case distinction}
+Almost all the basic laws about a datatype are applied automatically during
+simplification. Only induction is invoked by hand via \isaindex{induct_tac},
+which works for any datatype. In some cases, induction is overkill and a case
+distinction over all constructors of the datatype suffices. This is performed
+by \isaindexbold{case_tac}. A trivial example:
+Note that we do not need to give a lemma a name if we do not intend to refer
+to it explicitly in the future.
+  Induction is only allowed on free (or \isasymAnd-bound) variables that
+  should not occur among the assumptions of the subgoal.  Case distinction
+  (\isa{case_tac}) works for arbitrary terms, which need to be
+  quoted if they are non-atomic.
+\subsection{Case study: boolean expressions}
+The aim of this case study is twofold: it shows how to model boolean
+expressions and some algorithms for manipulating them, and it demonstrates
+the constructs introduced above.
+How does one come up with the required lemmas? Try to prove the main theorems
+without them and study carefully what \isa{auto} leaves unproved. This has
+to provide the clue.  The necessity of universal quantification
+(\isa{\isasymforall{}t e}) in the two lemmas is explained in
+  We strengthen the definition of a \isa{normal} If-expression as follows:
+  the first argument of all \isa{IF}s must be a variable. Adapt the above
+  development to this changed requirement. (Hint: you may need to formulate
+  some of the goals as implications (\isasymimp) rather than equalities
+  (\isa{=}).)
+\section{Some basic types}
+\subsection{Natural numbers}
+The usual arithmetic operations \ttindexboldpos{+}{$HOL2arithfun},
+\ttindexboldpos{-}{$HOL2arithfun}, \ttindexboldpos{*}{$HOL2arithfun},
+\isaindexbold{div}, \isaindexbold{mod}, \isaindexbold{min} and
+\isaindexbold{max} are predefined, as are the relations
+\indexboldpos{\isasymle}{$HOL2arithrel} and
+\ttindexboldpos{<}{$HOL2arithrel}. There is even a least number operation
+\isaindexbold{LEAST}. For example, \isa{(LEAST n.$\,$1 < n) = 2}, although
+Isabelle does not prove this completely automatically. Note that \isa{1} and
+\isa{2} are available as abbreviations for the corresponding
+\isa{Suc}-expressions. If you need the full set of numerals,
+  The operations \ttindexboldpos{+}{$HOL2arithfun},
+  \ttindexboldpos{-}{$HOL2arithfun}, \ttindexboldpos{*}{$HOL2arithfun},
+  \isaindexbold{min}, \isaindexbold{max},
+  \indexboldpos{\isasymle}{$HOL2arithrel} and
+  \ttindexboldpos{<}{$HOL2arithrel} are overloaded, i.e.\ they are available
+  not just for natural numbers but at other types as well (see
+  \S\ref{sec:TypeClasses}). For example, given the goal \isa{x+y = y+x},
+  there is nothing to indicate that you are talking about natural numbers.
+  Hence Isabelle can only infer that \isa{x} and \isa{y} are of some
+  arbitrary type where \isa{+} is declared. As a consequence, you will be
+  unable to prove the goal (although it may take you some time to realize
+  what has happened if \texttt{show_types} is not set).  In this particular
+  example, you need to include an explicit type constraint, for example
+  \isa{x+y = y+(x::nat)}.  If there is enough contextual information this may
+  not be necessary: \isa{x+0 = x} automatically implies \isa{x::nat}.
+Simple arithmetic goals are proved automatically by both \isa{auto}
+and the simplification methods introduced in \S\ref{sec:Simplification}.  For
+is proved automatically. The main restriction is that only addition is taken
+into account; other arithmetic operations and quantified formulae are ignored.
+For more complex goals, there is the special method
+\isaindexbold{arith} (which attacks the first subgoal). It proves
+arithmetic goals involving the usual logical connectives (\isasymnot,
+\isasymand, \isasymor, \isasymimp), the relations \isasymle\ and \isa{<}, and
+the operations \isa{+}, \isa{-}, \isa{min} and \isa{max}. For example,
+succeeds because \isa{k*k} can be treated as atomic.
+In contrast,
+is not even proved by \isa{arith} because the proof relies essentially
+on properties of multiplication.
+  The running time of \isa{arith} is exponential in the number of occurrences
+  of \ttindexboldpos{-}{$HOL2arithfun}, \isaindexbold{min} and
+  \isaindexbold{max} because they are first eliminated by case distinctions.
+  \isa{arith} is incomplete even for the restricted class of formulae
+  described above (known as ``linear arithmetic''). If divisibility plays a
+  role, it may fail to prove a valid formula, for example $m+m \neq n+n+1$.
+  Fortunately, such examples are rare in practice.
+HOL also has pairs: \isa{($a@1$,$a@2$)} is of type \texttt{$\tau@1$ *
+  $\tau@2$} provided each $a@i$ is of type $\tau@i$. The components of a pair
+are extracted by \isa{fst} and \isa{snd}: \isa{fst($x$,$y$) = $x$} and
+\isa{snd($x$,$y$) = $y$}. Tuples are simulated by pairs nested to the right:
+\isa{($a@1$,$a@2$,$a@3$)} stands for \isa{($a@1$,($a@2$,$a@3$))} and
+\isa{$\tau@1$ * $\tau@2$ * $\tau@3$} for \isa{$\tau@1$ * ($\tau@2$ *
+  $\tau@3$)}. Therefore we have \isa{fst(snd($a@1$,$a@2$,$a@3$)) = $a@2$}.
+It is possible to use (nested) tuples as patterns in abstractions, for
+example \isa{\isasymlambda(x,y,z).x+y+z} and
+In addition to explicit $\lambda$-abstractions, tuple patterns can be used in
+most variable binding constructs. Typical examples are
+Further important examples are quantifiers and sets (see~\S\ref{quant-pats}).
+%FIXME move stuff below into section on proofs about products?
+  Abstraction over pairs and tuples is merely a convenient shorthand for a
+  more complex internal representation.  Thus the internal and external form
+  of a term may differ, which can affect proofs. If you want to avoid this
+  complication, use \isa{fst} and \isa{snd}, i.e.\ write
+  \isa{\isasymlambda{}p.~fst p + snd p} instead of
+  \isa{\isasymlambda(x,y).~x + y}.  See~\S\ref{proofs-about-products} for
+  theorem proving with tuple patterns.
+Finally note that products, like natural numbers, are datatypes, which means
+in particular that \isa{induct_tac} and \isa{case_tac} are applicable to
+products (see \S\ref{proofs-about-products}).
+A definition is simply an abbreviation, i.e.\ a new name for an existing
+construction. In particular, definitions cannot be recursive. Isabelle offers
+definitions on the level of types and terms. Those on the type level are
+called type synonyms, those on the term level are called (constant)
+\subsection{Type synonyms}
+\indexbold{type synonyms}
+Type synonyms are similar to those found in ML. Their syntax is fairly self
+Note that pattern-matching is not allowed, i.e.\ each definition must be of
+the form $f\,x@1\,\dots\,x@n~\isasymequiv~t$.
+Section~\S\ref{sec:Simplification} explains how definitions are used
+in proofs.
+A common mistake when writing definitions is to introduce extra free
+variables on the right-hand side as in the following fictitious definition:
+\chapter{More Functional Programming}
+The purpose of this chapter is to deepen the reader's understanding of the
+concepts encountered so far and to introduce an advanced forms of datatypes
+and recursive functions. The first two sections give a structured
+presentation of theorem proving by simplification
+(\S\ref{sec:Simplification}) and discuss important heuristics for induction
+(\S\ref{sec:InductionHeuristics}). They can be skipped by readers less
+interested in proofs. They are followed by a case study, a compiler for
+expressions (\S\ref{sec:ExprCompiler}). Advanced datatypes, including those
+involving function spaces, are covered in \S\ref{sec:advanced-datatypes},
+which closes with another case study, search trees (``tries'').  Finally we
+introduce a very general form of recursive function definition which goes
+well beyond what \isacommand{primrec} allows (\S\ref{sec:recdef}).
+So far we have proved our theorems by \isa{auto}, which ``simplifies'' all
+subgoals. In fact, \isa{auto} can do much more than that, except that it
+did not need to so far. However, when you go beyond toy examples, you need to
+understand the ingredients of \isa{auto}.  This section covers the method
+that \isa{auto} always applies first, namely simplification.
+Simplification is one of the central theorem proving tools in Isabelle and
+many other systems. The tool itself is called the \bfindex{simplifier}. The
+purpose of this section is twofold: to introduce the many features of the
+simplifier (\S\ref{sec:SimpFeatures}) and to explain a little bit how the
+simplifier works (\S\ref{sec:SimpHow}).  Anybody intending to use HOL should
+read \S\ref{sec:SimpFeatures}, and the serious student should read
+\S\ref{sec:SimpHow} as well in order to understand what happened in case
+things do not simplify as expected.
+\subsection{Using the simplifier}
+In its most basic form, simplification means repeated application of
+equations from left to right. For example, taking the rules for \isa{\at}
+and applying them to the term \isa{[0,1] \at\ []} results in a sequence of
+simplification steps:
+(0#1#[]) @ []  \(\leadsto\)  0#((1#[]) @ [])  \(\leadsto\)  0#(1#([] @ []))  \(\leadsto\)  0#1#[]
+This is also known as \bfindex{term rewriting} and the equations are referred
+to as \bfindex{rewrite rules}. This is more honest than ``simplification''
+because the terms do not necessarily become simpler in the process.
+\subsubsection{Simplification rules}
+\indexbold{simplification rules}
+To facilitate simplification, theorems can be declared to be simplification
+rules (with the help of the attribute \isa{[simp]}\index{*simp
+  (attribute)}), in which case proofs by simplification make use of these
+rules by default. In addition the constructs \isacommand{datatype} and
+\isacommand{primrec} (and a few others) invisibly declare useful
+simplification rules. Explicit definitions are \emph{not} declared
+simplification rules automatically!
+Not merely equations but pretty much any theorem can become a simplification
+rule. The simplifier will try to make sense of it.  For example, a theorem
+\isasymnot$P$ is automatically turned into \isa{$P$ = False}. The details
+are explained in \S\ref{sec:SimpHow}.
+The simplification attribute of theorems can be turned on and off as follows:
+theorems [simp] = \(list of theorem names\);
+theorems [simp del] = \(list of theorem names\);
+As a rule of thumb, rules that really simplify (like \isa{rev(rev xs) =
+  xs} and \mbox{\isa{xs \at\ [] = xs}}) should be made simplification
+rules.  Those of a more specific nature (e.g.\ distributivity laws, which
+alter the structure of terms considerably) should only be used selectively,
+i.e.\ they should not be default simplification rules.  Conversely, it may
+also happen that a simplification rule needs to be disabled in certain
+proofs.  Frequent changes in the simplification status of a theorem may
+indicates a badly designed theory.
+  Simplification may not terminate, for example if both $f(x) = g(x)$ and
+  $g(x) = f(x)$ are simplification rules. It is the user's responsibility not
+  to include simplification rules that can lead to nontermination, either on
+  their own or in combination with other simplification rules.
+\subsubsection{The simplification method}
+\index{*simp (method)|bold}
+The general format of the simplification method is
+simp \(list of modifiers\)
+where the list of \emph{modifiers} helps to fine tune the behaviour and may
+be empty. Most if not all of the proofs seen so far could have been performed
+with \isa{simp} instead of \isa{auto}, except that \isa{simp} attacks
+only the first subgoal and may thus need to be repeated.
+Note that \isa{simp} fails if nothing changes.
+\subsubsection{Adding and deleting simplification rules}
+If a certain theorem is merely needed in a few proofs by simplification,
+we do not need to make it a global simplification rule. Instead we can modify
+the set of simplification rules used in a simplification step by adding rules
+to it and/or deleting rules from it. The two modifiers for this are
+add: \(list of theorem names\)
+del: \(list of theorem names\)
+In case you want to use only a specific list of theorems and ignore all
+only: \(list of theorem names\)
+\index{simplification!with/of assumptions}
+\subsubsection{Rewriting with definitions}
+\index{simplification!with definitions}
+  If you have defined $f\,x\,y~\isasymequiv~t$ then you can only expand
+  occurrences of $f$ with at least two arguments. Thus it is safer to define
+  $f$~\isasymequiv~\isasymlambda$x\,y.\;t$.
+\subsubsection{Simplifying let-expressions}
+\index{simplification!of let-expressions}
+Proving a goal containing \isaindex{let}-expressions almost invariably
+requires the \isa{let}-con\-structs to be expanded at some point. Since
+\isa{let}-\isa{in} is just syntactic sugar for a defined constant (called
+\isa{Let}), expanding \isa{let}-constructs means rewriting with
+\subsubsection{Conditional equations}
+\subsubsection{Automatic case splits}
+\indexbold{case splits}
+  The simplifier merely simplifies the condition of an \isa{if} but not the
+  \isa{then} or \isa{else} parts. The latter are simplified only after the
+  condition reduces to \isa{True} or \isa{False}, or after splitting. The
+  same is true for \isaindex{case}-expressions: only the selector is
+  simplified at first, until either the expression reduces to one of the
+  cases or it is split.
+The simplifier routinely solves a small class of linear arithmetic formulae
+(over types \isa{nat} and \isa{int}): it only takes into account
+assumptions and conclusions that are (possibly negated) (in)equalities
+(\isa{=}, \isasymle, \isa{<}) and it only knows about addition. Thus
+is proved by simplification, whereas the only slightly more complex
+is not proved by simplification and requires \isa{arith}.
+\subsubsection{Permutative rewrite rules}
+A rewrite rule is {\bf permutative} if the left-hand side and right-hand side
+are the same up to renaming of variables.  The most common permutative rule
+is commutativity: $x+y = y+x$.  Another example is $(x-y)-z = (x-z)-y$.  Such
+rules are problematic because once they apply, they can be used forever.
+The simplifier is aware of this danger and treats permutative rules
+separately. For details see~\cite{isabelle-ref}.
+\indexbold{tracing the simplifier}
+\subsection{How it works}
+\subsubsection{Higher-order patterns}
+\subsubsection{Local assumptions}
+\subsubsection{The preprocessor}
+\section{Induction heuristics}
+The purpose of this section is to illustrate some simple heuristics for
+inductive proofs. The first one we have already mentioned in our initial
+{\em 1. Theorems about recursive functions are proved by induction.}
+In case the function has more than one argument
+{\em 2. Do induction on argument number $i$ if the function is defined by
+recursion in argument number $i$.}
+When we look at the proof of {\makeatother\isa{(xs @ ys) @ zs = xs @ (ys @
+  zs)}} in \S\ref{sec:intro-proof} we find (a) \isa{\at} is recursive in
+the first argument, (b) \isa{xs} occurs only as the first argument of
+\isa{\at}, and (c) both \isa{ys} and \isa{zs} occur at least once as
+the second argument of \isa{\at}. Hence it is natural to perform induction
+on \isa{xs}.
+The key heuristic, and the main point of this section, is to
+generalize the goal before induction. The reason is simple: if the goal is
+too specific, the induction hypothesis is too weak to allow the induction
+step to go through. Let us now illustrate the idea with an example.
+A final point worth mentioning is the orientation of the equation we just
+proved: the more complex notion (\isa{itrev}) is on the left-hand
+side, the simpler \isa{rev} on the right-hand side. This constitutes
+another, albeit weak heuristic that is not restricted to induction:
+  {\em 5. The right-hand side of an equation should (in some sense) be
+    simpler than the left-hand side.}
+The heuristic is tricky to apply because it is not obvious that
+\isa{rev xs \at\ ys} is simpler than \isa{itrev xs ys}. But see what
+happens if you try to prove the symmetric equation!
+\section{Case study: compiling expressions}
+\section{Advanced datatypes}
+This section presents advanced forms of \isacommand{datatype}s.
+\subsection{Mutual recursion}
+\subsection{Nested recursion}
+\subsection{The limits of nested recursion}
+How far can we push nested recursion? By the unfolding argument above, we can
+reduce nested to mutual recursion provided the nested recursion only involves
+previously defined datatypes. This does not include functions:
+datatype t = C (t => bool)
+is a real can of worms: in HOL it must be ruled out because it requires a type
+\isa{t} such that \isa{t} and its power set \isa{t \isasymFun\ bool} have the
+same cardinality---an impossibility. For the same reason it is not possible
+to allow recursion involving the type \isa{set}, which is isomorphic to
+\isa{t \isasymFun\ bool}.
+Fortunately, a limited form of recursion
+involving function spaces is permitted: the recursive type may occur on the
+right of a function arrow, but never on the left. Hence the above can of worms
+is ruled out but the following example of a potentially infinitely branching tree is
+If you need nested recursion on the left of a function arrow,
+there are alternatives to pure HOL: LCF~\cite{paulson87} is a logic where types like
+datatype lam = C (lam -> lam)
+do indeed make sense (note the intentionally different arrow \isa{->}!).
+There is even a version of LCF on top of HOL, called
+\subsection{Case study: Tries}
+Tries are a classic search tree data structure~\cite{Knuth3-75} for fast
+indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a
+trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and
+``cat''.  When searching a string in a trie, the letters of the string are
+examined sequentially. Each letter determines which subtrie to search next.
+In this case study we model tries as a datatype, define a lookup and an
+update function, and prove that they behave as expected.
+\put( 5, 0){\makebox(0,0)[b]{l}}
+\put(25, 0){\makebox(0,0)[b]{e}}
+\put(35, 0){\makebox(0,0)[b]{n}}
+\put(45, 0){\makebox(0,0)[b]{r}}
+\put(55, 0){\makebox(0,0)[b]{t}}
+\put( 5, 9){\line(0,-1){5}}
+\put(25, 9){\line(0,-1){5}}
+\put(44, 9){\line(-3,-2){9}}
+\put(45, 9){\line(0,-1){5}}
+\put(46, 9){\line(3,-2){9}}
+\put( 5,10){\makebox(0,0)[b]{l}}
+\caption{A sample trie}
+Proper tries associate some value with each string. Since the
+information is stored only in the final node associated with the string, many
+nodes do not carry any value. This distinction is captured by the
+following predefined datatype (from theory \texttt{Option}, which is a parent
+of \texttt{Main}):
+  Write an improved version of \isa{update} that does not suffer from the
+  space leak in the version above. Prove the main theorem for your improved
+  \isa{update}.
+  Write a function to \emph{delete} entries from a trie. An easy solution is
+  a clever modification of \isa{update} which allows both insertion and
+  deletion with a single function.  Prove (a modified version of) the main
+  theorem above. Optimize you function such that it shrinks tries after
+  deletion, if possible.
+\section{Total recursive functions}
+Although many total functions have a natural primitive recursive definition,
+this is not always the case. Arbitrary total recursive functions can be
+defined by means of \isacommand{recdef}: you can use full pattern-matching,
+recursion need not involve datatypes, and termination is proved by showing
+that the arguments of all recursive calls are smaller in a suitable (user
+supplied) sense.
+\subsection{Defining recursive functions}
+\subsection{Proving termination}
+\subsection{Simplification with recdef}
+\index{recursion induction|(}
+\index{recursion induction|)}