| 11419 |      1 | \chapter{Functional Programming in HOL}
 | 
|  |      2 | 
 | 
| 11450 |      3 | This chapter describes how to write
 | 
|  |      4 | functional programs in HOL and how to verify them.  However, 
 | 
|  |      5 | most of the constructs and
 | 
|  |      6 | proof procedures introduced are general and recur in any specification
 | 
|  |      7 | or verification task.  We really should speak of functional
 | 
|  |      8 | \emph{modelling} rather than functional \emph{programming}: 
 | 
|  |      9 | our primary aim is not
 | 
| 11419 |     10 | to write programs but to design abstract models of systems.  HOL is
 | 
|  |     11 | a specification language that goes well beyond what can be expressed as a
 | 
|  |     12 | program. However, for the time being we concentrate on the computable.
 | 
|  |     13 | 
 | 
| 11450 |     14 | If you are a purist functional programmer, please note that all functions
 | 
|  |     15 | in HOL must be total:
 | 
|  |     16 | they must terminate for all inputs.  Lazy data structures are not
 | 
| 11419 |     17 | directly available.
 | 
|  |     18 | 
 | 
|  |     19 | \section{An Introductory Theory}
 | 
|  |     20 | \label{sec:intro-theory}
 | 
|  |     21 | 
 | 
|  |     22 | Functional programming needs datatypes and functions. Both of them can be
 | 
|  |     23 | defined in a theory with a syntax reminiscent of languages like ML or
 | 
|  |     24 | Haskell. As an example consider the theory in figure~\ref{fig:ToyList}.
 | 
|  |     25 | We will now examine it line by line.
 | 
|  |     26 | 
 | 
|  |     27 | \begin{figure}[htbp]
 | 
|  |     28 | \begin{ttbox}\makeatother
 | 
|  |     29 | \input{ToyList2/ToyList1}\end{ttbox}
 | 
| 11450 |     30 | \caption{A Theory of Lists}
 | 
| 11419 |     31 | \label{fig:ToyList}
 | 
|  |     32 | \end{figure}
 | 
|  |     33 | 
 | 
| 11457 |     34 | \index{*ToyList example|(}
 | 
| 11419 |     35 | {\makeatother\input{ToyList/document/ToyList.tex}}
 | 
|  |     36 | 
 | 
|  |     37 | The complete proof script is shown in Fig.\ts\ref{fig:ToyList-proofs}. The
 | 
|  |     38 | concatenation of Figs.\ts\ref{fig:ToyList} and~\ref{fig:ToyList-proofs}
 | 
|  |     39 | constitutes the complete theory \texttt{ToyList} and should reside in file
 | 
| 12327 |     40 | \texttt{ToyList.thy}.
 | 
|  |     41 | % It is good practice to present all declarations and
 | 
|  |     42 | %definitions at the beginning of a theory to facilitate browsing.%
 | 
| 11457 |     43 | \index{*ToyList example|)}
 | 
| 11419 |     44 | 
 | 
|  |     45 | \begin{figure}[htbp]
 | 
|  |     46 | \begin{ttbox}\makeatother
 | 
|  |     47 | \input{ToyList2/ToyList2}\end{ttbox}
 | 
| 11450 |     48 | \caption{Proofs about Lists}
 | 
| 11419 |     49 | \label{fig:ToyList-proofs}
 | 
|  |     50 | \end{figure}
 | 
|  |     51 | 
 | 
|  |     52 | \subsubsection*{Review}
 | 
|  |     53 | 
 | 
|  |     54 | This is the end of our toy proof. It should have familiarized you with
 | 
|  |     55 | \begin{itemize}
 | 
|  |     56 | \item the standard theorem proving procedure:
 | 
| 8743 |     57 | state a goal (lemma or theorem); proceed with proof until a separate lemma is
 | 
|  |     58 | required; prove that lemma; come back to the original goal.
 | 
|  |     59 | \item a specific procedure that works well for functional programs:
 | 
|  |     60 | induction followed by all-out simplification via \isa{auto}.
 | 
|  |     61 | \item a basic repertoire of proof commands.
 | 
|  |     62 | \end{itemize}
 | 
|  |     63 | 
 | 
| 12332 |     64 | \begin{warn}
 | 
|  |     65 | It is tempting to think that all lemmas should have the \isa{simp} attribute
 | 
|  |     66 | just because this was the case in the example above. However, in that example
 | 
|  |     67 | all lemmas were equations, and the right-hand side was simpler than the
 | 
|  |     68 | left-hand side --- an ideal situation for simplification purposes. Unless
 | 
|  |     69 | this is clearly the case, novices should refrain from awarding a lemma the
 | 
|  |     70 | \isa{simp} attribute, which has a global effect. Instead, lemmas can be
 | 
|  |     71 | applied locally where they are needed, which is discussed in the following
 | 
|  |     72 | chapter.
 | 
|  |     73 | \end{warn}
 | 
| 8743 |     74 | 
 | 
| 10885 |     75 | \section{Some Helpful Commands}
 | 
| 8743 |     76 | \label{sec:commands-and-hints}
 | 
|  |     77 | 
 | 
|  |     78 | This section discusses a few basic commands for manipulating the proof state
 | 
|  |     79 | and can be skipped by casual readers.
 | 
|  |     80 | 
 | 
|  |     81 | There are two kinds of commands used during a proof: the actual proof
 | 
|  |     82 | commands and auxiliary commands for examining the proof state and controlling
 | 
|  |     83 | the display. Simple proof commands are of the form
 | 
| 12327 |     84 | \commdx{apply}(\textit{method}), where \textit{method} is typically 
 | 
| 11419 |     85 | \isa{induct_tac} or \isa{auto}.  All such theorem proving operations
 | 
|  |     86 | are referred to as \bfindex{methods}, and further ones are
 | 
|  |     87 | introduced throughout the tutorial.  Unless stated otherwise, you may
 | 
|  |     88 | assume that a method attacks merely the first subgoal. An exception is
 | 
|  |     89 | \isa{auto}, which tries to solve all subgoals.
 | 
| 8743 |     90 | 
 | 
| 11419 |     91 | The most useful auxiliary commands are as follows:
 | 
| 8743 |     92 | \begin{description}
 | 
| 11419 |     93 | \item[Undoing:] \commdx{undo} undoes the effect of
 | 
|  |     94 | the
 | 
| 8743 |     95 |   last command; \isacommand{undo} can be undone by
 | 
| 11419 |     96 |   \commdx{redo}.  Both are only needed at the shell
 | 
| 8743 |     97 |   level and should not occur in the final theory.
 | 
| 11419 |     98 | \item[Printing the current state:] \commdx{pr}
 | 
|  |     99 | redisplays
 | 
| 11302 |    100 |   the current proof state, for example when it has scrolled past the top of
 | 
|  |    101 |   the screen.
 | 
| 8743 |    102 | \item[Limiting the number of subgoals:] \isacommand{pr}~$n$ tells Isabelle to
 | 
|  |    103 |   print only the first $n$ subgoals from now on and redisplays the current
 | 
|  |    104 |   proof state. This is helpful when there are many subgoals.
 | 
|  |    105 | \item[Modifying the order of subgoals:]
 | 
| 11419 |    106 | \commdx{defer} moves the first subgoal to the end and
 | 
|  |    107 | \commdx{prefer}~$n$ moves subgoal $n$ to the front.
 | 
| 8743 |    108 | \item[Printing theorems:]
 | 
| 11419 |    109 |   \commdx{thm}~\textit{name}$@1$~\dots~\textit{name}$@n$
 | 
| 8743 |    110 |   prints the named theorems.
 | 
|  |    111 | \item[Displaying types:] We have already mentioned the flag
 | 
| 11456 |    112 |   \texttt{show_types} above.\index{*show_types (flag)}
 | 
|  |    113 |   It can also be useful for detecting misspellings in
 | 
|  |    114 |   formulae. For example, if \texttt{show_types} is set and the goal
 | 
| 8743 |    115 |   \isa{rev(rev xs) = xs} is started, Isabelle prints the additional output
 | 
|  |    116 | \par\noindent
 | 
|  |    117 | \begin{isabelle}%
 | 
| 11646 |    118 | variables:\isanewline
 | 
| 8743 |    119 | ~~xs~::~'a~list
 | 
|  |    120 | \end{isabelle}%
 | 
|  |    121 | \par\noindent
 | 
|  |    122 | which tells us that Isabelle has correctly inferred that
 | 
|  |    123 | \isa{xs} is a variable of list type. On the other hand, had we
 | 
|  |    124 | made a typo as in \isa{rev(re xs) = xs}, the response
 | 
|  |    125 | \par\noindent
 | 
|  |    126 | \begin{isabelle}%
 | 
| 11646 |    127 | variables:\isanewline
 | 
| 8743 |    128 | ~~re~::~'a~list~{\isasymRightarrow}~'a~list\isanewline
 | 
|  |    129 | ~~xs~::~'a~list%
 | 
|  |    130 | \end{isabelle}%
 | 
|  |    131 | \par\noindent
 | 
|  |    132 | would have alerted us because of the unexpected variable \isa{re}.
 | 
| 11419 |    133 | \item[Reading terms and types:] \commdx{term}
 | 
| 8743 |    134 |   \textit{string} reads, type-checks and prints the given string as a term in
 | 
|  |    135 |   the current context; the inferred type is output as well.
 | 
| 11419 |    136 |   \commdx{typ} \textit{string} reads and prints the given
 | 
| 8743 |    137 |   string as a type in the current context.
 | 
|  |    138 | \item[(Re)loading theories:] When you start your interaction you must open a
 | 
| 11456 |    139 |   named theory with the line \commdx{theory}~\isa{T~=~\dots~:}. Isabelle
 | 
| 8771 |    140 |   automatically loads all the required parent theories from their respective
 | 
|  |    141 |   files (which may take a moment, unless the theories are already loaded and
 | 
| 9541 |    142 |   the files have not been modified).
 | 
| 8743 |    143 |   
 | 
|  |    144 |   If you suddenly discover that you need to modify a parent theory of your
 | 
| 11428 |    145 |   current theory, you must first abandon your current theory%
 | 
|  |    146 |   \indexbold{abandoning a theory}\indexbold{theories!abandoning} 
 | 
|  |    147 |   (at the shell
 | 
| 11419 |    148 |   level type \commdx{kill}). After the parent theory has
 | 
| 10971 |    149 |   been modified, you go back to your original theory. When its first line
 | 
|  |    150 |   \isa{\isacommand{theory}~T~=~\dots~:} is processed, the
 | 
| 8743 |    151 |   modified parent is reloaded automatically.
 | 
| 9541 |    152 |   
 | 
| 10978 |    153 | %  The only time when you need to load a theory by hand is when you simply
 | 
|  |    154 | %  want to check if it loads successfully without wanting to make use of the
 | 
|  |    155 | %  theory itself. This you can do by typing
 | 
| 11419 |    156 | %  \isa{\commdx{use\_thy}~"T"}.
 | 
| 8743 |    157 | \end{description}
 | 
| 12327 |    158 | Further commands are found in the Isabelle/Isar Reference
 | 
|  |    159 | Manual~\cite{isabelle-isar-ref}.
 | 
| 8743 |    160 | 
 | 
| 8771 |    161 | We now examine Isabelle's functional programming constructs systematically,
 | 
|  |    162 | starting with inductive datatypes.
 | 
|  |    163 | 
 | 
| 8743 |    164 | 
 | 
|  |    165 | \section{Datatypes}
 | 
|  |    166 | \label{sec:datatype}
 | 
|  |    167 | 
 | 
| 11456 |    168 | \index{datatypes|(}%
 | 
| 8743 |    169 | Inductive datatypes are part of almost every non-trivial application of HOL.
 | 
| 11458 |    170 | First we take another look at an important example, the datatype of
 | 
| 8743 |    171 | lists, before we turn to datatypes in general. The section closes with a
 | 
|  |    172 | case study.
 | 
|  |    173 | 
 | 
|  |    174 | 
 | 
|  |    175 | \subsection{Lists}
 | 
|  |    176 | 
 | 
| 11428 |    177 | \index{*list (type)}%
 | 
| 11457 |    178 | Lists are one of the essential datatypes in computing.  We expect that you
 | 
|  |    179 | are already familiar with their basic operations.
 | 
| 8771 |    180 | Theory \isa{ToyList} is only a small fragment of HOL's predefined theory
 | 
| 11428 |    181 | \thydx{List}\footnote{\url{http://isabelle.in.tum.de/library/HOL/List.html}}.
 | 
| 8743 |    182 | The latter contains many further operations. For example, the functions
 | 
| 11419 |    183 | \cdx{hd} (``head'') and \cdx{tl} (``tail'') return the first
 | 
| 8743 |    184 | element and the remainder of a list. (However, pattern-matching is usually
 | 
| 10795 |    185 | preferable to \isa{hd} and \isa{tl}.)  
 | 
| 10800 |    186 | Also available are higher-order functions like \isa{map} and \isa{filter}.
 | 
| 10795 |    187 | Theory \isa{List} also contains
 | 
| 8743 |    188 | more syntactic sugar: \isa{[}$x@1$\isa{,}\dots\isa{,}$x@n$\isa{]} abbreviates
 | 
|  |    189 | $x@1$\isa{\#}\dots\isa{\#}$x@n$\isa{\#[]}.  In the rest of the tutorial we
 | 
| 12327 |    190 | always use HOL's predefined lists by building on theory \isa{Main}.
 | 
| 8743 |    191 | 
 | 
|  |    192 | 
 | 
| 10885 |    193 | \subsection{The General Format}
 | 
| 8743 |    194 | \label{sec:general-datatype}
 | 
|  |    195 | 
 | 
|  |    196 | The general HOL \isacommand{datatype} definition is of the form
 | 
|  |    197 | \[
 | 
|  |    198 | \isacommand{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~
 | 
|  |    199 | C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~
 | 
|  |    200 | C@m~\tau@{m1}~\dots~\tau@{mk@m}
 | 
|  |    201 | \]
 | 
| 8771 |    202 | where $\alpha@i$ are distinct type variables (the parameters), $C@i$ are distinct
 | 
| 8743 |    203 | constructor names and $\tau@{ij}$ are types; it is customary to capitalize
 | 
|  |    204 | the first letter in constructor names. There are a number of
 | 
|  |    205 | restrictions (such as that the type should not be empty) detailed
 | 
|  |    206 | elsewhere~\cite{isabelle-HOL}. Isabelle notifies you if you violate them.
 | 
|  |    207 | 
 | 
|  |    208 | Laws about datatypes, such as \isa{[] \isasymnoteq~x\#xs} and
 | 
|  |    209 | \isa{(x\#xs = y\#ys) = (x=y \isasymand~xs=ys)}, are used automatically
 | 
|  |    210 | during proofs by simplification.  The same is true for the equations in
 | 
|  |    211 | primitive recursive function definitions.
 | 
|  |    212 | 
 | 
| 12327 |    213 | Every\footnote{Except for advanced datatypes where the recursion involves
 | 
|  |    214 | ``\isasymRightarrow'' as in {\S}\ref{sec:nested-fun-datatype}.} datatype $t$
 | 
|  |    215 | comes equipped with a \isa{size} function from $t$ into the natural numbers
 | 
|  |    216 | (see~{\S}\ref{sec:nat} below). For lists, \isa{size} is just the length, i.e.\
 | 
|  |    217 | \isa{size [] = 0} and \isa{size(x \# xs) = size xs + 1}.  In general,
 | 
|  |    218 | \cdx{size} returns
 | 
| 11456 |    219 | \begin{itemize}
 | 
| 11457 |    220 | \item zero for all constructors
 | 
| 11456 |    221 | that do not have an argument of type $t$\\
 | 
| 11457 |    222 | \item one plus the sum of the sizes of all arguments of type~$t$,
 | 
|  |    223 | for all other constructors
 | 
| 11456 |    224 | \end{itemize}
 | 
|  |    225 | Note that because
 | 
| 9644 |    226 | \isa{size} is defined on every datatype, it is overloaded; on lists
 | 
| 11419 |    227 | \isa{size} is also called \sdx{length}, which is not overloaded.
 | 
| 10795 |    228 | Isabelle will always show \isa{size} on lists as \isa{length}.
 | 
| 9644 |    229 | 
 | 
|  |    230 | 
 | 
| 10885 |    231 | \subsection{Primitive Recursion}
 | 
| 8743 |    232 | 
 | 
| 11456 |    233 | \index{recursion!primitive}%
 | 
| 8743 |    234 | Functions on datatypes are usually defined by recursion. In fact, most of the
 | 
| 11428 |    235 | time they are defined by what is called \textbf{primitive recursion}.
 | 
|  |    236 | The keyword \commdx{primrec} is followed by a list of
 | 
| 8743 |    237 | equations
 | 
|  |    238 | \[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \]
 | 
|  |    239 | such that $C$ is a constructor of the datatype $t$ and all recursive calls of
 | 
|  |    240 | $f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus
 | 
|  |    241 | Isabelle immediately sees that $f$ terminates because one (fixed!) argument
 | 
| 10654 |    242 | becomes smaller with every recursive call. There must be at most one equation
 | 
| 8743 |    243 | for each constructor.  Their order is immaterial.
 | 
| 8771 |    244 | A more general method for defining total recursive functions is introduced in
 | 
| 10538 |    245 | {\S}\ref{sec:recdef}.
 | 
| 8743 |    246 | 
 | 
| 9493 |    247 | \begin{exercise}\label{ex:Tree}
 | 
| 8743 |    248 | \input{Misc/document/Tree.tex}%
 | 
|  |    249 | \end{exercise}
 | 
|  |    250 | 
 | 
| 9721 |    251 | \input{Misc/document/case_exprs.tex}
 | 
| 8743 |    252 | 
 | 
|  |    253 | \input{Ifexpr/document/Ifexpr.tex}
 | 
| 11456 |    254 | \index{datatypes|)}
 | 
|  |    255 | 
 | 
| 8743 |    256 | 
 | 
| 10885 |    257 | \section{Some Basic Types}
 | 
| 8743 |    258 | 
 | 
| 11457 |    259 | This section introduces the types of natural numbers and ordered pairs.  Also
 | 
|  |    260 | described is type \isa{option}, which is useful for modelling exceptional
 | 
|  |    261 | cases. 
 | 
| 8743 |    262 | 
 | 
| 10885 |    263 | \subsection{Natural Numbers}
 | 
| 11456 |    264 | \label{sec:nat}\index{natural numbers}%
 | 
| 11428 |    265 | \index{linear arithmetic|(}
 | 
| 8743 |    266 | 
 | 
|  |    267 | \input{Misc/document/fakenat.tex}
 | 
|  |    268 | \input{Misc/document/natsum.tex}
 | 
|  |    269 | 
 | 
| 11428 |    270 | \index{linear arithmetic|)}
 | 
| 8743 |    271 | 
 | 
|  |    272 | 
 | 
| 10396 |    273 | \subsection{Pairs}
 | 
| 9541 |    274 | \input{Misc/document/pairs.tex}
 | 
| 8743 |    275 | 
 | 
| 10608 |    276 | \subsection{Datatype {\tt\slshape option}}
 | 
| 10543 |    277 | \label{sec:option}
 | 
|  |    278 | \input{Misc/document/Option2.tex}
 | 
|  |    279 | 
 | 
| 8743 |    280 | \section{Definitions}
 | 
|  |    281 | \label{sec:Definitions}
 | 
|  |    282 | 
 | 
|  |    283 | A definition is simply an abbreviation, i.e.\ a new name for an existing
 | 
|  |    284 | construction. In particular, definitions cannot be recursive. Isabelle offers
 | 
|  |    285 | definitions on the level of types and terms. Those on the type level are
 | 
| 11456 |    286 | called \textbf{type synonyms}; those on the term level are simply called 
 | 
| 8743 |    287 | definitions.
 | 
|  |    288 | 
 | 
|  |    289 | 
 | 
| 10885 |    290 | \subsection{Type Synonyms}
 | 
| 8743 |    291 | 
 | 
| 12327 |    292 | \index{type synonyms}%
 | 
| 11456 |    293 | Type synonyms are similar to those found in ML\@. They are created by a 
 | 
| 11428 |    294 | \commdx{types} command:
 | 
| 8743 |    295 | 
 | 
| 12327 |    296 | \input{Misc/document/types.tex}
 | 
| 8743 |    297 | 
 | 
| 9844 |    298 | \input{Misc/document/prime_def.tex}
 | 
| 8743 |    299 | 
 | 
|  |    300 | 
 | 
| 11201 |    301 | \section{The Definitional Approach}
 | 
|  |    302 | \label{sec:definitional}
 | 
|  |    303 | 
 | 
| 11457 |    304 | \index{Definitional Approach}%
 | 
| 11201 |    305 | As we pointed out at the beginning of the chapter, asserting arbitrary
 | 
| 11456 |    306 | axioms such as $f(n) = f(n) + 1$ can easily lead to contradictions. In order
 | 
| 11457 |    307 | to avoid this danger, we advocate the definitional rather than
 | 
| 11456 |    308 | the axiomatic approach: introduce new concepts by definitions. However,  Isabelle/HOL seems to
 | 
|  |    309 | support many richer definitional constructs, such as
 | 
|  |    310 | \isacommand{primrec}. The point is that Isabelle reduces such constructs to first principles. For example, each
 | 
|  |    311 | \isacommand{primrec} function definition is turned into a proper
 | 
|  |    312 | (nonrecursive!) definition from which the user-supplied recursion equations are
 | 
| 11457 |    313 | automatically proved.  This process is
 | 
| 11456 |    314 | hidden from the user, who does not have to understand the details.  Other commands described
 | 
| 11457 |    315 | later, like \isacommand{recdef} and \isacommand{inductive}, work similarly.  
 | 
|  |    316 | This strict adherence to the definitional approach reduces the risk of 
 | 
|  |    317 | soundness errors.
 | 
| 11201 |    318 | 
 | 
| 8743 |    319 | \chapter{More Functional Programming}
 | 
|  |    320 | 
 | 
| 11458 |    321 | The purpose of this chapter is to deepen your understanding of the
 | 
| 8771 |    322 | concepts encountered so far and to introduce advanced forms of datatypes and
 | 
|  |    323 | recursive functions. The first two sections give a structured presentation of
 | 
| 10538 |    324 | theorem proving by simplification ({\S}\ref{sec:Simplification}) and discuss
 | 
| 11458 |    325 | important heuristics for induction ({\S}\ref{sec:InductionHeuristics}).  You can
 | 
|  |    326 | skip them if you are not planning to perform proofs yourself.
 | 
|  |    327 | We then present a case
 | 
|  |    328 | study: a compiler for expressions ({\S}\ref{sec:ExprCompiler}). Advanced
 | 
| 8771 |    329 | datatypes, including those involving function spaces, are covered in
 | 
| 11458 |    330 | {\S}\ref{sec:advanced-datatypes}; it closes with another case study, search
 | 
|  |    331 | trees (``tries'').  Finally we introduce \isacommand{recdef}, a general
 | 
|  |    332 | form of recursive function definition that goes well beyond 
 | 
|  |    333 | \isacommand{primrec} ({\S}\ref{sec:recdef}).
 | 
| 8743 |    334 | 
 | 
|  |    335 | 
 | 
|  |    336 | \section{Simplification}
 | 
|  |    337 | \label{sec:Simplification}
 | 
|  |    338 | \index{simplification|(}
 | 
|  |    339 | 
 | 
| 10795 |    340 | So far we have proved our theorems by \isa{auto}, which simplifies
 | 
| 11458 |    341 | all subgoals. In fact, \isa{auto} can do much more than that. 
 | 
|  |    342 | To go beyond toy examples, you
 | 
| 9541 |    343 | need to understand the ingredients of \isa{auto}.  This section covers the
 | 
| 10971 |    344 | method that \isa{auto} always applies first, simplification.
 | 
| 8743 |    345 | 
 | 
|  |    346 | Simplification is one of the central theorem proving tools in Isabelle and
 | 
| 11458 |    347 | many other systems. The tool itself is called the \textbf{simplifier}. 
 | 
|  |    348 | This section introduces the many features of the simplifier
 | 
|  |    349 | and is required reading if you intend to perform proofs.  Later on,
 | 
|  |    350 | {\S}\ref{sec:simplification-II} explains some more advanced features and a
 | 
| 9754 |    351 | little bit of how the simplifier works. The serious student should read that
 | 
| 11458 |    352 | section as well, in particular to understand why the simplifier did
 | 
|  |    353 | something unexpected.
 | 
| 8743 |    354 | 
 | 
| 11458 |    355 | \subsection{What is Simplification?}
 | 
| 9458 |    356 | 
 | 
| 8743 |    357 | In its most basic form, simplification means repeated application of
 | 
|  |    358 | equations from left to right. For example, taking the rules for \isa{\at}
 | 
|  |    359 | and applying them to the term \isa{[0,1] \at\ []} results in a sequence of
 | 
|  |    360 | simplification steps:
 | 
|  |    361 | \begin{ttbox}\makeatother
 | 
|  |    362 | (0#1#[]) @ []  \(\leadsto\)  0#((1#[]) @ [])  \(\leadsto\)  0#(1#([] @ []))  \(\leadsto\)  0#1#[]
 | 
|  |    363 | \end{ttbox}
 | 
| 9933 |    364 | This is also known as \bfindex{term rewriting}\indexbold{rewriting} and the
 | 
| 11458 |    365 | equations are referred to as \bfindex{rewrite rules}.
 | 
| 9933 |    366 | ``Rewriting'' is more honest than ``simplification'' because the terms do not
 | 
|  |    367 | necessarily become simpler in the process.
 | 
| 8743 |    368 | 
 | 
| 11458 |    369 | The simplifier proves arithmetic goals as described in
 | 
|  |    370 | {\S}\ref{sec:nat} above.  Arithmetic expressions are simplified using built-in
 | 
|  |    371 | procedures that go beyond mere rewrite rules.  New simplification procedures
 | 
|  |    372 | can be coded and installed, but they are definitely not a matter for this
 | 
|  |    373 | tutorial. 
 | 
|  |    374 | 
 | 
| 9844 |    375 | \input{Misc/document/simp.tex}
 | 
| 8743 |    376 | 
 | 
|  |    377 | \index{simplification|)}
 | 
|  |    378 | 
 | 
| 9844 |    379 | \input{Misc/document/Itrev.tex}
 | 
| 13305 |    380 | \begin{exercise}
 | 
|  |    381 | \input{Misc/document/Plus.tex}%
 | 
|  |    382 | \end{exercise}
 | 
| 9493 |    383 | \begin{exercise}
 | 
|  |    384 | \input{Misc/document/Tree2.tex}%
 | 
|  |    385 | \end{exercise}
 | 
| 8743 |    386 | 
 | 
| 9844 |    387 | \input{CodeGen/document/CodeGen.tex}
 | 
| 8743 |    388 | 
 | 
|  |    389 | 
 | 
| 10885 |    390 | \section{Advanced Datatypes}
 | 
| 8743 |    391 | \label{sec:advanced-datatypes}
 | 
| 11428 |    392 | \index{datatype@\isacommand {datatype} (command)|(}
 | 
|  |    393 | \index{primrec@\isacommand {primrec} (command)|(}
 | 
| 8743 |    394 | %|)
 | 
|  |    395 | 
 | 
| 11428 |    396 | This section presents advanced forms of datatypes: mutual and nested
 | 
|  |    397 | recursion.  A series of examples will culminate in a treatment of the trie
 | 
|  |    398 | data structure.
 | 
|  |    399 | 
 | 
| 8743 |    400 | 
 | 
| 10885 |    401 | \subsection{Mutual Recursion}
 | 
| 8743 |    402 | \label{sec:datatype-mut-rec}
 | 
|  |    403 | 
 | 
|  |    404 | \input{Datatype/document/ABexpr.tex}
 | 
|  |    405 | 
 | 
| 10885 |    406 | \subsection{Nested Recursion}
 | 
| 9644 |    407 | \label{sec:nested-datatype}
 | 
| 8743 |    408 | 
 | 
| 9644 |    409 | {\makeatother\input{Datatype/document/Nested.tex}}
 | 
| 8743 |    410 | 
 | 
|  |    411 | 
 | 
| 11419 |    412 | \subsection{The Limits of Nested Recursion}
 | 
| 12327 |    413 | \label{sec:nested-fun-datatype}
 | 
| 11419 |    414 | 
 | 
|  |    415 | How far can we push nested recursion? By the unfolding argument above, we can
 | 
|  |    416 | reduce nested to mutual recursion provided the nested recursion only involves
 | 
|  |    417 | previously defined datatypes. This does not include functions:
 | 
|  |    418 | \begin{isabelle}
 | 
|  |    419 | \isacommand{datatype} t = C "t \isasymRightarrow\ bool"
 | 
|  |    420 | \end{isabelle}
 | 
|  |    421 | This declaration is a real can of worms.
 | 
|  |    422 | In HOL it must be ruled out because it requires a type
 | 
|  |    423 | \isa{t} such that \isa{t} and its power set \isa{t \isasymFun\ bool} have the
 | 
|  |    424 | same cardinality --- an impossibility. For the same reason it is not possible
 | 
| 12328 |    425 | to allow recursion involving the type \isa{t set}, which is isomorphic to
 | 
| 11419 |    426 | \isa{t \isasymFun\ bool}.
 | 
|  |    427 | 
 | 
|  |    428 | Fortunately, a limited form of recursion
 | 
|  |    429 | involving function spaces is permitted: the recursive type may occur on the
 | 
|  |    430 | right of a function arrow, but never on the left. Hence the above can of worms
 | 
| 11458 |    431 | is ruled out but the following example of a potentially 
 | 
|  |    432 | \index{infinitely branching trees}%
 | 
|  |    433 | infinitely branching tree is accepted:
 | 
| 11419 |    434 | \smallskip
 | 
|  |    435 | 
 | 
|  |    436 | \input{Datatype/document/Fundata.tex}
 | 
|  |    437 | 
 | 
|  |    438 | If you need nested recursion on the left of a function arrow, there are
 | 
|  |    439 | alternatives to pure HOL\@.  In the Logic for Computable Functions 
 | 
| 11458 |    440 | (\rmindex{LCF}), types like
 | 
| 11419 |    441 | \begin{isabelle}
 | 
|  |    442 | \isacommand{datatype} lam = C "lam \isasymrightarrow\ lam"
 | 
|  |    443 | \end{isabelle}
 | 
|  |    444 | do indeed make sense~\cite{paulson87}.  Note the different arrow,
 | 
|  |    445 | \isa{\isasymrightarrow} instead of \isa{\isasymRightarrow},
 | 
|  |    446 | expressing the type of \emph{continuous} functions. 
 | 
|  |    447 | There is even a version of LCF on top of HOL,
 | 
| 11458 |    448 | called \rmindex{HOLCF}~\cite{MuellerNvOS99}.
 | 
| 11428 |    449 | \index{datatype@\isacommand {datatype} (command)|)}
 | 
|  |    450 | \index{primrec@\isacommand {primrec} (command)|)}
 | 
| 11419 |    451 | 
 | 
|  |    452 | 
 | 
|  |    453 | \subsection{Case Study: Tries}
 | 
|  |    454 | \label{sec:Trie}
 | 
|  |    455 | 
 | 
| 11458 |    456 | \index{tries|(}%
 | 
| 11419 |    457 | Tries are a classic search tree data structure~\cite{Knuth3-75} for fast
 | 
|  |    458 | indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a
 | 
|  |    459 | trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and
 | 
|  |    460 | ``cat''.  When searching a string in a trie, the letters of the string are
 | 
|  |    461 | examined sequentially. Each letter determines which subtrie to search next.
 | 
|  |    462 | In this case study we model tries as a datatype, define a lookup and an
 | 
|  |    463 | update function, and prove that they behave as expected.
 | 
|  |    464 | 
 | 
|  |    465 | \begin{figure}[htbp]
 | 
|  |    466 | \begin{center}
 | 
| 8743 |    467 | \unitlength1mm
 | 
|  |    468 | \begin{picture}(60,30)
 | 
|  |    469 | \put( 5, 0){\makebox(0,0)[b]{l}}
 | 
|  |    470 | \put(25, 0){\makebox(0,0)[b]{e}}
 | 
|  |    471 | \put(35, 0){\makebox(0,0)[b]{n}}
 | 
|  |    472 | \put(45, 0){\makebox(0,0)[b]{r}}
 | 
|  |    473 | \put(55, 0){\makebox(0,0)[b]{t}}
 | 
|  |    474 | %
 | 
|  |    475 | \put( 5, 9){\line(0,-1){5}}
 | 
|  |    476 | \put(25, 9){\line(0,-1){5}}
 | 
|  |    477 | \put(44, 9){\line(-3,-2){9}}
 | 
|  |    478 | \put(45, 9){\line(0,-1){5}}
 | 
|  |    479 | \put(46, 9){\line(3,-2){9}}
 | 
|  |    480 | %
 | 
|  |    481 | \put( 5,10){\makebox(0,0)[b]{l}}
 | 
|  |    482 | \put(15,10){\makebox(0,0)[b]{n}}
 | 
| 11419 |    483 | \put(25,10){\makebox(0,0)[b]{p}}
 | 
|  |    484 | \put(45,10){\makebox(0,0)[b]{a}}
 | 
|  |    485 | %
 | 
|  |    486 | \put(14,19){\line(-3,-2){9}}
 | 
|  |    487 | \put(15,19){\line(0,-1){5}}
 | 
|  |    488 | \put(16,19){\line(3,-2){9}}
 | 
|  |    489 | \put(45,19){\line(0,-1){5}}
 | 
|  |    490 | %
 | 
|  |    491 | \put(15,20){\makebox(0,0)[b]{a}}
 | 
|  |    492 | \put(45,20){\makebox(0,0)[b]{c}}
 | 
|  |    493 | %
 | 
|  |    494 | \put(30,30){\line(-3,-2){13}}
 | 
|  |    495 | \put(30,30){\line(3,-2){13}}
 | 
|  |    496 | \end{picture}
 | 
|  |    497 | \end{center}
 | 
| 11450 |    498 | \caption{A Sample Trie}
 | 
| 11419 |    499 | \label{fig:trie}
 | 
|  |    500 | \end{figure}
 | 
|  |    501 | 
 | 
|  |    502 | Proper tries associate some value with each string. Since the
 | 
|  |    503 | information is stored only in the final node associated with the string, many
 | 
|  |    504 | nodes do not carry any value. This distinction is modeled with the help
 | 
|  |    505 | of the predefined datatype \isa{option} (see {\S}\ref{sec:option}).
 | 
|  |    506 | \input{Trie/document/Trie.tex}
 | 
| 11458 |    507 | \index{tries|)}
 | 
| 11419 |    508 | 
 | 
|  |    509 | \section{Total Recursive Functions}
 | 
|  |    510 | \label{sec:recdef}
 | 
| 11428 |    511 | \index{recdef@\isacommand {recdef} (command)|(}\index{functions!total|(}
 | 
| 11419 |    512 | 
 | 
|  |    513 | Although many total functions have a natural primitive recursive definition,
 | 
|  |    514 | this is not always the case. Arbitrary total recursive functions can be
 | 
|  |    515 | defined by means of \isacommand{recdef}: you can use full pattern-matching,
 | 
|  |    516 | recursion need not involve datatypes, and termination is proved by showing
 | 
|  |    517 | that the arguments of all recursive calls are smaller in a suitable (user
 | 
|  |    518 | supplied) sense. In this section we restrict ourselves to measure functions;
 | 
|  |    519 | more advanced termination proofs are discussed in {\S}\ref{sec:beyond-measure}.
 | 
|  |    520 | 
 | 
|  |    521 | \subsection{Defining Recursive Functions}
 | 
|  |    522 | \label{sec:recdef-examples}
 | 
|  |    523 | \input{Recdef/document/examples.tex}
 | 
|  |    524 | 
 | 
|  |    525 | \subsection{Proving Termination}
 | 
|  |    526 | \input{Recdef/document/termination.tex}
 | 
|  |    527 | 
 | 
| 11458 |    528 | \subsection{Simplification and Recursive Functions}
 | 
| 11419 |    529 | \label{sec:recdef-simplification}
 | 
|  |    530 | \input{Recdef/document/simplification.tex}
 | 
|  |    531 | 
 | 
| 11458 |    532 | \subsection{Induction and Recursive Functions}
 | 
| 11419 |    533 | \index{induction!recursion|(}
 | 
| 8743 |    534 | \index{recursion induction|(}
 | 
|  |    535 | 
 | 
|  |    536 | \input{Recdef/document/Induction.tex}
 | 
| 9644 |    537 | \label{sec:recdef-induction}
 | 
| 8743 |    538 | 
 | 
|  |    539 | \index{induction!recursion|)}
 | 
|  |    540 | \index{recursion induction|)}
 | 
| 11428 |    541 | \index{recdef@\isacommand {recdef} (command)|)}\index{functions!total|)}
 |