8743

1 
\chapter{Functional Programming in HOL}


2 


3 
Although on the surface this chapter is mainly concerned with how to write


4 
functional programs in HOL and how to verify them, most of the


5 
constructs and proof procedures introduced are general purpose and recur in


6 
any specification or verification task.


7 


8 
The dedicated functional programmer should be warned: HOL offers only {\em


9 
total functional programming}  all functions in HOL must be total; lazy


10 
data structures are not directly available. On the positive side, functions


11 
in HOL need not be computable: HOL is a specification language that goes well


12 
beyond what can be expressed as a program. However, for the time being we


13 
concentrate on the computable.


14 


15 
\section{An introductory theory}


16 
\label{sec:introtheory}


17 


18 
Functional programming needs datatypes and functions. Both of them can be


19 
defined in a theory with a syntax reminiscent of languages like ML or


20 
Haskell. As an example consider the theory in figure~\ref{fig:ToyList}.


21 
We will now examine it line by line.


22 


23 
\begin{figure}[htbp]


24 
\begin{ttbox}\makeatother


25 
\input{ToyList2/ToyList1}\end{ttbox}


26 
\caption{A theory of lists}


27 
\label{fig:ToyList}


28 
\end{figure}


29 


30 
{\makeatother\input{ToyList/document/ToyList.tex}}


31 


32 
The complete proof script is shown in figure~\ref{fig:ToyListproofs}. The


33 
concatenation of figures \ref{fig:ToyList} and \ref{fig:ToyListproofs}


34 
constitutes the complete theory \texttt{ToyList} and should reside in file


35 
\texttt{ToyList.thy}. It is good practice to present all declarations and


36 
definitions at the beginning of a theory to facilitate browsing.


37 


38 
\begin{figure}[htbp]


39 
\begin{ttbox}\makeatother


40 
\input{ToyList2/ToyList2}\end{ttbox}


41 
\caption{Proofs about lists}


42 
\label{fig:ToyListproofs}


43 
\end{figure}


44 


45 
\subsubsection*{Review}


46 


47 
This is the end of our toy proof. It should have familiarized you with


48 
\begin{itemize}


49 
\item the standard theorem proving procedure:


50 
state a goal (lemma or theorem); proceed with proof until a separate lemma is


51 
required; prove that lemma; come back to the original goal.


52 
\item a specific procedure that works well for functional programs:


53 
induction followed by allout simplification via \isa{auto}.


54 
\item a basic repertoire of proof commands.


55 
\end{itemize}


56 


57 


58 
\section{Some helpful commands}


59 
\label{sec:commandsandhints}


60 


61 
This section discusses a few basic commands for manipulating the proof state


62 
and can be skipped by casual readers.


63 


64 
There are two kinds of commands used during a proof: the actual proof


65 
commands and auxiliary commands for examining the proof state and controlling


66 
the display. Simple proof commands are of the form


67 
\isacommand{apply}\isa{(method)}\indexbold{apply} where \bfindex{method} is a


68 
synonym for ``theorem proving function''. Typical examples are


69 
\isa{induct_tac} and \isa{auto}. Further methods are introduced throughout


70 
the tutorial. Unless stated otherwise you may assume that a method attacks


71 
merely the first subgoal. An exception is \isa{auto} which tries to solve all


72 
subgoals.


73 


74 
The most useful auxiliary commands are:


75 
\begin{description}


76 
\item[Undoing:] \isacommand{undo}\indexbold{*undo} undoes the effect of the


77 
last command; \isacommand{undo} can be undone by


78 
\isacommand{redo}\indexbold{*redo}. Both are only needed at the shell


79 
level and should not occur in the final theory.


80 
\item[Printing the current state:] \isacommand{pr}\indexbold{*pr} redisplays


81 
the current proof state, for example when it has disappeared off the


82 
screen.


83 
\item[Limiting the number of subgoals:] \isacommand{pr}~$n$ tells Isabelle to


84 
print only the first $n$ subgoals from now on and redisplays the current


85 
proof state. This is helpful when there are many subgoals.


86 
\item[Modifying the order of subgoals:]


87 
\isacommand{defer}\indexbold{*defer} moves the first subgoal to the end and


88 
\isacommand{prefer}\indexbold{*prefer}~$n$ moves subgoal $n$ to the front.


89 
\item[Printing theorems:]


90 
\isacommand{thm}\indexbold{*thm}~\textit{name}$@1$~\dots~\textit{name}$@n$


91 
prints the named theorems.


92 
\item[Displaying types:] We have already mentioned the flag


93 
\ttindex{show_types} above. It can also be useful for detecting typos in


94 
formulae early on. For example, if \texttt{show_types} is set and the goal


95 
\isa{rev(rev xs) = xs} is started, Isabelle prints the additional output


96 
\par\noindent


97 
\begin{isabelle}%


98 
Variables:\isanewline


99 
~~xs~::~'a~list


100 
\end{isabelle}%


101 
\par\noindent


102 
which tells us that Isabelle has correctly inferred that


103 
\isa{xs} is a variable of list type. On the other hand, had we


104 
made a typo as in \isa{rev(re xs) = xs}, the response


105 
\par\noindent


106 
\begin{isabelle}%


107 
Variables:\isanewline


108 
~~re~::~'a~list~{\isasymRightarrow}~'a~list\isanewline


109 
~~xs~::~'a~list%


110 
\end{isabelle}%


111 
\par\noindent


112 
would have alerted us because of the unexpected variable \isa{re}.


113 
\item[Reading terms and types:] \isacommand{term}\indexbold{*term}


114 
\textit{string} reads, typechecks and prints the given string as a term in


115 
the current context; the inferred type is output as well.


116 
\isacommand{typ}\indexbold{*typ} \textit{string} reads and prints the given


117 
string as a type in the current context.


118 
\item[(Re)loading theories:] When you start your interaction you must open a

8771

119 
named theory with the line \isa{\isacommand{theory}~T~=~\dots~:}. Isabelle


120 
automatically loads all the required parent theories from their respective


121 
files (which may take a moment, unless the theories are already loaded and


122 
the files have not been modified). The only time when you need to load a


123 
theory by hand is when you simply want to check if it loads successfully


124 
without wanting to make use of the theory itself. This you can do by typing


125 
\isa{\isacommand{use\_thy}\indexbold{*use_thy}~"T"}.

8743

126 


127 
If you suddenly discover that you need to modify a parent theory of your


128 
current theory you must first abandon your current theory (at the shell


129 
level type \isacommand{kill}\indexbold{*kill}). After the parent theory has


130 
been modified you go back to your original theory. When its first line


131 
\isacommand{theory}\texttt{~T~=}~\dots~\texttt{:} is processed, the


132 
modified parent is reloaded automatically.


133 
\end{description}


134 
Further commands are found in the Isabelle/Isar Reference Manual.


135 

8771

136 
We now examine Isabelle's functional programming constructs systematically,


137 
starting with inductive datatypes.


138 

8743

139 


140 
\section{Datatypes}


141 
\label{sec:datatype}


142 


143 
Inductive datatypes are part of almost every nontrivial application of HOL.


144 
First we take another look at a very important example, the datatype of


145 
lists, before we turn to datatypes in general. The section closes with a


146 
case study.


147 


148 


149 
\subsection{Lists}


150 
\indexbold{*list}


151 


152 
Lists are one of the essential datatypes in computing. Readers of this


153 
tutorial and users of HOL need to be familiar with their basic operations.

8771

154 
Theory \isa{ToyList} is only a small fragment of HOL's predefined theory


155 
\isa{List}\footnote{\url{http://isabelle.in.tum.de/library/HOL/List.html}}.

8743

156 
The latter contains many further operations. For example, the functions

8771

157 
\isaindexbold{hd} (``head'') and \isaindexbold{tl} (``tail'') return the first

8743

158 
element and the remainder of a list. (However, patternmatching is usually

8771

159 
preferable to \isa{hd} and \isa{tl}.) Theory \isa{List} also contains

8743

160 
more syntactic sugar: \isa{[}$x@1$\isa{,}\dots\isa{,}$x@n$\isa{]} abbreviates


161 
$x@1$\isa{\#}\dots\isa{\#}$x@n$\isa{\#[]}. In the rest of the tutorial we


162 
always use HOL's predefined lists.


163 


164 


165 
\subsection{The general format}


166 
\label{sec:generaldatatype}


167 


168 
The general HOL \isacommand{datatype} definition is of the form


169 
\[


170 
\isacommand{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~


171 
C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~


172 
C@m~\tau@{m1}~\dots~\tau@{mk@m}


173 
\]

8771

174 
where $\alpha@i$ are distinct type variables (the parameters), $C@i$ are distinct

8743

175 
constructor names and $\tau@{ij}$ are types; it is customary to capitalize


176 
the first letter in constructor names. There are a number of


177 
restrictions (such as that the type should not be empty) detailed


178 
elsewhere~\cite{isabelleHOL}. Isabelle notifies you if you violate them.


179 


180 
Laws about datatypes, such as \isa{[] \isasymnoteq~x\#xs} and


181 
\isa{(x\#xs = y\#ys) = (x=y \isasymand~xs=ys)}, are used automatically


182 
during proofs by simplification. The same is true for the equations in


183 
primitive recursive function definitions.


184 


185 
\subsection{Primitive recursion}


186 


187 
Functions on datatypes are usually defined by recursion. In fact, most of the


188 
time they are defined by what is called \bfindex{primitive recursion}.


189 
The keyword \isacommand{primrec}\indexbold{*primrec} is followed by a list of


190 
equations


191 
\[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \]


192 
such that $C$ is a constructor of the datatype $t$ and all recursive calls of


193 
$f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus


194 
Isabelle immediately sees that $f$ terminates because one (fixed!) argument


195 
becomes smaller with every recursive call. There must be exactly one equation


196 
for each constructor. Their order is immaterial.

8771

197 
A more general method for defining total recursive functions is introduced in

8743

198 
\S\ref{sec:recdef}.


199 


200 
\begin{exercise}


201 
\input{Misc/document/Tree.tex}%


202 
\end{exercise}


203 


204 
\subsection{Case expressions}


205 
\label{sec:caseexpressions}


206 


207 
HOL also features \isaindexbold{case}expressions for analyzing


208 
elements of a datatype. For example,


209 
% case xs of [] => 0  y#ys => y


210 
\begin{isabellepar}%


211 
~~~case~xs~of~[]~{\isasymRightarrow}~0~~y~\#~ys~{\isasymRightarrow}~y


212 
\end{isabellepar}%


213 
evaluates to \isa{0} if \isa{xs} is \isa{[]} and to \isa{y} if


214 
\isa{xs} is \isa{y\#ys}. (Since the result in both branches must be of


215 
the same type, it follows that \isa{y::nat} and hence


216 
\isa{xs::(nat)list}.)


217 


218 
In general, if $e$ is a term of the datatype $t$ defined in


219 
\S\ref{sec:generaldatatype} above, the corresponding


220 
\isa{case}expression analyzing $e$ is


221 
\[


222 
\begin{array}{rrcl}


223 
\isa{case}~e~\isa{of} & C@1~x@{11}~\dots~x@{1k@1} & \To & e@1 \\


224 
\vdots \\


225 
\mid & C@m~x@{m1}~\dots~x@{mk@m} & \To & e@m


226 
\end{array}


227 
\]


228 


229 
\begin{warn}


230 
{\em All} constructors must be present, their order is fixed, and nested


231 
patterns are not supported. Violating these restrictions results in strange


232 
error messages.


233 
\end{warn}


234 
\noindent


235 
Nested patterns can be simulated by nested \isa{case}expressions: instead


236 
of


237 
% case xs of [] => 0  [x] => x  x#(y#zs) => y


238 
\begin{isabellepar}%


239 
~~~case~xs~of~[]~{\isasymRightarrow}~0~~[x]~{\isasymRightarrow}~x~~x~\#~y~\#~zs~{\isasymRightarrow}~y


240 
\end{isabellepar}%


241 
write


242 
% term(latex xsymbols symbols)"case xs of [] => 0  x#ys => (case ys of [] => x  y#zs => y)";


243 
\begin{isabellepar}%


244 
~~~case~xs~of~[]~{\isasymRightarrow}~0~~x~\#~ys~{\isasymRightarrow}~(case~ys~of~[]~{\isasymRightarrow}~x~~y~\#~zs~{\isasymRightarrow}~y)%


245 
\end{isabellepar}%


246 
Note that \isa{case}expressions should be enclosed in parentheses to


247 
indicate their scope.


248 


249 
\subsection{Structural induction and case distinction}


250 
\indexbold{structural induction}


251 
\indexbold{induction!structural}


252 
\indexbold{case distinction}


253 


254 
Almost all the basic laws about a datatype are applied automatically during


255 
simplification. Only induction is invoked by hand via \isaindex{induct_tac},


256 
which works for any datatype. In some cases, induction is overkill and a case


257 
distinction over all constructors of the datatype suffices. This is performed


258 
by \isaindexbold{case_tac}. A trivial example:


259 


260 
\input{Misc/document/cases.tex}%


261 


262 
Note that we do not need to give a lemma a name if we do not intend to refer


263 
to it explicitly in the future.


264 


265 
\begin{warn}


266 
Induction is only allowed on free (or \isasymAndbound) variables that


267 
should not occur among the assumptions of the subgoal. Case distinction


268 
(\isa{case_tac}) works for arbitrary terms, which need to be


269 
quoted if they are nonatomic.


270 
\end{warn}


271 


272 


273 
\subsection{Case study: boolean expressions}


274 
\label{sec:boolex}


275 


276 
The aim of this case study is twofold: it shows how to model boolean


277 
expressions and some algorithms for manipulating them, and it demonstrates


278 
the constructs introduced above.


279 


280 
\input{Ifexpr/document/Ifexpr.tex}

8771

281 
\medskip

8743

282 


283 
How does one come up with the required lemmas? Try to prove the main theorems


284 
without them and study carefully what \isa{auto} leaves unproved. This has


285 
to provide the clue. The necessity of universal quantification


286 
(\isa{\isasymforall{}t e}) in the two lemmas is explained in


287 
\S\ref{sec:InductionHeuristics}


288 


289 
\begin{exercise}


290 
We strengthen the definition of a \isa{normal} Ifexpression as follows:


291 
the first argument of all \isa{IF}s must be a variable. Adapt the above


292 
development to this changed requirement. (Hint: you may need to formulate


293 
some of the goals as implications (\isasymimp) rather than equalities


294 
(\isa{=}).)


295 
\end{exercise}


296 


297 
\section{Some basic types}


298 


299 


300 
\subsection{Natural numbers}


301 
\index{arithmetic(}


302 


303 
\input{Misc/document/fakenat.tex}


304 
\input{Misc/document/natsum.tex}


305 


306 
The usual arithmetic operations \ttindexboldpos{+}{$HOL2arithfun},


307 
\ttindexboldpos{}{$HOL2arithfun}, \ttindexboldpos{*}{$HOL2arithfun},


308 
\isaindexbold{div}, \isaindexbold{mod}, \isaindexbold{min} and


309 
\isaindexbold{max} are predefined, as are the relations


310 
\indexboldpos{\isasymle}{$HOL2arithrel} and


311 
\ttindexboldpos{<}{$HOL2arithrel}. There is even a least number operation


312 
\isaindexbold{LEAST}. For example, \isa{(LEAST n.$\,$1 < n) = 2}, although


313 
Isabelle does not prove this completely automatically. Note that \isa{1} and


314 
\isa{2} are available as abbreviations for the corresponding


315 
\isa{Suc}expressions. If you need the full set of numerals,


316 
see~\S\ref{natnumerals}.


317 


318 
\begin{warn}


319 
The operations \ttindexboldpos{+}{$HOL2arithfun},


320 
\ttindexboldpos{}{$HOL2arithfun}, \ttindexboldpos{*}{$HOL2arithfun},


321 
\isaindexbold{min}, \isaindexbold{max},


322 
\indexboldpos{\isasymle}{$HOL2arithrel} and


323 
\ttindexboldpos{<}{$HOL2arithrel} are overloaded, i.e.\ they are available


324 
not just for natural numbers but at other types as well (see


325 
\S\ref{sec:TypeClasses}). For example, given the goal \isa{x+y = y+x},


326 
there is nothing to indicate that you are talking about natural numbers.


327 
Hence Isabelle can only infer that \isa{x} and \isa{y} are of some


328 
arbitrary type where \isa{+} is declared. As a consequence, you will be


329 
unable to prove the goal (although it may take you some time to realize


330 
what has happened if \texttt{show_types} is not set). In this particular


331 
example, you need to include an explicit type constraint, for example


332 
\isa{x+y = y+(x::nat)}. If there is enough contextual information this may


333 
not be necessary: \isa{x+0 = x} automatically implies \isa{x::nat}.


334 
\end{warn}


335 


336 
Simple arithmetic goals are proved automatically by both \isa{auto}


337 
and the simplification methods introduced in \S\ref{sec:Simplification}. For


338 
example,


339 


340 
\input{Misc/document/arith1.tex}%


341 
is proved automatically. The main restriction is that only addition is taken


342 
into account; other arithmetic operations and quantified formulae are ignored.


343 


344 
For more complex goals, there is the special method


345 
\isaindexbold{arith} (which attacks the first subgoal). It proves


346 
arithmetic goals involving the usual logical connectives (\isasymnot,


347 
\isasymand, \isasymor, \isasymimp), the relations \isasymle\ and \isa{<}, and


348 
the operations \isa{+}, \isa{}, \isa{min} and \isa{max}. For example,


349 


350 
\input{Misc/document/arith2.tex}%


351 
succeeds because \isa{k*k} can be treated as atomic.


352 
In contrast,


353 


354 
\input{Misc/document/arith3.tex}%


355 
is not even proved by \isa{arith} because the proof relies essentially


356 
on properties of multiplication.


357 


358 
\begin{warn}


359 
The running time of \isa{arith} is exponential in the number of occurrences


360 
of \ttindexboldpos{}{$HOL2arithfun}, \isaindexbold{min} and


361 
\isaindexbold{max} because they are first eliminated by case distinctions.


362 


363 
\isa{arith} is incomplete even for the restricted class of formulae


364 
described above (known as ``linear arithmetic''). If divisibility plays a


365 
role, it may fail to prove a valid formula, for example $m+m \neq n+n+1$.


366 
Fortunately, such examples are rare in practice.


367 
\end{warn}


368 


369 
\index{arithmetic)}


370 


371 


372 
\subsection{Products}


373 

8771

374 
HOL also has pairs: \isa{($a@1$,$a@2$)} is of type \isa{$\tau@1$ *

8743

375 
$\tau@2$} provided each $a@i$ is of type $\tau@i$. The components of a pair


376 
are extracted by \isa{fst} and \isa{snd}: \isa{fst($x$,$y$) = $x$} and


377 
\isa{snd($x$,$y$) = $y$}. Tuples are simulated by pairs nested to the right:


378 
\isa{($a@1$,$a@2$,$a@3$)} stands for \isa{($a@1$,($a@2$,$a@3$))} and


379 
\isa{$\tau@1$ * $\tau@2$ * $\tau@3$} for \isa{$\tau@1$ * ($\tau@2$ *


380 
$\tau@3$)}. Therefore we have \isa{fst(snd($a@1$,$a@2$,$a@3$)) = $a@2$}.


381 


382 
It is possible to use (nested) tuples as patterns in abstractions, for


383 
example \isa{\isasymlambda(x,y,z).x+y+z} and


384 
\isa{\isasymlambda((x,y),z).x+y+z}.


385 
In addition to explicit $\lambda$abstractions, tuple patterns can be used in


386 
most variable binding constructs. Typical examples are


387 


388 
\input{Misc/document/pairs.tex}%


389 
Further important examples are quantifiers and sets (see~\S\ref{quantpats}).


390 


391 
%FIXME move stuff below into section on proofs about products?


392 
\begin{warn}


393 
Abstraction over pairs and tuples is merely a convenient shorthand for a


394 
more complex internal representation. Thus the internal and external form


395 
of a term may differ, which can affect proofs. If you want to avoid this


396 
complication, use \isa{fst} and \isa{snd}, i.e.\ write


397 
\isa{\isasymlambda{}p.~fst p + snd p} instead of


398 
\isa{\isasymlambda(x,y).~x + y}. See~\S\ref{proofsaboutproducts} for


399 
theorem proving with tuple patterns.


400 
\end{warn}


401 


402 
Finally note that products, like natural numbers, are datatypes, which means


403 
in particular that \isa{induct_tac} and \isa{case_tac} are applicable to


404 
products (see \S\ref{proofsaboutproducts}).


405 


406 
\section{Definitions}


407 
\label{sec:Definitions}


408 


409 
A definition is simply an abbreviation, i.e.\ a new name for an existing


410 
construction. In particular, definitions cannot be recursive. Isabelle offers


411 
definitions on the level of types and terms. Those on the type level are


412 
called type synonyms, those on the term level are called (constant)


413 
definitions.


414 


415 


416 
\subsection{Type synonyms}

8771

417 
\indexbold{type synonym}

8743

418 


419 
Type synonyms are similar to those found in ML. Their syntax is fairly self


420 
explanatory:


421 


422 
\input{Misc/document/types.tex}%


423 


424 
Note that patternmatching is not allowed, i.e.\ each definition must be of


425 
the form $f\,x@1\,\dots\,x@n~\isasymequiv~t$.


426 
Section~\S\ref{sec:Simplification} explains how definitions are used


427 
in proofs.


428 


429 
\begin{warn}%


430 
A common mistake when writing definitions is to introduce extra free


431 
variables on the righthand side as in the following fictitious definition:


432 
\input{Misc/document/prime_def.tex}%


433 
\end{warn}


434 


435 


436 
\chapter{More Functional Programming}


437 


438 
The purpose of this chapter is to deepen the reader's understanding of the

8771

439 
concepts encountered so far and to introduce advanced forms of datatypes and


440 
recursive functions. The first two sections give a structured presentation of


441 
theorem proving by simplification (\S\ref{sec:Simplification}) and discuss


442 
important heuristics for induction (\S\ref{sec:InductionHeuristics}). They can


443 
be skipped by readers less interested in proofs. They are followed by a case


444 
study, a compiler for expressions (\S\ref{sec:ExprCompiler}). Advanced


445 
datatypes, including those involving function spaces, are covered in


446 
\S\ref{sec:advanceddatatypes}, which closes with another case study, search


447 
trees (``tries''). Finally we introduce \isacommand{recdef}, a very general


448 
form of recursive function definition which goes well beyond what


449 
\isacommand{primrec} allows (\S\ref{sec:recdef}).

8743

450 


451 


452 
\section{Simplification}


453 
\label{sec:Simplification}


454 
\index{simplification(}


455 


456 
So far we have proved our theorems by \isa{auto}, which ``simplifies'' all


457 
subgoals. In fact, \isa{auto} can do much more than that, except that it


458 
did not need to so far. However, when you go beyond toy examples, you need to


459 
understand the ingredients of \isa{auto}. This section covers the method


460 
that \isa{auto} always applies first, namely simplification.


461 


462 
Simplification is one of the central theorem proving tools in Isabelle and


463 
many other systems. The tool itself is called the \bfindex{simplifier}. The


464 
purpose of this section is twofold: to introduce the many features of the


465 
simplifier (\S\ref{sec:SimpFeatures}) and to explain a little bit how the


466 
simplifier works (\S\ref{sec:SimpHow}). Anybody intending to use HOL should


467 
read \S\ref{sec:SimpFeatures}, and the serious student should read


468 
\S\ref{sec:SimpHow} as well in order to understand what happened in case


469 
things do not simplify as expected.


470 


471 


472 
\subsection{Using the simplifier}


473 
\label{sec:SimpFeatures}


474 


475 
In its most basic form, simplification means repeated application of


476 
equations from left to right. For example, taking the rules for \isa{\at}


477 
and applying them to the term \isa{[0,1] \at\ []} results in a sequence of


478 
simplification steps:


479 
\begin{ttbox}\makeatother


480 
(0#1#[]) @ [] \(\leadsto\) 0#((1#[]) @ []) \(\leadsto\) 0#(1#([] @ [])) \(\leadsto\) 0#1#[]


481 
\end{ttbox}


482 
This is also known as \bfindex{term rewriting} and the equations are referred

8771

483 
to as \bfindex{rewrite rules}. ``Rewriting'' is more honest than ``simplification''

8743

484 
because the terms do not necessarily become simpler in the process.


485 


486 
\subsubsection{Simplification rules}


487 
\indexbold{simplification rules}


488 


489 
To facilitate simplification, theorems can be declared to be simplification


490 
rules (with the help of the attribute \isa{[simp]}\index{*simp


491 
(attribute)}), in which case proofs by simplification make use of these

8771

492 
rules automatically. In addition the constructs \isacommand{datatype} and

8743

493 
\isacommand{primrec} (and a few others) invisibly declare useful


494 
simplification rules. Explicit definitions are \emph{not} declared


495 
simplification rules automatically!


496 


497 
Not merely equations but pretty much any theorem can become a simplification


498 
rule. The simplifier will try to make sense of it. For example, a theorem


499 
\isasymnot$P$ is automatically turned into \isa{$P$ = False}. The details


500 
are explained in \S\ref{sec:SimpHow}.


501 


502 
The simplification attribute of theorems can be turned on and off as follows:


503 
\begin{ttbox}


504 
theorems [simp] = \(list of theorem names\);


505 
theorems [simp del] = \(list of theorem names\);


506 
\end{ttbox}

8771

507 
As a rule of thumb, equations that really simplify (like \isa{rev(rev xs) =

8743

508 
xs} and \mbox{\isa{xs \at\ [] = xs}}) should be made simplification


509 
rules. Those of a more specific nature (e.g.\ distributivity laws, which


510 
alter the structure of terms considerably) should only be used selectively,


511 
i.e.\ they should not be default simplification rules. Conversely, it may


512 
also happen that a simplification rule needs to be disabled in certain


513 
proofs. Frequent changes in the simplification status of a theorem may

8771

514 
indicate a badly designed theory.

8743

515 
\begin{warn}


516 
Simplification may not terminate, for example if both $f(x) = g(x)$ and


517 
$g(x) = f(x)$ are simplification rules. It is the user's responsibility not


518 
to include simplification rules that can lead to nontermination, either on


519 
their own or in combination with other simplification rules.


520 
\end{warn}


521 


522 
\subsubsection{The simplification method}


523 
\index{*simp (method)bold}


524 


525 
The general format of the simplification method is


526 
\begin{ttbox}


527 
simp \(list of modifiers\)


528 
\end{ttbox}


529 
where the list of \emph{modifiers} helps to fine tune the behaviour and may


530 
be empty. Most if not all of the proofs seen so far could have been performed


531 
with \isa{simp} instead of \isa{auto}, except that \isa{simp} attacks


532 
only the first subgoal and may thus need to be repeated.


533 
Note that \isa{simp} fails if nothing changes.


534 


535 
\subsubsection{Adding and deleting simplification rules}


536 


537 
If a certain theorem is merely needed in a few proofs by simplification,


538 
we do not need to make it a global simplification rule. Instead we can modify


539 
the set of simplification rules used in a simplification step by adding rules


540 
to it and/or deleting rules from it. The two modifiers for this are


541 
\begin{ttbox}


542 
add: \(list of theorem names\)


543 
del: \(list of theorem names\)


544 
\end{ttbox}


545 
In case you want to use only a specific list of theorems and ignore all


546 
others:


547 
\begin{ttbox}


548 
only: \(list of theorem names\)


549 
\end{ttbox}


550 


551 


552 
\subsubsection{Assumptions}


553 
\index{simplification!with/of assumptions}


554 


555 
{\makeatother\input{Misc/document/asm_simp.tex}}


556 


557 
\subsubsection{Rewriting with definitions}


558 
\index{simplification!with definitions}


559 


560 
\input{Misc/document/def_rewr.tex}


561 


562 
\begin{warn}


563 
If you have defined $f\,x\,y~\isasymequiv~t$ then you can only expand


564 
occurrences of $f$ with at least two arguments. Thus it is safer to define


565 
$f$~\isasymequiv~\isasymlambda$x\,y.\;t$.


566 
\end{warn}


567 


568 
\subsubsection{Simplifying letexpressions}


569 
\index{simplification!of letexpressions}


570 


571 
Proving a goal containing \isaindex{let}expressions almost invariably


572 
requires the \isa{let}con\structs to be expanded at some point. Since

8771

573 
\isa{let}\isa{in} is just syntactic sugar for a predefined constant (called

8743

574 
\isa{Let}), expanding \isa{let}constructs means rewriting with


575 
\isa{Let_def}:


576 


577 
{\makeatother\input{Misc/document/let_rewr.tex}}


578 


579 
\subsubsection{Conditional equations}


580 


581 
\input{Misc/document/cond_rewr.tex}


582 


583 


584 
\subsubsection{Automatic case splits}


585 
\indexbold{case splits}


586 
\index{*split(}


587 


588 
{\makeatother\input{Misc/document/case_splits.tex}}


589 


590 
\index{*split)}


591 


592 
\begin{warn}


593 
The simplifier merely simplifies the condition of an \isa{if} but not the


594 
\isa{then} or \isa{else} parts. The latter are simplified only after the


595 
condition reduces to \isa{True} or \isa{False}, or after splitting. The


596 
same is true for \isaindex{case}expressions: only the selector is


597 
simplified at first, until either the expression reduces to one of the


598 
cases or it is split.


599 
\end{warn}


600 


601 
\subsubsection{Arithmetic}


602 
\index{arithmetic}


603 


604 
The simplifier routinely solves a small class of linear arithmetic formulae

8771

605 
(over type \isa{nat} and other numeric types): it only takes into account

8743

606 
assumptions and conclusions that are (possibly negated) (in)equalities


607 
(\isa{=}, \isasymle, \isa{<}) and it only knows about addition. Thus


608 


609 
\input{Misc/document/arith1.tex}%


610 
is proved by simplification, whereas the only slightly more complex


611 


612 
\input{Misc/document/arith4.tex}%


613 
is not proved by simplification and requires \isa{arith}.


614 


615 
\subsubsection{Permutative rewrite rules}


616 


617 
A rewrite rule is {\bf permutative} if the lefthand side and righthand side


618 
are the same up to renaming of variables. The most common permutative rule


619 
is commutativity: $x+y = y+x$. Another example is $(xy)z = (xz)y$. Such


620 
rules are problematic because once they apply, they can be used forever.


621 
The simplifier is aware of this danger and treats permutative rules


622 
separately. For details see~\cite{isabelleref}.


623 


624 
\subsubsection{Tracing}


625 
\indexbold{tracing the simplifier}


626 


627 
{\makeatother\input{Misc/document/trace_simp.tex}}


628 


629 
\subsection{How it works}


630 
\label{sec:SimpHow}


631 


632 
\subsubsection{Higherorder patterns}


633 


634 
\subsubsection{Local assumptions}


635 


636 
\subsubsection{The preprocessor}


637 


638 
\index{simplification)}


639 


640 
\section{Induction heuristics}


641 
\label{sec:InductionHeuristics}


642 


643 
The purpose of this section is to illustrate some simple heuristics for


644 
inductive proofs. The first one we have already mentioned in our initial


645 
example:


646 
\begin{quote}


647 
{\em 1. Theorems about recursive functions are proved by induction.}


648 
\end{quote}


649 
In case the function has more than one argument


650 
\begin{quote}


651 
{\em 2. Do induction on argument number $i$ if the function is defined by


652 
recursion in argument number $i$.}


653 
\end{quote}


654 
When we look at the proof of {\makeatother\isa{(xs @ ys) @ zs = xs @ (ys @


655 
zs)}} in \S\ref{sec:introproof} we find (a) \isa{\at} is recursive in


656 
the first argument, (b) \isa{xs} occurs only as the first argument of


657 
\isa{\at}, and (c) both \isa{ys} and \isa{zs} occur at least once as


658 
the second argument of \isa{\at}. Hence it is natural to perform induction


659 
on \isa{xs}.


660 


661 
The key heuristic, and the main point of this section, is to


662 
generalize the goal before induction. The reason is simple: if the goal is


663 
too specific, the induction hypothesis is too weak to allow the induction


664 
step to go through. Let us now illustrate the idea with an example.


665 


666 
{\makeatother\input{Misc/document/Itrev.tex}}


667 


668 
A final point worth mentioning is the orientation of the equation we just


669 
proved: the more complex notion (\isa{itrev}) is on the lefthand


670 
side, the simpler \isa{rev} on the righthand side. This constitutes


671 
another, albeit weak heuristic that is not restricted to induction:


672 
\begin{quote}


673 
{\em 5. The righthand side of an equation should (in some sense) be


674 
simpler than the lefthand side.}


675 
\end{quote}


676 
The heuristic is tricky to apply because it is not obvious that


677 
\isa{rev xs \at\ ys} is simpler than \isa{itrev xs ys}. But see what

8771

678 
happens if you try to prove \isa{rev xs \at\ ys = itrev xs ys}!

8743

679 


680 


681 
\section{Case study: compiling expressions}


682 
\label{sec:ExprCompiler}


683 


684 
{\makeatother\input{CodeGen/document/CodeGen.tex}}


685 


686 


687 
\section{Advanced datatypes}


688 
\label{sec:advanceddatatypes}


689 
\index{*datatype(}


690 
\index{*primrec(}


691 
%)


692 


693 
This section presents advanced forms of \isacommand{datatype}s.


694 


695 
\subsection{Mutual recursion}


696 
\label{sec:datatypemutrec}


697 


698 
\input{Datatype/document/ABexpr.tex}


699 


700 
\subsection{Nested recursion}


701 


702 
\input{Datatype/document/Nested.tex}


703 


704 


705 
\subsection{The limits of nested recursion}


706 


707 
How far can we push nested recursion? By the unfolding argument above, we can


708 
reduce nested to mutual recursion provided the nested recursion only involves


709 
previously defined datatypes. This does not include functions:


710 
\begin{ttbox}


711 
datatype t = C (t => bool)


712 
\end{ttbox}


713 
is a real can of worms: in HOL it must be ruled out because it requires a type


714 
\isa{t} such that \isa{t} and its power set \isa{t \isasymFun\ bool} have the


715 
same cardinalityan impossibility. For the same reason it is not possible


716 
to allow recursion involving the type \isa{set}, which is isomorphic to


717 
\isa{t \isasymFun\ bool}.


718 


719 
Fortunately, a limited form of recursion


720 
involving function spaces is permitted: the recursive type may occur on the


721 
right of a function arrow, but never on the left. Hence the above can of worms


722 
is ruled out but the following example of a potentially infinitely branching tree is


723 
accepted:

8771

724 
\smallskip

8743

725 


726 
\input{Datatype/document/Fundata.tex}


727 
\bigskip


728 


729 
If you need nested recursion on the left of a function arrow,


730 
there are alternatives to pure HOL: LCF~\cite{paulson87} is a logic where types like


731 
\begin{ttbox}


732 
datatype lam = C (lam > lam)


733 
\end{ttbox}

8771

734 
do indeed make sense (note the intentionally different arrow \isa{>}).

8743

735 
There is even a version of LCF on top of HOL, called


736 
HOLCF~\cite{MuellerNvOS99}.


737 


738 
\index{*primrec)}


739 
\index{*datatype)}


740 


741 
\subsection{Case study: Tries}


742 


743 
Tries are a classic search tree data structure~\cite{Knuth375} for fast


744 
indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a


745 
trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and


746 
``cat''. When searching a string in a trie, the letters of the string are


747 
examined sequentially. Each letter determines which subtrie to search next.


748 
In this case study we model tries as a datatype, define a lookup and an


749 
update function, and prove that they behave as expected.


750 


751 
\begin{figure}[htbp]


752 
\begin{center}


753 
\unitlength1mm


754 
\begin{picture}(60,30)


755 
\put( 5, 0){\makebox(0,0)[b]{l}}


756 
\put(25, 0){\makebox(0,0)[b]{e}}


757 
\put(35, 0){\makebox(0,0)[b]{n}}


758 
\put(45, 0){\makebox(0,0)[b]{r}}


759 
\put(55, 0){\makebox(0,0)[b]{t}}


760 
%


761 
\put( 5, 9){\line(0,1){5}}


762 
\put(25, 9){\line(0,1){5}}


763 
\put(44, 9){\line(3,2){9}}


764 
\put(45, 9){\line(0,1){5}}


765 
\put(46, 9){\line(3,2){9}}


766 
%


767 
\put( 5,10){\makebox(0,0)[b]{l}}


768 
\put(15,10){\makebox(0,0)[b]{n}}


769 
\put(25,10){\makebox(0,0)[b]{p}}


770 
\put(45,10){\makebox(0,0)[b]{a}}


771 
%


772 
\put(14,19){\line(3,2){9}}


773 
\put(15,19){\line(0,1){5}}


774 
\put(16,19){\line(3,2){9}}


775 
\put(45,19){\line(0,1){5}}


776 
%


777 
\put(15,20){\makebox(0,0)[b]{a}}


778 
\put(45,20){\makebox(0,0)[b]{c}}


779 
%


780 
\put(30,30){\line(3,2){13}}


781 
\put(30,30){\line(3,2){13}}


782 
\end{picture}


783 
\end{center}


784 
\caption{A sample trie}


785 
\label{fig:trie}


786 
\end{figure}


787 


788 
Proper tries associate some value with each string. Since the


789 
information is stored only in the final node associated with the string, many


790 
nodes do not carry any value. This distinction is captured by the

8771

791 
following predefined datatype (from theory \isa{Option}, which is a parent


792 
of \isa{Main}):


793 
\smallskip

8743

794 
\input{Trie/document/Option2.tex}

8771

795 
\indexbold{*option}\indexbold{*None}\indexbold{*Some}%

8743

796 
\input{Trie/document/Trie.tex}


797 


798 
\begin{exercise}


799 
Write an improved version of \isa{update} that does not suffer from the


800 
space leak in the version above. Prove the main theorem for your improved


801 
\isa{update}.


802 
\end{exercise}


803 


804 
\begin{exercise}


805 
Write a function to \emph{delete} entries from a trie. An easy solution is


806 
a clever modification of \isa{update} which allows both insertion and


807 
deletion with a single function. Prove (a modified version of) the main


808 
theorem above. Optimize you function such that it shrinks tries after


809 
deletion, if possible.


810 
\end{exercise}


811 


812 
\section{Total recursive functions}


813 
\label{sec:recdef}


814 
\index{*recdef(}


815 


816 
Although many total functions have a natural primitive recursive definition,


817 
this is not always the case. Arbitrary total recursive functions can be


818 
defined by means of \isacommand{recdef}: you can use full patternmatching,


819 
recursion need not involve datatypes, and termination is proved by showing


820 
that the arguments of all recursive calls are smaller in a suitable (user


821 
supplied) sense.


822 


823 
\subsection{Defining recursive functions}


824 


825 
\input{Recdef/document/examples.tex}


826 


827 
\subsection{Proving termination}


828 


829 
\input{Recdef/document/termination.tex}


830 


831 
\subsection{Simplification with recdef}


832 


833 
\input{Recdef/document/simplification.tex}


834 


835 


836 
\subsection{Induction}


837 
\index{induction!recursion(}


838 
\index{recursion induction(}


839 


840 
\input{Recdef/document/Induction.tex}


841 


842 
\index{induction!recursion)}


843 
\index{recursion induction)}


844 
\index{*recdef)}
