8743

1 
\chapter{Functional Programming in HOL}


2 


3 
Although on the surface this chapter is mainly concerned with how to write


4 
functional programs in HOL and how to verify them, most of the


5 
constructs and proof procedures introduced are general purpose and recur in


6 
any specification or verification task.


7 


8 
The dedicated functional programmer should be warned: HOL offers only {\em


9 
total functional programming}  all functions in HOL must be total; lazy


10 
data structures are not directly available. On the positive side, functions


11 
in HOL need not be computable: HOL is a specification language that goes well


12 
beyond what can be expressed as a program. However, for the time being we


13 
concentrate on the computable.


14 


15 
\section{An introductory theory}


16 
\label{sec:introtheory}


17 


18 
Functional programming needs datatypes and functions. Both of them can be


19 
defined in a theory with a syntax reminiscent of languages like ML or


20 
Haskell. As an example consider the theory in figure~\ref{fig:ToyList}.


21 
We will now examine it line by line.


22 


23 
\begin{figure}[htbp]


24 
\begin{ttbox}\makeatother


25 
\input{ToyList2/ToyList1}\end{ttbox}


26 
\caption{A theory of lists}


27 
\label{fig:ToyList}


28 
\end{figure}


29 


30 
{\makeatother\input{ToyList/document/ToyList.tex}}


31 


32 
The complete proof script is shown in figure~\ref{fig:ToyListproofs}. The


33 
concatenation of figures \ref{fig:ToyList} and \ref{fig:ToyListproofs}


34 
constitutes the complete theory \texttt{ToyList} and should reside in file


35 
\texttt{ToyList.thy}. It is good practice to present all declarations and


36 
definitions at the beginning of a theory to facilitate browsing.


37 


38 
\begin{figure}[htbp]


39 
\begin{ttbox}\makeatother


40 
\input{ToyList2/ToyList2}\end{ttbox}


41 
\caption{Proofs about lists}


42 
\label{fig:ToyListproofs}


43 
\end{figure}


44 


45 
\subsubsection*{Review}


46 


47 
This is the end of our toy proof. It should have familiarized you with


48 
\begin{itemize}


49 
\item the standard theorem proving procedure:


50 
state a goal (lemma or theorem); proceed with proof until a separate lemma is


51 
required; prove that lemma; come back to the original goal.


52 
\item a specific procedure that works well for functional programs:


53 
induction followed by allout simplification via \isa{auto}.


54 
\item a basic repertoire of proof commands.


55 
\end{itemize}


56 


57 


58 
\section{Some helpful commands}


59 
\label{sec:commandsandhints}


60 


61 
This section discusses a few basic commands for manipulating the proof state


62 
and can be skipped by casual readers.


63 


64 
There are two kinds of commands used during a proof: the actual proof


65 
commands and auxiliary commands for examining the proof state and controlling


66 
the display. Simple proof commands are of the form


67 
\isacommand{apply}\isa{(method)}\indexbold{apply} where \bfindex{method} is a


68 
synonym for ``theorem proving function''. Typical examples are


69 
\isa{induct_tac} and \isa{auto}. Further methods are introduced throughout


70 
the tutorial. Unless stated otherwise you may assume that a method attacks


71 
merely the first subgoal. An exception is \isa{auto} which tries to solve all


72 
subgoals.


73 


74 
The most useful auxiliary commands are:


75 
\begin{description}


76 
\item[Undoing:] \isacommand{undo}\indexbold{*undo} undoes the effect of the


77 
last command; \isacommand{undo} can be undone by


78 
\isacommand{redo}\indexbold{*redo}. Both are only needed at the shell


79 
level and should not occur in the final theory.


80 
\item[Printing the current state:] \isacommand{pr}\indexbold{*pr} redisplays


81 
the current proof state, for example when it has disappeared off the


82 
screen.


83 
\item[Limiting the number of subgoals:] \isacommand{pr}~$n$ tells Isabelle to


84 
print only the first $n$ subgoals from now on and redisplays the current


85 
proof state. This is helpful when there are many subgoals.


86 
\item[Modifying the order of subgoals:]


87 
\isacommand{defer}\indexbold{*defer} moves the first subgoal to the end and


88 
\isacommand{prefer}\indexbold{*prefer}~$n$ moves subgoal $n$ to the front.


89 
\item[Printing theorems:]


90 
\isacommand{thm}\indexbold{*thm}~\textit{name}$@1$~\dots~\textit{name}$@n$


91 
prints the named theorems.


92 
\item[Displaying types:] We have already mentioned the flag


93 
\ttindex{show_types} above. It can also be useful for detecting typos in


94 
formulae early on. For example, if \texttt{show_types} is set and the goal


95 
\isa{rev(rev xs) = xs} is started, Isabelle prints the additional output


96 
\par\noindent


97 
\begin{isabelle}%


98 
Variables:\isanewline


99 
~~xs~::~'a~list


100 
\end{isabelle}%


101 
\par\noindent


102 
which tells us that Isabelle has correctly inferred that


103 
\isa{xs} is a variable of list type. On the other hand, had we


104 
made a typo as in \isa{rev(re xs) = xs}, the response


105 
\par\noindent


106 
\begin{isabelle}%


107 
Variables:\isanewline


108 
~~re~::~'a~list~{\isasymRightarrow}~'a~list\isanewline


109 
~~xs~::~'a~list%


110 
\end{isabelle}%


111 
\par\noindent


112 
would have alerted us because of the unexpected variable \isa{re}.


113 
\item[Reading terms and types:] \isacommand{term}\indexbold{*term}


114 
\textit{string} reads, typechecks and prints the given string as a term in


115 
the current context; the inferred type is output as well.


116 
\isacommand{typ}\indexbold{*typ} \textit{string} reads and prints the given


117 
string as a type in the current context.


118 
\item[(Re)loading theories:] When you start your interaction you must open a

8771

119 
named theory with the line \isa{\isacommand{theory}~T~=~\dots~:}. Isabelle


120 
automatically loads all the required parent theories from their respective


121 
files (which may take a moment, unless the theories are already loaded and


122 
the files have not been modified). The only time when you need to load a


123 
theory by hand is when you simply want to check if it loads successfully


124 
without wanting to make use of the theory itself. This you can do by typing


125 
\isa{\isacommand{use\_thy}\indexbold{*use_thy}~"T"}.

8743

126 


127 
If you suddenly discover that you need to modify a parent theory of your


128 
current theory you must first abandon your current theory (at the shell


129 
level type \isacommand{kill}\indexbold{*kill}). After the parent theory has


130 
been modified you go back to your original theory. When its first line


131 
\isacommand{theory}\texttt{~T~=}~\dots~\texttt{:} is processed, the


132 
modified parent is reloaded automatically.


133 
\end{description}


134 
Further commands are found in the Isabelle/Isar Reference Manual.


135 

8771

136 
We now examine Isabelle's functional programming constructs systematically,


137 
starting with inductive datatypes.


138 

8743

139 


140 
\section{Datatypes}


141 
\label{sec:datatype}


142 


143 
Inductive datatypes are part of almost every nontrivial application of HOL.


144 
First we take another look at a very important example, the datatype of


145 
lists, before we turn to datatypes in general. The section closes with a


146 
case study.


147 


148 


149 
\subsection{Lists}


150 
\indexbold{*list}


151 


152 
Lists are one of the essential datatypes in computing. Readers of this


153 
tutorial and users of HOL need to be familiar with their basic operations.

8771

154 
Theory \isa{ToyList} is only a small fragment of HOL's predefined theory


155 
\isa{List}\footnote{\url{http://isabelle.in.tum.de/library/HOL/List.html}}.

8743

156 
The latter contains many further operations. For example, the functions

8771

157 
\isaindexbold{hd} (``head'') and \isaindexbold{tl} (``tail'') return the first

8743

158 
element and the remainder of a list. (However, patternmatching is usually

8771

159 
preferable to \isa{hd} and \isa{tl}.) Theory \isa{List} also contains

8743

160 
more syntactic sugar: \isa{[}$x@1$\isa{,}\dots\isa{,}$x@n$\isa{]} abbreviates


161 
$x@1$\isa{\#}\dots\isa{\#}$x@n$\isa{\#[]}. In the rest of the tutorial we


162 
always use HOL's predefined lists.


163 


164 


165 
\subsection{The general format}


166 
\label{sec:generaldatatype}


167 


168 
The general HOL \isacommand{datatype} definition is of the form


169 
\[


170 
\isacommand{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~


171 
C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~


172 
C@m~\tau@{m1}~\dots~\tau@{mk@m}


173 
\]

8771

174 
where $\alpha@i$ are distinct type variables (the parameters), $C@i$ are distinct

8743

175 
constructor names and $\tau@{ij}$ are types; it is customary to capitalize


176 
the first letter in constructor names. There are a number of


177 
restrictions (such as that the type should not be empty) detailed


178 
elsewhere~\cite{isabelleHOL}. Isabelle notifies you if you violate them.


179 


180 
Laws about datatypes, such as \isa{[] \isasymnoteq~x\#xs} and


181 
\isa{(x\#xs = y\#ys) = (x=y \isasymand~xs=ys)}, are used automatically


182 
during proofs by simplification. The same is true for the equations in


183 
primitive recursive function definitions.


184 


185 
\subsection{Primitive recursion}


186 


187 
Functions on datatypes are usually defined by recursion. In fact, most of the


188 
time they are defined by what is called \bfindex{primitive recursion}.


189 
The keyword \isacommand{primrec}\indexbold{*primrec} is followed by a list of


190 
equations


191 
\[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \]


192 
such that $C$ is a constructor of the datatype $t$ and all recursive calls of


193 
$f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus


194 
Isabelle immediately sees that $f$ terminates because one (fixed!) argument


195 
becomes smaller with every recursive call. There must be exactly one equation


196 
for each constructor. Their order is immaterial.

8771

197 
A more general method for defining total recursive functions is introduced in

8743

198 
\S\ref{sec:recdef}.


199 

9493

200 
\begin{exercise}\label{ex:Tree}

8743

201 
\input{Misc/document/Tree.tex}%


202 
\end{exercise}


203 


204 
\subsection{Case expressions}


205 
\label{sec:caseexpressions}


206 


207 
HOL also features \isaindexbold{case}expressions for analyzing


208 
elements of a datatype. For example,


209 
% case xs of [] => 0  y#ys => y


210 
\begin{isabellepar}%


211 
~~~case~xs~of~[]~{\isasymRightarrow}~0~~y~\#~ys~{\isasymRightarrow}~y


212 
\end{isabellepar}%


213 
evaluates to \isa{0} if \isa{xs} is \isa{[]} and to \isa{y} if


214 
\isa{xs} is \isa{y\#ys}. (Since the result in both branches must be of


215 
the same type, it follows that \isa{y::nat} and hence


216 
\isa{xs::(nat)list}.)


217 


218 
In general, if $e$ is a term of the datatype $t$ defined in


219 
\S\ref{sec:generaldatatype} above, the corresponding


220 
\isa{case}expression analyzing $e$ is


221 
\[


222 
\begin{array}{rrcl}


223 
\isa{case}~e~\isa{of} & C@1~x@{11}~\dots~x@{1k@1} & \To & e@1 \\


224 
\vdots \\


225 
\mid & C@m~x@{m1}~\dots~x@{mk@m} & \To & e@m


226 
\end{array}


227 
\]


228 


229 
\begin{warn}


230 
{\em All} constructors must be present, their order is fixed, and nested


231 
patterns are not supported. Violating these restrictions results in strange


232 
error messages.


233 
\end{warn}


234 
\noindent


235 
Nested patterns can be simulated by nested \isa{case}expressions: instead


236 
of


237 
% case xs of [] => 0  [x] => x  x#(y#zs) => y


238 
\begin{isabellepar}%


239 
~~~case~xs~of~[]~{\isasymRightarrow}~0~~[x]~{\isasymRightarrow}~x~~x~\#~y~\#~zs~{\isasymRightarrow}~y


240 
\end{isabellepar}%


241 
write


242 
% term(latex xsymbols symbols)"case xs of [] => 0  x#ys => (case ys of [] => x  y#zs => y)";


243 
\begin{isabellepar}%


244 
~~~case~xs~of~[]~{\isasymRightarrow}~0~~x~\#~ys~{\isasymRightarrow}~(case~ys~of~[]~{\isasymRightarrow}~x~~y~\#~zs~{\isasymRightarrow}~y)%


245 
\end{isabellepar}%


246 
Note that \isa{case}expressions should be enclosed in parentheses to


247 
indicate their scope.


248 


249 
\subsection{Structural induction and case distinction}


250 
\indexbold{structural induction}


251 
\indexbold{induction!structural}


252 
\indexbold{case distinction}


253 


254 
Almost all the basic laws about a datatype are applied automatically during


255 
simplification. Only induction is invoked by hand via \isaindex{induct_tac},


256 
which works for any datatype. In some cases, induction is overkill and a case


257 
distinction over all constructors of the datatype suffices. This is performed


258 
by \isaindexbold{case_tac}. A trivial example:


259 


260 
\input{Misc/document/cases.tex}%


261 


262 
Note that we do not need to give a lemma a name if we do not intend to refer


263 
to it explicitly in the future.


264 


265 
\begin{warn}


266 
Induction is only allowed on free (or \isasymAndbound) variables that


267 
should not occur among the assumptions of the subgoal. Case distinction


268 
(\isa{case_tac}) works for arbitrary terms, which need to be


269 
quoted if they are nonatomic.


270 
\end{warn}


271 


272 


273 
\subsection{Case study: boolean expressions}


274 
\label{sec:boolex}


275 


276 
The aim of this case study is twofold: it shows how to model boolean


277 
expressions and some algorithms for manipulating them, and it demonstrates


278 
the constructs introduced above.


279 


280 
\input{Ifexpr/document/Ifexpr.tex}

8771

281 
\medskip

8743

282 


283 
How does one come up with the required lemmas? Try to prove the main theorems


284 
without them and study carefully what \isa{auto} leaves unproved. This has


285 
to provide the clue. The necessity of universal quantification


286 
(\isa{\isasymforall{}t e}) in the two lemmas is explained in


287 
\S\ref{sec:InductionHeuristics}


288 


289 
\begin{exercise}


290 
We strengthen the definition of a \isa{normal} Ifexpression as follows:


291 
the first argument of all \isa{IF}s must be a variable. Adapt the above


292 
development to this changed requirement. (Hint: you may need to formulate


293 
some of the goals as implications (\isasymimp) rather than equalities


294 
(\isa{=}).)


295 
\end{exercise}


296 


297 
\section{Some basic types}


298 


299 


300 
\subsection{Natural numbers}


301 
\index{arithmetic(}


302 


303 
\input{Misc/document/fakenat.tex}


304 
\input{Misc/document/natsum.tex}


305 


306 
The usual arithmetic operations \ttindexboldpos{+}{$HOL2arithfun},


307 
\ttindexboldpos{}{$HOL2arithfun}, \ttindexboldpos{*}{$HOL2arithfun},


308 
\isaindexbold{div}, \isaindexbold{mod}, \isaindexbold{min} and


309 
\isaindexbold{max} are predefined, as are the relations


310 
\indexboldpos{\isasymle}{$HOL2arithrel} and


311 
\ttindexboldpos{<}{$HOL2arithrel}. There is even a least number operation


312 
\isaindexbold{LEAST}. For example, \isa{(LEAST n.$\,$1 < n) = 2}, although


313 
Isabelle does not prove this completely automatically. Note that \isa{1} and


314 
\isa{2} are available as abbreviations for the corresponding


315 
\isa{Suc}expressions. If you need the full set of numerals,


316 
see~\S\ref{natnumerals}.


317 


318 
\begin{warn}


319 
The operations \ttindexboldpos{+}{$HOL2arithfun},


320 
\ttindexboldpos{}{$HOL2arithfun}, \ttindexboldpos{*}{$HOL2arithfun},


321 
\isaindexbold{min}, \isaindexbold{max},


322 
\indexboldpos{\isasymle}{$HOL2arithrel} and


323 
\ttindexboldpos{<}{$HOL2arithrel} are overloaded, i.e.\ they are available


324 
not just for natural numbers but at other types as well (see


325 
\S\ref{sec:TypeClasses}). For example, given the goal \isa{x+y = y+x},


326 
there is nothing to indicate that you are talking about natural numbers.


327 
Hence Isabelle can only infer that \isa{x} and \isa{y} are of some


328 
arbitrary type where \isa{+} is declared. As a consequence, you will be


329 
unable to prove the goal (although it may take you some time to realize


330 
what has happened if \texttt{show_types} is not set). In this particular


331 
example, you need to include an explicit type constraint, for example


332 
\isa{x+y = y+(x::nat)}. If there is enough contextual information this may


333 
not be necessary: \isa{x+0 = x} automatically implies \isa{x::nat}.


334 
\end{warn}


335 


336 
Simple arithmetic goals are proved automatically by both \isa{auto}


337 
and the simplification methods introduced in \S\ref{sec:Simplification}. For


338 
example,


339 


340 
\input{Misc/document/arith1.tex}%


341 
is proved automatically. The main restriction is that only addition is taken


342 
into account; other arithmetic operations and quantified formulae are ignored.


343 


344 
For more complex goals, there is the special method


345 
\isaindexbold{arith} (which attacks the first subgoal). It proves


346 
arithmetic goals involving the usual logical connectives (\isasymnot,


347 
\isasymand, \isasymor, \isasymimp), the relations \isasymle\ and \isa{<}, and


348 
the operations \isa{+}, \isa{}, \isa{min} and \isa{max}. For example,


349 


350 
\input{Misc/document/arith2.tex}%


351 
succeeds because \isa{k*k} can be treated as atomic.


352 
In contrast,


353 


354 
\input{Misc/document/arith3.tex}%


355 
is not even proved by \isa{arith} because the proof relies essentially


356 
on properties of multiplication.


357 


358 
\begin{warn}


359 
The running time of \isa{arith} is exponential in the number of occurrences


360 
of \ttindexboldpos{}{$HOL2arithfun}, \isaindexbold{min} and


361 
\isaindexbold{max} because they are first eliminated by case distinctions.


362 


363 
\isa{arith} is incomplete even for the restricted class of formulae


364 
described above (known as ``linear arithmetic''). If divisibility plays a


365 
role, it may fail to prove a valid formula, for example $m+m \neq n+n+1$.


366 
Fortunately, such examples are rare in practice.


367 
\end{warn}


368 


369 
\index{arithmetic)}


370 


371 


372 
\subsection{Products}


373 

8771

374 
HOL also has pairs: \isa{($a@1$,$a@2$)} is of type \isa{$\tau@1$ *

8743

375 
$\tau@2$} provided each $a@i$ is of type $\tau@i$. The components of a pair


376 
are extracted by \isa{fst} and \isa{snd}: \isa{fst($x$,$y$) = $x$} and


377 
\isa{snd($x$,$y$) = $y$}. Tuples are simulated by pairs nested to the right:


378 
\isa{($a@1$,$a@2$,$a@3$)} stands for \isa{($a@1$,($a@2$,$a@3$))} and


379 
\isa{$\tau@1$ * $\tau@2$ * $\tau@3$} for \isa{$\tau@1$ * ($\tau@2$ *


380 
$\tau@3$)}. Therefore we have \isa{fst(snd($a@1$,$a@2$,$a@3$)) = $a@2$}.


381 


382 
It is possible to use (nested) tuples as patterns in abstractions, for


383 
example \isa{\isasymlambda(x,y,z).x+y+z} and


384 
\isa{\isasymlambda((x,y),z).x+y+z}.


385 
In addition to explicit $\lambda$abstractions, tuple patterns can be used in


386 
most variable binding constructs. Typical examples are


387 


388 
\input{Misc/document/pairs.tex}%


389 
Further important examples are quantifiers and sets (see~\S\ref{quantpats}).


390 


391 
%FIXME move stuff below into section on proofs about products?


392 
\begin{warn}


393 
Abstraction over pairs and tuples is merely a convenient shorthand for a


394 
more complex internal representation. Thus the internal and external form


395 
of a term may differ, which can affect proofs. If you want to avoid this


396 
complication, use \isa{fst} and \isa{snd}, i.e.\ write


397 
\isa{\isasymlambda{}p.~fst p + snd p} instead of


398 
\isa{\isasymlambda(x,y).~x + y}. See~\S\ref{proofsaboutproducts} for


399 
theorem proving with tuple patterns.


400 
\end{warn}


401 


402 
Finally note that products, like natural numbers, are datatypes, which means


403 
in particular that \isa{induct_tac} and \isa{case_tac} are applicable to


404 
products (see \S\ref{proofsaboutproducts}).


405 


406 
\section{Definitions}


407 
\label{sec:Definitions}


408 


409 
A definition is simply an abbreviation, i.e.\ a new name for an existing


410 
construction. In particular, definitions cannot be recursive. Isabelle offers


411 
definitions on the level of types and terms. Those on the type level are


412 
called type synonyms, those on the term level are called (constant)


413 
definitions.


414 


415 


416 
\subsection{Type synonyms}

8771

417 
\indexbold{type synonym}

8743

418 


419 
Type synonyms are similar to those found in ML. Their syntax is fairly self


420 
explanatory:


421 


422 
\input{Misc/document/types.tex}%


423 


424 
Note that patternmatching is not allowed, i.e.\ each definition must be of


425 
the form $f\,x@1\,\dots\,x@n~\isasymequiv~t$.


426 
Section~\S\ref{sec:Simplification} explains how definitions are used


427 
in proofs.


428 


429 
\begin{warn}%


430 
A common mistake when writing definitions is to introduce extra free


431 
variables on the righthand side as in the following fictitious definition:


432 
\input{Misc/document/prime_def.tex}%


433 
\end{warn}


434 


435 


436 
\chapter{More Functional Programming}


437 


438 
The purpose of this chapter is to deepen the reader's understanding of the

8771

439 
concepts encountered so far and to introduce advanced forms of datatypes and


440 
recursive functions. The first two sections give a structured presentation of


441 
theorem proving by simplification (\S\ref{sec:Simplification}) and discuss


442 
important heuristics for induction (\S\ref{sec:InductionHeuristics}). They can


443 
be skipped by readers less interested in proofs. They are followed by a case


444 
study, a compiler for expressions (\S\ref{sec:ExprCompiler}). Advanced


445 
datatypes, including those involving function spaces, are covered in


446 
\S\ref{sec:advanceddatatypes}, which closes with another case study, search


447 
trees (``tries''). Finally we introduce \isacommand{recdef}, a very general


448 
form of recursive function definition which goes well beyond what


449 
\isacommand{primrec} allows (\S\ref{sec:recdef}).

8743

450 


451 


452 
\section{Simplification}


453 
\label{sec:Simplification}


454 
\index{simplification(}


455 


456 
So far we have proved our theorems by \isa{auto}, which ``simplifies'' all


457 
subgoals. In fact, \isa{auto} can do much more than that, except that it


458 
did not need to so far. However, when you go beyond toy examples, you need to


459 
understand the ingredients of \isa{auto}. This section covers the method


460 
that \isa{auto} always applies first, namely simplification.


461 


462 
Simplification is one of the central theorem proving tools in Isabelle and


463 
many other systems. The tool itself is called the \bfindex{simplifier}. The


464 
purpose of this section is twofold: to introduce the many features of the


465 
simplifier (\S\ref{sec:SimpFeatures}) and to explain a little bit how the


466 
simplifier works (\S\ref{sec:SimpHow}). Anybody intending to use HOL should


467 
read \S\ref{sec:SimpFeatures}, and the serious student should read


468 
\S\ref{sec:SimpHow} as well in order to understand what happened in case


469 
things do not simplify as expected.


470 


471 


472 
\subsection{Using the simplifier}


473 
\label{sec:SimpFeatures}


474 

9458

475 
\subsubsection{What is simplification}


476 

8743

477 
In its most basic form, simplification means repeated application of


478 
equations from left to right. For example, taking the rules for \isa{\at}


479 
and applying them to the term \isa{[0,1] \at\ []} results in a sequence of


480 
simplification steps:


481 
\begin{ttbox}\makeatother


482 
(0#1#[]) @ [] \(\leadsto\) 0#((1#[]) @ []) \(\leadsto\) 0#(1#([] @ [])) \(\leadsto\) 0#1#[]


483 
\end{ttbox}


484 
This is also known as \bfindex{term rewriting} and the equations are referred

8771

485 
to as \bfindex{rewrite rules}. ``Rewriting'' is more honest than ``simplification''

8743

486 
because the terms do not necessarily become simpler in the process.


487 


488 
\subsubsection{Simplification rules}


489 
\indexbold{simplification rules}


490 


491 
To facilitate simplification, theorems can be declared to be simplification


492 
rules (with the help of the attribute \isa{[simp]}\index{*simp


493 
(attribute)}), in which case proofs by simplification make use of these

8771

494 
rules automatically. In addition the constructs \isacommand{datatype} and

8743

495 
\isacommand{primrec} (and a few others) invisibly declare useful


496 
simplification rules. Explicit definitions are \emph{not} declared


497 
simplification rules automatically!


498 


499 
Not merely equations but pretty much any theorem can become a simplification


500 
rule. The simplifier will try to make sense of it. For example, a theorem


501 
\isasymnot$P$ is automatically turned into \isa{$P$ = False}. The details


502 
are explained in \S\ref{sec:SimpHow}.


503 


504 
The simplification attribute of theorems can be turned on and off as follows:


505 
\begin{ttbox}


506 
theorems [simp] = \(list of theorem names\);


507 
theorems [simp del] = \(list of theorem names\);


508 
\end{ttbox}

8771

509 
As a rule of thumb, equations that really simplify (like \isa{rev(rev xs) =

8743

510 
xs} and \mbox{\isa{xs \at\ [] = xs}}) should be made simplification


511 
rules. Those of a more specific nature (e.g.\ distributivity laws, which


512 
alter the structure of terms considerably) should only be used selectively,


513 
i.e.\ they should not be default simplification rules. Conversely, it may


514 
also happen that a simplification rule needs to be disabled in certain


515 
proofs. Frequent changes in the simplification status of a theorem may

8771

516 
indicate a badly designed theory.

8743

517 
\begin{warn}


518 
Simplification may not terminate, for example if both $f(x) = g(x)$ and


519 
$g(x) = f(x)$ are simplification rules. It is the user's responsibility not


520 
to include simplification rules that can lead to nontermination, either on


521 
their own or in combination with other simplification rules.


522 
\end{warn}


523 


524 
\subsubsection{The simplification method}


525 
\index{*simp (method)bold}


526 


527 
The general format of the simplification method is


528 
\begin{ttbox}


529 
simp \(list of modifiers\)


530 
\end{ttbox}


531 
where the list of \emph{modifiers} helps to fine tune the behaviour and may


532 
be empty. Most if not all of the proofs seen so far could have been performed


533 
with \isa{simp} instead of \isa{auto}, except that \isa{simp} attacks


534 
only the first subgoal and may thus need to be repeated.


535 
Note that \isa{simp} fails if nothing changes.


536 


537 
\subsubsection{Adding and deleting simplification rules}


538 


539 
If a certain theorem is merely needed in a few proofs by simplification,


540 
we do not need to make it a global simplification rule. Instead we can modify


541 
the set of simplification rules used in a simplification step by adding rules


542 
to it and/or deleting rules from it. The two modifiers for this are


543 
\begin{ttbox}


544 
add: \(list of theorem names\)


545 
del: \(list of theorem names\)


546 
\end{ttbox}


547 
In case you want to use only a specific list of theorems and ignore all


548 
others:


549 
\begin{ttbox}


550 
only: \(list of theorem names\)


551 
\end{ttbox}


552 


553 


554 
\subsubsection{Assumptions}


555 
\index{simplification!with/of assumptions}


556 


557 
{\makeatother\input{Misc/document/asm_simp.tex}}


558 


559 
\subsubsection{Rewriting with definitions}


560 
\index{simplification!with definitions}


561 


562 
\input{Misc/document/def_rewr.tex}


563 


564 
\begin{warn}


565 
If you have defined $f\,x\,y~\isasymequiv~t$ then you can only expand


566 
occurrences of $f$ with at least two arguments. Thus it is safer to define


567 
$f$~\isasymequiv~\isasymlambda$x\,y.\;t$.


568 
\end{warn}


569 


570 
\subsubsection{Simplifying letexpressions}


571 
\index{simplification!of letexpressions}


572 


573 
Proving a goal containing \isaindex{let}expressions almost invariably


574 
requires the \isa{let}con\structs to be expanded at some point. Since

8771

575 
\isa{let}\isa{in} is just syntactic sugar for a predefined constant (called

8743

576 
\isa{Let}), expanding \isa{let}constructs means rewriting with


577 
\isa{Let_def}:


578 


579 
{\makeatother\input{Misc/document/let_rewr.tex}}


580 


581 
\subsubsection{Conditional equations}


582 


583 
\input{Misc/document/cond_rewr.tex}


584 


585 


586 
\subsubsection{Automatic case splits}


587 
\indexbold{case splits}


588 
\index{*split(}


589 


590 
{\makeatother\input{Misc/document/case_splits.tex}}


591 


592 
\index{*split)}


593 


594 
\begin{warn}


595 
The simplifier merely simplifies the condition of an \isa{if} but not the


596 
\isa{then} or \isa{else} parts. The latter are simplified only after the


597 
condition reduces to \isa{True} or \isa{False}, or after splitting. The


598 
same is true for \isaindex{case}expressions: only the selector is


599 
simplified at first, until either the expression reduces to one of the


600 
cases or it is split.


601 
\end{warn}


602 


603 
\subsubsection{Arithmetic}


604 
\index{arithmetic}


605 


606 
The simplifier routinely solves a small class of linear arithmetic formulae

8771

607 
(over type \isa{nat} and other numeric types): it only takes into account

8743

608 
assumptions and conclusions that are (possibly negated) (in)equalities


609 
(\isa{=}, \isasymle, \isa{<}) and it only knows about addition. Thus


610 


611 
\input{Misc/document/arith1.tex}%


612 
is proved by simplification, whereas the only slightly more complex


613 


614 
\input{Misc/document/arith4.tex}%


615 
is not proved by simplification and requires \isa{arith}.


616 


617 
\subsubsection{Permutative rewrite rules}


618 


619 
A rewrite rule is {\bf permutative} if the lefthand side and righthand side


620 
are the same up to renaming of variables. The most common permutative rule


621 
is commutativity: $x+y = y+x$. Another example is $(xy)z = (xz)y$. Such


622 
rules are problematic because once they apply, they can be used forever.


623 
The simplifier is aware of this danger and treats permutative rules


624 
separately. For details see~\cite{isabelleref}.


625 


626 
\subsubsection{Tracing}


627 
\indexbold{tracing the simplifier}


628 


629 
{\makeatother\input{Misc/document/trace_simp.tex}}


630 


631 
\subsection{How it works}


632 
\label{sec:SimpHow}


633 


634 
\subsubsection{Higherorder patterns}


635 


636 
\subsubsection{Local assumptions}


637 


638 
\subsubsection{The preprocessor}


639 


640 
\index{simplification)}


641 


642 
\section{Induction heuristics}


643 
\label{sec:InductionHeuristics}


644 


645 
The purpose of this section is to illustrate some simple heuristics for


646 
inductive proofs. The first one we have already mentioned in our initial


647 
example:


648 
\begin{quote}


649 
{\em 1. Theorems about recursive functions are proved by induction.}


650 
\end{quote}


651 
In case the function has more than one argument


652 
\begin{quote}


653 
{\em 2. Do induction on argument number $i$ if the function is defined by


654 
recursion in argument number $i$.}


655 
\end{quote}


656 
When we look at the proof of {\makeatother\isa{(xs @ ys) @ zs = xs @ (ys @


657 
zs)}} in \S\ref{sec:introproof} we find (a) \isa{\at} is recursive in


658 
the first argument, (b) \isa{xs} occurs only as the first argument of


659 
\isa{\at}, and (c) both \isa{ys} and \isa{zs} occur at least once as


660 
the second argument of \isa{\at}. Hence it is natural to perform induction


661 
on \isa{xs}.


662 


663 
The key heuristic, and the main point of this section, is to


664 
generalize the goal before induction. The reason is simple: if the goal is


665 
too specific, the induction hypothesis is too weak to allow the induction


666 
step to go through. Let us now illustrate the idea with an example.


667 


668 
{\makeatother\input{Misc/document/Itrev.tex}}


669 


670 
A final point worth mentioning is the orientation of the equation we just


671 
proved: the more complex notion (\isa{itrev}) is on the lefthand


672 
side, the simpler \isa{rev} on the righthand side. This constitutes


673 
another, albeit weak heuristic that is not restricted to induction:


674 
\begin{quote}


675 
{\em 5. The righthand side of an equation should (in some sense) be


676 
simpler than the lefthand side.}


677 
\end{quote}


678 
The heuristic is tricky to apply because it is not obvious that


679 
\isa{rev xs \at\ ys} is simpler than \isa{itrev xs ys}. But see what

8771

680 
happens if you try to prove \isa{rev xs \at\ ys = itrev xs ys}!

8743

681 

9493

682 
\begin{exercise}


683 
\input{Misc/document/Tree2.tex}%


684 
\end{exercise}

8743

685 


686 
\section{Case study: compiling expressions}


687 
\label{sec:ExprCompiler}


688 


689 
{\makeatother\input{CodeGen/document/CodeGen.tex}}


690 


691 


692 
\section{Advanced datatypes}


693 
\label{sec:advanceddatatypes}


694 
\index{*datatype(}


695 
\index{*primrec(}


696 
%)


697 


698 
This section presents advanced forms of \isacommand{datatype}s.


699 


700 
\subsection{Mutual recursion}


701 
\label{sec:datatypemutrec}


702 


703 
\input{Datatype/document/ABexpr.tex}


704 


705 
\subsection{Nested recursion}


706 


707 
\input{Datatype/document/Nested.tex}


708 


709 


710 
\subsection{The limits of nested recursion}


711 


712 
How far can we push nested recursion? By the unfolding argument above, we can


713 
reduce nested to mutual recursion provided the nested recursion only involves


714 
previously defined datatypes. This does not include functions:


715 
\begin{ttbox}


716 
datatype t = C (t => bool)


717 
\end{ttbox}


718 
is a real can of worms: in HOL it must be ruled out because it requires a type


719 
\isa{t} such that \isa{t} and its power set \isa{t \isasymFun\ bool} have the


720 
same cardinalityan impossibility. For the same reason it is not possible


721 
to allow recursion involving the type \isa{set}, which is isomorphic to


722 
\isa{t \isasymFun\ bool}.


723 


724 
Fortunately, a limited form of recursion


725 
involving function spaces is permitted: the recursive type may occur on the


726 
right of a function arrow, but never on the left. Hence the above can of worms


727 
is ruled out but the following example of a potentially infinitely branching tree is


728 
accepted:

8771

729 
\smallskip

8743

730 


731 
\input{Datatype/document/Fundata.tex}


732 
\bigskip


733 


734 
If you need nested recursion on the left of a function arrow,


735 
there are alternatives to pure HOL: LCF~\cite{paulson87} is a logic where types like


736 
\begin{ttbox}


737 
datatype lam = C (lam > lam)


738 
\end{ttbox}

8771

739 
do indeed make sense (note the intentionally different arrow \isa{>}).

8743

740 
There is even a version of LCF on top of HOL, called


741 
HOLCF~\cite{MuellerNvOS99}.


742 


743 
\index{*primrec)}


744 
\index{*datatype)}


745 


746 
\subsection{Case study: Tries}


747 


748 
Tries are a classic search tree data structure~\cite{Knuth375} for fast


749 
indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a


750 
trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and


751 
``cat''. When searching a string in a trie, the letters of the string are


752 
examined sequentially. Each letter determines which subtrie to search next.


753 
In this case study we model tries as a datatype, define a lookup and an


754 
update function, and prove that they behave as expected.


755 


756 
\begin{figure}[htbp]


757 
\begin{center}


758 
\unitlength1mm


759 
\begin{picture}(60,30)


760 
\put( 5, 0){\makebox(0,0)[b]{l}}


761 
\put(25, 0){\makebox(0,0)[b]{e}}


762 
\put(35, 0){\makebox(0,0)[b]{n}}


763 
\put(45, 0){\makebox(0,0)[b]{r}}


764 
\put(55, 0){\makebox(0,0)[b]{t}}


765 
%


766 
\put( 5, 9){\line(0,1){5}}


767 
\put(25, 9){\line(0,1){5}}


768 
\put(44, 9){\line(3,2){9}}


769 
\put(45, 9){\line(0,1){5}}


770 
\put(46, 9){\line(3,2){9}}


771 
%


772 
\put( 5,10){\makebox(0,0)[b]{l}}


773 
\put(15,10){\makebox(0,0)[b]{n}}


774 
\put(25,10){\makebox(0,0)[b]{p}}


775 
\put(45,10){\makebox(0,0)[b]{a}}


776 
%


777 
\put(14,19){\line(3,2){9}}


778 
\put(15,19){\line(0,1){5}}


779 
\put(16,19){\line(3,2){9}}


780 
\put(45,19){\line(0,1){5}}


781 
%


782 
\put(15,20){\makebox(0,0)[b]{a}}


783 
\put(45,20){\makebox(0,0)[b]{c}}


784 
%


785 
\put(30,30){\line(3,2){13}}


786 
\put(30,30){\line(3,2){13}}


787 
\end{picture}


788 
\end{center}


789 
\caption{A sample trie}


790 
\label{fig:trie}


791 
\end{figure}


792 


793 
Proper tries associate some value with each string. Since the


794 
information is stored only in the final node associated with the string, many


795 
nodes do not carry any value. This distinction is captured by the

8771

796 
following predefined datatype (from theory \isa{Option}, which is a parent


797 
of \isa{Main}):


798 
\smallskip

8743

799 
\input{Trie/document/Option2.tex}

8771

800 
\indexbold{*option}\indexbold{*None}\indexbold{*Some}%

8743

801 
\input{Trie/document/Trie.tex}


802 


803 
\begin{exercise}


804 
Write an improved version of \isa{update} that does not suffer from the


805 
space leak in the version above. Prove the main theorem for your improved


806 
\isa{update}.


807 
\end{exercise}


808 


809 
\begin{exercise}


810 
Write a function to \emph{delete} entries from a trie. An easy solution is


811 
a clever modification of \isa{update} which allows both insertion and


812 
deletion with a single function. Prove (a modified version of) the main


813 
theorem above. Optimize you function such that it shrinks tries after


814 
deletion, if possible.


815 
\end{exercise}


816 


817 
\section{Total recursive functions}


818 
\label{sec:recdef}


819 
\index{*recdef(}


820 


821 
Although many total functions have a natural primitive recursive definition,


822 
this is not always the case. Arbitrary total recursive functions can be


823 
defined by means of \isacommand{recdef}: you can use full patternmatching,


824 
recursion need not involve datatypes, and termination is proved by showing


825 
that the arguments of all recursive calls are smaller in a suitable (user


826 
supplied) sense.


827 


828 
\subsection{Defining recursive functions}


829 


830 
\input{Recdef/document/examples.tex}


831 


832 
\subsection{Proving termination}


833 


834 
\input{Recdef/document/termination.tex}


835 


836 
\subsection{Simplification with recdef}


837 


838 
\input{Recdef/document/simplification.tex}


839 


840 


841 
\subsection{Induction}


842 
\index{induction!recursion(}


843 
\index{recursion induction(}


844 


845 
\input{Recdef/document/Induction.tex}


846 


847 
\index{induction!recursion)}


848 
\index{recursion induction)}


849 
\index{*recdef)}
