8743

1 
\chapter{Functional Programming in HOL}


2 


3 
Although on the surface this chapter is mainly concerned with how to write


4 
functional programs in HOL and how to verify them, most of the


5 
constructs and proof procedures introduced are general purpose and recur in


6 
any specification or verification task.


7 

9541

8 
The dedicated functional programmer should be warned: HOL offers only


9 
\emph{total functional programming}  all functions in HOL must be total;


10 
lazy data structures are not directly available. On the positive side,


11 
functions in HOL need not be computable: HOL is a specification language that


12 
goes well beyond what can be expressed as a program. However, for the time


13 
being we concentrate on the computable.

8743

14 


15 
\section{An introductory theory}


16 
\label{sec:introtheory}


17 


18 
Functional programming needs datatypes and functions. Both of them can be


19 
defined in a theory with a syntax reminiscent of languages like ML or


20 
Haskell. As an example consider the theory in figure~\ref{fig:ToyList}.


21 
We will now examine it line by line.


22 


23 
\begin{figure}[htbp]


24 
\begin{ttbox}\makeatother


25 
\input{ToyList2/ToyList1}\end{ttbox}


26 
\caption{A theory of lists}


27 
\label{fig:ToyList}


28 
\end{figure}


29 


30 
{\makeatother\input{ToyList/document/ToyList.tex}}


31 


32 
The complete proof script is shown in figure~\ref{fig:ToyListproofs}. The


33 
concatenation of figures \ref{fig:ToyList} and \ref{fig:ToyListproofs}


34 
constitutes the complete theory \texttt{ToyList} and should reside in file


35 
\texttt{ToyList.thy}. It is good practice to present all declarations and


36 
definitions at the beginning of a theory to facilitate browsing.


37 


38 
\begin{figure}[htbp]


39 
\begin{ttbox}\makeatother


40 
\input{ToyList2/ToyList2}\end{ttbox}


41 
\caption{Proofs about lists}


42 
\label{fig:ToyListproofs}


43 
\end{figure}


44 


45 
\subsubsection*{Review}


46 


47 
This is the end of our toy proof. It should have familiarized you with


48 
\begin{itemize}


49 
\item the standard theorem proving procedure:


50 
state a goal (lemma or theorem); proceed with proof until a separate lemma is


51 
required; prove that lemma; come back to the original goal.


52 
\item a specific procedure that works well for functional programs:


53 
induction followed by allout simplification via \isa{auto}.


54 
\item a basic repertoire of proof commands.


55 
\end{itemize}


56 


57 


58 
\section{Some helpful commands}


59 
\label{sec:commandsandhints}


60 


61 
This section discusses a few basic commands for manipulating the proof state


62 
and can be skipped by casual readers.


63 


64 
There are two kinds of commands used during a proof: the actual proof


65 
commands and auxiliary commands for examining the proof state and controlling


66 
the display. Simple proof commands are of the form


67 
\isacommand{apply}\isa{(method)}\indexbold{apply} where \bfindex{method} is a


68 
synonym for ``theorem proving function''. Typical examples are


69 
\isa{induct_tac} and \isa{auto}. Further methods are introduced throughout


70 
the tutorial. Unless stated otherwise you may assume that a method attacks


71 
merely the first subgoal. An exception is \isa{auto} which tries to solve all


72 
subgoals.


73 


74 
The most useful auxiliary commands are:


75 
\begin{description}


76 
\item[Undoing:] \isacommand{undo}\indexbold{*undo} undoes the effect of the


77 
last command; \isacommand{undo} can be undone by


78 
\isacommand{redo}\indexbold{*redo}. Both are only needed at the shell


79 
level and should not occur in the final theory.


80 
\item[Printing the current state:] \isacommand{pr}\indexbold{*pr} redisplays


81 
the current proof state, for example when it has disappeared off the


82 
screen.


83 
\item[Limiting the number of subgoals:] \isacommand{pr}~$n$ tells Isabelle to


84 
print only the first $n$ subgoals from now on and redisplays the current


85 
proof state. This is helpful when there are many subgoals.


86 
\item[Modifying the order of subgoals:]


87 
\isacommand{defer}\indexbold{*defer} moves the first subgoal to the end and


88 
\isacommand{prefer}\indexbold{*prefer}~$n$ moves subgoal $n$ to the front.


89 
\item[Printing theorems:]


90 
\isacommand{thm}\indexbold{*thm}~\textit{name}$@1$~\dots~\textit{name}$@n$


91 
prints the named theorems.


92 
\item[Displaying types:] We have already mentioned the flag


93 
\ttindex{show_types} above. It can also be useful for detecting typos in


94 
formulae early on. For example, if \texttt{show_types} is set and the goal


95 
\isa{rev(rev xs) = xs} is started, Isabelle prints the additional output


96 
\par\noindent


97 
\begin{isabelle}%


98 
Variables:\isanewline


99 
~~xs~::~'a~list


100 
\end{isabelle}%


101 
\par\noindent


102 
which tells us that Isabelle has correctly inferred that


103 
\isa{xs} is a variable of list type. On the other hand, had we


104 
made a typo as in \isa{rev(re xs) = xs}, the response


105 
\par\noindent


106 
\begin{isabelle}%


107 
Variables:\isanewline


108 
~~re~::~'a~list~{\isasymRightarrow}~'a~list\isanewline


109 
~~xs~::~'a~list%


110 
\end{isabelle}%


111 
\par\noindent


112 
would have alerted us because of the unexpected variable \isa{re}.


113 
\item[Reading terms and types:] \isacommand{term}\indexbold{*term}


114 
\textit{string} reads, typechecks and prints the given string as a term in


115 
the current context; the inferred type is output as well.


116 
\isacommand{typ}\indexbold{*typ} \textit{string} reads and prints the given


117 
string as a type in the current context.


118 
\item[(Re)loading theories:] When you start your interaction you must open a

8771

119 
named theory with the line \isa{\isacommand{theory}~T~=~\dots~:}. Isabelle


120 
automatically loads all the required parent theories from their respective


121 
files (which may take a moment, unless the theories are already loaded and

9541

122 
the files have not been modified).

8743

123 


124 
If you suddenly discover that you need to modify a parent theory of your

9494

125 
current theory you must first abandon your current theory\indexbold{abandon


126 
theory}\indexbold{theory!abandon} (at the shell

8743

127 
level type \isacommand{kill}\indexbold{*kill}). After the parent theory has


128 
been modified you go back to your original theory. When its first line


129 
\isacommand{theory}\texttt{~T~=}~\dots~\texttt{:} is processed, the


130 
modified parent is reloaded automatically.

9541

131 


132 
The only time when you need to load a theory by hand is when you simply


133 
want to check if it loads successfully without wanting to make use of the


134 
theory itself. This you can do by typing


135 
\isa{\isacommand{use\_thy}\indexbold{*use_thy}~"T"}.

8743

136 
\end{description}


137 
Further commands are found in the Isabelle/Isar Reference Manual.


138 

8771

139 
We now examine Isabelle's functional programming constructs systematically,


140 
starting with inductive datatypes.


141 

8743

142 


143 
\section{Datatypes}


144 
\label{sec:datatype}


145 


146 
Inductive datatypes are part of almost every nontrivial application of HOL.


147 
First we take another look at a very important example, the datatype of


148 
lists, before we turn to datatypes in general. The section closes with a


149 
case study.


150 


151 


152 
\subsection{Lists}


153 
\indexbold{*list}


154 


155 
Lists are one of the essential datatypes in computing. Readers of this


156 
tutorial and users of HOL need to be familiar with their basic operations.

8771

157 
Theory \isa{ToyList} is only a small fragment of HOL's predefined theory


158 
\isa{List}\footnote{\url{http://isabelle.in.tum.de/library/HOL/List.html}}.

8743

159 
The latter contains many further operations. For example, the functions

8771

160 
\isaindexbold{hd} (``head'') and \isaindexbold{tl} (``tail'') return the first

8743

161 
element and the remainder of a list. (However, patternmatching is usually

8771

162 
preferable to \isa{hd} and \isa{tl}.) Theory \isa{List} also contains

8743

163 
more syntactic sugar: \isa{[}$x@1$\isa{,}\dots\isa{,}$x@n$\isa{]} abbreviates


164 
$x@1$\isa{\#}\dots\isa{\#}$x@n$\isa{\#[]}. In the rest of the tutorial we


165 
always use HOL's predefined lists.


166 


167 


168 
\subsection{The general format}


169 
\label{sec:generaldatatype}


170 


171 
The general HOL \isacommand{datatype} definition is of the form


172 
\[


173 
\isacommand{datatype}~(\alpha@1, \dots, \alpha@n) \, t ~=~


174 
C@1~\tau@{11}~\dots~\tau@{1k@1} ~\mid~ \dots ~\mid~


175 
C@m~\tau@{m1}~\dots~\tau@{mk@m}


176 
\]

8771

177 
where $\alpha@i$ are distinct type variables (the parameters), $C@i$ are distinct

8743

178 
constructor names and $\tau@{ij}$ are types; it is customary to capitalize


179 
the first letter in constructor names. There are a number of


180 
restrictions (such as that the type should not be empty) detailed


181 
elsewhere~\cite{isabelleHOL}. Isabelle notifies you if you violate them.


182 


183 
Laws about datatypes, such as \isa{[] \isasymnoteq~x\#xs} and


184 
\isa{(x\#xs = y\#ys) = (x=y \isasymand~xs=ys)}, are used automatically


185 
during proofs by simplification. The same is true for the equations in


186 
primitive recursive function definitions.


187 

9644

188 
Every datatype $t$ comes equipped with a \isa{size} function from $t$ into


189 
the natural numbers (see~\S\ref{sec:nat} below). For lists, \isa{size} is


190 
just the length, i.e.\ \isa{size [] = 0} and \isa{size(x \# xs) = size xs +


191 
1}. In general, \isa{size} returns \isa{0} for all constructors that do


192 
not have an argument of type $t$, and for all other constructors \isa{1 +}


193 
the sum of the sizes of all arguments of type $t$. Note that because


194 
\isa{size} is defined on every datatype, it is overloaded; on lists


195 
\isa{size} is also called \isa{length}, which is not overloaded.


196 


197 

8743

198 
\subsection{Primitive recursion}


199 


200 
Functions on datatypes are usually defined by recursion. In fact, most of the


201 
time they are defined by what is called \bfindex{primitive recursion}.


202 
The keyword \isacommand{primrec}\indexbold{*primrec} is followed by a list of


203 
equations


204 
\[ f \, x@1 \, \dots \, (C \, y@1 \, \dots \, y@k)\, \dots \, x@n = r \]


205 
such that $C$ is a constructor of the datatype $t$ and all recursive calls of


206 
$f$ in $r$ are of the form $f \, \dots \, y@i \, \dots$ for some $i$. Thus


207 
Isabelle immediately sees that $f$ terminates because one (fixed!) argument


208 
becomes smaller with every recursive call. There must be exactly one equation


209 
for each constructor. Their order is immaterial.

8771

210 
A more general method for defining total recursive functions is introduced in

8743

211 
\S\ref{sec:recdef}.


212 

9493

213 
\begin{exercise}\label{ex:Tree}

8743

214 
\input{Misc/document/Tree.tex}%


215 
\end{exercise}


216 


217 
\subsection{Case expressions}


218 
\label{sec:caseexpressions}


219 


220 
HOL also features \isaindexbold{case}expressions for analyzing


221 
elements of a datatype. For example,


222 
% case xs of [] => 0  y#ys => y


223 
\begin{isabellepar}%


224 
~~~case~xs~of~[]~{\isasymRightarrow}~0~~y~\#~ys~{\isasymRightarrow}~y


225 
\end{isabellepar}%


226 
evaluates to \isa{0} if \isa{xs} is \isa{[]} and to \isa{y} if


227 
\isa{xs} is \isa{y\#ys}. (Since the result in both branches must be of


228 
the same type, it follows that \isa{y::nat} and hence


229 
\isa{xs::(nat)list}.)


230 


231 
In general, if $e$ is a term of the datatype $t$ defined in


232 
\S\ref{sec:generaldatatype} above, the corresponding


233 
\isa{case}expression analyzing $e$ is


234 
\[


235 
\begin{array}{rrcl}


236 
\isa{case}~e~\isa{of} & C@1~x@{11}~\dots~x@{1k@1} & \To & e@1 \\


237 
\vdots \\


238 
\mid & C@m~x@{m1}~\dots~x@{mk@m} & \To & e@m


239 
\end{array}


240 
\]


241 


242 
\begin{warn}


243 
{\em All} constructors must be present, their order is fixed, and nested


244 
patterns are not supported. Violating these restrictions results in strange


245 
error messages.


246 
\end{warn}


247 
\noindent


248 
Nested patterns can be simulated by nested \isa{case}expressions: instead


249 
of


250 
% case xs of [] => 0  [x] => x  x#(y#zs) => y


251 
\begin{isabellepar}%


252 
~~~case~xs~of~[]~{\isasymRightarrow}~0~~[x]~{\isasymRightarrow}~x~~x~\#~y~\#~zs~{\isasymRightarrow}~y


253 
\end{isabellepar}%


254 
write


255 
% term(latex xsymbols symbols)"case xs of [] => 0  x#ys => (case ys of [] => x  y#zs => y)";


256 
\begin{isabellepar}%


257 
~~~case~xs~of~[]~{\isasymRightarrow}~0~~x~\#~ys~{\isasymRightarrow}~(case~ys~of~[]~{\isasymRightarrow}~x~~y~\#~zs~{\isasymRightarrow}~y)%


258 
\end{isabellepar}%


259 
Note that \isa{case}expressions should be enclosed in parentheses to


260 
indicate their scope.


261 


262 
\subsection{Structural induction and case distinction}


263 
\indexbold{structural induction}


264 
\indexbold{induction!structural}


265 
\indexbold{case distinction}


266 


267 
Almost all the basic laws about a datatype are applied automatically during


268 
simplification. Only induction is invoked by hand via \isaindex{induct_tac},


269 
which works for any datatype. In some cases, induction is overkill and a case


270 
distinction over all constructors of the datatype suffices. This is performed


271 
by \isaindexbold{case_tac}. A trivial example:


272 


273 
\input{Misc/document/cases.tex}%


274 


275 
Note that we do not need to give a lemma a name if we do not intend to refer


276 
to it explicitly in the future.


277 


278 
\begin{warn}


279 
Induction is only allowed on free (or \isasymAndbound) variables that

9644

280 
should not occur among the assumptions of the subgoal; see


281 
\S\ref{sec:indvarinprems} for details. Case distinction

8743

282 
(\isa{case_tac}) works for arbitrary terms, which need to be


283 
quoted if they are nonatomic.


284 
\end{warn}


285 


286 


287 
\subsection{Case study: boolean expressions}


288 
\label{sec:boolex}


289 


290 
The aim of this case study is twofold: it shows how to model boolean


291 
expressions and some algorithms for manipulating them, and it demonstrates


292 
the constructs introduced above.


293 


294 
\input{Ifexpr/document/Ifexpr.tex}

8771

295 
\medskip

8743

296 


297 
How does one come up with the required lemmas? Try to prove the main theorems


298 
without them and study carefully what \isa{auto} leaves unproved. This has


299 
to provide the clue. The necessity of universal quantification


300 
(\isa{\isasymforall{}t e}) in the two lemmas is explained in


301 
\S\ref{sec:InductionHeuristics}


302 


303 
\begin{exercise}


304 
We strengthen the definition of a \isa{normal} Ifexpression as follows:


305 
the first argument of all \isa{IF}s must be a variable. Adapt the above


306 
development to this changed requirement. (Hint: you may need to formulate


307 
some of the goals as implications (\isasymimp) rather than equalities


308 
(\isa{=}).)


309 
\end{exercise}


310 


311 
\section{Some basic types}


312 


313 


314 
\subsection{Natural numbers}

9644

315 
\label{sec:nat}

8743

316 
\index{arithmetic(}


317 


318 
\input{Misc/document/fakenat.tex}


319 
\input{Misc/document/natsum.tex}


320 


321 
The usual arithmetic operations \ttindexboldpos{+}{$HOL2arithfun},


322 
\ttindexboldpos{}{$HOL2arithfun}, \ttindexboldpos{*}{$HOL2arithfun},


323 
\isaindexbold{div}, \isaindexbold{mod}, \isaindexbold{min} and


324 
\isaindexbold{max} are predefined, as are the relations


325 
\indexboldpos{\isasymle}{$HOL2arithrel} and


326 
\ttindexboldpos{<}{$HOL2arithrel}. There is even a least number operation


327 
\isaindexbold{LEAST}. For example, \isa{(LEAST n.$\,$1 < n) = 2}, although


328 
Isabelle does not prove this completely automatically. Note that \isa{1} and


329 
\isa{2} are available as abbreviations for the corresponding


330 
\isa{Suc}expressions. If you need the full set of numerals,


331 
see~\S\ref{natnumerals}.


332 


333 
\begin{warn}

9494

334 
The constant \ttindexbold{0} and the operations


335 
\ttindexboldpos{+}{$HOL2arithfun}, \ttindexboldpos{}{$HOL2arithfun},


336 
\ttindexboldpos{*}{$HOL2arithfun}, \isaindexbold{min}, \isaindexbold{max},

8743

337 
\indexboldpos{\isasymle}{$HOL2arithrel} and


338 
\ttindexboldpos{<}{$HOL2arithrel} are overloaded, i.e.\ they are available


339 
not just for natural numbers but at other types as well (see

9494

340 
\S\ref{sec:TypeClasses}). For example, given the goal \isa{x+0 = x}, there


341 
is nothing to indicate that you are talking about natural numbers. Hence


342 
Isabelle can only infer that \isa{x} is of some arbitrary type where


343 
\isa{0} and \isa{+} are declared. As a consequence, you will be unable to


344 
prove the goal (although it may take you some time to realize what has


345 
happened if \texttt{show_types} is not set). In this particular example,


346 
you need to include an explicit type constraint, for example \isa{x+0 =


347 
(x::nat)}. If there is enough contextual information this may not be


348 
necessary: \isa{Suc x = x} automatically implies \isa{x::nat} because


349 
\isa{Suc} is not overloaded.

8743

350 
\end{warn}


351 


352 
Simple arithmetic goals are proved automatically by both \isa{auto}


353 
and the simplification methods introduced in \S\ref{sec:Simplification}. For


354 
example,


355 


356 
\input{Misc/document/arith1.tex}%


357 
is proved automatically. The main restriction is that only addition is taken


358 
into account; other arithmetic operations and quantified formulae are ignored.


359 


360 
For more complex goals, there is the special method


361 
\isaindexbold{arith} (which attacks the first subgoal). It proves


362 
arithmetic goals involving the usual logical connectives (\isasymnot,


363 
\isasymand, \isasymor, \isasymimp), the relations \isasymle\ and \isa{<}, and


364 
the operations \isa{+}, \isa{}, \isa{min} and \isa{max}. For example,


365 


366 
\input{Misc/document/arith2.tex}%


367 
succeeds because \isa{k*k} can be treated as atomic.


368 
In contrast,


369 


370 
\input{Misc/document/arith3.tex}%


371 
is not even proved by \isa{arith} because the proof relies essentially


372 
on properties of multiplication.


373 


374 
\begin{warn}


375 
The running time of \isa{arith} is exponential in the number of occurrences


376 
of \ttindexboldpos{}{$HOL2arithfun}, \isaindexbold{min} and


377 
\isaindexbold{max} because they are first eliminated by case distinctions.


378 


379 
\isa{arith} is incomplete even for the restricted class of formulae


380 
described above (known as ``linear arithmetic''). If divisibility plays a


381 
role, it may fail to prove a valid formula, for example $m+m \neq n+n+1$.


382 
Fortunately, such examples are rare in practice.


383 
\end{warn}


384 


385 
\index{arithmetic)}


386 


387 


388 
\subsection{Products}

9541

389 
\input{Misc/document/pairs.tex}

8743

390 


391 
%FIXME move stuff below into section on proofs about products?


392 
\begin{warn}


393 
Abstraction over pairs and tuples is merely a convenient shorthand for a


394 
more complex internal representation. Thus the internal and external form


395 
of a term may differ, which can affect proofs. If you want to avoid this


396 
complication, use \isa{fst} and \isa{snd}, i.e.\ write


397 
\isa{\isasymlambda{}p.~fst p + snd p} instead of


398 
\isa{\isasymlambda(x,y).~x + y}. See~\S\ref{proofsaboutproducts} for


399 
theorem proving with tuple patterns.


400 
\end{warn}


401 

9541

402 
Note that products, like natural numbers, are datatypes, which means

8743

403 
in particular that \isa{induct_tac} and \isa{case_tac} are applicable to


404 
products (see \S\ref{proofsaboutproducts}).


405 

9541

406 
Instead of tuples with many components (where ``many'' is not much above 2),


407 
it is far preferable to use record types (see \S\ref{sec:records}).


408 

8743

409 
\section{Definitions}


410 
\label{sec:Definitions}


411 


412 
A definition is simply an abbreviation, i.e.\ a new name for an existing


413 
construction. In particular, definitions cannot be recursive. Isabelle offers


414 
definitions on the level of types and terms. Those on the type level are


415 
called type synonyms, those on the term level are called (constant)


416 
definitions.


417 


418 


419 
\subsection{Type synonyms}

8771

420 
\indexbold{type synonym}

8743

421 


422 
Type synonyms are similar to those found in ML. Their syntax is fairly self


423 
explanatory:


424 


425 
\input{Misc/document/types.tex}%


426 


427 
Note that patternmatching is not allowed, i.e.\ each definition must be of


428 
the form $f\,x@1\,\dots\,x@n~\isasymequiv~t$.


429 
Section~\S\ref{sec:Simplification} explains how definitions are used


430 
in proofs.


431 


432 
\begin{warn}%


433 
A common mistake when writing definitions is to introduce extra free


434 
variables on the righthand side as in the following fictitious definition:


435 
\input{Misc/document/prime_def.tex}%


436 
\end{warn}


437 


438 


439 
\chapter{More Functional Programming}


440 


441 
The purpose of this chapter is to deepen the reader's understanding of the

8771

442 
concepts encountered so far and to introduce advanced forms of datatypes and


443 
recursive functions. The first two sections give a structured presentation of


444 
theorem proving by simplification (\S\ref{sec:Simplification}) and discuss


445 
important heuristics for induction (\S\ref{sec:InductionHeuristics}). They can


446 
be skipped by readers less interested in proofs. They are followed by a case


447 
study, a compiler for expressions (\S\ref{sec:ExprCompiler}). Advanced


448 
datatypes, including those involving function spaces, are covered in


449 
\S\ref{sec:advanceddatatypes}, which closes with another case study, search


450 
trees (``tries''). Finally we introduce \isacommand{recdef}, a very general


451 
form of recursive function definition which goes well beyond what


452 
\isacommand{primrec} allows (\S\ref{sec:recdef}).

8743

453 


454 


455 
\section{Simplification}


456 
\label{sec:Simplification}


457 
\index{simplification(}


458 

9541

459 
So far we have proved our theorems by \isa{auto}, which ``simplifies''


460 
\emph{all} subgoals. In fact, \isa{auto} can do much more than that, except


461 
that it did not need to so far. However, when you go beyond toy examples, you


462 
need to understand the ingredients of \isa{auto}. This section covers the


463 
method that \isa{auto} always applies first, namely simplification.

8743

464 


465 
Simplification is one of the central theorem proving tools in Isabelle and


466 
many other systems. The tool itself is called the \bfindex{simplifier}. The


467 
purpose of this section is twofold: to introduce the many features of the


468 
simplifier (\S\ref{sec:SimpFeatures}) and to explain a little bit how the


469 
simplifier works (\S\ref{sec:SimpHow}). Anybody intending to use HOL should


470 
read \S\ref{sec:SimpFeatures}, and the serious student should read


471 
\S\ref{sec:SimpHow} as well in order to understand what happened in case


472 
things do not simplify as expected.


473 


474 


475 
\subsection{Using the simplifier}


476 
\label{sec:SimpFeatures}


477 

9458

478 
\subsubsection{What is simplification}


479 

8743

480 
In its most basic form, simplification means repeated application of


481 
equations from left to right. For example, taking the rules for \isa{\at}


482 
and applying them to the term \isa{[0,1] \at\ []} results in a sequence of


483 
simplification steps:


484 
\begin{ttbox}\makeatother


485 
(0#1#[]) @ [] \(\leadsto\) 0#((1#[]) @ []) \(\leadsto\) 0#(1#([] @ [])) \(\leadsto\) 0#1#[]


486 
\end{ttbox}


487 
This is also known as \bfindex{term rewriting} and the equations are referred

8771

488 
to as \bfindex{rewrite rules}. ``Rewriting'' is more honest than ``simplification''

8743

489 
because the terms do not necessarily become simpler in the process.


490 


491 
\subsubsection{Simplification rules}


492 
\indexbold{simplification rules}


493 


494 
To facilitate simplification, theorems can be declared to be simplification


495 
rules (with the help of the attribute \isa{[simp]}\index{*simp


496 
(attribute)}), in which case proofs by simplification make use of these

8771

497 
rules automatically. In addition the constructs \isacommand{datatype} and

8743

498 
\isacommand{primrec} (and a few others) invisibly declare useful


499 
simplification rules. Explicit definitions are \emph{not} declared


500 
simplification rules automatically!


501 


502 
Not merely equations but pretty much any theorem can become a simplification


503 
rule. The simplifier will try to make sense of it. For example, a theorem


504 
\isasymnot$P$ is automatically turned into \isa{$P$ = False}. The details


505 
are explained in \S\ref{sec:SimpHow}.


506 


507 
The simplification attribute of theorems can be turned on and off as follows:


508 
\begin{ttbox}

9541

509 
lemmas [simp] = \(list of theorem names\);


510 
lemmas [simp del] = \(list of theorem names\);

8743

511 
\end{ttbox}

8771

512 
As a rule of thumb, equations that really simplify (like \isa{rev(rev xs) =

8743

513 
xs} and \mbox{\isa{xs \at\ [] = xs}}) should be made simplification


514 
rules. Those of a more specific nature (e.g.\ distributivity laws, which


515 
alter the structure of terms considerably) should only be used selectively,


516 
i.e.\ they should not be default simplification rules. Conversely, it may


517 
also happen that a simplification rule needs to be disabled in certain


518 
proofs. Frequent changes in the simplification status of a theorem may

8771

519 
indicate a badly designed theory.

8743

520 
\begin{warn}


521 
Simplification may not terminate, for example if both $f(x) = g(x)$ and


522 
$g(x) = f(x)$ are simplification rules. It is the user's responsibility not


523 
to include simplification rules that can lead to nontermination, either on


524 
their own or in combination with other simplification rules.


525 
\end{warn}


526 


527 
\subsubsection{The simplification method}


528 
\index{*simp (method)bold}


529 


530 
The general format of the simplification method is


531 
\begin{ttbox}


532 
simp \(list of modifiers\)


533 
\end{ttbox}


534 
where the list of \emph{modifiers} helps to fine tune the behaviour and may


535 
be empty. Most if not all of the proofs seen so far could have been performed


536 
with \isa{simp} instead of \isa{auto}, except that \isa{simp} attacks

9689

537 
only the first subgoal and may thus need to be repeateduse \isaindex{simp_all}


538 
to simplify all subgoals.

8743

539 
Note that \isa{simp} fails if nothing changes.


540 


541 
\subsubsection{Adding and deleting simplification rules}


542 


543 
If a certain theorem is merely needed in a few proofs by simplification,


544 
we do not need to make it a global simplification rule. Instead we can modify


545 
the set of simplification rules used in a simplification step by adding rules


546 
to it and/or deleting rules from it. The two modifiers for this are


547 
\begin{ttbox}


548 
add: \(list of theorem names\)


549 
del: \(list of theorem names\)


550 
\end{ttbox}


551 
In case you want to use only a specific list of theorems and ignore all


552 
others:


553 
\begin{ttbox}


554 
only: \(list of theorem names\)


555 
\end{ttbox}


556 


557 


558 
\subsubsection{Assumptions}


559 
\index{simplification!with/of assumptions}


560 


561 
{\makeatother\input{Misc/document/asm_simp.tex}}


562 


563 
\subsubsection{Rewriting with definitions}


564 
\index{simplification!with definitions}


565 


566 
\input{Misc/document/def_rewr.tex}


567 


568 
\begin{warn}


569 
If you have defined $f\,x\,y~\isasymequiv~t$ then you can only expand


570 
occurrences of $f$ with at least two arguments. Thus it is safer to define


571 
$f$~\isasymequiv~\isasymlambda$x\,y.\;t$.


572 
\end{warn}


573 


574 
\subsubsection{Simplifying letexpressions}


575 
\index{simplification!of letexpressions}


576 


577 
Proving a goal containing \isaindex{let}expressions almost invariably


578 
requires the \isa{let}con\structs to be expanded at some point. Since

8771

579 
\isa{let}\isa{in} is just syntactic sugar for a predefined constant (called

8743

580 
\isa{Let}), expanding \isa{let}constructs means rewriting with


581 
\isa{Let_def}:


582 


583 
{\makeatother\input{Misc/document/let_rewr.tex}}


584 


585 
\subsubsection{Conditional equations}


586 


587 
\input{Misc/document/cond_rewr.tex}


588 


589 


590 
\subsubsection{Automatic case splits}


591 
\indexbold{case splits}


592 
\index{*split(}


593 


594 
{\makeatother\input{Misc/document/case_splits.tex}}


595 


596 
\index{*split)}


597 


598 
\begin{warn}


599 
The simplifier merely simplifies the condition of an \isa{if} but not the


600 
\isa{then} or \isa{else} parts. The latter are simplified only after the


601 
condition reduces to \isa{True} or \isa{False}, or after splitting. The


602 
same is true for \isaindex{case}expressions: only the selector is


603 
simplified at first, until either the expression reduces to one of the


604 
cases or it is split.


605 
\end{warn}


606 


607 
\subsubsection{Arithmetic}


608 
\index{arithmetic}


609 


610 
The simplifier routinely solves a small class of linear arithmetic formulae

8771

611 
(over type \isa{nat} and other numeric types): it only takes into account

8743

612 
assumptions and conclusions that are (possibly negated) (in)equalities


613 
(\isa{=}, \isasymle, \isa{<}) and it only knows about addition. Thus


614 


615 
\input{Misc/document/arith1.tex}%


616 
is proved by simplification, whereas the only slightly more complex


617 


618 
\input{Misc/document/arith4.tex}%


619 
is not proved by simplification and requires \isa{arith}.


620 


621 
\subsubsection{Permutative rewrite rules}


622 


623 
A rewrite rule is {\bf permutative} if the lefthand side and righthand side


624 
are the same up to renaming of variables. The most common permutative rule


625 
is commutativity: $x+y = y+x$. Another example is $(xy)z = (xz)y$. Such


626 
rules are problematic because once they apply, they can be used forever.


627 
The simplifier is aware of this danger and treats permutative rules


628 
separately. For details see~\cite{isabelleref}.


629 


630 
\subsubsection{Tracing}


631 
\indexbold{tracing the simplifier}


632 


633 
{\makeatother\input{Misc/document/trace_simp.tex}}


634 


635 
\subsection{How it works}


636 
\label{sec:SimpHow}


637 


638 
\subsubsection{Higherorder patterns}


639 


640 
\subsubsection{Local assumptions}


641 


642 
\subsubsection{The preprocessor}


643 


644 
\index{simplification)}


645 


646 
\section{Induction heuristics}


647 
\label{sec:InductionHeuristics}


648 


649 
The purpose of this section is to illustrate some simple heuristics for


650 
inductive proofs. The first one we have already mentioned in our initial


651 
example:


652 
\begin{quote}


653 
{\em 1. Theorems about recursive functions are proved by induction.}


654 
\end{quote}


655 
In case the function has more than one argument


656 
\begin{quote}


657 
{\em 2. Do induction on argument number $i$ if the function is defined by


658 
recursion in argument number $i$.}


659 
\end{quote}


660 
When we look at the proof of {\makeatother\isa{(xs @ ys) @ zs = xs @ (ys @


661 
zs)}} in \S\ref{sec:introproof} we find (a) \isa{\at} is recursive in


662 
the first argument, (b) \isa{xs} occurs only as the first argument of


663 
\isa{\at}, and (c) both \isa{ys} and \isa{zs} occur at least once as


664 
the second argument of \isa{\at}. Hence it is natural to perform induction


665 
on \isa{xs}.


666 


667 
The key heuristic, and the main point of this section, is to


668 
generalize the goal before induction. The reason is simple: if the goal is


669 
too specific, the induction hypothesis is too weak to allow the induction


670 
step to go through. Let us now illustrate the idea with an example.


671 


672 
{\makeatother\input{Misc/document/Itrev.tex}}


673 


674 
A final point worth mentioning is the orientation of the equation we just


675 
proved: the more complex notion (\isa{itrev}) is on the lefthand


676 
side, the simpler \isa{rev} on the righthand side. This constitutes


677 
another, albeit weak heuristic that is not restricted to induction:


678 
\begin{quote}


679 
{\em 5. The righthand side of an equation should (in some sense) be


680 
simpler than the lefthand side.}


681 
\end{quote}


682 
The heuristic is tricky to apply because it is not obvious that


683 
\isa{rev xs \at\ ys} is simpler than \isa{itrev xs ys}. But see what

8771

684 
happens if you try to prove \isa{rev xs \at\ ys = itrev xs ys}!

8743

685 

9493

686 
\begin{exercise}


687 
\input{Misc/document/Tree2.tex}%


688 
\end{exercise}

8743

689 


690 
\section{Case study: compiling expressions}


691 
\label{sec:ExprCompiler}


692 


693 
{\makeatother\input{CodeGen/document/CodeGen.tex}}


694 


695 


696 
\section{Advanced datatypes}


697 
\label{sec:advanceddatatypes}


698 
\index{*datatype(}


699 
\index{*primrec(}


700 
%)


701 


702 
This section presents advanced forms of \isacommand{datatype}s.


703 


704 
\subsection{Mutual recursion}


705 
\label{sec:datatypemutrec}


706 


707 
\input{Datatype/document/ABexpr.tex}


708 


709 
\subsection{Nested recursion}

9644

710 
\label{sec:nesteddatatype}

8743

711 

9644

712 
{\makeatother\input{Datatype/document/Nested.tex}}

8743

713 


714 


715 
\subsection{The limits of nested recursion}


716 


717 
How far can we push nested recursion? By the unfolding argument above, we can


718 
reduce nested to mutual recursion provided the nested recursion only involves


719 
previously defined datatypes. This does not include functions:


720 
\begin{ttbox}


721 
datatype t = C (t => bool)


722 
\end{ttbox}


723 
is a real can of worms: in HOL it must be ruled out because it requires a type


724 
\isa{t} such that \isa{t} and its power set \isa{t \isasymFun\ bool} have the


725 
same cardinalityan impossibility. For the same reason it is not possible


726 
to allow recursion involving the type \isa{set}, which is isomorphic to


727 
\isa{t \isasymFun\ bool}.


728 


729 
Fortunately, a limited form of recursion


730 
involving function spaces is permitted: the recursive type may occur on the


731 
right of a function arrow, but never on the left. Hence the above can of worms


732 
is ruled out but the following example of a potentially infinitely branching tree is


733 
accepted:

8771

734 
\smallskip

8743

735 


736 
\input{Datatype/document/Fundata.tex}


737 
\bigskip


738 


739 
If you need nested recursion on the left of a function arrow,


740 
there are alternatives to pure HOL: LCF~\cite{paulson87} is a logic where types like


741 
\begin{ttbox}


742 
datatype lam = C (lam > lam)


743 
\end{ttbox}

8771

744 
do indeed make sense (note the intentionally different arrow \isa{>}).

8743

745 
There is even a version of LCF on top of HOL, called


746 
HOLCF~\cite{MuellerNvOS99}.


747 


748 
\index{*primrec)}


749 
\index{*datatype)}


750 


751 
\subsection{Case study: Tries}


752 


753 
Tries are a classic search tree data structure~\cite{Knuth375} for fast


754 
indexing with strings. Figure~\ref{fig:trie} gives a graphical example of a


755 
trie containing the words ``all'', ``an'', ``ape'', ``can'', ``car'' and


756 
``cat''. When searching a string in a trie, the letters of the string are


757 
examined sequentially. Each letter determines which subtrie to search next.


758 
In this case study we model tries as a datatype, define a lookup and an


759 
update function, and prove that they behave as expected.


760 


761 
\begin{figure}[htbp]


762 
\begin{center}


763 
\unitlength1mm


764 
\begin{picture}(60,30)


765 
\put( 5, 0){\makebox(0,0)[b]{l}}


766 
\put(25, 0){\makebox(0,0)[b]{e}}


767 
\put(35, 0){\makebox(0,0)[b]{n}}


768 
\put(45, 0){\makebox(0,0)[b]{r}}


769 
\put(55, 0){\makebox(0,0)[b]{t}}


770 
%


771 
\put( 5, 9){\line(0,1){5}}


772 
\put(25, 9){\line(0,1){5}}


773 
\put(44, 9){\line(3,2){9}}


774 
\put(45, 9){\line(0,1){5}}


775 
\put(46, 9){\line(3,2){9}}


776 
%


777 
\put( 5,10){\makebox(0,0)[b]{l}}


778 
\put(15,10){\makebox(0,0)[b]{n}}


779 
\put(25,10){\makebox(0,0)[b]{p}}


780 
\put(45,10){\makebox(0,0)[b]{a}}


781 
%


782 
\put(14,19){\line(3,2){9}}


783 
\put(15,19){\line(0,1){5}}


784 
\put(16,19){\line(3,2){9}}


785 
\put(45,19){\line(0,1){5}}


786 
%


787 
\put(15,20){\makebox(0,0)[b]{a}}


788 
\put(45,20){\makebox(0,0)[b]{c}}


789 
%


790 
\put(30,30){\line(3,2){13}}


791 
\put(30,30){\line(3,2){13}}


792 
\end{picture}


793 
\end{center}


794 
\caption{A sample trie}


795 
\label{fig:trie}


796 
\end{figure}


797 


798 
Proper tries associate some value with each string. Since the


799 
information is stored only in the final node associated with the string, many


800 
nodes do not carry any value. This distinction is captured by the

8771

801 
following predefined datatype (from theory \isa{Option}, which is a parent


802 
of \isa{Main}):


803 
\smallskip

8743

804 
\input{Trie/document/Option2.tex}

8771

805 
\indexbold{*option}\indexbold{*None}\indexbold{*Some}%

8743

806 
\input{Trie/document/Trie.tex}


807 


808 
\begin{exercise}


809 
Write an improved version of \isa{update} that does not suffer from the


810 
space leak in the version above. Prove the main theorem for your improved


811 
\isa{update}.


812 
\end{exercise}


813 


814 
\begin{exercise}


815 
Write a function to \emph{delete} entries from a trie. An easy solution is


816 
a clever modification of \isa{update} which allows both insertion and


817 
deletion with a single function. Prove (a modified version of) the main


818 
theorem above. Optimize you function such that it shrinks tries after


819 
deletion, if possible.


820 
\end{exercise}


821 


822 
\section{Total recursive functions}


823 
\label{sec:recdef}


824 
\index{*recdef(}


825 


826 
Although many total functions have a natural primitive recursive definition,


827 
this is not always the case. Arbitrary total recursive functions can be


828 
defined by means of \isacommand{recdef}: you can use full patternmatching,


829 
recursion need not involve datatypes, and termination is proved by showing


830 
that the arguments of all recursive calls are smaller in a suitable (user


831 
supplied) sense.


832 


833 
\subsection{Defining recursive functions}


834 


835 
\input{Recdef/document/examples.tex}


836 


837 
\subsection{Proving termination}


838 


839 
\input{Recdef/document/termination.tex}


840 


841 
\subsection{Simplification with recdef}


842 


843 
\input{Recdef/document/simplification.tex}


844 


845 
\subsection{Induction}


846 
\index{induction!recursion(}


847 
\index{recursion induction(}


848 


849 
\input{Recdef/document/Induction.tex}

9644

850 
\label{sec:recdefinduction}

8743

851 


852 
\index{induction!recursion)}


853 
\index{recursion induction)}

9644

854 

9689

855 
\subsection{Advanced forms of recursion}


856 
\label{sec:advancedrecdef}

9644

857 

9689

858 
\input{Recdef/document/Nested0.tex}


859 
\input{Recdef/document/Nested1.tex}


860 
\input{Recdef/document/Nested2.tex}

9644

861 

8743

862 
\index{*recdef)}

9644

863 


864 
\section{Advanced induction techniques}


865 
\label{sec:advancedind}


866 
\input{Misc/document/AdvancedInd.tex}


867 

9673

868 
%\input{Datatype/document/Nested2.tex}
